-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add internal Array.Clear(Array), optimize some Array callers + codegen #51548
Conversation
Tagging subscribers to this area: @tannergooding Issue DetailsOptimized some calls to This is marked as draft because I'm soliciting feedback on whether to pursue this. It introduces some complexity to the The codegen for 00007ffa`321721a0 4883ec28 sub rsp, 28h
00007ffa`321721a4 4885c9 test rcx, rcx ; arr != null check
00007ffa`321721a7 745d je System_Private_CoreLib!System.Array.Clear(System.Array)+0xffffffff`a0fc77c6 (00007ffa`32172206)
00007ffa`321721a9 488b11 mov rdx, qword ptr [rcx] ; rdx := pMethodTable
00007ffa`321721ac 4c8bc2 mov r8, rdx
00007ffa`321721af 410fb700 movzx eax, word ptr [r8] ; rax := MethodTable::ComponentSize
00007ffa`321721b3 480faf4108 imul rax, qword ptr [rcx+8] ; rax := TotalByteLength (ComponentSize * NumElements)
00007ffa`321721b8 8b5204 mov edx, dword ptr [rdx+4] ; rdx := MethodTable::BaseSize
00007ffa`321721bb 4883c108 add rcx, 8 ; ideally would be rolled into line immediately below, but eh, whatever
00007ffa`321721bf 488d4c11f0 lea rcx, [rcx+rdx-10h] ; rcx := &Array.FirstElement
00007ffa`321721c4 41f70000000001 test dword ptr [r8], 1000000h ; <does the method table contain gc-tracked types?>
; <snip> jit-diff output for CoreLib below. Not sure what's going on with some methods being deleted + reintroduced. Maybe something weird in my build environment? Codegen for methods which now call Top method regressions (bytes):
260 ( ∞ of base) : System.Private.CoreLib.dasm - Array:<Sort>g__GenericSort|128_0(Array,Array,int,int) (0 base, 1 diff methods)
151 ( ∞ of base) : System.Private.CoreLib.dasm - Array:<BinarySearch>g__GenericBinarySearch|82_0(Array,int,int,Object):int (0 base, 1 diff methods)
151 ( ∞ of base) : System.Private.CoreLib.dasm - Array:<IndexOf>g__GenericIndexOf|107_0(Array,Object,int,int):int (0 base, 1 diff methods)
151 ( ∞ of base) : System.Private.CoreLib.dasm - Array:<LastIndexOf>g__GenericLastIndexOf|113_0(Array,Object,int,int):int (0 base, 1 diff methods)
120 ( ∞ of base) : System.Private.CoreLib.dasm - SpanHelpers:ClearWithReferences(byref,long) (0 base, 1 diff methods)
117 ( ∞ of base) : System.Private.CoreLib.dasm - Array:Clear(Array) (0 base, 1 diff methods)
5 ( ∞ of base) : System.Private.CoreLib.dasm - Array:get_NativeLength():long:this (0 base, 1 diff methods)
1 (25.00% of base) : System.Private.CoreLib.dasm - Array:get_LongLength():long:this
Top method improvements (bytes):
-260 (-100.00% of base) : System.Private.CoreLib.dasm - Array:<Sort>g__GenericSort|125_0(Array,Array,int,int) (1 base, 0 diff methods)
-151 (-100.00% of base) : System.Private.CoreLib.dasm - Array:<BinarySearch>g__GenericBinarySearch|79_0(Array,int,int,Object):int (1 base, 0 diff methods)
-151 (-100.00% of base) : System.Private.CoreLib.dasm - Array:<IndexOf>g__GenericIndexOf|104_0(Array,Object,int,int):int (1 base, 0 diff methods)
-151 (-100.00% of base) : System.Private.CoreLib.dasm - Array:<LastIndexOf>g__GenericLastIndexOf|110_0(Array,Object,int,int):int (1 base, 0 diff methods)
-62 (-29.81% of base) : System.Private.CoreLib.dasm - String:Ctor(ushort,int):String:this
-22 (-2.41% of base) : System.Private.CoreLib.dasm - TlsOverPerCoreLockedStacksArrayPool`1:Return(ref,bool):this
-13 (-6.02% of base) : System.Private.CoreLib.dasm - Array:Copy(Array,int,Array,int,int)
-10 (-1.34% of base) : System.Private.CoreLib.dasm - Buffer:BlockCopy(Array,int,Array,int,int)
-9 (-1.01% of base) : System.Private.CoreLib.dasm - Array:Copy(Array,int,Array,int,int,bool)
-6 (-2.70% of base) : System.Private.CoreLib.dasm - Array:Clear(Array,int,int)
-6 (-0.59% of base) : System.Private.CoreLib.dasm - TlsOverPerCoreLockedStacksArrayPool`1:Trim():bool:this
-5 (-1.62% of base) : System.Private.CoreLib.dasm - ConfigurableArrayPool`1:Return(ref,bool):this
-4 (-1.71% of base) : System.Private.CoreLib.dasm - Array:Copy(Array,Array,int)
-2 (-1.72% of base) : System.Private.CoreLib.dasm - Array:UnsafeArrayAsSpan(Array,int,int):Span`1
-2 (-1.13% of base) : System.Private.CoreLib.dasm - Buffer:ByteLength(Array):int
-2 (-2.99% of base) : System.Private.CoreLib.dasm - Buffer:GetByte(Array,int):ubyte
-2 (-2.82% of base) : System.Private.CoreLib.dasm - Buffer:SetByte(Array,int,ubyte)
|
Note to reviewers - test failures suggest that there are scenarios where the padding inside the array object is non-zero. However, browsing through the sources, I can't find any examples of where this might be the case. Thoughts? |
internal sealed class RawData | ||
{ | ||
[FieldOffset(0)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see the improvement here. Is this just a micro-optimization to compensate for missing JIT optimization? I would much preffer to fix the JIT instead of this.
Whatever it is, I do not think we should be introducing code that depends on this padding. |
@@ -138,8 +138,8 @@ public static unsafe void Copy(Array sourceArray, Array destinationArray, int le | |||
MethodTable* pMT = RuntimeHelpers.GetMethodTable(sourceArray); | |||
if (pMT == RuntimeHelpers.GetMethodTable(destinationArray) && | |||
!pMT->IsMultiDimensionalArray && | |||
(uint)length <= (nuint)sourceArray.LongLength && | |||
(uint)length <= (nuint)destinationArray.LongLength) | |||
(uint)length <= sourceArray.NativeLength && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the NativeLength
property to make these casts go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of this was trying to reduce codegen and save a register allocation by having a qword comparison directly against the NativeLength field rather than first requiring zero-extension from a dword field into a qword register. Might have to give up that optimization if we can't assume the padding is all-zero, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it is worth it to complicate the code to save a few bytes of codegen. It is very x86/64 specific code-size micro-optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just have NativeLength
be => Unsafe.As<RawArrayData>(this).Length
to force the zero-extension to nuint
.
That would make it easier to switch to properly be NativeLength
if we ever decide to make the switch in the future and would avoid the potential problems here while keeping things as is (not requiring more complex changes to actually track nuint
in the Array metadata today).
@@ -49,20 +49,17 @@ internal static partial class Sys | |||
|
|||
private static unsafe void AllocNullTerminatedArray(string[] arr, ref byte** arrPtr) | |||
{ | |||
int arrLength = arr.Length + 1; // +1 is for null termination | |||
int arrLength = checked(arr.Length + 1); // +1 is for null termination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't actually overflow, can it? arr.Length <= Array.MaxLength < int.MaxValue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mono isn't bound by this same constraint. It also seems fragile to depend on the runtime never allowing this value to increase in the future.
|
||
// Allocate the unmanaged array to hold each string pointer. | ||
// It needs to have an extra element to null terminate the array. | ||
arrPtr = (byte**)Marshal.AllocHGlobal(sizeof(IntPtr) * arrLength); | ||
arrPtr = (byte**)Marshal.AllocHGlobal((nint)sizeof(IntPtr) * arrLength); | ||
Debug.Assert(arrPtr != null); | ||
|
||
// Zero the memory so that if any of the individual string allocations fails, | ||
// we can loop through the array to free any that succeeded. | ||
// The last element will remain null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is out of date.
{ | ||
_shortPrefixUri[i] = null; | ||
} | ||
Array.Clear(_shortPrefixUri, 0, _shortPrefixUri.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we open an issue to expose Array.Clear(Array) publicly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled out into #51581.
@@ -287,7 +309,7 @@ public static unsafe void Clear(Array array, int index, int length) | |||
|
|||
int offset = index - lowerBound; | |||
|
|||
if (index < lowerBound || offset < 0 || length < 0 || (uint)(offset + length) > (nuint)array.LongLength) | |||
if (index < lowerBound || offset < 0 || length < 0 || (uint)(offset + length) > array.NativeLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (index < lowerBound || offset < 0 || length < 0 || (uint)(offset + length) > array.NativeLength) | |
if (index < lowerBound || (offset | length) < 0 || (uint)(offset + length) > array.NativeLength) |
Ideally the JIT could do this optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happily, already in progress! :)
#13573
@@ -266,6 +266,28 @@ public static void ConstrainedCopy(Array sourceArray, int sourceIndex, Array des | |||
Copy(sourceArray, sourceIndex, destinationArray, destinationIndex, length, reliable: true); | |||
} | |||
|
|||
internal static unsafe void Clear(Array array) | |||
{ | |||
if (array == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more convenient to just turn this into no-op if the array is null.
matchcount[i] = 0; | ||
} | ||
_ = matchcount.Length; // dereference to avoid null check on below line | ||
matchcount.AsSpan().Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the typical matchcount.Length
? Is this going to be an improvement in typical case?
Latest commit explicitly zeroes out the padding in |
7008bdd
to
9b65c32
Compare
Force-push because I undid most of the work in this PR. The PR is now greatly slimmed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Optimized some calls to
Array.Clear
, including adding an internal non-genericArray.Clear(Array)
method for common use cases, plus redirecting some existing calls toArray.Clear
orSpan<T>.Clear
(see #51534) orSpan<T>.Fill
.This is marked as draft because I'm soliciting feedback on whether to pursue this. It introduces some complexity to the
RawArrayData
andRawData
types (using FieldOffset) in order to knock down the codegen. Whether complicating the C# definitions of these types in order to save some bytes of codegen is worth it I'll leave up to smarter people. :)The codegen for
Array.Clear(Array)
is pretty well optimized now:jit-diff output for CoreLib below. Not sure what's going on with some methods being deleted + reintroduced. Maybe something weird in my build environment? Codegen for methods which now call
Array.Clear(Array)
instead ofArray.Clear(Array, int, int)
is smaller, as expected. Methods likeArray.Copy
which now rely on the internalNativeLength
property instead ofLongLength
also saw codegen reductions.