Skip to content

Conversation

huoyaoyuan
Copy link
Member

Since the initial publish commit of coreclr, FEATURE_ARRAYSTUB_AS_IL was not enabled for win-x86. Now it's 10 years later and our JIT compiler is able to produce better code than hand written assemblies. It's also much easier to maintain the logic in IL.

A little portion of dead code accessing ESP+offset is not removed in stublinkerx86.

@ghost ghost added the area-VM-coreclr label Jun 16, 2024
@huoyaoyuan huoyaoyuan marked this pull request as ready for review June 16, 2024 07:34
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 16, 2024
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@huoyaoyuan
Copy link
Member Author

Benchmarks:

    public class Base { }
    public class Derived : Base { }

    public struct GCStruct
    {
        public object a, b, c, d, e, f, g, h;
    }

        private Base[,] baseArray = new Base[10, 1];
        private Base[,] covariantArray = new Derived[10, 1];
        private Derived[,] derivedArray = new Derived[10, 1];
        private int[,] intArray = new int[10, 1];
        private Guid[,] guidArray = new Guid[10, 1];
        private GCStruct[,] gcStructArray = new GCStruct[10, 1];
        private double[,] doubleArray = new double[10, 1];

        private Derived derivedObj = new Derived();
        private Guid guid = Guid.NewGuid();
        private GCStruct gcStruct = new GCStruct
        {
            a = new object(),
            g = new object(),
        };

        private void GenericSet<T>(T[,] array, T value)
        {
            for (int i = 0; i < 10; i++)
            {
                array[i, 0] = value;
            }
        }

        [Benchmark]
        public void ReferenceTypeBase() => GenericSet(baseArray, derivedObj);

        [Benchmark]
        public void ReferenceTypeExactMatch() => GenericSet(derivedArray, derivedObj);

        [Benchmark]
        public void ReferenceTypeCovariant() => GenericSet(covariantArray, derivedObj);

        [Benchmark]
        public void PrimitiveInt() => GenericSet(intArray, 42);

        [Benchmark]
        public void NonGCStruct() => GenericSet(guidArray, guid);

        [Benchmark]
        public void GCStruct() => GenericSet(gcStructArray, gcStruct);

        [Benchmark]
        public void FPUType() => GenericSet(doubleArray, 123.456);
BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3737/23H2/2023Update/SunValley3)
13th Gen Intel Core i9-13900K, 1 CPU, 32 logical and 24 physical cores
.NET SDK 9.0.100-preview.4.24267.66
  [Host]     : .NET 8.0.5 (8.0.524.21615), X86 RyuJIT AVX2
  Job-PHSMMU : .NET 9.0.0 (42.42.42.42424), X86 RyuJIT AVX2
  Job-LOQTGY : .NET 9.0.0 (42.42.42.42424), X86 RyuJIT AVX2

OutlierMode=DontRemove  Affinity=00000000000000001111111111111111  MemoryRandomization=True
Method Job Toolchain Mean Error StdDev Ratio RatioSD
ReferenceTypeBase Job-PHSMMU \x86-baseline\corerun.exe 39.986 ns 0.2994 ns 0.2800 ns 1.00 0.00
ReferenceTypeBase Job-LOQTGY \x86-pr\corerun.exe 31.664 ns 0.2694 ns 0.2520 ns 0.79 0.00
ReferenceTypeExactMatch Job-PHSMMU \x86-baseline\corerun.exe 24.066 ns 0.0779 ns 0.0729 ns 1.00 0.00
ReferenceTypeExactMatch Job-LOQTGY \x86-pr\corerun.exe 24.819 ns 0.5049 ns 0.5185 ns 1.03 0.02
ReferenceTypeCovariant Job-PHSMMU \x86-baseline\corerun.exe 23.918 ns 0.1255 ns 0.1174 ns 1.00 0.00
ReferenceTypeCovariant Job-LOQTGY \x86-pr\corerun.exe 24.498 ns 0.4961 ns 0.4872 ns 1.02 0.02
PrimitiveInt Job-PHSMMU \x86-baseline\corerun.exe 4.219 ns 0.0412 ns 0.0385 ns 1.00 0.00
PrimitiveInt Job-LOQTGY \x86-pr\corerun.exe 4.127 ns 0.0582 ns 0.0544 ns 0.98 0.02
NonGCStruct Job-PHSMMU \x86-baseline\corerun.exe 82.065 ns 0.9609 ns 0.8989 ns 1.00 0.00
NonGCStruct Job-LOQTGY \x86-pr\corerun.exe 10.953 ns 0.1486 ns 0.1390 ns 0.13 0.00
GCStruct Job-PHSMMU \x86-baseline\corerun.exe 103.016 ns 0.7472 ns 0.6989 ns 1.00 0.00
GCStruct Job-LOQTGY \x86-pr\corerun.exe 60.069 ns 1.1729 ns 1.0971 ns 0.58 0.01
FPUType Job-PHSMMU \x86-baseline\corerun.exe 5.275 ns 0.0429 ns 0.0402 ns 1.00 0.00
FPUType Job-LOQTGY \x86-pr\corerun.exe 5.140 ns 0.1218 ns 0.1140 ns 0.97 0.02

JIT generates much better code for large structs.

@huoyaoyuan
Copy link
Member Author

Test failure at ILStubCache::CreateNewMethodDesc. Looks related.

@huoyaoyuan
Copy link
Member Author

The test failure seems to unveil another bug:

In ILStubCache::CreateNewMethodDesc, the stub type is detected for each stub. For StubVirtualStaticMethodDispatch, it's detection is not in else if, but another block separated from the if-else if tree:

pMD->SetILStubType(DynamicMethodDesc::StubCLRToNativeInterop);
}
}
if (SF_IsVirtualStaticMethodDispatchStub(dwStubFlags))
{
pMD->SetILStubType(DynamicMethodDesc::StubVirtualStaticMethodDispatch);
}

Thus for SVM dispatch stubs, the stub type is set as the fallback StubCLRToNativeInterop, then StubVirtualStaticMethodDispatch.
However in DynamicMethodDesc::SetILStubType, the stub type bits are |'d, without clearing existing flag:

void SetILStubType(ILStubType type)
{
_ASSERTE(HasFlags(FlagIsILStub));
m_dwExtendedFlags |= type;
}

So the stub type will actually be set as StubCLRToNativeInterop|StubVirtualStaticMethodDispatch. The value of StubCLRToNativeInterop is 0x01, and for all other configurations and previously Windows x86, the value of StubVirtualStaticMethodDispatch is odd, so the |'d result is still StubVirtualStaticMethodDispatch.
By enabling StubArrayOp, this PR changes the value on Windows x86 from 0x09 to 0x0a, then the result of StubCLRToNativeInterop|StubVirtualStaticMethodDispatch becomes 0x0b, which is out of range.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you

@jkotas jkotas merged commit 2d9373a into dotnet:main Jun 16, 2024
@huoyaoyuan huoyaoyuan deleted the array-stub-x86 branch June 17, 2024 01:58
@github-actions github-actions bot locked and limited conversation to collaborators Jul 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-VM-coreclr community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants