Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EqualityComparer<T>.Default.Equals doesn't devirtualize in a shared generic #10050

Open
benaadams opened this issue Mar 27, 2018 · 11 comments
Open
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Milestone

Comments

@benaadams
Copy link
Member

benaadams commented Mar 27, 2018

It potentially could since it resolves to calls on object.

However more pressing is regression in dictionary for object types https://github.com/dotnet/corefx/issues/28511

/cc @AndyAyersMS @danmosemsft @ianhays

category:cq
theme:devirtualization
skill-level:intermediate
cost:medium
impact:medium

@AndyAyersMS
Copy link
Member

I don't believe the jit can devirtualize in this case in general.

For ref type T, EqualityComparer<T>.Default can either be a GenericEqualityComparer<T> or an ObjectEqualityComparer<T>.

In the typical case where the caller of Default is itself generic in T (that is, shared code) both comparer types are always possible at runtime.

If T is exactly known or we can inline EqualityComparer<T>.Default into a context where T is exactly known then it should devirtualize (eg T == string).

As noted in dotnet/corefx#28511 if you are doing a lot of comparisons using the default comparer the performant patterns for value and ref types differ. This is unfortunate. For ref types caching the result of Default in a local is beneficial; for value types it is harmful.

We should investigate what it would take to fix the jit so the same patterns perform equally well for value and ref types.

@benaadams
Copy link
Member Author

For ref type T, EqualityComparer<T>.Default can either be a GenericEqualityComparer<T> or an ObjectEqualityComparer<T>.

Hmm... that makes it more complicated and means I can't do a simple fallback to object.Equals as it may be IEquatable<T>.Equals(T) and a different method?

@benaadams
Copy link
Member Author

I think best can do is going back to caching in a local for object types 😢

Will treat it as a resolution to the regression rather than something more general

@benaadams
Copy link
Member Author

As aside (future) could generic lookups in loops be hoisted outside of loops?

@AndyAyersMS
Copy link
Member

Yes, that is part of getting the jit changes correct.

I'd like to make it so that if you call Default in a loop, and don't cache and the jit doesn't devirtualize / inline, it can can hoist the runtime handle lookup and the call to Default out of the loop and effectively re-introduce caching. That fixes things so if you use the new value-optimal pattern you come out mostly ok with refs too.

The second part of the fix is that if you cache the jit needs to be more flexible in propagating the devirtualized comparer to the use site. This is a bit trickier to arrange since devirtualization happens so early, but I think we can make it work provided the caching is done "near" the use. That way if you cache to get good perf for ref types (or have legacy code that cached because prior to all this devirtualization, caching was a win for both ref and value types) you now get the benefits of devirtualization.

@AndyAyersMS
Copy link
Member

Looked into this a bit, here are some notes:

To enable hoisting Default when we won't devirtualize Equals

We likely want to make Default be noinline, as once it gets inlined and the generic static base address lookup expands, it becomes too complex to analyze/hoist. There's probably no real benefit to inlining this anyways. If we do that then we need to adjust various optimizer methods to recognize calls to Default as an intrinsic with special properties: while it has global side effects they are idempotent and so it can be safely hoisted. We can also streamline/specialize the "special DCE" property which indicates that if the result of this call is unused the call can be removed.

I prototyped this a bit and another challenge I hit here was that morph also does some crude side effect analysis for calls and since this call ends up nested, morph creates an embedded assignment. Wasn't clear to me why we did not spill at the outer call in the importer but if/when we do this we will decouple the call to Default and the subsequent use at the virtual site and so introduce instances of the case below.

Also we might want to allow value numbering/CSE of these calls.

To enable devirtualization for cases where user or jit has split the call to Default away from the call to Equals

Assumption here is that in the importer we need to track the actual expression for single def locals and not just its type. Seems like this is not too hard to do though we must be careful about reimportation and such. Then during devirt of Equals if we have a single-def temp or local for this we can look back and see the definition is a call to Default and proceed as we do now. Later the call to Default would be dead coded as its result would no longer be used.

using System;
using System.Collections.Generic;

class GitHub_17273
{
    // Would like to see just one call to Default per call to Hoist
    public static int Hoist()
    {
        int result = 0;
        string s0 = "s";
        string s1 = "s";

        for (int i = 0; i < 100; i++)
        {
            if (EqualityComparer<string>.Default.Equals(s0, s1))
            {
                result++;
            }
        }

        return result;
    }

    // Would like to see call to Default optimized away
    // and Equals devirtualized and inlined
    public static int Sink(int v0, int v1)
    {
        int result = 0;

        var c = EqualityComparer<int>.Default;

        for (int i = 0; i < 100; i++)
        {
            if (c.Equals(v0, v1))
            {
                result++;
            }
        }

        return result;
    }

    // Would like to see just one call to Default per call to Common
    public static int Common()
    {
        int result = 0;
        string s0 = "t";
        string s1 = "t";

        for (int i = 0; i < 50; i++)
        {
            if (EqualityComparer<string>.Default.Equals(s0, s1))
            {
                result++;
            }
        }

        for (int i = 0; i < 50; i++)
        {
            if (EqualityComparer<string>.Default.Equals(s0, s1))
            {
                result++;
            }
        }

        return result;
    }

    // Optimization pattern should vary here.
    //
    // If T is a ref type, Default will be hoisted and Equals will not devirtualize.
    // If T is a value type, Default will be removed and Equals will devirtualize.
    public static int GeneralHoist<T>(T v0, T v1)
    {
        int result = 0;

        for (int i = 0; i < 100; i++)
        {
            if (EqualityComparer<T>.Default.Equals(v0, v1))
            {
                result++;
            }
        }

        return result;

    }

    // Optimization pattern should vary here.
    //
    // If T is a ref type we will compile as is.
    // If T is a value type, Default will be removed and Equals will devirtualize.
    public static int GeneralSink<T>(T v0, T v1)
    {
        int result = 0;

        var c = EqualityComparer<T>.Default;

        for (int i = 0; i < 100; i++)
        {
            if (c.Equals(v0, v1))
            {
                result++;
            }
        }

        return result;

    }

    public static int Main()
    {
        int h = Hoist();
        int s = Sink(33, 33);
        int c = Common();
        int ghr = GeneralHoist<string>("u", "u");
        int ghv = GeneralHoist<int>(44, 44);
        int gsr = GeneralSink<string>("v", "v");
        int gsv = GeneralSink<int>(55, 55);

        return h + s + c + ghr + ghv + gsr + gsv - 600;
    }
}

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
kamronbatman added a commit to modernuo/ModernUO that referenced this issue Feb 4, 2021
- [X] Adds the ability to store a null value in an ordered hash set

TODO:
Optimized the OrderedHashSet. See this:
* dotnet/runtime#10050
Looks like the OrderedDictionary that I based this structure from was not updated.
See the source for diffing:
https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/HashSet.cs

Difference between a HashSet and an OrderedHashSet is updating the indexes of the entries after fixing the chain to preserve insertion order.
kamronbatman added a commit to modernuo/ModernUO that referenced this issue Aug 9, 2021
…#679)

* Optimizes OrderedHashSet using the learnings here: dotnet/runtime#10050
* Removes ValueCollection since it was a relic from the conversion of the OrderedDictionary
* Centralizes throw strings.
* Adds pooled ordered hash set (but doesn't use it)

Closes #448
@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Jan 24, 2023
@EgorBo
Copy link
Member

EgorBo commented Mar 9, 2023

Just wanted to note that the penalty from non devirtualized Equals in that case is quite big:

image

@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Mar 5, 2024
@AndyAyersMS
Copy link
Member

@EgorBo can you post updated perf for your benchmark(s) above?

@EgorBo
Copy link
Member

EgorBo commented Oct 16, 2024

@EgorBo can you post updated perf for your benchmark(s) above?

I don't think anything has changed for it, we bail out on devirtualizing it (and PGO refuses to devirt it either due to lookup-via-this/indirect call, something like that), but let me kick off a run via bot..

@EgorBo
Copy link
Member

EgorBo commented Oct 16, 2024

@EgorBot -intel -arm64

using System;
using BenchmarkDotNet.Attributes;
using System.Runtime.CompilerServices;
using BenchmarkDotNet.Running;

public class StringComparisons
{
    private string s1 = "1";
    private string s2 = "1";

    [Benchmark]
    public bool Equals_inline() => string.Equals(s1, s2);

    [Benchmark]
    public bool Equals_specialized() => EqualityComparer<string>.Default.Equals(s1, s2);

    [Benchmark]
    public bool Equals_generic() => GenericEquals(s1, s2);

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static bool GenericEquals<T>(T a, T b) => EqualityComparer<T>.Default.Equals(a, b);
}

@EgorBo
Copy link
Member

EgorBo commented Oct 16, 2024

cc @AndyAyersMS results in EgorBot/runtime-utils#120 probably better now? hard to say

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Projects
None yet
Development

No branches or pull requests

5 participants