-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficiencies with IComparer<T>.Compare vs operators #81220
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsConsider these three variations: public static bool M1(int i, int j) => i <= j;
public static bool M2<T>(T i, T j) where T : IBinaryInteger<T> => i <= j;
public static bool M3(int i, int j) => Comparer<int>.Default.Compare(i, j) <= 0; M1 and M2 (with T==int) both produce the nice: L0000: xor eax, eax
L0002: cmp ecx, edx
L0004: setle al
L0007: ret but M3 produces: L0000: cmp ecx, edx
L0002: jl short L0013
L0004: cmp ecx, edx
L0006: jg short L001a
L0008: xor eax, eax
L000a: test eax, eax
L000c: setle al
L000f: movzx eax, al
L0012: ret
L0013: mov eax, 0xffffffff
L0018: jmp short L000a
L001a: mov eax, 1
L001f: jmp short L000a It'd be helpful if the JIT were able to unravel such an
|
@AndyAyersMS I've not checked the dump yet but it looks like it was supposed to be handled by your #72979 ? (or #76283) |
I took a quick look and RBO does not see anything it can handle:
RBO doesn't look at predicates in returns; even if it did we'd need to do something like:
|
For reference, here's something much closer to my actual scenario: I'd like to be able to use #nullable disable
#pragma warning disable 0649
using System.Collections.Generic;
using System.Numerics;
using SharpLab.Runtime;
[JitGeneric(typeof(int))]
public class C1<T> where T : struct, IBinaryInteger<T>
{
private T _max;
private T[] _values; // sorted
public bool Contains(T value)
{
if (value <= _max)
{
T[] values = _values;
for (int i = 0; i < values.Length; i++)
{
if (value <= values[i])
{
if (value < values[i])
{
break;
}
return true;
}
}
}
return false;
}
}
[JitGeneric(typeof(int))]
public class C2<T> where T : struct
{
private T _max;
private T[] _values; // sorted
public bool Contains(T value)
{
if (Comparer<T>.Default.Compare(value, _max) <= 0)
{
T[] values = _values;
for (int i = 0; i < values.Length; i++)
{
if (Comparer<T>.Default.Compare(value, values[i]) <= 0)
{
if (Comparer<T>.Default.Compare(value, values[i]) < 0)
{
break;
}
return true;
}
}
}
return false;
}
} |
@AndyAyersMS here is a simplied version with no returns as Stephen posted above ^: static int CompareTo(int a, int b)
{
if (a < b) return -1;
if (a > b) return 1;
return 0;
}
[MethodImpl(MethodImplOptions.NoInlining)]
public static bool Test(int a, int b) => CompareTo(a, b) <= 0 ? true : false; Current codegen: ; Method Program:Test(int,int):bool
G_M37083_IG01:
G_M37083_IG02:
cmp ecx, edx
jl SHORT G_M37083_IG04
G_M37083_IG03:
cmp ecx, edx
jg SHORT G_M37083_IG06
G_M37083_IG04:
mov eax, 1
G_M37083_IG05:
ret
G_M37083_IG06:
xor eax, eax
G_M37083_IG07:
ret
; Total bytes of code: 17 |
cc @AndyAyersMS to look into it. |
Have the recognition part prototyped. Need to gather some stats on how often we see this flow pattern, but here's a peek at what it can do so far on the example above:
The next step would be to add boolean simplification logic to VN, to see if Already we can see that VN doesn't know how to do much here, eg If this pattern matches often enough, and if we can do the simplifications above, and then safely reason about side effects, we can optimize. |
ASPNET screening finds about 1400 cases. My shape matching is too permissive, but this is somewhat encouraging. Here's an organic example from Here BB14 is the dominating block, BB17 is a dominated block with a relop with the same operands, BB22 is block C and BB13 is block D. BB19 is empty, and BB18 has a neglectable side effect (dead local assign). So it appears we could alter BB14 to branch directly to BB13 if So we have failed to prove that BB22 and BB13 collectively postdominate BB14, and, if we're going to alter the route by which control reaches from A=BB14 to C=BB22 we need to have the same net side effect. |
I taught VN a bit about boolean logic simplification and some relop subsumption rules. Now it can reduce the
From here it should be straight forward to modify the flow and the relop and implement the optimization. To make this sound I still need to check for (lack of) side effects in the various paths. And there is one additional wrinkle now in the |
Recognize a particular case where a control flow pattern involving two BBJ_COND blocks with relops that test the same VNs can be simplified to a single relop in the dominating block. As part of this, teach VN about some of the rudiments of boolean simplification (DeMorgan's laws) and how to simplify some NOT / AND / OR expressions involving relops. Addresses some of the cases in dotnet#81220.
Been prototyping this too, on top of the changes in #83859. It seems to be able to simplify, but so far I'm not seeing a lot of cases like the one above (none in asp.net, 5 in libraries.pmi). Am double-checking my screening to see if it is flawed. On the
The next step from here is to figure out how to leverage this information. We don't keep inverse mappings from VNs to representative trees, and even if we did the tree values might be position dependent, so this may be tricky. One idea is to see if the dominating block's relop has operands with the required VNs. If so we could copy them into new temps and then use those temps down in BB06. Or alternatively, see if those operands might still have the same value if evaluated in BB06. Either would work in this case. If we did, that then the stores to V05 would all become dead and hopefully we could clean up all intermediate flow and just have a single compare. But I don't have enough examples yet where we find these sorts of cases to figure out how effective this kind of thing would be. |
I should add this sort of optimization could apply to more relops, it is a generalization of the phi-based jump threading to cover relops whose values aren't directly feeding into some GT_JTRUE. The prototype shares a lot of the same analysis so if the screening is indeed hitting some artificial limitation, fixing that may improve jump threading oo. Basically, given a relop VN in some tree in some block, we are trying to see if the value of the VN is entirely predictable if we know which predecessor transferred control to the block. So we check the relop operand VNs to see if any of them are local PHI defs, if so we find the appropriate phi input VNs (matching with preds), substitute those into the relop VN, and see if we can simplify to a constant (0 or 1 here, since it's a relop). This either gives us 0, 1, or "can't tell" -- so by doing this we sort the preds into 3 categories: true preds, false preds, and ambiguous preds. Phi-based jump threading then uses this info to rewire flow. But more generally we can use this info to either simplify the relop or (perhaps) decide to likewise duplicate this block to create contexts where the relop value is known. |
Also (note to self) we might want to put VNs onto flow edges indicating the conditions under which control reaches that edge, that would make this sort of path-based analysis simpler. This should be fairly cheap. Perhaps assertion prop could leverage this too, somehow? Going further we could update the VN phi defs to capture these conditions and the rest of VN to know how to simplify over these predicated value sets -- this pushes forwards the information that I am recovering backwards above. Probably not so cheap though. |
In libraries PMI I see about 177 cases where the return VN contains a PhiDef and we can't resolve it. In many cases the PhiDef is non local and these are perhaps not interesting for jump threading but may still be viable for general relop rewriting? This would go something like the following (assume just one phi def for now):
If there are multiple PHI defs and they don't all have the same def block then things are more complicated. In some other cases the PhiDef is local (phi def block is block) but the PhiDef is buried deeper in the func graph. Currently we only look one level down, eg there are 100 or so cases like this one:
These are cases that might help phi-based RBO too. But it is clunky to do substitutions in a multi-level VN tree. Would be nice to have a general substitution utility I suppose. This also starts to tread on if conversion territory -- if "materialize it somehow" was creation of a GT_SELECT tree and we relied on other opts to clean all this up. In the above we are effectively forward subbing that tree into the consumer (and insisting there be a local consumer) and only doing this for booleans, and only doing this when we know it will clean up. |
Recognize a particular case where a control flow pattern involving two BBJ_COND blocks with relops that test the same VNs can be simplified to a single relop in the dominating block. As part of this, teach VN about some of the rudiments of boolean simplification (DeMorgan's laws) and how to simplify some NOT / AND / OR expressions involving relops. Addresses some of the cases in dotnet#81220.
Consider these three variations:
M1 and M2 (with T==int) both produce the nice:
but M3 produces:
SharpLab
It'd be helpful if the JIT were able to unravel such an
IComparer<T>.Default.Compare(...)
followed by comparison to 0 into the equivalent of the direct operator usage.The text was updated successfully, but these errors were encountered: