Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

JIT: don't overlook strong nearby preference in LSRA #16028

Merged
merged 1 commit into from
Jan 29, 2018

Conversation

AndyAyersMS
Copy link
Member

When allocating an interval, LSRA will look at related intervals
for preference hints.

LSRA may look past a related interval I1 to its related interval
I2 if I1 looks like a simple copy. But if I1 has been previously
allocated then it seems we are often better served using I1's
preferences instead of using I2's.

Addresses some issues seen in #11390.

When allocating an interval, LSRA will look at related intervals
for preference hints.

LSRA may look past a related interval I1 to its related interval
I2 if I1 looks like a simple copy. But if I1 has been previously
allocated then it seems we are often better served using I1's
preferences instead of using I2's.

Addresses some issues seen in #11390.
@AndyAyersMS
Copy link
Member Author

@CarolEidt realize this is kind of from left field, but the diffs look surprisingly and consistently good.
cc @dotnet/jit-contrib

Total bytes of diff: -6203 (-0.03% of base)
    diff is an improvement.
Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes
Top file regressions by size (bytes):
          17 : System.Linq.Expressions.dasm (0.00% of base)
           2 : NuGet.Protocol.Core.v3.dasm (0.00% of base)
           2 : System.Console.dasm (0.00% of base)
Top file improvements by size (bytes):
       -1566 : Microsoft.CodeAnalysis.CSharp.dasm (-0.07% of base)
       -1118 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.05% of base)
        -770 : System.Private.Xml.dasm (-0.03% of base)
        -766 : System.Private.CoreLib.dasm (-0.02% of base)
        -254 : System.Private.DataContractSerialization.dasm (-0.04% of base)
62 total files with size differences (59 improved, 3 regressed), 68 unchanged.
Top method regessions by size (bytes):
          57 : System.Linq.Expressions.dasm - System.Linq.Expressions.Interpreter.NotEqualInstruction:Create(ref,bool):ref
          19 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.SyntaxTreeSemanticModel:GetDeclarationName(ref):ref:this
          11 : System.Private.Xml.dasm - System.Xml.Serialization.XmlReflectionImporter:ImportAccessorMapping(ref,ref,ref,ref,ref,bool,bool,ref):this
           9 : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.LocalRewriter:VisitCall(ref):ref:this
           9 : System.Private.CoreLib.dasm - System.RuntimeType:GetPropertyImpl(ref,int,ref,ref,ref,ref):ref:this
Top method improvements by size (bytes):
         -50 : Microsoft.CodeAnalysis.dasm - Microsoft.CodeAnalysis.ConstantValue:Create(ref,ubyte):ref
         -40 : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.Syntax.InternalSyntax.Scanner:ScanXmlString(ushort,ushort,bool):ref:this
         -34 : System.Private.CoreLib.dasm - System.IO.FileStream:ReadAsyncInternal(struct,struct,byref):ref:this
         -30 : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.Syntax.InternalSyntax.Scanner:ScanXmlContentInXmlDoc():ref:this
         -30 : System.Private.DataContractSerialization.dasm - System.Xml.XmlBinaryReader:ReadArray(ref,ref,ref,int,int):int:this (20 methods)
1515 total methods with size differences (1482 improved, 33 regressed), 153461 unchanged.

There are some notes over in #11390 if you wondered how I ended up with this.

Happy to post good and bad examples if it helps.

@mikedn
Copy link

mikedn commented Jan 26, 2018

Doesn't this also help with issues like #15782? It would be better if copies are eliminated earlier but after playing a bit with copy propagation I'm skeptical that the existing copy propagation phase can be significantly improved in this regard. Its reliance on heuristics is a bit of a problem.

@AndyAyersMS
Copy link
Member Author

Sadly, no.

I'll have to trace through what LSRA is thinking in that example and figure out why it's different.

@AndyAyersMS
Copy link
Member Author

For the example in #15782 -- don't fully understand it yet but it's a result of a different part of the heuristic. After the lea gets rax the allocator sees both rax and r8 as good candidates for the next ref position, but prefers r8 as it wins out via the "killed sooner" clause. If I disable this clause then we get

G_M5638_IG02:
       4885D2               test     rdx, rdx
       7507                 jne      SHORT G_M5638_IG03
       33C0                 xor      rax, rax
       4533C0               xor      r8d, r8d
       EB08                 jmp      SHORT G_M5638_IG04

G_M5638_IG03:
       488D4210             lea      rax, bword ptr [rdx+16]
       448B4208             mov      r8d, dword ptr [rdx+8]

G_M5638_IG04:
       488901               mov      bword ptr [rcx], rax
       44894108             mov      dword ptr [rcx+8], r8d
       488BC1               mov      rax, rcx

G_M5638_IG05:
       C3                   ret

@AndyAyersMS
Copy link
Member Author

@dotnet-bot test CentOS7.1 x64 Release Innerloop Build and Test

@CarolEidt
Copy link

Happy to post good and bad examples if it helps.

This LGTM, but it would be good to see some of the good & bad examples.

@AndyAyersMS
Copy link
Member Author

Examples below. Note in some good cases (eg Retain) there are still back to back moves, so there are other parts of the heuristic that might need a bit of tweaking.

I didn't look at all the regressions but the last one (extra movs) seems pretty rare.

+; *** GOOD ***
; Microsoft.CodeAnalysis.BitVector:AllSet(int):struct
 
 G_M13204_IG03:
        lea      eax, [rsi-1]
-       mov      ebx, eax
-       sar      ebx, 5
-       mov      eax, ebx
+       sar      eax, 5
 G_M13204_IG04:
        mov      ebx, eax
        test     ebx, ebx


+; *** GOOD ***
; System.CurrentSystemTimeZone:.ctor():this

G_M13204_IG04:

        call     CachedData:CreateLocal():ref:this
        mov      rdi, rax
 G_M12143_IG03:
-       mov      rax, rdi
-       mov      rdi, rax
        mov      rdx, qword ptr [rdi+56]
 

+; *** GOOD ***
; System.Memory`1[__Canon][System.__Canon]:Retain(bool):struct:this

G_M12143_IG03:
        or       rax, 1
        mov      rcx, rax
        mov      rax, rcx
-       mov      rcx, rax
-       mov      rax, rcx
        lea      rcx, bword ptr [rsi+16]
        mov      edx, dword ptr [rdi+8]


+; *** GOOD ***
; Assembly listing for method System.Memory`1[__Canon][System.__Canon]:Equals(ref):bool:this
 
 G_M39589_IG15:
        lea      rax, bword ptr [rdi+8]
-       mov      rdx, gword ptr [rax]
-       mov      r14, rdx
-       mov      edx, dword ptr [rax+8]
-       mov      r15d, edx
-       mov      eax, dword ptr [rax+12]
-       mov      ecx, eax
+       mov      r14, gword ptr [rax]
+       mov      r15d, dword ptr [rax+8]
+       mov      ecx, dword ptr [rax+12]
 G_M39589_IG16:
-       mov      rdx, r14
-       mov      eax, ecx
        test     ebp, ebp
        je       SHORT G_M39589_IG19
-       cmp      gword ptr [rbx], rdx
+       cmp      gword ptr [rbx], r14
        jne      SHORT G_M39589_IG17
 
 
- ; *** BAD ***  Use of R14 incurs prolog/epilog cost
 ; Assembly listing for method System.RuntimeType:GetPropertyImpl(ref,int,ref,ref,ref,ref):ref:this
-; Lcl frame size = 72
+; Lcl frame size = 80
 G_M462_IG01:
+       push     r14
        push     rdi
        ...
 G_M462_IG06:
-       mov      rcx, gword ptr [rsp+30H]
+       mov      rcx, gword ptr [rsp+38H]
        cmp      dword ptr [rcx+8], 0
        jbe      G_M462_IG19
-       mov      rcx, gword ptr [rsp+30H]
-       mov      rdi, gword ptr [rcx+16]
-       mov      rcx, rdi
+       mov      rcx, gword ptr [rsp+38H]
+       mov      rcx, gword ptr [rcx+16]
 G_M462_IG07:
-       mov      rdi, rcx
+       mov      r14, rcx
        test     rbx, rbx
        je       SHORT G_M462_IG10
-       mov      rcx, rdi
-       mov      rax, qword ptr [rdi]
+       mov      rcx, r14
+       mov      rax, qword ptr [r14]
        ...
-
 G_M462_IG09:
-       add      rsp, 72
+       add      rsp, 80
        pop      rbx
        pop      rbp
        pop      rsi
        pop      rdi
-       ret 
+       pop      r14
+       ret
 
 
- ; *** BAD *** Extra movs
 ; Assembly listing for method System.Linq.Expressions.Interpreter.NotEqualInstruction:Create(ref,bool):ref
 
 G_M18433_IG05:
-       mov      rsi, rbp
+       mov      rax, rbp
+       mov      rsi, rax
        jmp      G_M18433_IG53
 G_M18433_IG06:
        call     [CORINFO_HELP_READYTORUN_STATIC_BASE]
        mov      rbx, rax
-       mov      rbp, gword ptr [rbx+0B88H]
-       test     rbp, rbp
+       mov      rsi, gword ptr [rbx+0B88H]
+       test     rsi, rsi
        jne      SHORT G_M18433_IG07
        call     [CORINFO_HELP_READYTORUN_NEW]
-       mov      rbp, rax
+       mov      rsi, rax
        lea      rcx, bword ptr [rbx+0B88H]
-       mov      rdx, rbp
+       mov      rdx, rsi
        call     [CORINFO_HELP_ASSIGN_REF]
 G_M18433_IG07:
-       mov      rsi, rbp
+       mov      rax, rsi
+       mov      rsi, rax
        jmp      G_M18433_IG53

@CarolEidt
Copy link

@AndyAyersMS - thanks for the examples. In the past I've mostly dismissed the ones that cost an extra callee-reg save and restore, as I think that is a semi-orthogonal issue (and, if the reduction in instrs is in a loop, could be a CQ win even if a size increase). For the other, that one example looks like it might be a resolution move, which may also be somewhat orthogonal.

The preferencing heuristics need some focused work, e.g. see #11463, but incremental improvement is great!

@AndyAyersMS
Copy link
Member Author

BTW it sure would be nice if jit-diffs could automatically produce markdown-friendly diff output.What I did above required a bunch of manual collation. Opened dotnet/jitutils#125 in case anybody wants a fun side project.

@AndyAyersMS AndyAyersMS merged commit c3dd261 into dotnet:master Jan 29, 2018
@AndyAyersMS AndyAyersMS deleted the LsraPuzzle branch January 29, 2018 20:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants