Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RyuJit] Declaring many vars causes stackspace spilling #7281

Closed
benaadams opened this issue Jan 23, 2017 · 10 comments
Closed

[RyuJit] Declaring many vars causes stackspace spilling #7281

benaadams opened this issue Jan 23, 2017 · 10 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Milestone

Comments

@benaadams
Copy link
Member

Optimized code reserves stack space for pre-optimized stack

e.g. The following code reserves 136 bytes of stack, and zeros out 24 bytes (which it uses) even though I'd reason its output and output from https://github.com/dotnet/coreclr/issues/9066 should be identical

; Lcl frame size = 136

G_M2227_IG01:
       57                   push     rdi
       56                   push     rsi
       4881EC88000000       sub      rsp, 136          // 136 bytes reserved
       488BF1               mov      rsi, rcx
       488D7C2428           lea      rdi, [rsp+28H]
       B918000000           mov      ecx, 24           // 24 bytes zero'd
       33C0                 xor      rax, rax
       F3AB                 rep stosd 

Code

[MethodImpl(MethodImplOptions.NoInlining)]
private static byte LotsOfSpans(byte[] array)
{
    var span00 = new Span<byte>(array);
    var span01 = span00.Slice(1);
    var span02 = span01.Slice(1);
    var span03 = span02.Slice(1);
    var span04 = span03.Slice(1);
    var span05 = span04.Slice(1);
    var span06 = span05.Slice(1);
    var span07 = span06.Slice(1);
    var span08 = span07.Slice(1);
    var span09 = span08.Slice(1);
    var span10 = span09.Slice(1);
    var span11 = span10.Slice(1);
    var span12 = span11.Slice(1);
    var span13 = span12.Slice(1);
    var span14 = span13.Slice(1);
    var span15 = span14.Slice(1);
    var span16 = span15.Slice(1);
    var span17 = span16.Slice(1);
    var span18 = span17.Slice(1);
    var span19 = span18.Slice(1);
    var span20 = span19.Slice(1);
    var span21 = span20.Slice(1);
    var span22 = span21.Slice(1);
    var span23 = span22.Slice(1);
    var span24 = span23.Slice(1);
    var span25 = span24.Slice(1);
    var span26 = span25.Slice(1);
    var span27 = span26.Slice(1);
    var span28 = span27.Slice(1);
    var span29 = span28.Slice(1);
    var span30 = span29.Slice(1);
    return span30[0];
}

Asm produced

; Lcl frame size = 136

G_M2227_IG01:
       57                   push     rdi
       56                   push     rsi
       4881EC88000000       sub      rsp, 136
       488BF1               mov      rsi, rcx
       488D7C2428           lea      rdi, [rsp+28H]
       B918000000           mov      ecx, 24
       33C0                 xor      rax, rax
       F3AB                 rep stosd 
       488BCE               mov      rcx, rsi

G_M2227_IG02:
       4885C9               test     rcx, rcx
       0F844D030000         je       G_M2227_IG45

G_M2227_IG03:
       8B7108               mov      esi, dword ptr [rcx+8]
       488BF9               mov      rdi, rcx
       48B9D07C960FF87F0000 mov      rcx, 0x7FF80F967CD0
       33D2                 xor      edx, edx
       E8B397AD5F           call     CORINFO_HELP_CLASSINIT_SHARED_DYNAMICCLASS
       488B052485EAFF       mov      rax, qword ptr [reloc classVar[0xf9693a8]]
       83FE01               cmp      esi, 1
       0F822D030000         jb       G_M2227_IG46

G_M2227_IG04:
       48FFC0               inc      rax
       8D4EFF               lea      ecx, [rsi-1]
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8222030000         jb       G_M2227_IG47

G_M2227_IG05:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8218030000         jb       G_M2227_IG48

G_M2227_IG06:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F820E030000         jb       G_M2227_IG49

G_M2227_IG07:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8204030000         jb       G_M2227_IG50

G_M2227_IG08:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82FA020000         jb       G_M2227_IG51

G_M2227_IG09:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82F0020000         jb       G_M2227_IG52

G_M2227_IG10:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82E6020000         jb       G_M2227_IG53

G_M2227_IG11:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82DC020000         jb       G_M2227_IG54

G_M2227_IG12:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82D2020000         jb       G_M2227_IG55

G_M2227_IG13:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82C8020000         jb       G_M2227_IG56

G_M2227_IG14:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82BE020000         jb       G_M2227_IG57

G_M2227_IG15:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82B4020000         jb       G_M2227_IG58

G_M2227_IG16:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82AA020000         jb       G_M2227_IG59

G_M2227_IG17:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F82A0020000         jb       G_M2227_IG60

G_M2227_IG18:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8296020000         jb       G_M2227_IG61

G_M2227_IG19:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F828C020000         jb       G_M2227_IG62

G_M2227_IG20:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8282020000         jb       G_M2227_IG63

G_M2227_IG21:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8278020000         jb       G_M2227_IG64

G_M2227_IG22:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F826E020000         jb       G_M2227_IG65

G_M2227_IG23:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8264020000         jb       G_M2227_IG66

G_M2227_IG24:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F825A020000         jb       G_M2227_IG67

G_M2227_IG25:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8250020000         jb       G_M2227_IG68

G_M2227_IG26:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8246020000         jb       G_M2227_IG69

G_M2227_IG27:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F823C020000         jb       G_M2227_IG70

G_M2227_IG28:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       488BFA               mov      rdi, rdx
       83F901               cmp      ecx, 1
       0F8232020000         jb       G_M2227_IG71

G_M2227_IG29:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       488BD7               mov      rdx, rdi
       83F901               cmp      ecx, 1
       0F822B020000         jb       G_M2227_IG72

G_M2227_IG30:
       48FFC0               inc      rax
       FFC9                 dec      ecx
       4533C0               xor      r8, r8
       4C8D4C2470           lea      r9, bword ptr [rsp+70H]

G_M2227_IG31:
       660F57C0             xorpd    xmm0, xmm0
       F3410F7F01           movdqu   qword ptr [r9], xmm0
       4D894110             mov      qword ptr [r9+16], r8

G_M2227_IG32:
       898C2480000000       mov      dword ptr [rsp+80H], ecx
       4889542470           mov      gword ptr [rsp+70H], rdx
       4889442478           mov      qword ptr [rsp+78H], rax
       488D442470           lea      rax, bword ptr [rsp+70H]
       488B08               mov      rcx, gword ptr [rax]
       488B5008             mov      rdx, qword ptr [rax+8]
       8B4010               mov      eax, dword ptr [rax+16]
       83F801               cmp      eax, 1
       0F82F2010000         jb       G_M2227_IG73

G_M2227_IG33:
       48FFC2               inc      rdx
       FFC8                 dec      eax
       4533C0               xor      r8, r8
       4C8D4C2458           lea      r9, bword ptr [rsp+58H]

G_M2227_IG34:
       660F57C0             xorpd    xmm0, xmm0
       F3410F7F01           movdqu   qword ptr [r9], xmm0
       4D894110             mov      qword ptr [r9+16], r8

G_M2227_IG35:
       89442468             mov      dword ptr [rsp+68H], eax
       48894C2458           mov      gword ptr [rsp+58H], rcx
       4889542460           mov      qword ptr [rsp+60H], rdx
       488D442458           lea      rax, bword ptr [rsp+58H]
       488B08               mov      rcx, gword ptr [rax]
       488B5008             mov      rdx, qword ptr [rax+8]
       8B4010               mov      eax, dword ptr [rax+16]
       83F801               cmp      eax, 1
       0F82BC010000         jb       G_M2227_IG74

G_M2227_IG36:
       48FFC2               inc      rdx
       FFC8                 dec      eax
       4533C0               xor      r8, r8
       4C8D4C2440           lea      r9, bword ptr [rsp+40H]

G_M2227_IG37:
       660F57C0             xorpd    xmm0, xmm0
       F3410F7F01           movdqu   qword ptr [r9], xmm0
       4D894110             mov      qword ptr [r9+16], r8

G_M2227_IG38:
       89442450             mov      dword ptr [rsp+50H], eax
       48894C2440           mov      gword ptr [rsp+40H], rcx
       4889542448           mov      qword ptr [rsp+48H], rdx
       488D442440           lea      rax, bword ptr [rsp+40H]
       488B08               mov      rcx, gword ptr [rax]
       488B5008             mov      rdx, qword ptr [rax+8]
       8B4010               mov      eax, dword ptr [rax+16]
       83F801               cmp      eax, 1
       0F8286010000         jb       G_M2227_IG75

G_M2227_IG39:
       48FFC2               inc      rdx
       FFC8                 dec      eax
       4533C0               xor      r8, r8
       4C8D4C2428           lea      r9, bword ptr [rsp+28H]

G_M2227_IG40:
       660F57C0             xorpd    xmm0, xmm0
       F3410F7F01           movdqu   qword ptr [r9], xmm0
       4D894110             mov      qword ptr [r9+16], r8

G_M2227_IG41:
       89442438             mov      dword ptr [rsp+38H], eax
       48894C2428           mov      gword ptr [rsp+28H], rcx
       4889542430           mov      qword ptr [rsp+30H], rdx
       488D442428           lea      rax, bword ptr [rsp+28H]
       488B08               mov      rcx, gword ptr [rax]
       488B5008             mov      rdx, qword ptr [rax+8]
       8B4010               mov      eax, dword ptr [rax+16]
       85C0                 test     eax, eax
       0F8651010000         jbe      G_M2227_IG76

G_M2227_IG42:
       4885C9               test     rcx, rcx
       7505                 jne      SHORT G_M2227_IG43
       0FB602               movzx    rax, byte  ptr [rdx]
       EB0A                 jmp      SHORT G_M2227_IG44

G_M2227_IG43:
       488D4108             lea      rax, bword ptr [rcx+8]
       4803C2               add      rax, rdx
       0FB600               movzx    rax, byte  ptr [rax]

G_M2227_IG44:
       4881C488000000       add      rsp, 136
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret      

G_M2227_IG45:
       33C9                 xor      ecx, ecx
       E876F8FFFF           call     ThrowHelper:ThrowArgumentNullException(int)

G_M2227_IG46:
       B902000000           mov      ecx, 2
       E8BCF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG47:
       B902000000           mov      ecx, 2
       E8B2F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG48:
       B902000000           mov      ecx, 2
       E8A8F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG49:
       B902000000           mov      ecx, 2
       E89EF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG50:
       B902000000           mov      ecx, 2
       E894F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG51:
       B902000000           mov      ecx, 2
       E88AF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG52:
       B902000000           mov      ecx, 2
       E880F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG53:
       B902000000           mov      ecx, 2
       E876F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG54:
       B902000000           mov      ecx, 2
       E86CF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG55:
       B902000000           mov      ecx, 2
       E862F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG56:
       B902000000           mov      ecx, 2
       E858F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG57:
       B902000000           mov      ecx, 2
       E84EF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG58:
       B902000000           mov      ecx, 2
       E844F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG59:
       B902000000           mov      ecx, 2
       E83AF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG60:
       B902000000           mov      ecx, 2
       E830F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG61:
       B902000000           mov      ecx, 2
       E826F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG62:
       B902000000           mov      ecx, 2
       E81CF8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG63:
       B902000000           mov      ecx, 2
       E812F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG64:
       B902000000           mov      ecx, 2
       E808F8FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG65:
       B902000000           mov      ecx, 2
       E8FEF7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG66:
       B902000000           mov      ecx, 2
       E8F4F7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG67:
       B902000000           mov      ecx, 2
       E8EAF7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG68:
       B902000000           mov      ecx, 2
       E8E0F7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG69:
       B902000000           mov      ecx, 2
       E8D6F7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG70:
       B902000000           mov      ecx, 2
       E8CCF7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG71:
       B902000000           mov      ecx, 2
       E8C2F7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG72:
       B902000000           mov      ecx, 2
       E8B8F7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG73:
       B902000000           mov      ecx, 2
       E8AEF7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG74:
       B902000000           mov      ecx, 2
       E8A4F7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG75:
       B902000000           mov      ecx, 2
       E89AF7FFFF           call     ThrowHelper:ThrowArgumentOutOfRangeException(int)

G_M2227_IG76:
       E885F7FFFF           call     ThrowHelper:ThrowIndexOutOfRangeException()
       CC                   int3     

; Total bytes of code 1196, prolog size 29 for method Program:LotsOfSpans(ref):ubyte

category:cq
theme:ref-counts
skill-level:expert
cost:medium

@benaadams benaadams changed the title [RyuJit] Declaring vars causes stackspace spilling [RyuJit] Declaring many vars causes stackspace spilling Jan 24, 2017
@gkhanna79
Copy link
Member

CC @RussKeldorph

@RussKeldorph
Copy link
Contributor

@dotnet/jit-contrib

@AndyAyersMS
Copy link
Member

Now seeing similar stack size to #7279's example:

; Assembly listing for method T:LotsOfSpans(ref):ubyte
; Lcl frame size = 40

G_M3549_IG01:
       4883EC28             sub      rsp, 40

@benaadams
Copy link
Member Author

Still does it using a three field "SpanLike" StackSpace.cs though interestingly dotnet/coreclr#9068 and #7279 have switched output StackSpace.asm

@benaadams
Copy link
Member Author

benaadams commented Oct 31, 2017

This sample now reserves and zeros a large amount of stack space it doesn't use

byte LotsOfSpans(byte[] array) (in StackSpace.cs/StackSpace.asm)

; Lcl frame size = 984

G_M3116_IG01:
       57                   push     rdi
       56                   push     rsi
       4881ECD8030000       sub      rsp, 984
       C5F877               vzeroupper 
       488BF1               mov      rsi, rcx
       488DBC24F0000000     lea      rdi, [rsp+F0H]
       B91E000000           mov      ecx, 30
       33C0                 xor      rax, rax
       F3AB                 rep stosd 
       488BCE               mov      rcx, rsi

G_M3116_IG02:
       4885C9               test     rcx, rcx
       0F848D020000         je       G_M3116_IG46

G_M3116_IG03:
       8B4108               mov      eax, dword ptr [rcx+8]
       83F801               cmp      eax, 1
       0F8288020000         jb       G_M3116_IG47

G_M3116_IG04:
       FFC8                 dec      eax
       83F801               cmp      eax, 1
       0F8287020000         jb       G_M3116_IG48

@AndyAyersMS
Copy link
Member

In this version the excess stack looks like it is caused by local ref counts being too high. Eg V37 has final ref count of 1 but is actually unreferenced. Let me see if I can spot where the ref counts go wrong.

@AndyAyersMS
Copy link
Member

The culprit seems to be fgMorphBlockStmt which will unconditionally increment ref counts.

This method potentially modifies a statement's expression in unspecified ways, so the safe way to handle ref count updates is:

  • decrement ref counts in the input tree
  • output tree = fgMorph(input tree)
  • increment ref counts in the output tree
    (assuming no expression morphing messes with counts).

It is likely that the common case is that the tree is unchanged by morphing and this ref count work ends up just burning cycles. Perhaps more motivation for the ideas in dotnet/coreclr#13280.

Doing this reduces stack in the new case to 104 bytes. But also triggers asserts as the inflated ref counts may be masking maintenance issues elsewhere.

@AndyAyersMS
Copy link
Member

There was at least one example where morph directly modifies ref counts (in the GT_COMMA case in fgMorphSmpOp).

"Fixing" this and updating fgMorphBlockStmt as above leads to the following diffs in corelib (more than methods than shown below since many diffs are size preserving which jit-diffs doesn't pick up):

Total bytes of diff: -6673 (-0.19 % of base)
    diff is an improvement.
Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes
Top file improvements by size (bytes):
       -6673 : System.Private.CoreLib.dasm (-0.19 % of base)
1 total files with size differences (1 improved, 0 regressed), 0 unchanged.
Top method regessions by size (bytes):
          32 : System.Private.CoreLib.dasm - System.Collections.Generic.Dictionary`2[KeyValuePair`2,__Canon][System.Collections.Generic.KeyValuePair`2[System.__Canon,System.__Canon],System.__Canon]:Remove(struct):bool:this (1 methods)
          24 : System.Private.CoreLib.dasm - System.Collections.Concurrent.ConcurrentQueue`1[__Canon][System.__Canon]:GetCount(ref,int,ref,int):long (1 methods)
          23 : System.Private.CoreLib.dasm - System.String:JoinCore(long,int,ref,int,int):ref (1 methods)
          18 : System.Private.CoreLib.dasm - System.Collections.Generic.Dictionary`2[__Canon,Guid][System.__Canon,System.Guid]:Remove(ref,byref):bool:this (1 methods)
          18 : System.Private.CoreLib.dasm - System.Collections.Generic.Dictionary`2[__Canon,Boolean][System.__Canon,System.Boolean]:Remove(ref,byref):bool:this (1 methods)
Top method improvements by size (bytes):
        -543 : System.Private.CoreLib.dasm - DomainNeutralILStubClass:IL_STUB_CLRtoWinRT():ref:this (68 methods)
        -180 : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ProcessTerminal_HMS_F_D(byref,ubyte,byref):bool (1 methods)
        -180 : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ProcessTerminal_HM_S_D(byref,ubyte,byref):bool (1 methods)
        -102 : System.Private.CoreLib.dasm - System.ValueTuple`8[__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,ValueTuple`7][System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.ValueTuple`7[System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon]]:System.IValueTupleInternal.ToStringEnd():ref:this (1 methods)
        -102 : System.Private.CoreLib.dasm - System.ValueTuple`8[__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,ValueTuple`7][System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.ValueTuple`7[System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon]]:ToString():ref:this (1 methods)

379 total methods with size differences (293 improved, 86 regressed), 24503 unchanged.

By and large the big size wins are from smaller local offsets as the stack shrinks down to the used portion. Size increases seem to mainly be from RA changes.

Adopting this for real implies a commitment to the current ref count maintenance strategy and will still likely be error prone, though perhaps if we can confine maintenance windows to a select set of higher-level APIs we could get by.

@AndyAyersMS
Copy link
Member

Full set of jit-diff data from the prospective change. Something odd going on in the trace event code. Will take a closer look.

Total bytes of diff: -30115 (-0.13 % of base)
    diff is an improvement.
Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes
Top file regressions by size (bytes):
          60 : System.Reflection.Metadata.dasm (0.08 % of base)
          49 : System.IO.FileSystem.Watcher.dasm (0.37 % of base)
          36 : System.Console.dasm (0.08 % of base)
          24 : System.Security.Cryptography.Csp.dasm (0.05 % of base)
          22 : System.Diagnostics.Process.dasm (0.04 % of base)
Top file improvements by size (bytes):
       -7892 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.62 % of base)
       -6673 : System.Private.CoreLib.dasm (-0.19 % of base)
       -2098 : Microsoft.CodeAnalysis.CSharp.dasm (-0.10 % of base)
       -1905 : System.Private.Xml.dasm (-0.07 % of base)
       -1887 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.08 % of base)
82 total files with size differences (68 improved, 14 regressed), 48 unchanged.
Top method regessions by size (bytes):
        1598 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - Microsoft.Diagnostics.Tracing.Parsers.KernelTraceEventParser:EnumerateTemplates(ref,ref):this (1 methods)
         454 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - Microsoft.Diagnostics.Tracing.Parsers.ClrPrivateTraceEventParser:EnumerateTemplates(ref,ref):this (1 methods)
         397 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - Microsoft.Diagnostics.Tracing.Parsers.JScriptTraceEventParser:EnumerateTemplates(ref,ref):this (1 methods)
         189 : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.Symbols.SourceMemberMethodSymbol:BindSingleHandlesClause(ref,ref,ref,ref,ref,ref,byref):ref:this (1 methods)
         115 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - Microsoft.Diagnostics.Tracing.Parsers.ClrTraceEventParser:EnumerateTemplates(ref,ref):this (1 methods)
Top method improvements by size (bytes):
        -543 : System.Private.CoreLib.dasm - DomainNeutralILStubClass:IL_STUB_CLRtoWinRT():ref:this (68 methods)
        -221 : System.Data.Common.dasm - System.Data.RBTree`1[__Canon][System.__Canon]:RBDeleteX(int,int,int):int:this (1 methods)
        -180 : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ProcessTerminal_HMS_F_D(byref,ubyte,byref):bool (1 methods)
        -180 : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ProcessTerminal_HM_S_D(byref,ubyte,byref):bool (1 methods)
        -177 : System.Threading.Tasks.Parallel.dasm - System.Threading.Tasks.Parallel:ForWorker(int,int,ref,ref,ref,ref,ref,ref):struct (1 methods)
2545 total methods with size differences (2118 improved, 427 regressed), 140438 unchanged.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@benaadams
Copy link
Member Author

This is fixed

@ghost ghost locked as resolved and limited conversation to collaborators Dec 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

6 participants