Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete GTF_BLK_VOLATILE and GTF_BLK_UNALIGNED #84217

Merged
merged 1 commit into from
Apr 3, 2023

Conversation

SingleAccretion
Copy link
Contributor

@SingleAccretion SingleAccretion commented Apr 1, 2023

These struct ASG-specific flags are as legacy leftover from GT_COPYBLK days, inconsistent with how non-struct stores are handled.

Some diffs due to the better (more correct) handling of some volatile operations.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 1, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 1, 2023
@ghost
Copy link

ghost commented Apr 1, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

These struct ASG-specific flags are as legacy leftover from GT_COPYBLK days, inconsistent with how non-struct stores are handled.

Expecting some diffs due to the better (more correct) handling of some volatile operations.

Author: SingleAccretion
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

These struct ASG-specific flags are as legacy leftover from
GT_COPYBLK days, inconsistent with how non-struct stores are handled.
@SingleAccretion SingleAccretion marked this pull request as ready for review April 1, 2023 23:22
@SingleAccretion
Copy link
Contributor Author

@dotnet/jit-contrib

@EgorBo
Copy link
Member

EgorBo commented Apr 1, 2023

Any idea what happened with e.g. TestLclFldAddrIntrinsicsSSE41_BlendVariable:

@@ -7,33 +7,42 @@
 ; Final local variable assignments
 ;
 ;* V00 arg0         [V00    ] (  0,  0   )  double  ->  zero-ref    single-def
-;* V01 loc0         [V01,T00] (  0,  0   )  struct (32) zero-ref    do-not-enreg[SF] ld-addr-op
+;  V01 loc0         [V01    ] (  2,  2   )  struct (32) [rsp+08H]   do-not-enreg[XSF] must-init addr-exposed ld-addr-op
 ;* V02 loc1         [V02    ] (  0,  0   )   byref  ->  zero-ref   
 ;* V03 loc2         [V03    ] (  0,  0   )  simd16  ->  zero-ref   
 ;# V04 OutArgs      [V04    ] (  1,  1   )  struct ( 0) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;
-; Lcl frame size = 0
+; Lcl frame size = 40
 
 G_M20604_IG01:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, nogc <-- Prolog IG
+       sub      rsp, 40
        vzeroupper 
-						;; size=3 bbWeight=1 PerfScore 1.00
+       xor      eax, eax
+       mov      qword ptr [rsp+08H], rax
+       vxorps   xmm4, xmm4
+       vmovdqa  xmmword ptr [rsp+10H], xmm4
+       mov      qword ptr [rsp+20H], rax
+						;; size=29 bbWeight=1 PerfScore 5.83
 G_M20604_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
        vmovups  xmm0, xmmword ptr [reloc @RWD00]
-       vmovups  xmm1, xmmword ptr [reloc @RWD16]
-       vpblendvb xmm0, xmm0, [reloc @RWD32], xmm1
+       vmovups  xmmword ptr [rsp+18H], xmm0
+       vmovups  xmm0, xmmword ptr [reloc @RWD16]
+       vmovups  xmm1, xmmword ptr [reloc @RWD32]
+       vpblendvb xmm0, xmm0, [rsp+18H], xmm1
        vmovd    eax, xmm0
        vxorps   xmm0, xmm0
        vcvtsi2sd  xmm0, eax
-						;; size=38 bbWeight=1 PerfScore 17.33
+						;; size=50 bbWeight=1 PerfScore 21.33
 G_M20604_IG03:        ; bbWeight=1, epilog, nogc, extend
+       add      rsp, 40
        ret      
-						;; size=1 bbWeight=1 PerfScore 1.00
-RWD00  	dq	0000000100000001h, 0000000100000001h
-RWD16  	dq	0000000300000003h, 0000000300000003h
-RWD32  	dq	0000000200000002h, 0000000200000002h
+						;; size=5 bbWeight=1 PerfScore 1.25
+RWD00  	dq	0000000200000002h, 0000000200000002h
+RWD16  	dq	0000000100000001h, 0000000100000001h
+RWD32  	dq	0000000300000003h, 0000000300000003h
 
 
-; Total bytes of code 42, prolog size 3, PerfScore 23.53, instruction count 8, allocated bytes for code 42 (MethodHash=ab2aaf83) for method Runtime_39424:TestLclFldAddrIntrinsicsSSE41_BlendVariable(double):double
+; Total bytes of code 84, prolog size 29, PerfScore 36.82, instruction count 17, allocated bytes for code 84 (MethodHash=ab2aaf83) for method Runtime_39424:TestLclFldAddrIntrinsicsSSE41_BlendVariable(double):double
 ; ============================================================

@SingleAccretion
Copy link
Contributor Author

Any idea what happened with e.g. TestLclFldAddrIntrinsicsSSE41_BlendVariable

Yes - all the diffs are from cases where we set the flags correctly but used not to. This particular one is from ldobj handling.

@EgorBo
Copy link
Member

EgorBo commented Apr 2, 2023

Any idea what happened with e.g. TestLclFldAddrIntrinsicsSSE41_BlendVariable

Yes - all the diffs are from cases where we set the flags correctly but used not to. This particular one is from ldobj handling.

Thanks, didn't notice volatile prefix in that IL test

@EgorBo EgorBo merged commit 24fa97a into dotnet:main Apr 3, 2023
@SingleAccretion SingleAccretion deleted the No-Asg-Flags branch April 3, 2023 20:37
@ghost ghost locked as resolved and limited conversation to collaborators May 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants