Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Recognize when length is never negative for Span<T>::.ctor(void*, int32) #83248

Closed
xtqqczze opened this issue Mar 10, 2023 · 6 comments · Fixed by #83694
Closed

[JIT] Recognize when length is never negative for Span<T>::.ctor(void*, int32) #83248

xtqqczze opened this issue Mar 10, 2023 · 6 comments · Fixed by #83694
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@xtqqczze
Copy link
Contributor

xtqqczze commented Mar 10, 2023

I noticed this for the JIT intrinsic BitOperations.PopCount, for which the return value should be treated as never negative (since #64951).

unsafe Span<byte> M0(byte* p, byte b) => new(p, b);
unsafe Span<byte> M2(byte* p, int b) => new(p, (byte)b);
unsafe Span<byte> M3(byte* p, uint mask) => new(p, BitOperations.PopCount(mask));
unsafe Span<byte> M4(byte* p, uint mask) => new(p, (byte)BitOperations.PopCount(mask));
// crossgen2 8.0.0-preview.3.23159.99+1b2eb12a711410a358bdfa93af017a34a929d4cf

C:M0(ulong,ubyte):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       mov      rax, rsi
       movzx    rdx, dl
						;; size=6 bbWeight=1 PerfScore 0.50
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25

C:M02(ulong,int):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       movzx    rdx, dl
       test     edx, edx
       jl       SHORT G_M62596_IG04
       mov      rax, rsi
						;; size=10 bbWeight=1 PerfScore 1.75
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M62596_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00

C:M1(ulong,uint):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       popcnt   edx, edx
       jl       SHORT G_M53218_IG04
       mov      rax, rsi
						;; size=9 bbWeight=1 PerfScore 3.25
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M53218_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00

C:M2(ulong,uint):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       popcnt   edx, edx
       jl       SHORT G_M40097_IG04
       mov      rax, rsi
						;; size=9 bbWeight=1 PerfScore 3.25
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M40097_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00
``
@xtqqczze xtqqczze added the tenet-performance Performance related issue label Mar 10, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 10, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 10, 2023
@ghost
Copy link

ghost commented Mar 10, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

I noticed this for BitOperations.PopCount which is a intrinsic where the return value is treated as never negative since #64951.

unsafe Span<byte> M0(byte* p, byte b) => new(p, b);
unsafe Span<byte> M2(byte* p, int b) => new(p, (byte)b);
unsafe Span<byte> M3(byte* p, uint mask) => new(p, BitOperations.PopCount(mask));
unsafe Span<byte> M4(byte* p, uint mask) => new(p, (byte)BitOperations.PopCount(mask));
// crossgen2 8.0.0-preview.3.23159.99+1b2eb12a711410a358bdfa93af017a34a929d4cf

C:M0(ulong,ubyte):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       mov      rax, rsi
       movzx    rdx, dl
						;; size=6 bbWeight=1 PerfScore 0.50
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25

C:M02(ulong,int):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       movzx    rdx, dl
       test     edx, edx
       jl       SHORT G_M62596_IG04
       mov      rax, rsi
						;; size=10 bbWeight=1 PerfScore 1.75
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M62596_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00

C:M1(ulong,uint):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       popcnt   edx, edx
       jl       SHORT G_M53218_IG04
       mov      rax, rsi
						;; size=9 bbWeight=1 PerfScore 3.25
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M53218_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00

C:M2(ulong,uint):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       popcnt   edx, edx
       jl       SHORT G_M40097_IG04
       mov      rax, rsi
						;; size=9 bbWeight=1 PerfScore 3.25
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M40097_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00
``

<table>
  <tr>
    <th align="left">Author:</th>
    <td>xtqqczze</td>
  </tr>
  <tr>
    <th align="left">Assignees:</th>
    <td>-</td>
  </tr>
  <tr>
    <th align="left">Labels:</th>
    <td>

`tenet-performance`, `area-CodeGen-coreclr`, `untriaged`

</td>
  </tr>
  <tr>
    <th align="left">Milestone:</th>
    <td>-</td>
  </tr>
</table>
</details>

@xtqqczze
Copy link
Contributor Author

xtqqczze commented Mar 10, 2023

Reproduces for XARCH and ARM64.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Mar 14, 2023
@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Mar 14, 2023
@JulieLeeMSFT
Copy link
Member

@TIHan PTAL.

@xtqqczze xtqqczze changed the title JIT fails to recognize when length is never negative for Span<T>::.ctor(void*, int32) [JIT] Recognize when length is never negative for Span<T>::.ctor(void*, int32) Mar 14, 2023
@TIHan
Copy link
Contributor

TIHan commented Mar 20, 2023

Did a quick investigation, IsNeverNegative does handle the PopCount intrinsics.

The problem may have something to do with value numbering with assertion prop or redundant branch opt. Will look into it further.

@xtqqczze
Copy link
Contributor Author

xtqqczze commented Apr 6, 2023

Confirmed fixed for BitOperations.PopCount on X64.

// crossgen2 8.0.0-preview.4.23206.99+18e2c5fd9e2239a8b06fe49dbb6492d40f5e5e19

C:M0(ulong,ubyte):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       mov      rax, rsi
       movzx    rdx, dl
						;; size=6 bbWeight=1 PerfScore 0.50
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25

C:M2(ulong,int):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       movzx    rdx, dl
       test     edx, edx
       jl       SHORT G_M26260_IG04
       mov      rax, rsi
						;; size=10 bbWeight=1 PerfScore 1.75
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25
G_M26260_IG04:
       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
       int3     
						;; size=7 bbWeight=0 PerfScore 0.00

C:M3(ulong,uint):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       popcnt   edx, edx
       mov      rax, rsi
						;; size=7 bbWeight=1 PerfScore 2.25
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25

C:M4(ulong,uint):System.Span`1[ubyte]:this:
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
       push     rax
						;; size=1 bbWeight=1 PerfScore 1.00
       popcnt   edx, edx
       mov      rax, rsi
						;; size=7 bbWeight=1 PerfScore 2.25
       add      rsp, 8
       ret      
						;; size=5 bbWeight=1 PerfScore 1.25

C:.ctor():this:
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Unix
						;; size=0 bbWeight=1 PerfScore 0.00
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

https://csharp.godbolt.org/z/9Px9W3T5v

@xtqqczze
Copy link
Contributor Author

xtqqczze commented Apr 6, 2023

However, there is still a range check for:

unsafe Span<byte> M2(byte* p, int b) => new(p, (byte)b);

@ghost ghost locked as resolved and limited conversation to collaborators May 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants