-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System.Numerics.Vector: Recognize division by a constant and inline #43761
Comments
I closed it in the favor of yours since you noticed the problem 🙂 |
@AndyAyersMS side issue: any idea what can be done with Vector<int> Test(Vector<int> v1, Vector<int> v2) => v1 / v2; Codegen: G_M26366_IG01:
push rdi
push rsi
sub rsp, 136
vzeroupper
mov rsi, rdx
G_M26366_IG02:
vmovupd ymm0, ymmword ptr[r8]
vmovupd ymmword ptr[rsp+40H], ymm0
vmovupd ymm0, ymmword ptr[r9]
vmovupd ymmword ptr[rsp+20H], ymm0
vxorps ymm0, ymm0
vmovupd ymmword ptr[rsp+60H], ymm0
xor rdi, rdi
G_M26366_IG03:
lea rcx, bword ptr [rsp+40H]
mov ecx, dword ptr [rcx+4*rdi]
lea rdx, bword ptr [rsp+20H]
mov edx, dword ptr [rdx+4*rdi]
call System.Numerics.Vector`1[Int32][System.Int32]:ScalarDivide(int,int):int ;; <====== not inlined
lea rdx, bword ptr [rsp+60H]
mov dword ptr [rdx+4*rdi], eax
inc rdi
cmp rdi, 8
jl SHORT G_M26366_IG03
G_M26366_IG04:
vmovupd ymm0, ymmword ptr[rsp+60H]
vmovupd ymmword ptr[rsi], ymm0
mov rax, rsi
G_M26366_IG05:
vzeroupper
add rsp, 136
pop rsi
pop rdi
ret That
So it exceeds the time budget but I'd love to inline it anyway to get a single division instruction. |
This is the same issue as we see over in #41692, and I think the suggestion I made there should work.
Basically: for an aggressive inline, we budget check incrementally as we import rather than doing the check up front, so if the method turns out to in fact be small because of importer trimming, the inline will succeed. If the method ends up not being small, we'll reject the inline, and will have wasted some time/memory in the partial inlining attempt, but hopefully that's a fairly rare occurrence. |
Marking as Future, |
… opportunities Look for IL patterns in an inlinee that are indicative of type folding. Also track when an inlinee has calls that are intrinsics. Use this to boost the inlinee profitability estimate. Also, allow an aggressive inline inline candidate method to go over the budget when it contains type folding or intrinsics (and so the method may be simpler than it appears). Remove the old criteria (top-level inline) which was harder to reason about. Closes dotnet#43761. Closes dotnet#41692. Closes dotnet#51587. Closes dotnet#47434.
Sharplab
https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABABgAJiBGAOgDkBXfGKASzFwG4BYAKD+IBmcgwB2ubADMYlAEzkAwnwDefcuvKRxGcq1E6AYtjAZo5ALzlZpFAA5O5APSOr5AHrkqVNRq24dejoAygAWrJI6ll48vBrksNgAJhCiADYAnuQAajAm0AA8gQB82bmmUEZ5UBbkojAA7qVVhfpFABSV5QCUMXE+6kJN5S0YJQAirABurIkwbTnNxeSTXRYlk05D0J3QvRp8AL5AA
Given
Output
The assembly output for x64 is:
Expected
The division is inlined and division by constant is recognized and "optimal" SIMD is emitted e.g. with shifts or similar.
cc: @tannergooding
category:cq
theme:inlining
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: