ARM64: Investigate why more stack space is allocated than needed and why they are not aligned #37429

kunalspathak · 2020-06-04T17:32:10Z

This was brought to the attention by @TamarChristinaArm in #37139 (comment). For a method that takes 2 Vector64<> parameters, there are couple of problems:

We allocate stack space of 48 bytes although we just want to store 16 bytes.
The 16 bytes that is stored are not aligned properly. If they are aligned properly, we can convert it tp stp instead.

Here is the sample code:

G_M35607_IG01:
        A9BD7BFD          stp     fp, lr, [sp,#-48]!
        910003FD          mov     fp, sp
        FD0017A0          str     d0, [fp,#40]
        FD000FA1          str     d1, [fp,#24]
						;; bbWeight=1    PerfScore 3.50
G_M35607_IG02:
        FD4017B0          ldr     d16, [fp,#40]
        FD400FB1          ldr     d17, [fp,#24]
        0EB01E10          mov     v16.8b, v16.8b

Here is the assembly code for reference: Create_after.txt

The text was updated successfully, but these errors were encountered:

Dotnet-GitSync-Bot · 2020-06-04T17:32:14Z

I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.

kunalspathak · 2020-06-04T17:32:30Z

@BruceForstall , @CarolEidt

CarolEidt · 2020-06-04T17:37:20Z

As I mentioned here this appears to be because Compiler::getSIMDTypeAlignment which is called by Compiler::lvaAllocLocalAndSetVirtualOffset always returns 16 for TARGET_ARM64 which it should not be doing for 8-byte vectors. That accounts for 32 of the 48 bytes. The remaining bytes are because we store fp and we're aligning to 16 bytes.

kunalspathak · 2020-06-04T17:40:17Z

always returns 16 for TARGET_ARM64 which it should not be doing for 8-byte vectors.

Thanks @CarolEidt , I will try out this suggestion and verify that the change doesn't regress anywhere else.

kunalspathak added the tenet-performance Performance related issue label Jun 4, 2020

Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jun 4, 2020

kunalspathak mentioned this issue Jun 4, 2020

Optimize WithLower, WithUpper, Create, AsInt64, AsUInt64, AsDouble with ARM64 hardware intrinsics #37139

Merged

mangod9 added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 5, 2020

BruceForstall added this to the Future milestone Jun 8, 2020

BruceForstall added arch-arm64 and removed untriaged New issue has not been triaged by the area owner labels Jun 8, 2020

kunalspathak mentioned this issue Jun 9, 2020

ARM64: Fix the alignment for Vector64 to 8 bytes #37649

Merged

kunalspathak closed this as completed in #37649 Jun 9, 2020

ghost locked as resolved and limited conversation to collaborators Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARM64: Investigate why more stack space is allocated than needed and why they are not aligned #37429

ARM64: Investigate why more stack space is allocated than needed and why they are not aligned #37429

kunalspathak commented Jun 4, 2020 •

edited

Loading

Dotnet-GitSync-Bot commented Jun 4, 2020

kunalspathak commented Jun 4, 2020

CarolEidt commented Jun 4, 2020

kunalspathak commented Jun 4, 2020

ARM64: Investigate why more stack space is allocated than needed and why they are not aligned #37429

ARM64: Investigate why more stack space is allocated than needed and why they are not aligned #37429

Comments

kunalspathak commented Jun 4, 2020 • edited Loading

Dotnet-GitSync-Bot commented Jun 4, 2020

kunalspathak commented Jun 4, 2020

CarolEidt commented Jun 4, 2020

kunalspathak commented Jun 4, 2020

kunalspathak commented Jun 4, 2020 •

edited

Loading