Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT ARM32: Fix passing odd sized structs from arbitrary sources #70075

Merged
merged 4 commits into from
Jun 7, 2022

Conversation

jakobbotsch
Copy link
Member

ARM32 ABI allows passing structs in register even when their sizes are
not divisible by 4. This means we sometimes need to pass 3 bytes in the
last register. The JIT would not handle this when the source was an
arbitrary memory location (this would require multiple loads and
shifts). The fix is to just force a copy into the local stack frame for
this case.

Fix #61168
Fix #68837
Fix #70042

ARM32 ABI allows passing structs in register even when their sizes are
not divisible by 4. This means we sometimes need to pass 3 bytes in the
last register. The JIT would not handle this when the source was an
arbitrary memory location (this would require multiple loads and
shifts). The fix is to just force a copy into the local stack frame for
this case.

Fix dotnet#61168
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 1, 2022
@ghost ghost assigned jakobbotsch Jun 1, 2022
@ghost
Copy link

ghost commented Jun 1, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

ARM32 ABI allows passing structs in register even when their sizes are
not divisible by 4. This means we sometimes need to pass 3 bytes in the
last register. The JIT would not handle this when the source was an
arbitrary memory location (this would require multiple loads and
shifts). The fix is to just force a copy into the local stack frame for
this case.

Fix #61168
Fix #68837
Fix #70042

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch changed the title JIT ARM32: Fix odd sized structs from arbitrary sources JIT ARM32: Fix passing odd sized structs from arbitrary sources Jun 1, 2022
Comment on lines 3273 to 3282
// ARM32 has an edge case where we need to pass 3 bytes in
// the last register, which would require two loads and a
// shift to avoid loading too much if the source is a
// potentially arbitrary address. Handle the edge case here
// by copying into the local stack frame first so we can
// use a register wide load.
if (!arg.AbiInfo.IsSplit() && (((arg.AbiInfo.NumRegs * REGSIZE_BYTES) - passingSize) == 1))
{
makeOutArgCopy = true;
}
Copy link
Contributor

@SingleAccretion SingleAccretion Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I would have expected this code to be in ArgsComplete, where we force copies for all other multi-reg enabled platforms for the same reason (of fgMorphMultiregStructArg not being able to handle them):

// TODO-Arm: This optimization is not implemented for ARM32
// so we skip this for ARM32 until it is ported to use RyuJIT backend
//
else if (argx->OperGet() == GT_OBJ)

The code talks about this being an "optimization", not sure why that is (it is required for correctness in a number of cases).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move it there, though I don't see why this has anything to do with multi-reg. I suspect this is handled for the single-reg case by

if (structSize < TARGET_POINTER_SIZE)
{
    makeOutArgCopy = true;
}

right below the code I added. I just did not want to remove that for this surgical fix.

Copy link
Member Author

@jakobbotsch jakobbotsch Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me get this fix in and then look at generalizing the logic in ArgsComplete to work for ARM as well in a follow-up. The current code there looks needlessly conservative (for all platforms).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I suppose it would only be conservative for ARM, that's probably why it's not enabled.

Copy link
Contributor

@SingleAccretion SingleAccretion Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me get this fix in and then look at generalizing the logic in ArgsComplete to work for ARM as well in a follow-up. The current code there looks needlessly conservative (for all platforms).

I would be fine with that. I agree the logic there is too conservative for platforms that support precise PUTARG_STKs (which is ARM and ARM64 only right now, plus Unix x64 after #68611).

Edit: note I tried the ArgsComplete fix a few days ago, there were no diffs.

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @BruceForstall

@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Jun 3, 2022
@jakobbotsch jakobbotsch merged commit 0971e5e into dotnet:main Jun 7, 2022
@jakobbotsch jakobbotsch deleted the fix-61168 branch June 7, 2022 09:00
@ghost ghost locked as resolved and limited conversation to collaborators Jul 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
4 participants