Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move return buffer handling from interop to the JIT #39294

Merged
merged 82 commits into from
Dec 3, 2020

Conversation

jkoritzinsky
Copy link
Member

So far, when specific calling conventions have required different rules around return buffers than the managed calling conventions, the JIT has relied on the interop space to munge the IL stubs so the JIT doesn't have to handle the return buffer logic itself. Through the work for supporting thiscall in .NET Core 3.x and .NET 5, we've found that handling the return buffer logic in the interop space for anything more than the most trivial cases is prone to bugs and makes the logic hard to follow. In addition, handling the logic in the IL stubs precludes UnmanagedCallersOnly methods that might need return buffers that would be provided by the interop system from not needing an IL stub.

This PR moves the return buffer handling into the JIT so the interop space doesn't need to know when to create a return buffer nor does it need to know how to munge the IL to correctly create return buffers in the managed->native and native->managed directions.

Fixes #12375

The general logic of this PR goes as follows:

General return buffer handling

On Unix x86, always use a return buffer for structures. On Windows x86, use the JITs support for unwrapping trivial structs to enregister <=4-byte structs for non-thiscall scenarios. Add in EAX/EDX multireg return support for 8-byte structures on Windows x86 when not in the thiscall scenario.

Managed->Native

On Windows non-ARM, the JIT will generate a struct return buffer argument passed as per the platform ABI for thiscall.

Native->Managed

On platforms where we have a standard UMThunkStub (non-Windows x86), we pass the unmanaged calling convention to the JIT.

On Windows non-ARM, the JIT will generate a return buffer for thiscall methods.

Since Windows x86 has a special stub linker, we update the thiscall handling there. Since the current implementation tries to surface thiscall as stdcall for the majority of the thunk by swapping around arguments, I augment that support by adding support for copying an EAX/EDX multireg return value from the managed call to the native return buffer.

…tbuf arg after the `this` parameter, not before.
…hiscall on required platforms. Still need to reorder the emit so the arguments are emitted correctly.
…primitive-wrapper struct returning thiscall working.
…eems to optimize them correctly now. Also this enables the JIT to correctly handle native instance method calls that return single primitive field structs.
… since the arguments have already been reversed) on x86.
…n for thiscall Reverse P/Invoke stubs, update the stub linker to handle the case where a return value would be an enregistered return on stdcall but is returned via a return buffer for thiscall.
@jkoritzinsky jkoritzinsky added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 14, 2020
@jkoritzinsky jkoritzinsky added this to the 6.0.0 milestone Jul 14, 2020
@jkoritzinsky
Copy link
Member Author

Looks like there's more work I need to do to support the EAX/EDX multireg return on Windows x86. I'm locally seeing asserts about how the GCInfo emitting doesn't know how to handle multireg struct returns on Windows x86 Debug (which is possibly causing the libraries test failures). And there's the other assert on Windows x86 Checked when crossgenning, so I need to figure that out as well.

cc: @CarolEidt @echesakovMSFT if you could take a look when you have a chance and see if you can figure out what I'm missing/what I need to add or change to get this working, that would be great!

@jkotas
Copy link
Member

jkotas commented Jul 17, 2020

I'm locally seeing asserts about how the GCInfo emitting doesn't know how to handle multireg struct returns on Windows x86 Debug

I would avoid changing managed calling convention in this change, in particular for x86. I would limit this change to just changing handling of unmanaged calling conventions in the JIT.

@jkoritzinsky
Copy link
Member Author

Ok. I'll try to limit this change to unmanaged calls only.

@jkoritzinsky
Copy link
Member Author

I've tried to move this to only being applied to unmanaged calls, but there are a few places where I don't think I can easily ask "is this an unmanaged call?" when the JIT asks "how do I return this struct?" and as a result I'm ending up with the runtime in a bad state.

Since I also don't know how to correctly emit the GCInfo for the alternative case of multireg returns on x86 on all platforms, I'm going to ask for help.

@dotnet/jit-contrib any suggestions on how to handle the fact that unmanaged calls on x86 need to support multireg returns for 8-byte structs without causing the JIT to fail because we don't support multireg returns on x86 in the GCInfo encoder? We can't just have interop tell the JIT that the 8-byte struct is a long since in the ThisCall case, 8 byte structs are returned in a return buffer but longs are returned in a register.

@sandreenko
Copy link
Contributor

In #39294 (comment) we were discussing CorInfoUnmanagedCallConv and if should have CORINFO_UNMANAGED_CALLCONV_MANAGED there that is identical to CORINFO_UNMANAGED_CALLCONV_UNKNOWN, I can't respond to that thread so I am starting a new one, the current name still confuses me and hiding it under a define in Jit code does not look like a solution.
I think the questions are:

  1. Does Jit need to distinguish MANAGED from UNKNOWN?
  2. Do other Runtime parts need to distinguish them?
  3. Why enum is called enum CorInfoUnmanagedCallConv, can it be called enum CorInfoCallConvExtension or something similar?
  4. could it be enum class?

@jkotas
Copy link
Member

jkotas commented Nov 30, 2020

Why enum is called enum CorInfoUnmanagedCallConv

For historic reasons, there are multiple (at least 4) partially overlapping calling convention enum definitions.

Notice that CorInfoUnmanagedCallConv is subset of CorInfoCallConv. My guess is whoever introduced CorInfoUnmanagedCallConv as a subset of CorInfoCallConv wanted to make it clear that getUnmanagedCallConv method on JIT/EE interface returns a subset of callconv values. If we were to change the name of CorInfoUnmanagedCallConv to CorInfoCallConvExtension and mixed managed calling convention into it, we would lose this clarity. I think we would be better off to just use CorInfoCallConv instead of introducing CorInfoCallConvExtension.

Does Jit need to distinguish MANAGED from UNKNOWN?

It does not. UNKNOWN is rejected by the JIT inside impCheckForPInvokeCall. Nothing in the JIT will see UNKNOWN except for this method.

I do not believe that CORINFO_UNMANAGED_CALLCONV_UNKNOWN is actually ever returned by JIT/EE interface. And if it was ever returned, the JIT side immediately rejects it as something that it does not understand. It makes it safe to use this value as sentinel inside the JIT internals.

Do other Runtime parts need to distinguish them?

Other runtime parts need to distinguish between unspecified unmanaged calling convention that is substituted with platform default and explicitly specified unmanaged calling convention. They use other calling convention enums for this some of the time (e.g. CorPinvokeMap).

could it be enum class?

If we want to switch the JIT/EE interface to use enum classes, it should be done as a separate style-only PR, accross all enums on JIT/EE interface. This PR is big enough as it is.

@sandreenko
Copy link
Contributor

@jkotas got it, thank you.
Then I prefer what Jeremy suggested earlier: add a new jit internal enum for it (enum class CorInfoCallConvExtension) and use it instead of CorInfoUnmanagedCallConv.

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard to see the whole picture for such a big change, the small parts look close to a merge state.

I left a few comments/questions.

src/coreclr/src/jit/codegenxarch.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/codegenxarch.cpp Outdated Show resolved Hide resolved
@@ -693,7 +742,7 @@ var_types Compiler::getArgTypeForStruct(CORINFO_CLASS_HANDLE clsHnd,
assert(structSize != 0);

// Determine if we can pass the struct as a primitive type.
// Note that on x86 we never pass structs as primitive types (unless the VM unwraps them for us).
// Note that on x86 we only pass specific pointer-sized structs as primitives that the VM used to unwrap.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will soon forget what VM used to unwrap, maybe delete an outdated reference and say something like:
// Note that on Windows we only pass specific pointer-sized structs that satisfy isTrivialPointerSizedStruct checks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the body says:

On Unix x86, always use a return buffer for structures.

Is it still true? What does this method return for a trivial struct on x86 unix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The native calling conventions always uses a return buffer for structures, even non-trivial ones, on x86 unix. For the managed calling convention, we'll continue to support the current system (4-byte structs returned in registers).

src/coreclr/src/jit/compiler.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/compiler.h Outdated Show resolved Hide resolved
src/coreclr/src/jit/lclvars.cpp Outdated Show resolved Hide resolved
// Skip any user args that we've already processed.
assert(userArgsToSkip <= argSigLen);
argSigLen -= userArgsToSkip;
for (unsigned i = 0; i < userArgsToSkip; i++, argLst = info.compCompHnd->getArgNext(argLst))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I see userArgsToSkip can be only 0 or 1, do we need a loop here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop is moreso to future proof this so it works for any input, even though we currently only use 0 or 1. If you prefer, I can change this to be an if condition and assert that the value is 0 or 1.

src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
@@ -2837,10 +2846,11 @@ void Compiler::fgInitArgInfo(GenTreeCall* call)
{
maxRegArgs = 0;
}

#ifdef UNIX_X86_ABI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was to preserve the previous behavior. We used to do this on both x86 platforms. For consistency with the previous behavior (since I don't have a unix x86 test setup and we don't run it in CI), I wanted to keep this block active for Unix x86.

src/coreclr/src/jit/morph.cpp Show resolved Hide resolved
@jkoritzinsky
Copy link
Member Author

If we use CorInfoCallConv, then we won't be able to extend the enum later to represent calling conventions defined using the "extensible calling convention" system we designed for function pointers. We're limited in the number of bits we can use in that enum since it's a direct representation of metadata. I'd prefer to use a different type than CorInfoCallConv so we can more easily add new calling conventions and extend the enum we use in this PR to represent the new "extensible calling convention" calling conventions along the EE-JIT border and inside the JIT.

@jkoritzinsky
Copy link
Member Author

I've addressed all of the feedback excluding the conversation about "union vs side table". Does anyone have any more feedback for this PR before approval?

@BruceForstall
Copy link
Member

Once all the PR feedback has been addressed and the outerloop test run is clean, I suggest you trigger basically every JIT and GC stress AzDO job.

Have you analyzed JIT throughput impact (if any is expected)? Verified no asm diffs (if none are expected), or expected asm diffs?

@jkoritzinsky
Copy link
Member Author

I don't expect any JIT throughput impact. I've verified that the only asm diffs are the expected asm diffs (mentioned above that the only ones are on x86 and due to the change in struct normalization). I'll trigger the stress jobs once the PR jobs are done to try to not overload the queues.

@jkoritzinsky
Copy link
Member Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkoritzinsky
Copy link
Member Author

I've run the outerloop pipelines. I've validated that the jitstress failures match the same failures in master. The gcstress+jitstress failures are a little harder to validate, but it looks to me like they're the same failures that exist in master as well.

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your patience during this massive change.

I have one more request, that could be done as a separate PR, I can open an issue if you don't have time for it right now:
https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/clr-abi.md does not have a description (or link) for x86/arm abi, so it would not be fair to ask you to write it fully, but could you please expand your examples from the PR header and put them there with additional comments and examples with accent on new managed/native differences?

Managed->Native
On Windows non-ARM, the JIT will generate a struct return buffer argument passed as per the platform ABI for thiscall.

please add what happens for windows arm and not for thiscall.

I think I would prefer just to see a banch of example, with different struct sizes, with/without special arguments/profiler attached etc with comments where each argument goes on each platform.

@jkoritzinsky
Copy link
Member Author

I'll work on another change to update that doc with my learnings in another PR so I don't slow down merging this one in. Thanks!

@jkoritzinsky jkoritzinsky merged commit b66138c into dotnet:master Dec 3, 2020
@jkoritzinsky jkoritzinsky deleted the retbuf-move-to-jit branch December 3, 2020 19:58
@ghost ghost locked as resolved and limited conversation to collaborators Jan 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move most of the calling-convention return buffer logic into the JIT
8 participants