-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Support dynamically creating tailored continuation layouts #120411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Support dynamically creating tailored continuation layouts #120411
Conversation
Add ability for the VM to dynamically create continuation layout types and for the JIT to request such types to be created.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
src/coreclr/vm/jitinterface.cpp
Outdated
|
||
AllocMemTracker amTracker; | ||
// TODO: table to share/deduplicate these | ||
result = (CORINFO_CLASS_HANDLE)m_pMethodBeingCompiled->GetModule()->CreateContinuationMethodTable((unsigned)dataSize, objRefs, dataOffsets, &amTracker); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these should be created on the loader allocator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean place the method on another class? CreateContinuationMethodTable
uses the high frequency heap of the module's loader allocator for the allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definition module loader allocator != method loader allocator.
Consider MethodInNonCollectibleAssembly<CollectibleType>()
: This method instantiation should be allocated on collectible loader allocator. As written, it will be allocated on non-collectible loader allocator of MethodInNonCollectibleAssembly and the collectability of the instantiation argument won't be taken into account. We will have a memory leak.
Loader allocator is the scope of unloading. If you are allocating something collectible, loader allocator is the right place to attach it to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense. Will change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we expect pMT->GetLoaderModule()
/pMT->GetModule()
to return for these types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we may want to make sure that these methods are never called on these types. If these methods are called on the continuation types, it is a likely a bug. It may take a while to chase down all places.
That does seem best. For now I have set the fields as you suggested above, but will look into asserting that we don't access them for the continuation types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that is already confused is SOS. !dumpobj
, !dumpmt
and !verifyheap
report invalid objects/MT when the continuation method tables are involved. Hopefully that's a simple check somewhere.
EDIT: This needed a fix in MethodTable::ValidateWithPossibleAV
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this would probably be a minor breaking change for debuggers and profilers and will require us to implement a few new diagnostic APIs. From what I see so far:
- All our APIs that expose object layouts do so in terms of metadata FieldDefs which we don't have here. At minimum we'd be breaking the assumption that the layout information is complete. If we wanted to go farther we could create new layout APIs that aren't backed by metadata tokens.
Profiler API
Debugger API - In the case of debugging, there is no API that enumerates GC references in an object separate from the layout API. We'd need to create that.
- For debugging, profiling, and GCDumps emitted by events we have algorithms that do GC heap traversal. We'll need to ensure those algorithms are correctly handling these types (this may just fall out for free from the right GCDescs)
- APIs that expose TypeDef tokens for types will need updates not to expose a token for these Continuation sub-types. The behavior change might need changes in the 3rd party tool code to accomodate for it. (ICorDebugClass::GetToken(), ICorProfilerInfo::GetClassIdInfo())
I don't think this needs to block the PR since we are still behind an environment variable, but could you open a GH issue to track the new diagnostics work this create? Thanks!
(I'm still looking through the PR, I might have other feedback apart from the diagnostic headline stuff :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to block the PR since we are still behind an environment variable, but could you open a GH issue to track the new diagnostics work this create?
+1
When designing these APIs, we should generalize them where possible to cover any no metadata types internally created by the runtime. We may want to use them for storing statics and get rid of static boxes.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs
Outdated
Show resolved
Hide resolved
Now that we store directly into the ThunkTask's result we do not need to allocate a tail continuation at all.
src/coreclr/vm/jitinterface.cpp
Outdated
LoaderAllocator* allocator = m_pMethodBeingCompiled->GetLoaderAllocator(); | ||
AsyncContinuationsManager* asyncConts = allocator->GetAsyncContinuationsManager(); | ||
AllocMemTracker amTracker; | ||
result = (CORINFO_CLASS_HANDLE)asyncConts->LookupOrCreateContinuationMethodTable((unsigned)dataSize, objRefs, dataOffsets, m_pMethodBeingCompiled, &amTracker); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking we can delay this call until the continuation needs to be created. But in any case we need to persist some description of the continuation somewhere. However, we would be able to store that description compressed until it is needed.
I don't have good intuition about the expense of creating these or the number of them that we are going to be creating (likely a lot), so it is hard to know if this will be good enough or if it will show up on the radar once we start measuring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think this will be a problem (with de-duping of the layout in place). Also, the de-duping can be helped by sorting the fields, so that there is higher probability of finding dups.
public uint State; | ||
public delegate*<Continuation, ref byte, Continuation?> Resume; | ||
public CorInfoContinuationFlags Flags; | ||
public int State; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually need this field in Continuation
, however this part is padding on 64-bit anyway. For 32-bit we could save space in methods with 1 suspension point by not having it at all, but then we make the 64-bit continuations with State
fields in them larger. And doing one thing in 64-bit and another in 32-bit is also more complex than I figured was worth it.
} | ||
pCode->EmitLDARG(1); // resultLoc | ||
pCode->EmitLDLOC(resultLoc); | ||
pCode->EmitSTOBJ(pCode->GetToken(resultTypeHnd)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice change. Transferring of return value is much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, OTOH we now pay cost for void-returning functions to go and find out there is no result to store during the main dispatch loop.
We could get rid of it with special continuation types created in FinalizeTaskReturningThunk
, but then we go back to needing to allocate a continuation there. Trade-offs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid touching parent continuation (i.e. to avoid cache misses) if we set some flag in the current continuation indicating it is void-returning.
No sure how much that could save. Maybe put this on the pile of "possible perf improvements".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we can leave it for the future to measure.
Not allocating the continuation in FinalizeTaskReturningThunk
is optimizing the async1 -> async2 path at a small cost to the async2 -> async2 path. It's probably worth it at this stage given that it is trading an allocation for a simple check against mt->ContinuationOffsets->Result == 0xFFFFFFFF
. But eventually taking an additional allocation in FinalizeTaskReturningThunk
may be preferable.
MethodTable* mt = RuntimeHelpers.GetMethodTable(this); | ||
|
||
// Only the special continuation sub types have continuation offsets. | ||
Debug.Assert(mt->IsContinuation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often a continuation does not have offsets? Is that just the sentinel, which is allocated on demand once per thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, with this change it should be just the sentinel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If size is the only concern to keep offsets off the base continuation, I think it is ok to have them if that may simplify various mt->ContinuationOffsets
indirections.
We can afford a few ints per thread, especially since that is allocated only if a thread ever suspends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not totally sure what you are suggesting here -- putting the offsets in the continuation itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we are storing the offsets as part of the MethodTable
itself, not as part of each allocated instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood the scheme. I thought the offsets are stored in the instance, but location of offsets is a per-type constant in method table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought - the CORINFO_CONTINUATION_DATA_OFFSETS
represents 4 optional parts with predefined order, if any present, and all have intptr
size, except Result
. Can Result
be stored last of all the 4 parts, and the presence of each part be a flags enum specifying which part is present? Then the offsets can be calculated via mask/popcount of the flags.
These flags could be folded with Flags
or can be readonly LayoutFlags
next to Flags
. If both are uint16
. Even though the layout flags would be stored in the instance, there will be no extra space used at all. Also no need to indirect via methodtable to find things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern, apart from complexity, is that methodtable will not be near the instance, thus a guaranteed cache miss.
It feels it would be nicer if methodtable is there mostly for GC purposes, but not used for the purposes of continuation dispatching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements the ability for the VM to dynamically create tailored continuation layout types and for the JIT to request such types to be created. This modernizes the async continuation system by replacing fixed-size arrays with dynamically generated continuation types that have customized layouts based on the specific continuation needs.
Key changes:
- Dynamic continuation type creation: The VM can now create continuation subtypes with tailored field layouts
- Simplified continuation data model: Eliminates separate data/GC arrays in favor of inline fields in dynamically created types
- New JIT-EE API: Adds
getContinuationType
method for the JIT to request specific continuation layouts
Reviewed Changes
Copilot reviewed 40 out of 40 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/coreclr/vm/asynccontinuations.h/.cpp | New manager for creating dynamic continuation types with custom layouts |
src/coreclr/vm/methodtable.h/.inl/.cpp | Added continuation type detection and data offset storage |
src/coreclr/vm/jitinterface.cpp | Implemented getContinuationType API and updated async resumption stub |
src/coreclr/jit/async.h/.cpp | Refactored continuation layout to use dynamic types instead of arrays |
src/coreclr/inc/corinfo.h | Updated continuation flags and added data offsets structure |
src/coreclr/System.Private.CoreLib/ | Updated continuation class to work with dynamic subtypes |
src/tests/async/ | Removed project build restrictions and added warning suppressions |
Comments suppressed due to low confidence (1)
src/coreclr/vm/asynccontinuations.cpp:1
- Constructor parameters should be validated for null pointer and invalid size. Consider adding assertions or parameter validation.
// Licensed to the .NET Foundation under one or more agreements.
if (IsContinuation()) | ||
{ | ||
*pDest = OBJECTREFToObject(GetParent().GetManagedClassObject()); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reasonable to make continuation.GetType()
return typeof(Continuation)
for the sub types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that there will a discrepancy between code like o.GetType() == typeof(Continuation)
when optimized by the JIT down to comparing MethodTables vs. when it is not optimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think throwing NotSupportedException
may work better. It will make sure that there is no way to get reflection access to these continuation objects.
sorry I haven't gotten to this as quick as I hoped @jakobbotsch, this is on my todo list for tomorrow (technically later today I suppose). |
|
||
public bool IsArray => (Flags & enum_flag_Category_Array_Mask) == enum_flag_Category_Array; | ||
|
||
public bool IsContinuation => ParentMethodTable == (MethodTable*)typeof(Continuation).TypeHandle.Value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public bool IsContinuation => ParentMethodTable == (MethodTable*)typeof(Continuation).TypeHandle.Value; | |
public bool IsContinuation => ParentMethodTable == TypeHandle.TypeHandleOf<Continuation>().AsMethodTable(); |
|
||
LoaderAllocator* allocator = m_pMethodBeingCompiled->GetLoaderAllocator(); | ||
AsyncContinuationsManager* asyncConts = allocator->GetAsyncContinuationsManager(); | ||
result = (CORINFO_CLASS_HANDLE)asyncConts->LookupOrCreateContinuationMethodTable((unsigned)dataSize, objRefs, dataOffsets, m_pMethodBeingCompiled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to take into account required alignment for FEATURE_64BIT_ALIGNMENT
?
Add ability for the VM to dynamically create continuation layout types and for the JIT to request such types to be created.