Skip to content

Conversation

max-charlamb
Copy link
Member

@max-charlamb max-charlamb commented Jul 28, 2025

Why

As we move to completely implement the API's used by SOS in the cDAC, we need to access data structures and global values defined in the GC. These are special because the GC can either be internal to coreclr.dll or an external clrgc.dll. Given this option, it is impossible to completely define GC data structures at coreclr.dll compile time and store them in the data-descriptor like we do for VM structures.

Proposed Solution

Add support for a data-descriptor to have sub-descriptors. Data-descriptors will have a new optional sub-descriptor section which can contain pointers to sub-descriptors. Each sub-descriptor has the same spec as the main (existing) descriptor. The pointers act the same as the well-known symbol name for the main descriptor.

Memory Layout

The sub-descriptors will be an additional (optional) section of the data-descriptor identical in spec to the existing globals with the exception that the value's type must be a pointer.

Parsing

When parsing a data-descriptor, each sub-descriptor pointer is checked if it is non-null. If it is non-null, then the sub-descriptor should be parsed in the same way as the main descriptor.

  • The header of a sub-descriptor must match the parent descriptor. If the header of a sub-descriptor does not match (different pointer size/endianness) this is an error and undefined behavior.
  • The sub-descriptor and parent descriptor must not have conflicting descriptors. If any conflicting descriptors exist, this is an error and undefined behavior.

Types, globals, and contract versions are merged between the sub-descriptor and the parent descriptor.

Implemented in: #118050

@Copilot Copilot AI review requested due to automatic review settings July 28, 2025 13:45
@max-charlamb max-charlamb marked this pull request as draft July 28, 2025 13:45
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 28, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for multi data-descriptor functionality to the Contract Data Access (cDAC) system. The primary purpose is to enable deferred data definition resolution through sub-descriptors when certain data definitions are not known at compile time but may be provided by external components.

Key changes:

  • Adds support for string-type global values alongside existing primitive integer constants and pointers
  • Introduces sub-descriptor pointers as a new optional component of data descriptors
  • Updates the JSON format to include a "sub-descriptors" field with pointer-based references to external data descriptors

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
datacontracts_design.md Updates global values to support strings and adds sub-descriptor concept documentation
data_descriptor.md Adds comprehensive specification for sub-descriptor descriptors, JSON format updates, and example usage
contract-descriptor.md Adds sub-descriptors field to example JSON and clarifies contract symbol discovery language
Comments suppressed due to low confidence (1)

docs/design/datacontracts/data_descriptor.md:300

  • The link reference uses 'contract-descriptor.md' which is inconsistent with the actual filename 'contract-descriptor.md' shown in the diff. However, this appears to be a correction from 'contract_descriptor.md' to match the actual hyphenated filename, so this is likely fixing an existing inconsistency.
descriptor](./contract-descriptor.md#Contract_descriptor).

},
"sub-descriptors":
{
"GCDescriptor": [ 1 ]
Copy link
Member

@jkotas jkotas Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we want to have the sub-descriptors in the json. We do not necessarily know how many of them we are going to have and what their names are going to be at build time. (We happen to know for GC that motivated this change, but it would be nice to allow for optional dynamically loaded components.)

Copy link
Member Author

@max-charlamb max-charlamb Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought having a separate section of pointers to sub-descriptors would be the cleanest way to implement them on the parser side. It would allow the parser to read the complete set of datadescriptors without outside information.

The name here isn't strictly required but I left it in for help debugging and to match the global spec. The parser machinery looks at the listed sub-descriptor pointers and if the values are non-null would recursively read in and merge the sub-descriptor. This would allow us to have sub-descriptor 'slots' that are not always used.

The alternative design I considered was to have the sub-descriptors be standard global values which are well-known to the relevant contracts. These contracts would use a new API on the Target to fetch this addition data. This would require a name and add more complexity to the Target as it's datastores would be mutable after creation.

The drawback is that the sub-descriptors couldn't be dynamically loaded (as you mention). I'm trying to understand if that would be an issue. Given the cDAC operates on a paused target, the memory between data descriptor initialization and contract use should not change (except for writes initiated by the cDAC), if there is a data descriptor JSON that can be loaded (ie no conflicts) would there be a benefit of pre-reading it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser machinery looks at the listed sub-descriptor pointers and if the values are non-null would recursively read in and merge the sub-descriptor. This would allow us to have sub-descriptor 'slots' that are not always used.

It requires us to know all types of sub-descriptors that we may possibly reference upfront (when we are generating the json at build time). After giving it more thought, it should not be a problem in practice. It is very unlikely that we will allow extending the runtime in unknown ways. Consider this feedback resolved.

The drawback is that the sub-descriptors couldn't be dynamically loaded (as you mention).

My concern was about dynamic loading at runtime. The difference is whether the runtime can load arbitrary unknown components dynamically, or whether the runtime can only load a known set of components dynamically. As I have said, I think it is fine to limit the runtime to known components.

Given the cDAC operates on a paused target, the memory between data descriptor initialization and contract use should not change

Yes, this should not be a problem with what we have now. (My gut feel is that we may need it to evolve the cDAC architecture to cache more and be less eager with pre-computing once we get to scenarios like single stepping, but that is a problem for future.)

Copy link
Member

@noahfalk noahfalk Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formalizing sub-descriptors seems like unnecessary complexity and constraint to me. The existing cDAC code seems amenable enough to dynamically loading new descriptors.

For example a runtime contract can declare any arbitrary field of a data structure to be a contract descriptor pointer:

CDAC_TYPE_FIELD(GCDacVars, /*pointer*/, StandaloneGcContractDesc, ...)

And cDAC code can load that contract descriptor on the fly if it needs to use it:

class GCContract : IGCContract
{
    IGCContract _standaloneGC;
    GCContract(Target t)
    {
        // do normal cDac stuff to read the value of a field
        ulong standaloneContractDesc = GetDescriptorFromGCDacVars(t);
        
        if(standaloneContractDesc != 0)
        {
             ContractDescriptorTarget.TryCreate(
                standaloneContractDesc,
                GetReadDelegate(target),
                GetGetThreadContextDelegate(target),
                out ContractDescriptorTarget standaloneGcTarget);
             standaloneGC = standaloneGcTarget.ContractRegistry.GCContract;
        }
    }
    
    public void EnumerateHeap()
    {
        if(_standaloneGC != null)
        {
            _standaloneGC.EnumerateHeap();
            return;
        }
        // do the normal built-in GC enumerate heap algorithm
    }
}

We could further simplify this a bit, it just shows the basic idea without diverging too far from how the code is currently structured. Is there any significant issue not treating ContractDescriptorTarget as a singleton?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the contracts don't know about ContractDescriptorTarget or the read/write delegates. They interact with the target through the abstract Target class.

This change would be possible but require adding some complexity to the managed side. Either having multiple targets (and dealing with properly flushing and using the correct one) or merging the globals/types together.

The current plan is to always load a sub-descriptor for the GC contract, even when we use the default GC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current plan is to always load a sub-descriptor for the GC contract, even when we use the default GC.

I think we'd be better off changing that plan and doing light changes to the managed code instead. Its harder to evolve interfaces between components (the contract descriptor format) than it is to change their internal implementation details so if complexity needs to be added somewhere I think we should bias towards putting it in the managed cDac implementation. Sub-descriptors also add constraints that we might find awkward later:

  • They prevent dynamic loaded dlls from defining new contracts. For example if one day we thought the GC contract would be better factored as two separate contracts it would be a breaking change for standalone GC that prevents it from being used on downlevel runtimes.
  • They prevent dynamic loaded dlls from having patterns other than singletons. For example imagine we have JIT plugin interface and we'd like to use the built-in JIT to compile some methods and the plugin JIT compiles other methods. We might want to access two instances of some JIT contract at the same time, not have one replace the other.

The data descriptor consists of:
* a collection of type structure descriptors
* a collection of global value descriptors
* an optional collection of pointers to sub-contracts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional

Should global value descriptors on the previous line be tagged as optional as well? If the (sub-)descriptor does not need any global values, I would expect that it can be missing - similar to the list of sub-contracts can be missing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ideally both the type structure descriptors and global value descriptors should be optional. The current infrastructure doesn't support that, but I can modify both the spec and infrastructure.

},
"sub-descriptors":
{
"GCDescriptor": [ 1 ] // indirect from aux data offset 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"GCDescriptor": [ 1 ] // indirect from aux data offset 1
"GC": [ 1 ] // indirect from aux data offset 1

Nit: I do not think we need to repeat "descriptor" in the name. It is clear what it points to by being under sub-descriptors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

},
"sub-descriptors":
{
"GCDescriptor": [ 1 ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser machinery looks at the listed sub-descriptor pointers and if the values are non-null would recursively read in and merge the sub-descriptor. This would allow us to have sub-descriptor 'slots' that are not always used.

It requires us to know all types of sub-descriptors that we may possibly reference upfront (when we are generating the json at build time). After giving it more thought, it should not be a problem in practice. It is very unlikely that we will allow extending the runtime in unknown ways. Consider this feedback resolved.

The drawback is that the sub-descriptors couldn't be dynamically loaded (as you mention).

My concern was about dynamic loading at runtime. The difference is whether the runtime can load arbitrary unknown components dynamically, or whether the runtime can only load a known set of components dynamically. As I have said, I think it is fine to limit the runtime to known components.

Given the cDAC operates on a paused target, the memory between data descriptor initialization and contract use should not change

Yes, this should not be a problem with what we have now. (My gut feel is that we may need it to evolve the cDAC architecture to cache more and be less eager with pre-computing once we get to scenarios like single stepping, but that is a problem for future.)

* a name
* a pointer value

If these values are non-null, the pointer represents another JSON data descriptor with the specification described in this document.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If these values are non-null, the pointer represents another JSON data descriptor with the specification described in this document.
If the value is non-null, the pointer points to another [contract descriptor](contract-descriptor.md#contract-descriptor-1).

I assume it will point to DotNetRuntimeContractDescriptor that speced in the other doc, not directly to JSON.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we use DotNetRuntimeContractDescriptor for both the main contract export and the data structure that it points to. We may want to rename the data structure (e.g. to just ContractDescriptor) to avoid confusion now that there will be multiple instances of it in the system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plan was for it to point to the same type of structure (currently named DotNetRuntimeContractDescriptor) which the symbol points to.


If these values are non-null, the pointer represents another JSON data descriptor with the specification described in this document.

When parsing a data descriptor with sub-descriptors each sub-descriptor should be parsed then its type, global, and contract values should be merged in. If any conflicts arise when merging in sub-descriptor data, this is an error and behavior is undefined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any conflicts arise when merging in sub-descriptor data, this is an error and behavior is undefined.

This design means that the components involved need to be aware of each other to avoid conflicts. Just pointing it out.

@max-charlamb max-charlamb mentioned this pull request Aug 7, 2025
1 task
@max-charlamb max-charlamb added documentation Documentation bug or enhancement, does not impact product or test code area-Diagnostics-coreclr labels Aug 26, 2025
@max-charlamb max-charlamb marked this pull request as ready for review August 26, 2025 14:02
Copy link
Contributor

Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

@max-charlamb max-charlamb removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 26, 2025
@max-charlamb
Copy link
Member Author

/ba-g docs only change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-Diagnostics-coreclr documentation Documentation bug or enhancement, does not impact product or test code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants