Add type-flow analysis pass for specialized dynamic dispatch#7968
Merged
saipraveenb25 merged 133 commits intoshader-slang:masterfrom Dec 4, 2025
Merged
Add type-flow analysis pass for specialized dynamic dispatch#7968saipraveenb25 merged 133 commits intoshader-slang:masterfrom
saipraveenb25 merged 133 commits intoshader-slang:masterfrom
Conversation
… collection duplication
…ing through specialization
…c method is now functional
…eters into dynamically-dispatched objects
…atch tests passing
tangent-vector
previously approved these changes
Nov 13, 2025
jhelferty-nv
approved these changes
Dec 4, 2025
Contributor
jhelferty-nv
left a comment
There was a problem hiding this comment.
Reapproving based on Tess's previous approval, and Sai's assurance that changes were minor to address branch drift.
This was referenced Dec 5, 2025
github-merge-queue bot
pushed a commit
that referenced
this pull request
Dec 10, 2025
When generating the release note, the script was missing the new line characters for the breaking changes. The following is an example of the problem: > === Breaking changes === 72761cc Add error diagnostic for integer literals that don't fit into uint64_t (#9208 Remove the deprecated hlsl_coopvec_poc capability that was for POC CoopVec (#9213 Add type-flow analysis pass for specialized dynamic dispatch (#7968) With this PR, it will be fixed as below: >=== Breaking changes === 72761cc Add error diagnostic for integer literals that don't fit into uint64_t (#9208) cc73e8d Remove the deprecated hlsl_coopvec_poc capability that was for POC CoopVec (#9213) 4280f24 Add type-flow analysis pass for specialized dynamic dispatch (#7968)
ncelikNV
added a commit
to ncelikNV/slang
that referenced
this pull request
Dec 10, 2025
Fixes the following warning emitted by Clang 20: ``` warning: first argument in call to 'memset' is a pointer to non-trivially copyable type 'IRConstant' [-Wnontrivial-memcall] ``` See shader-slang#8634.
gtong-nv
pushed a commit
that referenced
this pull request
Dec 15, 2025
When generating the release note, the script was missing the new line characters for the breaking changes. The following is an example of the problem: > === Breaking changes === 72761cc Add error diagnostic for integer literals that don't fit into uint64_t (#9208 Remove the deprecated hlsl_coopvec_poc capability that was for POC CoopVec (#9213 Add type-flow analysis pass for specialized dynamic dispatch (#7968) With this PR, it will be fixed as below: >=== Breaking changes === 72761cc Add error diagnostic for integer literals that don't fit into uint64_t (#9208) cc73e8d Remove the deprecated hlsl_coopvec_poc capability that was for POC CoopVec (#9213) 4280f24 Add type-flow analysis pass for specialized dynamic dispatch (#7968)
github-merge-queue bot
pushed a commit
that referenced
this pull request
Dec 16, 2025
Fixes the following warning emitted by Clang 20: ``` warning: first argument in call to 'memset' is a pointer to non-trivially copyable type 'IRConstant' [-Wnontrivial-memcall] ``` See #8634. Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes: #7782
Unblocks: #6486
Type Flow Analysis for Specialized Dynamic Dispatch
This PR implements a transformation pass for dynamic dispatch in the Slang compiler, enabling the lowering of dynamic instructions (such as
LookupWitnessMethod,MakeExistential, ...) into concrete instructions specialized to the set of possible elements that can be determined at compile time.This PR also removes the old generics lowering approach that relied on detecting all available conformances in the global scope and generating dispatch functions based on this information.
Minor Note: This PR introduces new IR op-codes, but does not change the max version since these IR ops are strictly internal. They are created and destroyed within the lifetime of the typeflow specialization pass.
Overview
The implementation introduces an IR transformation pass that converts runtime polymorphic instructions into specialized dispatch functions and union types through 3 stages:
i. Analysis: determining sets of possible types/tables/functions/etc.. for each dynamic instruction through a data-flow analysis pass,
ii. Specialization: replacing dynamic instructions into specialized versions that operate on the known sets via "set" instructions (e.g.
TypeSet,SetTagType, etc..), andiii. Lowering: lowering sets into dispatch functions (for sets of functions) and union types (for sets of types).
Analysis: Propagating sets of types, funcs, tables & generics using static analysis.
The analysis step proceeds similar to a traditional inter-procedural, context-sensitive data-flow analysis step.
The primary difference is that the states being tracked for each instruction is a not a set of values, but of global definitions (e.g. types, functions, witness tables, generics, etc..)
We maintain a map of
(context, instruction)totypeFlowInfofor each instruction.contextis either anIRFunc*or anIRSpecialize*inst and theinstructionis any child inst of the context.contextmust be a global inst with concrete operands. In case ofIRSpecialize*, the operands dictate the bindings for the generic parameters (these may also be sets)typeFlowInfois an "effective type" that we track for each instruction that may be dynamic (or have dynamic component). Here are the set of possible states:FuncSet,TypeSet,GenericSet,WitnessTableSet: Represents a set of functions/types/generics/tables.ElementOfSetType(set): Represents an element of the given set. Used for instructions whose value can be one of the elements in the set (e.g. the result of aLookupWitnessMethod,ExtractExistentialWitnessTable,ExtractExistentialType, etc..)TaggedUnionType(tableSet : WitnessTableSet, typeSet : TypeSet): Represents a tagged union of index into tables intableSetand a value whose type may be any of the types intypeSet. Typically used for instructions that create existentials.UntaggedUnionType(typeSet : TypeSet): Represents a union of values whose type may be any of the types intypeSet. Typically used for instructions that extract values from existentials.PtrTypeBase(typeFlowInfo)/ArrayType(typeFlowInfo, ...): Additionally, we allow type-flow info insts to be nested into pointer and array types. This enables us to analyze (and optimize) loads and stores from local vars, structs, arrays, etc.NoneTypeElement,NoneWitnessTableElement: Represents a 'none' sentinel value for types and witness tables respectively, to be used to represent the none case forOptionalTypeUninitializedTypeElement(interfaceType),UninitializedWitnessTableElement(interfaceType): Represents an uninitialized value of the given type. Used to represent cases where a value may be uninitialized (e.g. result of aLoadFromUninitializedMemory). This is primarily used to detect dynamic dispatch on uninitialized values and diagnose appropriate errors.UnboundedTypeElement(interfaceType),UnboundedWitnessTableElement(interfaceType): Represents an unbounded set of types that conform to the given interface type. This is used to represent cases where we cannot determine a finite set of possible types/tables (e.g. when we need to compile a function for true dynamic dispatch by accepting v-table pointers). Note that the elements can be combined with known elements to represent a hybrid specialized + dynamic case.UnboundedFuncElement(funcType),UnboundedGenericElement(genericType): Similar to above, but for functions and generics respectively, the funcType and genericType represent the type signature to be used to call into the element.The fixed-point analysis iterates by adding the users of insts whose type-data has changed in each iteration until no more changes are made.
Specialization: Convert dynamic lookups to use tag operations on known sets
Transforms dynamic dispatch instructions into tag operations and lowers certain insts (e.g call) to use these tags where necessary. This pass also rewrites the types of instructions to use the set-based types.
We also introduce a new type:
SetTagType(set): Represents a run-time identifier that can be used to select an element inset.Any insts that were assigned an
ElementOfSetType(set)during analysis will be specialized into insts that produce aSetTagType(set)to represent the run-time identifier, and any insts that were using the element will be specialized to accept the tag instead.In some cases (like
Call), this involves some re-writes to convert the tag into something callable.Tag Operations:
GetTagForSuperSet(tag),GetTagForSubSet(tag): Maps a tag for a sub-set to a tag for a super-set or vice-versa (the sets are encoded in the types of the get-tag inst and the operand)GetTagForMappedSet(tag, key): Maps a tag for a set to another set under a table lookup operation.GetTagForSpecializedSet(tag, specArgs...): Maps a tag for a generic set to a specialized set under specialization with the given arguments.GetTagFromSequentialID(id, interfaceType): Converts witness-table sequential IDs (which are global & may be externally defined) to set tags.GetSequentialIDFromTag(tag, interfaceType): Converts witness-table sequential IDs (which are global & may be externally defined) to set tags.GetTagForElementInSet(value, set): Given a concrete value and a set, returns the corresponding tag for that value in the set.GetElementFromTag(tag : SetTagType(set)): Given a tag and a set, returns the corresponding concrete value in the set.CastInterfaceToTaggedUnionPtr(ptr): Convert a pointer to an interface-typed location into a pointer to a tagged union of known types. This is required to avoid changing the layout of an interface type that is used globally (e.g. const buffers, structured buffers, etc.), while still loading into a known tagged union type.Tagged Union Operations:
MakeTaggedUnion(tag : SetTagType(tableSet), value : UntaggedUnionType(typeSet)): Create a tagged-union value from a tag and a value.GetTagFromTaggedUnion(taggedUnionValue : TaggedUnionType(typeSet, tableSet)): Translate a tagged-union value to its corresponding tag for the witness table represented in the tagged-union's set.GetTypeTagFromTaggedUnion(taggedUnionValue : TaggedUnionType(typeSet, tableSet)): Translate a tagged-union value to its corresponding tag for the type in the tagged-union's set. Note that the run-time information about the type is not actually used anywhere in the current compiler, so this inst turns into an undefined (poison) value currently.GetValueFromTaggedUnion(taggedUnionValue : TaggedUnionType(typeSet, tableSet)): Translate a tagged-union value to its corresponding value in the tagged-union's set.Logically all tagged union operations can be thought of as operating on a tuple of
(tag, value), where the tag is an identifier for the set of witness tables and the value may be of any of the types in the set of types. These are lowered intoMakeTuple/GetTupleElementoperations.Dispatch Operations:
GetDispatcher(witnessTableSet : WitnessTableSet, key : StructKey): Given a set of witness tables and a key, returns a callable function that dispatches to the correct implementation based on the tag (note that this uses the table tag and not the func tag, to skip the extra mapping step)GetSpecializedDispatcher(witnessTableSet : WitnessTableSet, key : StructKey, specArgs...): Given a set of witness tables, a key, and specialization arguments, returns a callable function that dispatches to the correct implementation based on the tag and the specialization arguments.Specialization Rules For Dynamic Insts
In general, every inst with a non-trivial type-flow-data has its type replaced with the type extracted from the type-flow-data.
For certain instructions, we have additional lowering logic:
Call
FuncSetSpecializeinst, any specialization arguments that are themselvesset-tags are also passed in as arguments to the call.
LookupWitnessMethod
GetTagForMappedSet(tag, key)Existential Insts
ExtractExistentialValueandExtractExistentialWitnessTablebecome extraction/reinterpretation operations.ExtractExistentialTypeis replaced by its inferredTypeSet.Load/Store
IRStoreis writing a value of set type that is different from the pointer's lowered type, then we cast the result (by reinterpreting the value and mapping the tag) before writing.IRLoadis reading a value of set type that is different from the pointer's lowered type, we cast the loaded value before using.Specialization of Generics With Set Operands:
The specialization pass can create specialized versions of generics that use sets instead of concrete elements.
This PR introduces an alternative approach to specialization
specializeGenericWithSetArgsthat handles suchIRSpecializeinsts.Unlike the existing
specializeGeneric,specializeGenericWithSetArgsintroduces additional tag variables for each witness table set operand to keep track of the index during execution.Example:
This process of adding tag parameters is only applied for generic functions. For generic struct types, the usual specialization is applied, which simply substitutes the set itself (since the contents of the specialized struct type cannot depend on anything truly dynamic).
Lowering: Generate integer maps, dispatch functions and any-value-types
The Analysis and Specialization parts are run entirely within the specialization loop through
specializeDynamicInsts().Lowering, on the other hand, happens after the main specialization loop is complete. Most instructions are lowered just before the re-interpret lowering pass. The sequential ID mapping instructions are lowered with the generics lowering step since the sequential IDs need to be assigned first.
Key insts lowered here are:
AnyValueTypeusing the max size of the types in the type setby using a mapping from set elements to integers. The current logic assigns the same integer ID for an element, independent of which set they appear in, so
GetTagForSuperSetandGetTagForSubSetbecome no-ops, whileGetTagForMappedSetbecomes an switch-statement from source set' integers to destination set's integer.GetTagForSpecializedSetshould be consumed during the specialization step and turn into aGetSpecializedDispatcherinstead, so we do not expect to see it during lowering.MakeTupleandGetTupleElementoperations.Examples:
Here is an end-to-end example of analysis, specialization and lowering applied to a function that uses dynamic insts (most types are inlined for readability)
After the Analysis step, these are the assignments that we will hold on to in the
dictionary. Note: nothing in the function is modified at this point, though set
insts may be created in the global scope.
After Specialization, the insts have their types rewritten, and any dynamic uses (e.g. calls, loads, stores, etc.)
are rewritten accordingly
After lowering
Breaking Changes
While this PR does not change the language or spec, it enforces some of the rules around existentials more strictly, which can cause old code to fail with diagnostic errors rather than silently generating code that might have undefined behavior at runtime.
Summary of patterns to look out for:
Initializing interface-type objects with default constructor
{}. E.g.IFoo foo = {};IFoo foo = MyDefaultFooImpl();. Create your own default concrete type that conforms toIFooOptional<IFoo> foo = {};which initializes tonone. Uses offooneed to change tofoo.valueand should check forfoo.hasValuefirst.Potentially uninitialized interface-type objects:
Optional<IFoo> foo = {};when creating the object.