Add IR instructions for cooperative matrix/vector ops#10643
Add IR instructions for cooperative matrix/vector ops#10643jkwak-work merged 5 commits intoshader-slang:masterfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds five cooperative matrix/vector IR opcodes and three new scalar kinds (BFloat16, FloatE4M3, FloatE5M2); implements IR validation, reflection/bindings, HLSL/SPIR-V/CUDA (OptiX) codegen paths, tests, and tooling updates to support these additions. Changes
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds new IR instructions and codegen paths for cooperative matrix/vector operations (SPIR-V NV/KHR and CUDA/OptiX), while extending reflection/support tooling to recognize additional “special” scalar types (bfloat16/float8 variants) and adding regression/codegen tests.
Changes:
- Introduces new cooperative matrix/vector IR ops (mul-add, matrix-mul, outer-product accumulate, reduce-sum accumulate) plus IR validation and serialization/stable-name updates.
- Extends reflection/type utilities and test infrastructure to support
BFloat16,FloatE4M3, andFloatE5M2scalar types. - Adds/updates backend emission for SPIR-V, HLSL, and CUDA/OptiX, with new/updated
.slangtests for codegen and diagnostics.
Reviewed changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/slang-unit-test/unit-test-special-scalar-reflection.cpp | New unit test covering reflection for bfloat16/float8 scalar + vector element types. |
| tools/slang-test/slang-test-main.cpp | Enables float comparisons for new special scalar types in output diffing. |
| tools/render-test/shader-input-layout.cpp | Adds print/texture-data handling for bfloat16 and float8 element types. |
| tools/gfx/slang.slang | Extends public SlangScalarType enum for gfx bindings. |
| tests/cuda/optix-coopvec.slang | Updates OptiX coopvec filecheck expectations. |
| tests/cuda/optix-coopvec-packed-input-diagnostic.slang | Adds diagnostic test for unsupported packed-input coopvec matmul on CUDA/OptiX. |
| tests/cooperative-vector/training-spirv-codegen.slang | New SPIR-V codegen test for cooperative vector training ops. |
| tests/cooperative-vector/training-hlsl-codegen.slang | New HLSL codegen test for cooperative vector training ops. |
| tests/cooperative-vector/training-cuda-codegen.slang | New CUDA/OptiX codegen test for cooperative vector training ops. |
| tests/cooperative-vector/matrix-mul-spirv-codegen.slang | New SPIR-V codegen test for coopvec matrix mul/mul-add ops. |
| tests/cooperative-vector/matrix-mul-hlsl-codegen.slang | New HLSL codegen test for coopvec matrix mul/mul-add (packed + non-packed). |
| tests/cooperative-matrix/mat-mul-add-cuda-codegen.slang | New CUDA codegen test for cooperative matrix mul-add. |
| source/slang/slang-reflection-json.cpp | Emits JSON reflection strings for new scalar types. |
| source/slang/slang-reflection-api.cpp | Extends scalar reflection to recognize non-basic scalar types (bfloat16/float8). |
| source/slang/slang-ir.h | Bumps max supported IR module version for new instructions. |
| source/slang/slang-ir.cpp | Marks new cooperative ops as side-effect-free where appropriate. |
| source/slang/slang-ir-validate.h | Declares new cooperative IR validation entrypoint. |
| source/slang/slang-ir-validate.cpp | Implements validation for new cooperative IR instructions. |
| source/slang/slang-ir-insts.lua | Defines new IR instruction opcodes and operand lists. |
| source/slang/slang-ir-insts-stable-names.lua | Assigns stable IDs for new IR instructions. |
| source/slang/slang-emit.cpp | Runs new cooperative IR validation pass during linking/optimization. |
| source/slang/slang-emit-spirv.cpp | Adds SPIR-V emission for new cooperative ops and enum/value mappings. |
| source/slang/slang-emit-hlsl.h | Declares helpers for mapping coopvec enums during HLSL emission. |
| source/slang/slang-emit-hlsl.cpp | Emits HLSL builtins for new cooperative vector operations. |
| source/slang/slang-emit-cuda.cpp | Emits CUDA WMMA coopmat mul-add and OptiX coopvec operations. |
| source/slang/slang-emit-c-like.cpp | Ensures coopvec ops are emitted as statements (not folded). |
| source/slang/hlsl.meta.slang | Adds new intrinsic ops and rewires coopvec/coopmat implementations to IR instructions. |
| source/slang-wasm/slang-wasm-bindings.cpp | Exposes new scalar types to WASM bindings. |
| source/core/slang-type-text-util.cpp | Adds type-name ↔ scalar-type mappings for new scalar types. |
| prelude/slang-cuda-prelude.h | Updates OptiX coopvec wrapper templates (enum types, mapping removal). |
| include/slang.h | Adds public enums for cooperative matrix/vector metadata and extends scalar type enums. |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
prelude/slang-cuda-prelude.h (1)
6487-6491:⚠️ Potential issue | 🔴 CriticalFix the first
slangOptixCoopVecMatMuloverload to honor thetransposeparameter.The first overload (line 6487–6505) accepts
bool transposeas a runtime parameter, but the implementation hardcodesfalsewhen dispatching tooptixCoopVecMatMul(line 6501). The CUDA emit code insource/slang/slang-emit-cuda.cppforwards the transpose operand for non-StructuredBuffer matrices; whentranspose=true, the wrapper silently selects the non-transposed code path instead. Dispatch to the appropriatetrue/falsespecialization based on the runtime value.🐛 Proposed fix
- return optixCoopVecMatMul< - VecTOut, - VecTIn, - inputInterpretation, - matrixLayout, - false, - N, - K, - matrixInterpretation>(inputVector, matrix, matrixOffset, matrixStride); + if (transpose) + { + return optixCoopVecMatMul< + VecTOut, + VecTIn, + inputInterpretation, + matrixLayout, + true, + N, + K, + matrixInterpretation>(inputVector, matrix, matrixOffset, matrixStride); + } + return optixCoopVecMatMul< + VecTOut, + VecTIn, + inputInterpretation, + matrixLayout, + false, + N, + K, + matrixInterpretation>(inputVector, matrix, matrixOffset, matrixStride);
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 86407eaa-1bb4-4f58-a1ee-8c917b3f600a
📒 Files selected for processing (31)
include/slang.hprelude/slang-cuda-prelude.hsource/core/slang-type-text-util.cppsource/slang-wasm/slang-wasm-bindings.cppsource/slang/hlsl.meta.slangsource/slang/slang-emit-c-like.cppsource/slang/slang-emit-cuda.cppsource/slang/slang-emit-hlsl.cppsource/slang/slang-emit-hlsl.hsource/slang/slang-emit-spirv.cppsource/slang/slang-emit.cppsource/slang/slang-ir-insts-stable-names.luasource/slang/slang-ir-insts.luasource/slang/slang-ir-validate.cppsource/slang/slang-ir-validate.hsource/slang/slang-ir.cppsource/slang/slang-ir.hsource/slang/slang-reflection-api.cppsource/slang/slang-reflection-json.cpptests/cooperative-matrix/mat-mul-add-cuda-codegen.slangtests/cooperative-vector/matrix-mul-hlsl-codegen.slangtests/cooperative-vector/matrix-mul-spirv-codegen.slangtests/cooperative-vector/training-cuda-codegen.slangtests/cooperative-vector/training-hlsl-codegen.slangtests/cooperative-vector/training-spirv-codegen.slangtests/cuda/optix-coopvec-packed-input-diagnostic.slangtests/cuda/optix-coopvec.slangtools/gfx/slang.slangtools/render-test/shader-input-layout.cpptools/slang-test/slang-test-main.cpptools/slang-unit-test/unit-test-special-scalar-reflection.cpp
55caea3 to
5995e4f
Compare
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
prelude/slang-cuda-prelude.h (2)
6486-6504:⚠️ Potential issue | 🔴 CriticalFix the 5-parameter
slangOptixCoopVecMatMuloverload to respect the runtimetransposeparameter.The function accepts a
transposeparameter but hardcodes it tofalsewhen callingoptixCoopVecMatMul. The CUDA compiler (slang-emit-cuda.cpp) actively validates and extracts this parameter, emitting it at runtime, but the wrapper ignores it entirely—any caller passingtranspose=truewill silently receive incorrect results. Since OptiX requirestransposeas a template parameter, add a runtime branch to instantiate the correct template variant.The 6-parameter overload (WITH bias) has the same issue and must also be fixed.
🔧 Preserve the current API by branching to the correct OptiX instantiation
__forceinline__ __device__ VecTOut slangOptixCoopVecMatMul( const VecTIn& inputVector, CUdeviceptr matrix, unsigned matrixOffset, bool transpose, unsigned matrixStride) { constexpr unsigned N = OptixCoopVecTraits<VecTOut>::size; // Output vector size constexpr unsigned K = OptixCoopVecTraits<VecTIn>::size; // Input vector size - return optixCoopVecMatMul< - VecTOut, - VecTIn, - inputInterpretation, - matrixLayout, - false, - N, - K, - matrixInterpretation>(inputVector, matrix, matrixOffset, matrixStride); + if (transpose) + { + return optixCoopVecMatMul< + VecTOut, + VecTIn, + inputInterpretation, + matrixLayout, + true, + N, + K, + matrixInterpretation>(inputVector, matrix, matrixOffset, matrixStride); + } + + return optixCoopVecMatMul< + VecTOut, + VecTIn, + inputInterpretation, + matrixLayout, + false, + N, + K, + matrixInterpretation>(inputVector, matrix, matrixOffset, matrixStride); }
6472-6477:⚠️ Potential issue | 🔴 CriticalAlign the cooperative-vector version gating with the wrappers and fix the ignored transpose parameter.
The
OptixCoopVecTraits<OptixCoopVec<T, N>>specialization is guarded byOPTIX_VERSION > 90000(line 6472), but the threeslangOptixCoopVecMatMulwrapper functions are guarded byOPTIX_VERSION >= 90000(line 6462). This creates a breaking mismatch: OptiX 9.0 code can instantiate these wrappers but will fail when accessingOptixCoopVecTraits<VecTOut>::sizeandOptixCoopVecTraits<VecTIn>::size. Align both guards to>= 90000.Additionally, the first wrapper (line 6486) accepts a
bool transposeparameter but always passesfalseto the nativeoptixCoopVecMatMulcall (line 6497), completely ignoring the input. Either remove the parameter or use it correctly.🔧 Guard alignment fix
-#if defined(OPTIX_VERSION) && OPTIX_VERSION > 90000 +#if defined(OPTIX_VERSION) && OPTIX_VERSION >= 90000 template<typename T, unsigned int N> struct OptixCoopVecTraits<OptixCoopVec<T, N>> { static constexpr unsigned int size = N; }; `#endif`As per coding guidelines,
prelude/**: Built-in language definitions and intrinsics. Changes here affect all Slang programs. Backward compatibility and all target backends must handle these intrinsics correctly.Also applies to: 6486-6510, 6515-6539, 6547-6568
♻️ Duplicate comments (2)
source/slang/slang-emit-cuda.cpp (1)
824-825: 🧹 Nitpick | 🔵 TrivialMinor inconsistency: mixed use of
as<>+SLANG_ASSERTvscast<>.Line 824 uses
as<>followed bySLANG_ASSERT, while line 851 usescast<>directly. Thecast<>pattern is preferred when the switch case guarantees the IR opcode, as it provides built-in debug assertions.Also applies to: 851-851
source/slang/slang-emit-hlsl.cpp (1)
711-724: 🧹 Nitpick | 🔵 Trivial
emitMappedCoopVecMatrixLayoutdoesn't diagnose unmapped layout values.Unlike
emitMappedCoopVecComponentTypewhich explicitly diagnoses unsupported types (e.g., BFloat16),emitMappedCoopVecMatrixLayoutrelies solely onSLANG_UNEXPECTEDin the mapping function for invalid values. While all current layout enum values are mapped, adding explicit validation would provide better diagnostics for future enum additions.♻️ Suggested validation pattern
void HLSLSourceEmitter::emitMappedCoopVecMatrixLayout(IRInst* operand) { auto intLit = as<IRIntLit>(operand); if (!intLit) { getSink()->diagnose(Diagnostics::UnsupportedTargetIntrinsic{ .operation = "cooperative vector matrix layout (non-constant operand)", .location = operand->sourceLoc}); m_writer->emit("0"); return; } + auto layoutValue = (int32_t)intLit->getValue(); + switch (layoutValue) + { + case SLANG_COOPERATIVE_VECTOR_MATRIX_LAYOUT_ROW_MAJOR: + case SLANG_COOPERATIVE_VECTOR_MATRIX_LAYOUT_COLUMN_MAJOR: + case SLANG_COOPERATIVE_VECTOR_MATRIX_LAYOUT_INFERENCING_OPTIMAL: + case SLANG_COOPERATIVE_VECTOR_MATRIX_LAYOUT_TRAINING_OPTIMAL: + break; + default: + getSink()->diagnose(Diagnostics::UnsupportedTargetIntrinsic{ + .operation = "cooperative vector matrix layout (unsupported value)", + .location = operand->sourceLoc}); + m_writer->emit("0"); + return; + } + - m_writer->emit(_mapSlangCoopVecMatrixLayoutToHLSL((int32_t)intLit->getValue())); + m_writer->emit(_mapSlangCoopVecMatrixLayoutToHLSL(layoutValue)); }As per coding guidelines,
source/slang/**: "(5) Null pointer safety and proper error handling via diagnostics."
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b94505b9-56da-4edc-a1d3-98207cfcdbf3
📒 Files selected for processing (32)
include/slang.hprelude/slang-cuda-prelude.hsource/core/slang-type-text-util.cppsource/slang-wasm/slang-wasm-bindings.cppsource/slang/hlsl.meta.slangsource/slang/slang-emit-c-like.cppsource/slang/slang-emit-cuda.cppsource/slang/slang-emit-hlsl.cppsource/slang/slang-emit-hlsl.hsource/slang/slang-emit-spirv.cppsource/slang/slang-emit.cppsource/slang/slang-ir-insts-stable-names.luasource/slang/slang-ir-insts.luasource/slang/slang-ir-validate.cppsource/slang/slang-ir-validate.hsource/slang/slang-ir.cppsource/slang/slang-ir.hsource/slang/slang-reflection-api.cppsource/slang/slang-reflection-json.cpptests/cooperative-matrix/mat-mul-add-cuda-codegen.slangtests/cooperative-vector/matrix-mul-hlsl-codegen.slangtests/cooperative-vector/matrix-mul-spirv-codegen.slangtests/cooperative-vector/training-cuda-codegen.slangtests/cooperative-vector/training-hlsl-codegen.slangtests/cooperative-vector/training-spirv-codegen.slangtests/cuda/optix-coopvec-packed-input-diagnostic.slangtests/cuda/optix-coopvec-transpose-diagnostic.slangtests/cuda/optix-coopvec.slangtools/gfx/slang.slangtools/render-test/shader-input-layout.cpptools/slang-test/slang-test-main.cpptools/slang-unit-test/unit-test-special-scalar-reflection.cpp
5995e4f to
10f5ad3
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tools/slang-test/slang-test-main.cpp (1)
4009-4019:⚠️ Potential issue | 🟠 MajorUse format-aware epsilon for low-precision float comparisons
Adding
BFloat16/FloatE4M3/FloatE5M2here is correct direction, but using the samedifferenceThreshold(1e-4) as higher-precision floats is too strict for these formats and can cause flaky/false test failures. Pick epsilon byscalarType(or use ULP-based comparison) beforeMath::AreNearlyEqual.Proposed change
- if (!Math::AreNearlyEqual(valueA, valueB, differenceThreshold)) + double epsilon = differenceThreshold; + switch (scalarType) + { + case ScalarType::FloatE4M3: epsilon = 1e-1; break; + case ScalarType::FloatE5M2: epsilon = 5e-2; break; + case ScalarType::BFloat16: epsilon = 1e-2; break; + default: break; + } + if (!Math::AreNearlyEqual(valueA, valueB, epsilon)) { return SLANG_FAIL; }prelude/slang-cuda-prelude.h (1)
6486-6505:⚠️ Potential issue | 🔴 CriticalMake
transposea template parameter instead of silently ignoring it.The function accepts
bool transposeas a parameter, but hardcodesfalsewhen forwarding tooptixCoopVecMatMul. This causes wrong-code behavior: a caller passingtranspose=truesilently executes the non-transposed operation instead of failing or applying the transpose. OptiX exposestransposeas a real template parameter, so either expose it as a template parameter on the wrapper or rejecttranspose=trueat compile time.Additionally, the wrapper block is guarded by
#if (OPTIX_VERSION >= 90000)while the requiredOptixCoopVecTraits<OptixCoopVec<T, N>>specialization is guarded by#if defined(OPTIX_VERSION) && OPTIX_VERSION > 90000. This creates a gap at exactlyOPTIX_VERSION == 90000where the wrappers would fail to compile due to undefined traits.
♻️ Duplicate comments (6)
source/slang/slang-reflection-json.cpp (1)
471-473:⚠️ Potential issue | 🟠 MajorAdd missing
IntPtr/UIntPtrscalar JSON mappings.The switch now handles new packed-float scalars, but pointer-sized integer scalars are still not mapped, so they fall through to
"unknown"in JSON output.As per coding guidelines, "Cross-backend consistency — changes to one emitter may need parallel changes in others."Proposed fix
CASE(Float16, float16); CASE(Float32, float32); CASE(Float64, float64); + CASE(IntPtr, intptr); + CASE(UIntPtr, uintptr); CASE(BFloat16, bfloat16); CASE(FloatE4M3, float_e4m3); CASE(FloatE5M2, float_e5m2);source/slang/slang-ir-validate.cpp (1)
698-705:⚠️ Potential issue | 🟡 MinorPotential null dereference if operand type is null.
The chain
getSaturatingAccumulation()->getDataType()->getOp()assumesgetDataType()returns non-null. While the operand count is validated, malformed IR could have an operand with a null type. Add a null check for defensive safety.🛡️ Suggested defensive check
- if (coopMatMulAdd->getSaturatingAccumulation()->getDataType()->getOp() != kIROp_BoolType) + auto satAccumType = coopMatMulAdd->getSaturatingAccumulation()->getDataType(); + if (!satAccumType || satAccumType->getOp() != kIROp_BoolType)source/slang/hlsl.meta.slang (2)
30950-30975:⚠️ Potential issue | 🟠 MajorPacked
matrixInterpretationandbiasInterpretationare still erased on the IR path.
__getCoopVecComponentScalarType()foldsSignedInt8PackedandUnsignedInt8Packedinto the same scalar-type IDs as the unpacked forms, but the new intrinsic surface only carries a packed bit forinputInterpretation. Packed matrix or bias interpretations therefore become indistinguishable from unpacked storage once lowered. Please either thread explicit packed flags for matrix/bias too, or reject packed matrix/bias interpretations before calling these intrinsics.As per coding guidelines,
source/slang/**: "IR pass correctness — ensure SSA form and type invariants are maintained."Also applies to: 31040-31045, 31068-31075, 31724-31729, 31738-31745
31042-31043:⚠️ Potential issue | 🟠 MajorDon't retype raw matrix and bias storage as
Ptr<T[]>.The pointer overloads still encode matrix and bias storage as
Ptr<T[]>, whereTis the result element type. That mis-types mixed-element cases likehalf×half→floatorint8×int8→int32, so the IR pointer operand no longer matches the actual backing storage. Keep these operands erased, or add separate matrix/bias pointee-type generics instead of reusingT.As per coding guidelines,
source/slang/**: "IR pass correctness — ensure SSA form and type invariants are maintained."Also applies to: 31070-31073, 32039-32044, 32101-32112
source/slang/slang-emit-cuda.cpp (2)
824-825: 🧹 Nitpick | 🔵 TrivialConsider using
cast<>for consistency with line 851.The switch case guarantees the opcode, so the downcast should always succeed. Line 851 already uses
cast<IRCoopVecReduceSumAccumulate>withoutSLANG_ASSERT, while this block usesas<>+SLANG_ASSERT. Usingcast<>here would be consistent and provides a built-in debug assertion.Suggested change
- auto outerProduct = as<IRCoopVecOuterProductAccumulate>(inst); - SLANG_ASSERT(outerProduct); + auto outerProduct = cast<IRCoopVecOuterProductAccumulate>(inst);
1349-1375:⚠️ Potential issue | 🟠 MajorMissing transpose emission in
kIROp_CoopVecMatMulAdd.The
kIROp_CoopVecMatMulcase (lines 1180-1196) conditionally emits thetransposeoperand for non-StructuredBuffer matrices, butkIROp_CoopVecMatMulAddvalidates the transpose constraint (lines 1305-1323) without ever emitting it in the final call. This creates an inconsistency where transpose is validated but ignored.Add the same StructuredBuffer check and conditional transpose emission to match MatMul.
Suggested fix
+ bool isStructuredBufferMatrix = + as<IRHLSLStructuredBufferTypeBase>(coopVecMatMulAdd->getMatrixPtr()->getDataType()) != + nullptr; + m_writer->emit("("); m_writer->emit("slangOptixCoopVecMatMul<"); emitType(inst->getDataType()); m_writer->emit(", "); emitType(coopVecMatMulAdd->getInput()->getDataType()); m_writer->emit(", "); m_writer->emit(inputInterpretation); m_writer->emit(", "); m_writer->emit(matrixInterpretation); m_writer->emit(", "); m_writer->emit(matrixLayout); m_writer->emit(", "); m_writer->emit(biasInterpretation); m_writer->emit(">(("); emitOperand(coopVecMatMulAdd->getInput(), getInfo(EmitOp::General)); m_writer->emit("), (CUdeviceptr)(&(("); emitOperand(coopVecMatMulAdd->getMatrixPtr(), getInfo(EmitOp::General)); m_writer->emit("))), "); emitOperand(coopVecMatMulAdd->getMatrixOffset(), getInfo(EmitOp::General)); + if (!isStructuredBufferMatrix) + { + m_writer->emit(", "); + emitOperand(coopVecMatMulAdd->getTranspose(), getInfo(EmitOp::General)); + } m_writer->emit(", (CUdeviceptr)(&(("); emitOperand(coopVecMatMulAdd->getBiasPtr(), getInfo(EmitOp::General));Verify that the OptiX wrapper template
slangOptixCoopVecMatMulwith bias parameters accepts a transpose argument:#!/bin/bash # Search for the slangOptixCoopVecMatMul template overload that takes bias parameters rg -n "slangOptixCoopVecMatMul" --type cpp -A 20 | head -100
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cbac7f26-7b5a-4d9e-bdec-2c96ff7c33a4
📒 Files selected for processing (32)
include/slang.hprelude/slang-cuda-prelude.hsource/core/slang-type-text-util.cppsource/slang-wasm/slang-wasm-bindings.cppsource/slang/hlsl.meta.slangsource/slang/slang-emit-c-like.cppsource/slang/slang-emit-cuda.cppsource/slang/slang-emit-hlsl.cppsource/slang/slang-emit-hlsl.hsource/slang/slang-emit-spirv.cppsource/slang/slang-emit.cppsource/slang/slang-ir-insts-stable-names.luasource/slang/slang-ir-insts.luasource/slang/slang-ir-validate.cppsource/slang/slang-ir-validate.hsource/slang/slang-ir.cppsource/slang/slang-ir.hsource/slang/slang-reflection-api.cppsource/slang/slang-reflection-json.cpptests/cooperative-matrix/mat-mul-add-cuda-codegen.slangtests/cooperative-vector/matrix-mul-hlsl-codegen.slangtests/cooperative-vector/matrix-mul-spirv-codegen.slangtests/cooperative-vector/training-cuda-codegen.slangtests/cooperative-vector/training-hlsl-codegen.slangtests/cooperative-vector/training-spirv-codegen.slangtests/cuda/optix-coopvec-packed-input-diagnostic.slangtests/cuda/optix-coopvec-transpose-diagnostic.slangtests/cuda/optix-coopvec.slangtools/gfx/slang.slangtools/render-test/shader-input-layout.cpptools/slang-test/slang-test-main.cpptools/slang-unit-test/unit-test-special-scalar-reflection.cpp
10f5ad3 to
16e8924
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (14)
source/core/slang-type-text-util.cpp (1)
28-31:⚠️ Potential issue | 🟠 MajorAdd missing pointer-sized scalar mappings to keep round-trip behavior complete.
SLANG_SCALAR_TYPESstill omitsIntPtr/UIntPtr, so scalar-name conversion is incomplete for pointer-sized integer scalar types.Proposed fix
`#define` SLANG_SCALAR_TYPES(x) \ x(None, none) \ x(Void, void) \ x(Bool, bool) \ x(Float16, half) \ x(UInt8, uint8_t) \ x(Int8, int8_t) \ x(UInt16, uint16_t) \ x(Int16, int16_t) \ x(UInt32, uint32_t) \ x(Int32, int32_t) \ x(Int64, int64_t) \ x(UInt64, uint64_t) \ + x(IntPtr, intptr_t) \ + x(UIntPtr, uintptr_t) \ x(Float32, float) \ x(Float64, double) \ x(BFloat16, bfloat16) \ x(FloatE4M3, float_e4m3) \ x(FloatE5M2, float_e5m2)source/slang/slang-ir.cpp (1)
9013-9015: 🧹 Nitpick | 🔵 TrivialDocument why accumulate cooperative ops are intentionally excluded from this side-effect-free list.
Line 9013-Line 9015look correct for pure value-producing ops, but this block is easy to misread as incomplete. Please add a brief comment clarifying thatkIROp_CoopVecOuterProductAccumulateandkIROp_CoopVecReduceSumAccumulateare intentionally excluded because they write to memory and must retain side effects.As per coding guidelines,
source/slang/**: “IR pass correctness — ensure SSA form and type invariants are maintained.”tests/cooperative-matrix/mat-mul-add-cuda-codegen.slang (1)
1-17: 🧹 Nitpick | 🔵 TrivialStrengthen the FileCheck assertion to validate template arguments.
The current CHECK pattern only verifies the function symbol exists but doesn't validate the critical template arguments (shape parameters and saturating flag). Since the shader specifically uses
16 x 16 x 16dimensions withsaturate=false, the test would still pass even if CUDA emission dropped or mangled these parameters.Consider tightening the pattern:
-// CHECK: Slang_CUDA_WMMA::coopMatMulAdd< +// CHECK: Slang_CUDA_WMMA::coopMatMulAdd<{{.*}}, 16, 16, 16, false>(As per coding guidelines, "tests/**: Check that expected outputs match the intended behavior, not just current behavior."
tests/cooperative-vector/matrix-mul-spirv-codegen.slang (1)
1-41: 🧹 Nitpick | 🔵 TrivialConsider adding a mixed signedness test case for better coverage.
The test currently only exercises the signed+signed combination (
int8_tinput withint32_tresult), which always produces the same operand mask (MatrixBSignedComponentsKHR|MatrixResultSignedComponentsKHR). The SPIR-V emitter's operand mask logic handles four independent signed flags, but only one combination is validated.Adding a case with unsigned input and signed result (or vice versa) would improve coverage:
// Unsigned input, signed result → only MatrixResultSignedComponentsKHR CoopVec<uint8_t, 4> uvec = coopVecLoad<4, uint8_t>(input); let mixedResult = coopVecMatMul<int32_t, 4, 4>( uvec, CoopVecComponentType::UnsignedInt8, ...); // CHECK: OpCooperativeVectorMatrixMulNV {{.*}} MatrixResultSignedComponentsKHRAs per coding guidelines, "tests/**: Ensure new features have corresponding tests."
tests/cooperative-vector/training-spirv-codegen.slang (1)
1-38: 🧹 Nitpick | 🔵 TrivialConsider strengthening SPIR-V checks to validate operands.
The test only verifies the capability and instruction opcodes exist but doesn't validate critical operand values like offsets, stride, or the encoded layout/component type constants. This means regressions in operand lowering could go undetected.
Consider extending the CHECK patterns to validate at least some operands:
// CHECK: OpCooperativeVectorOuterProductAccumulateNV {{%[0-9]+}} {{%[0-9]+}} {{%[0-9]+}} {{%[0-9]+}} %int_0 // CHECK: OpCooperativeVectorReduceSumAccumulateNV {{%[0-9]+}} {{%[0-9]+}} %int_0As per coding guidelines, "tests/**: Check that expected outputs match the intended behavior, not just current behavior."
tests/cuda/optix-coopvec-packed-input-diagnostic.slang (1)
11-27:⚠️ Potential issue | 🟠 MajorAdd packed-input diagnostic coverage for
coopVecMatMulAddPacked.This test only exercises
coopVecMatMulPacked; a regression incoopVecMatMulAddPackedpacked-input rejection would go uncaught. Please add a second diagnostic case (same file or sibling test) that expects the multiply-add packed-input unsupported diagnostic.As per coding guidelines,
tests/**: “Ensure new features have corresponding tests.”source/slang/slang-emit-spirv.cpp (2)
9013-9015:⚠️ Potential issue | 🟠 MajorNormalize cooperative-op metadata before
cast<>ing it.These paths assume raw
IRIntLit/IRBoolLit, but same-file constant handling already has to unwrapIRGlobalValueRefbefore reading literal values. If a wrapped or dynamic metadata operand reaches SPIR-V emission, this turns a user error into an assertion/null-deref instead of a diagnostic. Please unwrap once and diagnose unsupported non-constant metadata forsaturatingAccumulation, coop-vector interpretations,inputInterpretationIsPacked,k, andtranspose.As per coding guidelines,
source/slang/**: Null pointer safety and proper error handling via diagnostics.Also applies to: 9136-9153, 9177-9178, 9221-9222
9180-9187:⚠️ Potential issue | 🟠 MajorBuild the cooperative-vector signedness mask from the interpretation operands.
operandsMaskis derived from the storage element types of the input/result coop-vectors, while the instruction encodes explicitinputInterpretation/matrixInterpretation/biasInterpretationoperands. Packed or reinterpreted cases can make those disagree, so the component-type operands and signedness bits end up describing different math. The mask should be computed from the same interpretation operands that are emitted into the SPIR-V instruction.As per coding guidelines,
source/slang/**: IR pass correctness — ensure SSA form and type invariants are maintained.Also applies to: 9224-9231
source/slang/slang-emit-cuda.cpp (1)
1238-1362:⚠️ Potential issue | 🟠 MajorForward
transposein the OptiXCoopVecMatMulAddemission.Lines 1303-1319 validate
getTranspose(), but Lines 1336-1361 never serialize that operand into the emittedslangOptixCoopVecMatMul<...>call. That means transposedkIROp_CoopVecMatMulAddrequests are ignored. This path should mirrorkIROp_CoopVecMatMuland only omittransposefor the StructuredBuffer overload.As per coding guidelines,
source/slang/**: "(6) Cross-backend consistency — changes to one emitter may need parallel changes in others."source/slang/slang-emit-hlsl.cpp (1)
32-101:⚠️ Potential issue | 🟠 MajorDon’t encode unsupported coop-vector tags as
0.Lines 45-47, 79-81, and 97-99 all use
0as the fallback mapping value, and Line 678 does the same after diagnosingBFloat16. The new__builtin_*emitters then keep serializing those forged tags into HLSL; for matrix layouts this is especially dangerous because0is already RowMajor. Please return success/failure from these helpers and have the statement emitters stop after the diagnostic instead of continuing with a bogus enum value.As per coding guidelines,
source/slang/**: "(5) Null pointer safety and proper error handling via diagnostics."Also applies to: 667-698, 887-1031
prelude/slang-cuda-prelude.h (1)
6483-6485:⚠️ Potential issue | 🔴 CriticalFix the OptiX 9.0 trait guard before instantiating these wrappers.
The adjacent guard at Line 6472 still uses
OPTIX_VERSION > 90000, but all three overloads compile forOPTIX_VERSION >= 90000and immediately dereferenceOptixCoopVecTraitson Line 6493/6494, Line 6523/6524, and Line 6553/6554. OptiX 9.0 therefore loses the only specialization and these wrappers stop compiling.Suggested fix
-#if defined(OPTIX_VERSION) && OPTIX_VERSION > 90000 +#if defined(OPTIX_VERSION) && OPTIX_VERSION >= 90000 template<typename T, unsigned int N> struct OptixCoopVecTraits<OptixCoopVec<T, N>> { static constexpr unsigned int size = N; }; `#endif`As per coding guidelines,
prelude/**: Built-in language definitions and intrinsics. Changes here affect all Slang programs. Verify backward compatibility and check that all target backends handle new intrinsics.#!/bin/bash set -euo pipefail echo "Version guards and trait uses around the cooperative-vector wrappers:" rg -n 'OPTIX_VERSION > 90000|OPTIX_VERSION >= 90000|OptixCoopVecTraits<OptixCoopVec|OptixCoopVecTraits<VecT' prelude/slang-cuda-prelude.h echo echo "Relevant source window:" sed -n '6462,6566p' prelude/slang-cuda-prelude.hExpected result: the trait specialization remains guarded by
> 90000while the three wrapper overloads usingOptixCoopVecTraitsare compiled under>= 90000.Also applies to: 6511-6514, 6544-6546
source/slang/slang-ir-validate.cpp (1)
687-704:⚠️ Potential issue | 🟠 MajorGuard operand/type dereferences before validating.
getOperandCount()does not prove every operand slot is populated, and this file already tolerates null operands in malformed IR. The direct->getDataType()/->getOp()chains here can crash the validator instead of emittingIrValidationFailed.As per coding guidelines, "Null pointer safety and proper error handling via diagnostics."
Also applies to: 793-802, 894-903, 1001-1028, 1048-1060
source/slang/hlsl.meta.slang (2)
30950-31096:⚠️ Potential issue | 🟠 MajorDon't erase packedness from
matrixInterpretationandbiasInterpretation.
__getCoopVecComponentScalarType()still collapsesSignedInt8Packed/UnsignedInt8Packedinto the same scalar IDs as unpacked int8/uint8, while the new__coopVecMatMul*IR surface only preserves packedness forinputInterpretation. Packed matrix/bias operands will therefore lower indistinguishably from unpacked ones. Please thread separate packed flags for matrix/bias too, or reject packed matrix/bias before calling these intrinsics. As per coding guidelines,source/slang/**: "IR pass correctness — ensure SSA form and type invariants are maintained."Also applies to: 31720-31749
31720-31767:⚠️ Potential issue | 🟠 MajorKeep the pointer path storage-erased like the buffer path.
The structured-buffer overloads intentionally erase backing element types via
Ignored*BufferElementType, but the new array/pointer helpers below immediately retype raw storage asPtr<T[]>. In mixed-precision cases likehalf×half→floatorint8×int8→int32, that pointee type no longer matches the actual memory layout handed to the IR op. Carry an explicit storage element type on the pointer forms too, or keep them raw/erased. As per coding guidelines,source/slang/**: "IR pass correctness — ensure SSA form and type invariants are maintained."Also applies to: 31952-32116
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: caac342d-47f8-47ab-852a-b3d5c1570856
📒 Files selected for processing (32)
include/slang.hprelude/slang-cuda-prelude.hsource/core/slang-type-text-util.cppsource/slang-wasm/slang-wasm-bindings.cppsource/slang/hlsl.meta.slangsource/slang/slang-emit-c-like.cppsource/slang/slang-emit-cuda.cppsource/slang/slang-emit-hlsl.cppsource/slang/slang-emit-hlsl.hsource/slang/slang-emit-spirv.cppsource/slang/slang-emit.cppsource/slang/slang-ir-insts-stable-names.luasource/slang/slang-ir-insts.luasource/slang/slang-ir-validate.cppsource/slang/slang-ir-validate.hsource/slang/slang-ir.cppsource/slang/slang-ir.hsource/slang/slang-reflection-api.cppsource/slang/slang-reflection-json.cpptests/cooperative-matrix/mat-mul-add-cuda-codegen.slangtests/cooperative-vector/matrix-mul-hlsl-codegen.slangtests/cooperative-vector/matrix-mul-spirv-codegen.slangtests/cooperative-vector/training-cuda-codegen.slangtests/cooperative-vector/training-hlsl-codegen.slangtests/cooperative-vector/training-spirv-codegen.slangtests/cuda/optix-coopvec-packed-input-diagnostic.slangtests/cuda/optix-coopvec-transpose-diagnostic.slangtests/cuda/optix-coopvec.slangtools/gfx/slang.slangtools/render-test/shader-input-layout.cpptools/slang-test/slang-test-main.cpptools/slang-unit-test/unit-test-special-scalar-reflection.cpp
16e8924 to
c6f8975
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (12)
source/slang/slang-reflection-json.cpp (1)
459-473:⚠️ Potential issue | 🟠 MajorInclude
IntPtr/UIntPtrin scalar JSON mappingThis switch still omits pointer-sized integer scalar kinds, so they can fall back to
"unknown"(and trigger the debug assert path).Suggested patch
CASE(Int64, int64); CASE(UInt64, uint64); + CASE(IntPtr, intptr); + CASE(UIntPtr, uintptr); CASE(Float16, float16); CASE(Float32, float32); CASE(Float64, float64); CASE(BFloat16, bfloat16);As per coding guidelines, "Cross-backend consistency — changes to one emitter may need parallel changes in others."
source/slang/slang-ir.h (1)
2135-2137:⚠️ Potential issue | 🔴 CriticalAdd explicit module-version bounds checks in deserialize path
Line 2136 bumps the max IR module version, but deserialization still appears to accept
m_versionwithout enforcing[k_minSupportedModuleVersion, k_maxSupportedModuleVersion]before consuming IR. Please reject out-of-range versions with a diagnostic in the read path (readSerializedModuleInfo/readSerializedModuleIR_).As per coding guidelines, "Null pointer safety and proper error handling via diagnostics."
source/slang/slang-ir.cpp (1)
9013-9015: 🧹 Nitpick | 🔵 TrivialRe-raise: clarify intentional omission of accumulate coop ops.
Please add a short comment here stating that
kIROp_CoopVecOuterProductAccumulateandkIROp_CoopVecReduceSumAccumulateare intentionally excluded because they write memory and must remain side-effecting.Suggested patch
+ // `kIROp_CoopVecOuterProductAccumulate` and `kIROp_CoopVecReduceSumAccumulate` + // are intentionally excluded: they write to memory and must be side-effecting. case kIROp_CoopMatMulAdd: case kIROp_CoopVecMatMul: case kIROp_CoopVecMatMulAdd:As per coding guidelines, "IR pass correctness — ensure SSA form and type invariants are maintained."
source/core/slang-type-text-util.cpp (1)
14-31:⚠️ Potential issue | 🟠 Major
IntPtr/UIntPtrare still missing from scalar text mappings.
SLANG_SCALAR_TYPES(Line 14) is the single source for bothfindScalarType()andgetScalarTypeName().IntPtr/UIntPtrare present in the public scalar enum but still not mapped here, sointptr_t/uintptr_twon’t round-trip through text utilities.As per coding guidelines `source/core/**`: "Core utilities shared across the compiler. Check for: ... (3) Platform portability (Windows, Linux, macOS)."🔧 Proposed fix
`#define` SLANG_SCALAR_TYPES(x) \ x(None, none) \ x(Void, void) \ x(Bool, bool) \ x(Float16, half) \ x(UInt8, uint8_t) \ x(Int8, int8_t) \ x(UInt16, uint16_t) \ x(Int16, int16_t) \ x(UInt32, uint32_t) \ x(Int32, int32_t) \ x(Int64, int64_t) \ x(UInt64, uint64_t) \ + x(IntPtr, intptr_t) \ + x(UIntPtr, uintptr_t) \ x(Float32, float) \ x(Float64, double) \ x(BFloat16, bfloat16) \ x(FloatE4M3, float_e4m3) \ x(FloatE5M2, float_e5m2)tests/cooperative-vector/training-spirv-codegen.slang (1)
3-5:⚠️ Potential issue | 🟠 MajorThe SPIR-V check is still too weak.
These assertions only prove that the training capability and opcodes appear. They still won't catch regressions in the encoded offset/stride/layout/component operands, and they also don't assert the base
CooperativeVectorNVcapability that becomes necessary onceOpTypeCooperativeVectorNVis emitted.As per coding guidelines, "tests/**: Verify test correctness. Tests use //TEST directives: COMPARE_COMPUTE for GPU compute tests, INTERPRET for CPU interpreter tests. Ensure new features have corresponding tests. Check that expected outputs match the intended behavior, not just current behavior."
tests/cuda/optix-coopvec-packed-input-diagnostic.slang (1)
11-27:⚠️ Potential issue | 🟠 MajorAdd the packed
MatMulAdddiagnostic case too.This file only exercises
coopVecMatMulPacked, but the OptiX CUDA path has a parallel packed-input rejection forcoopVecMatMulAddPackedas well. If the add variant stops diagnosing packed input, this test still passes.As per coding guidelines, "tests/**: Verify test correctness. Tests use //TEST directives: COMPARE_COMPUTE for GPU compute tests, INTERPRET for CPU interpreter tests. Ensure new features have corresponding tests. Check that expected outputs match the intended behavior, not just current behavior."
tests/cooperative-vector/matrix-mul-hlsl-codegen.slang (1)
5-8:⚠️ Potential issue | 🟠 MajorThese checks still can't detect a swapped
elementCount/k.Every expectation uses
int(4), int(4), so an emission bug that flips those two arguments still passes FileCheck. Make at least one matmul and one matmul-add case use distinct values so the argument order is actually verified.As per coding guidelines, "tests/**: Verify test correctness. Tests use //TEST directives: COMPARE_COMPUTE for GPU compute tests, INTERPRET for CPU interpreter tests. Ensure new features have corresponding tests. Check that expected outputs match the intended behavior, not just current behavior."
tests/cooperative-vector/training-cuda-codegen.slang (1)
3-4:⚠️ Potential issue | 🟡 MinorLoosen the literal matching in these CHECKs.
These assertions still depend on one pretty-printing of the offset and stride. A harmless change from
int(0)to0, or from32Uto32, would fail the test without changing the generated call semantics.♻️ Suggested CHECK update
-// CHECK: optixCoopVecOuterProductAccumulate({{.*}}, {{.*}}, (CUdeviceptr)(&({{.*}})), int(0), 32U) -// CHECK: optixCoopVecReduceSumAccumulate({{.*}}, (CUdeviceptr)(&({{.*}})), int(0)) +// CHECK: optixCoopVecOuterProductAccumulate({{.*}}, {{.*}}, (CUdeviceptr)(&({{.*}})), {{(int\()?0\)?}}, {{32U?}}) +// CHECK: optixCoopVecReduceSumAccumulate({{.*}}, (CUdeviceptr)(&({{.*}})), {{(int\()?0\)?}})As per coding guidelines,
tests/**: “Check that expected outputs match the intended behavior, not just current behavior.”tests/cooperative-vector/matrix-mul-spirv-codegen.slang (1)
15-38:⚠️ Potential issue | 🟡 MinorAdd a mixed-signedness operand-mask case.
This setup only exercises the fully signed path, so it never validates the operand-mask logic when input, matrix, bias, and result signedness differ. A bug there would still pass this test. A second case like unsigned input with signed result would cover the missing branch.
As per coding guidelines,
tests/**: “Ensure new features have corresponding tests.”source/slang/slang-emit-spirv.cpp (2)
9132-9163:⚠️ Potential issue | 🟡 MinorUnwrap constant refs before remapping coop-vector enums.
These helpers still assume raw
IRIntLit/IRBoolLit. If legalization leaves a wrapped constant here (for example viaIRGlobalValueRef), SPIR-V emission hits the cast even though the operand is compile-time constant. Normalize once up front so the mapped enum operands and packed flag follow the same constant-handling path as the rest of the emitter.Patch sketch
+ IRInst* unwrapCoopVecEnumOperand(IRInst* operand) + { + if (auto globalValueRef = as<IRGlobalValueRef>(operand)) + return globalValueRef->getValue(); + return operand; + } + void emitMappedCoopVecMatrixLayoutOperand(IRInst* operand) { + operand = unwrapCoopVecEnumOperand(operand); auto intLit = cast<IRIntLit>(operand); emitOperand(emitIntConstant( mapSlangCoopVecMatrixLayoutToSpv(intLit->getValue()), operand->getDataType())); } @@ void emitMappedCoopVecComponentTypeOperand( IRInst* operand, IRInst* inputInterpretationIsPacked = nullptr) { + operand = unwrapCoopVecEnumOperand(operand); auto intLit = cast<IRIntLit>(operand); bool isPacked = false; if (inputInterpretationIsPacked) { - isPacked = cast<IRBoolLit>(inputInterpretationIsPacked)->getValue(); + auto packedOperand = unwrapCoopVecEnumOperand(inputInterpretationIsPacked); + isPacked = cast<IRBoolLit>(packedOperand)->getValue(); }As per coding guidelines,
source/slang/**: IR pass correctness — ensure SSA form and type invariants are maintained.
9177-9184:⚠️ Potential issue | 🟠 MajorBuild the cooperative-vector signedness mask from the interpretation operands.
operandsMaskis still derived from the storage element types, while the instruction explicitly emitsinputInterpretation,matrixInterpretation, andbiasInterpretation. Packed/reinterpreted paths can make those disagree, andbiasInterpretationnever contributes toMatrixCsignedness here. Use the interpretation operands as the source of truth so the enum operands and signedness bits describe the same math.As per coding guidelines,
source/slang/**: IR pass correctness — ensure SSA form and type invariants are maintained.Also applies to: 9221-9228
prelude/slang-cuda-prelude.h (1)
6483-6504:⚠️ Potential issue | 🔴 CriticalKeep the
OptixCoopVecTraitsspecialization visible in OptiX 9.0.Line 6472 still guards the only
OptixCoopVecTraits<OptixCoopVec<T, N>>specialization withOPTIX_VERSION > 90000, while all three wrappers here are enabled fromOPTIX_VERSION >= 90000. On OptiX 9.0,OptixCoopVecTraits<VecTOut>::size/OptixCoopVecTraits<VecTIn>::sizetherefore remain unspecialized, so any instantiation of these overloads fails.Suggested fix
-#if defined(OPTIX_VERSION) && OPTIX_VERSION > 90000 +#if defined(OPTIX_VERSION) && OPTIX_VERSION >= 90000 template<typename T, unsigned int N> struct OptixCoopVecTraits<OptixCoopVec<T, N>> { static constexpr unsigned int size = N; }; `#endif`As per coding guidelines,
prelude/**: "Built-in language definitions and intrinsics. Changes here affect all Slang programs. Verify backward compatibility and check that all target backends handle new intrinsics."#!/bin/bash set -euo pipefail sed -n '6464,6566p' prelude/slang-cuda-prelude.hAlso applies to: 6511-6536, 6544-6565
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b0b9e224-0c88-4d6b-beb7-95d796faec58
📒 Files selected for processing (32)
include/slang.hprelude/slang-cuda-prelude.hsource/core/slang-type-text-util.cppsource/slang-wasm/slang-wasm-bindings.cppsource/slang/hlsl.meta.slangsource/slang/slang-emit-c-like.cppsource/slang/slang-emit-cuda.cppsource/slang/slang-emit-hlsl.cppsource/slang/slang-emit-hlsl.hsource/slang/slang-emit-spirv.cppsource/slang/slang-emit.cppsource/slang/slang-ir-insts-stable-names.luasource/slang/slang-ir-insts.luasource/slang/slang-ir-validate.cppsource/slang/slang-ir-validate.hsource/slang/slang-ir.cppsource/slang/slang-ir.hsource/slang/slang-reflection-api.cppsource/slang/slang-reflection-json.cpptests/cooperative-matrix/mat-mul-add-cuda-codegen.slangtests/cooperative-vector/matrix-mul-hlsl-codegen.slangtests/cooperative-vector/matrix-mul-spirv-codegen.slangtests/cooperative-vector/training-cuda-codegen.slangtests/cooperative-vector/training-hlsl-codegen.slangtests/cooperative-vector/training-spirv-codegen.slangtests/cuda/optix-coopvec-packed-input-diagnostic.slangtests/cuda/optix-coopvec-transpose-diagnostic.slangtests/cuda/optix-coopvec.slangtools/gfx/slang.slangtools/render-test/shader-input-layout.cpptools/slang-test/slang-test-main.cpptools/slang-unit-test/unit-test-special-scalar-reflection.cpp
csyonghe
left a comment
There was a problem hiding this comment.
Looks like we also need a new test to cover the metadata reporting is correct.
jkwak-work
left a comment
There was a problem hiding this comment.
Good job overall.
I left a few comments.
|
@cmarcelo , can you share your update or eta for this PR? Please let us know if you need helps. |
46e92ec to
bb394be
Compare
bb394be to
72394b8
Compare
|
I see the following compilation error from slangpy test: @cmarcelo , can you address this problem? |
|
Falcor test shows the following compilation error as well: |
I think you are referring to my comment here.
I will pull and build to reproduce the complation error you are talking about. |
We are trying to add more types to ScalarType: IntPtr, UIntPtr and BFloat16; in shader-slang/slang#10643 And slang-rhi build prints a warning that the new types are not handled in a switch-statement for WGPU. Because WGPU cannot represent those types, this PR converts the new types as Undefined and avoid the compilation warnings, which is treated an errors on CI machines.
|
I have a fix on slang-rhi side build warning: |
We are trying to add more types to ScalarType: IntPtr, UIntPtr and BFloat16; in shader-slang/slang#10643 And slang-rhi build prints a warning that the new types are not handled in a switch-statement for WGPU. Because WGPU cannot represent those types, this PR converts the new types as Undefined and avoid the compilation warnings, which is treated an errors on CI machines.
72394b8 to
9ab3f26
Compare
9ab3f26 to
3c3d0f1
Compare
|
@cmarcelo , can you merge my change to fix the compiler warning and the failing tests? |
fc3c27e to
6496926
Compare
Done. |
OpCooperativeVectorMatrixMulNV (and related NV ops) require the matrix/buffer operand to be a pointer to an array type. After SPIRV legalization, ByteAddressBuffer/StructuredBuffer global params become ptr-to-struct, which fails the SPIRV validator with "Pointer's Type must be an array type". Add emitBufferPtrAsArrayPtr() that emits OpAccessChain index 0 to pierce through the wrapper struct when needed. For Ptr<T[]> inputs that already point directly to an unsized array, the pointer is returned as-is. Apply the helper in emitCoopVecMatMulAdd (matrix and bias operands), emitCoopVecOuterProductAccumulate (matrix operand), and emitCoopVecReduceSumAccumulate (buffer operand).
6496926 to
2fc068e
Compare
There was a problem hiding this comment.
Verdict: ✅ Clean — no significant issues found. 2 minor gaps noted below.
This PR replaces inline __target_switch / spirv_asm / __intrinsic_asm blocks in hlsl.meta.slang with 4 new IR instructions (CoopMatMulAdd, CoopVecMatMulAdd, CoopVecOuterProductAccumulate, CoopVecReduceSumAccumulate) and moves target-specific codegen into C++ emitters. The architecture is sound — enum mappings are centralized through SLANG_SCALAR_TYPE_* / SLANG_COOPERATIVE_* constants, side-effect analysis correctly marks the two accumulation ops as having side effects while the two multiply ops are pure, and a new ~400-line IR validation pass catches malformed cooperative operations before they reach backend emitters.
Changes Overview
New IR Instructions (slang-ir-insts.lua, slang-ir-insts-stable-names.lua, slang-ir.cpp, slang-ir.h)
- What changed: 4 new instruction definitions with named operands (including optional bias params for
CoopVecMatMulAdd), stable name entries 779–782,mightHaveSideEffects()updated to markCoopMatMulAdd/CoopVecMatMulAddas side-effect-free, module version bumped 12→13.
Standard Library Refactor (hlsl.meta.slang)
- What changed: Removed ~500 lines of per-target
__target_switch/spirv_asm/__intrinsic_asmblocks for cooperative matrix/vector multiply, outer product, and reduce-sum operations. Replaced with__intrinsic_opdeclarations that lower to the new IR instructions. Added__getCoopVecComponentScalarTypeto mapCoopVecComponentTypetoSLANG_SCALAR_TYPE_*. Renamed__inputInterpretationPackingFactor→__componentPackingFactor.
SPIRV Backend (slang-emit-spirv.cpp)
- What changed: Added
emitCoopMatMulAdd(KHR),emitCoopVecMatMulAdd(NV),emitCoopVecOuterProductAccumulate(NV),emitCoopVecReduceSumAccumulate(NV). AddedemitBufferPtrAsArrayPtrhelper for SPIRV buffer→array-ptr legalization. Added enum mapping functions (mapSlangCooperativeMatrixUseToSpv,mapSlangCoopVecMatrixLayoutToSpv,mapSlangCoopVecComponentTypeToSpv). Cooperative matrix use values now mapped throughmapSlangCooperativeMatrixUseToSpvinstead of passing raw enum values.
HLSL Backend (slang-emit-hlsl.cpp, slang-emit-hlsl.h)
- What changed: Added handlers for
CoopVecMatMulAdd(emitting__builtin_MatVecMul/__builtin_MatVecMulAdd),CoopVecOuterProductAccumulate(__builtin_OuterProductAccumulate), andCoopVecReduceSumAccumulate(__builtin_VectorAccumulate). Added_mapSlangCoopVecComponentTypeToHLSLand_mapSlangCoopVecMatrixLayoutToHLSLmapping functions.
CUDA/OptiX Backend (slang-emit-cuda.cpp, slang-cuda-prelude.h)
- What changed: Removed runtime
slangToOptixComponentType/slangToOptixMatrixLayoutconstexpr mappers from prelude — mapping now happens at compile time in the C++ emitter. Template parameters changed fromunsignedto direct OptiX enum types. AddedgetOptixCoopVecComponentTypeName/getOptixCoopVecMatrixLayoutNamemappers. Added handlers for all 4 IR ops with OptiX-specific constraint validation (training-optimal layout, float16 interpretation for outer product).
IR Validation (slang-ir-validate.cpp, slang-ir-validate.h, slang-emit.cpp)
- What changed: Added
validateCooperativeOperationspass (~400 lines) checking operand counts, type compatibility, dimension constraints, and input interpretation validity for all 4 new IR ops. Registered asSLANG_PASS(validateCooperativeOperations, sink)in the emit pipeline.
Public API & Reflection (slang.h, slang-reflection-api.cpp, slang-reflection-json.cpp, slang-type-text-util.cpp, slang-wasm-bindings.cpp)
- What changed: Added
SlangCooperativeMatrixUseandSlangCooperativeVectorMatrixLayoutenums. AddedSLANG_SCALAR_TYPE_BFLOAT16,SLANG_SCALAR_TYPE_FLOAT_E4M3,SLANG_SCALAR_TYPE_FLOAT_E5M2scalar types. ExtendedTypeReflection::ScalarTypewithIntPtr,UIntPtr,BFloat16,FloatE4M3,FloatE5M2. Updated reflection API to recognize special scalar types.
Tests (10 new test files)
- What changed: Added codegen tests for SPIRV, HLSL, CUDA across cooperative matrix and vector operations. Added 2 CUDA diagnostic tests for packed-input and transpose constraints. Added unit test for special scalar type reflection.
Findings (2 total)
| Severity | Location | Finding |
|---|---|---|
| 🟡 Gap | hlsl.meta.slang:~30985 |
BFloat16 missing from __getCoopVecComponentScalarType — backend-specific diagnostics in SPIRV/HLSL emitters are unreachable dead code |
| 🟡 Gap | slang-ir-validate.cpp:~1196 |
Validation pass error paths lack negative/diagnostic test coverage |
jkwak-work
left a comment
There was a problem hiding this comment.
Looks good to me.
It is a big change but I reviewed it multiple times over multiple days.
No description provided.