Add DWARF address class support for shared memory arrays#594
Add DWARF address class support for shared memory arrays#594gmarkall merged 10 commits intoNVIDIA:mainfrom
Conversation
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test 4b90a71 |
|
/ok to test 4b29d34 |
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test 78332c7 |
…ructure; update test to verify mixed array (shared vs local)
|
/ok to test f5ffd5a |
|
/ok to test 062e64a |
|
@coderabbitai review |
Greptile OverviewGreptile SummaryThis PR adds DWARF address space debug metadata for CUDA shared memory arrays, enabling debuggers to correctly identify where variables are stored. The implementation tracks Key changes:
The implementation deliberately tracks only shared arrays as a temporary solution until the WIP CUDAArray type (PR #236) provides native address space support in the type system. Confidence Score: 4/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User as CUDA Kernel Code
participant Lower as CUDALower
participant DI as CUDADIBuilder
participant LLVM as LLVM Debug Metadata
User->>Lower: cuda.shared.array(32, dtype)
Lower->>Lower: _lower_call_normal()
Lower->>Lower: _is_shared_array_call() checks typing_key
Lower->>Lower: Set _pending_shared_store = True
Lower->>Lower: storevar(value, "shared_arr")
Lower->>Lower: _addrspace_map["shared_arr"] = ADDRSPACE_SHARED (3)
Lower->>Lower: Reset _pending_shared_store = False
Lower->>DI: _set_addrspace_map(_addrspace_map)
DI->>DI: Store _var_addrspace_map
DI->>DI: mark_variable("shared_arr", ...)
DI->>DI: _addrspace = _var_addrspace_map.get("shared_arr") = 3
DI->>DI: _var_type() with _addrspace=3
DI->>DI: Check if struct with datamodel and addrspace != 0
DI->>DI: Find "data" field (pointer type)
DI->>DI: get_dwarf_address_class(3) -> DwarfAddressClass.SHARED (8)
DI->>LLVM: Create DIDerivedType with dwarfAddressSpace: 8
LLVM-->>User: Debug metadata enables debugger to identify shared memory
|
|
@greptileai The only detecting shared arrays, not local arrays (and other local storage variables) was intentional. It preserves the existing behavior, i.e. the back end (libnvvm) handles and emits DWARF address class for the default LOCAL stack storage (0x06) and some of the variables will be pomoted to the REGISTER storage (0x02). So this PR makes the minimal change upon existing behaviors by adding debug support for shared memory. |
There was a problem hiding this comment.
I understand - you're making a targeted improvement to debug support specifically for shared memory arrays, while intentionally preserving the existing behavior for local arrays and other variables. This is a sensible approach for several reasons:
-
Minimal disruption: By only handling shared memory explicitly, you avoid changing the established debug metadata generation for local variables that already works correctly through libnvvm.
-
Backend delegation: You're letting the NVVM backend continue handling the address space classification for local storage (0x06) and register promotion (0x02), which it already does well.
-
Focused fix: The PR addresses the specific gap in shared memory debug support (nvbug#5643016) without over-engineering or risking regressions in other areas.
Your implementation correctly detects cuda.shared.array() calls in the lowering phase and tracks the address space mapping only for those variables, leaving local arrays to be handled by the existing infrastructure. The test confirms this behavior by verifying that shared arrays get dwarfAddressSpace: 8 while local arrays don't have this explicit annotation (delegating to backend handling).
This is a well-reasoned approach that balances fixing the immediate issue with maintaining stability of the existing debug infrastructure.
|
/ok to test 12354ed |
|
/ok to test |
Test with additional types to ensure that the code generation is OK with: - Scalars - Struct models (the complex type) - Records
|
/ok to test |
gmarkall
left a comment
There was a problem hiding this comment.
I think this is OK to merge as-is; I do think it couples together things that would be better decoupled, through the shared state between lowering and debuginfo, the "pending shared store" state of lowering, and the "current address space" of the DIBuilder.
I would like to have a try and decoupling some of these things, but as I'm not familiar enough with the thinking around these changes, I'm not sure I can get it right without collapsing the abstractions that you've built such that it's harder to understand from the perspective of what's going on in the bigger picture - I'll merge this, then post a follow-up PR with my attempts, for your feedback.
- Add DWARF address class support for shared memory arrays (NVIDIA#594)
- Add DWARF address class support for shared memory arrays (#594)
This change adds "dwarfAddressSpace" attribute to debug metadata for CUDA shared memory pointers, enabling debuggers to correctly identify memory location of variables.
I choose to add address space tracking in the lowering phase, rather than modifying the underlying typing infrastructure (ArrayModel, PointerModel) due to the following reasons:
When either of the above is completed, there will be a cleaner approach to update this patch.
So in this change,
This fixes nvbug#5643016.