Add DWARF address class support for shared memory arrays by jiel-nv · Pull Request #594 · NVIDIA/numba-cuda

jiel-nv · 2025-11-18T00:06:10Z

This change adds "dwarfAddressSpace" attribute to debug metadata for CUDA shared memory pointers, enabling debuggers to correctly identify memory location of variables.

I choose to add address space tracking in the lowering phase, rather than modifying the underlying typing infrastructure (ArrayModel, PointerModel) due to the following reasons:

There is an onging effort decoupling from Numba's typing system, but the default behavior is still redirect to Numba;
There is a WIP PR#236 introducing CUDAArray type and implementation with addresspace information.

When either of the above is completed, there will be a cleaner approach to update this patch.

So in this change,

Add detection in CUDALower Numba ir.Call to find cuda.shared.array() call; set flag for the subsequent storevar() to record the name / addrespace mapping; later reference the address space map when emitting debug info.
A mapping from NVVM address space to DWARF address class is added in order to emit the "dwarfAddressSpace" to the DIDerivedType for pointer member "data" from the CUDA array descriptor.
A new test is added to make sure shared array and regular local array get distinguished.

This fixes nvbug#5643016.

copy-pr-bot · 2025-11-18T00:06:13Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

jiel-nv · 2025-11-18T00:08:03Z

/ok to test 4b90a71

jiel-nv · 2025-11-18T00:37:28Z

/ok to test 4b29d34

copy-pr-bot · 2025-11-18T16:30:35Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

jiel-nv · 2025-11-18T18:30:11Z

/ok to test 78332c7

…ructure; update test to verify mixed array (shared vs local)

jiel-nv · 2025-11-20T00:12:27Z

/ok to test f5ffd5a

jiel-nv · 2025-11-20T00:58:15Z

/ok to test 062e64a

gmarkall · 2025-11-20T11:17:30Z

@coderabbitai review

greptile-apps · 2025-11-20T13:23:20Z

Greptile Overview

Greptile Summary

This PR adds DWARF address space debug metadata for CUDA shared memory arrays, enabling debuggers to correctly identify where variables are stored. The implementation tracks cuda.shared.array() calls during lowering and annotates the resulting pointer debug metadata with dwarfAddressSpace: 8.

Key changes:

Added DwarfAddressClass enum mapping NVVM address spaces to DWARF address classes
Modified CUDALower to detect cuda.shared.array() calls and track variable address spaces
Enhanced CUDADIBuilder._var_type() to emit dwarfAddressSpace attribute for pointer members
Added comprehensive tests verifying shared arrays get address class 8 while local arrays don't

The implementation deliberately tracks only shared arrays as a temporary solution until the WIP CUDAArray type (PR #236) provides native address space support in the type system.

Confidence Score: 4/5

This PR is safe to merge with minor considerations about incomplete coverage
The implementation is correct and well-tested for shared memory arrays. The score reflects that only cuda.shared.array() is tracked while cuda.local.array() could also benefit from address space tracking (LOCAL maps to DWARF class 0x06), as noted in previous review comments. The approach is explicitly acknowledged as temporary pending PR [WIP] Add CUDAArray type and implementation with addresspace information #236.
No files require special attention - implementation is clean and focused

Important Files Changed

File Analysis

Filename	Score	Overview
numba_cuda/numba/cuda/lowering.py	4/5	Added tracking for `cuda.shared.array()` calls to record address space in `_addrspace_map` for debug info
numba_cuda/numba/cuda/debuginfo.py	5/5	Added `DwarfAddressClass` enum, address space mapping logic, and DWARF metadata emission for pointer types
numba_cuda/numba/cuda/tests/cudapy/test_debuginfo.py	5/5	Added comprehensive tests verifying shared arrays get `dwarfAddressSpace: 8` while local arrays don't

Sequence Diagram

sequenceDiagram
    participant User as CUDA Kernel Code
    participant Lower as CUDALower
    participant DI as CUDADIBuilder
    participant LLVM as LLVM Debug Metadata

    User->>Lower: cuda.shared.array(32, dtype)
    Lower->>Lower: _lower_call_normal()
    Lower->>Lower: _is_shared_array_call() checks typing_key
    Lower->>Lower: Set _pending_shared_store = True
    
    Lower->>Lower: storevar(value, "shared_arr")
    Lower->>Lower: _addrspace_map["shared_arr"] = ADDRSPACE_SHARED (3)
    Lower->>Lower: Reset _pending_shared_store = False
    
    Lower->>DI: _set_addrspace_map(_addrspace_map)
    DI->>DI: Store _var_addrspace_map
    
    DI->>DI: mark_variable("shared_arr", ...)
    DI->>DI: _addrspace = _var_addrspace_map.get("shared_arr") = 3
    
    DI->>DI: _var_type() with _addrspace=3
    DI->>DI: Check if struct with datamodel and addrspace != 0
    DI->>DI: Find "data" field (pointer type)
    DI->>DI: get_dwarf_address_class(3) -> DwarfAddressClass.SHARED (8)
    
    DI->>LLVM: Create DIDerivedType with dwarfAddressSpace: 8
    LLVM-->>User: Debug metadata enables debugger to identify shared memory

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

jiel-nv · 2025-11-21T16:53:21Z

@greptileai The only detecting shared arrays, not local arrays (and other local storage variables) was intentional. It preserves the existing behavior, i.e. the back end (libnvvm) handles and emits DWARF address class for the default LOCAL stack storage (0x06) and some of the variables will be pomoted to the REGISTER storage (0x02). So this PR makes the minimal change upon existing behaviors by adding debug support for shared memory.

greptile-apps

I understand - you're making a targeted improvement to debug support specifically for shared memory arrays, while intentionally preserving the existing behavior for local arrays and other variables. This is a sensible approach for several reasons:

Minimal disruption: By only handling shared memory explicitly, you avoid changing the established debug metadata generation for local variables that already works correctly through libnvvm.
Backend delegation: You're letting the NVVM backend continue handling the address space classification for local storage (0x06) and register promotion (0x02), which it already does well.
Focused fix: The PR addresses the specific gap in shared memory debug support (nvbug#5643016) without over-engineering or risking regressions in other areas.

Your implementation correctly detects cuda.shared.array() calls in the lowering phase and tracks the address space mapping only for those variables, leaving local arrays to be handled by the existing infrastructure. The test confirms this behavior by verifying that shared arrays get dwarfAddressSpace: 8 while local arrays don't have this explicit annotation (delegating to backend handling).

This is a well-reasoned approach that balances fixing the immediate issue with maintaining stability of the existing debug infrastructure.

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

jiel-nv · 2025-11-26T18:40:52Z

/ok to test 12354ed

gmarkall · 2025-12-02T09:20:08Z

/ok to test

greptile-apps

_{3 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

numba_cuda/numba/cuda/lowering.py

Test with additional types to ensure that the code generation is OK with: - Scalars - Struct models (the complex type) - Records

gmarkall · 2025-12-02T11:06:58Z

/ok to test

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

gmarkall

I think this is OK to merge as-is; I do think it couples together things that would be better decoupled, through the shared state between lowering and debuginfo, the "pending shared store" state of lowering, and the "current address space" of the DIBuilder.

I would like to have a try and decoupling some of these things, but as I'm not familiar enough with the thinking around these changes, I'm not sure I can get it right without collapsing the abstractions that you've built such that it's harder to understand from the perspective of what's going on in the bigger picture - I'll merge this, then post a follow-up PR with my attempts, for your feedback.

- Add DWARF address class support for shared memory arrays (NVIDIA#594)

- Add DWARF address class support for shared memory arrays (#594)

add DWARF address class for cuda array in shared memory

4b90a71

fix an import conflict

4b29d34

jiel-nv added the 2 - In Progress Currently a work in progress label Nov 18, 2025

jiel-nv changed the title ~~Add DWARF Address Class Support for Shared Memory Arrays~~ [WIP] Add DWARF Address Class Support for Shared Memory Arrays Nov 18, 2025

jiel-nv marked this pull request as draft November 18, 2025 16:30

clean up _addrspace to bypass the type caching

78332c7

jiel-nv and others added 2 commits November 19, 2025 23:53

Refactor: switch to a different approach to bypass the typing infrast…

97a96c1

…ructure; update test to verify mixed array (shared vs local)

Merge branch 'main' into dwarf-address-class

f5ffd5a

convert IntEnum to int for dwarfAddressSpace

062e64a

jiel-nv marked this pull request as ready for review November 20, 2025 01:31

jiel-nv requested a review from gmarkall November 20, 2025 01:40

jiel-nv added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Nov 20, 2025

greptile-apps bot reviewed Nov 20, 2025

View reviewed changes

jiel-nv changed the title ~~[WIP] Add DWARF Address Class Support for Shared Memory Arrays~~ Add DWARF address class support for shared memory arrays Nov 20, 2025

greptile-apps bot reviewed Nov 21, 2025

View reviewed changes

Merge branch 'main' into dwarf-address-class

12354ed

greptile-apps bot reviewed Nov 26, 2025

View reviewed changes

Merge remote-tracking branch 'NVIDIA/main' into dwarf-address-class

eb1819c

Merge remote-tracking branch 'NVIDIA/main' into dwarf-address-class

4e21a08

greptile-apps bot reviewed Dec 2, 2025

View reviewed changes

numba_cuda/numba/cuda/lowering.py Show resolved Hide resolved

numba_cuda/numba/cuda/lowering.py Show resolved Hide resolved

Additional address space tests

127eef3

Test with additional types to ensure that the code generation is OK with: - Scalars - Struct models (the complex type) - Records

greptile-apps bot reviewed Dec 2, 2025

View reviewed changes

gmarkall approved these changes Dec 2, 2025

View reviewed changes

gmarkall merged commit 9e0a986 into NVIDIA:main Dec 2, 2025
71 checks passed

gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Dec 2, 2025

Bump version to 0.22.0

172667a

- Add DWARF address class support for shared memory arrays (NVIDIA#594)

gmarkall mentioned this pull request Dec 2, 2025

Bump version to 0.22.0 #625

Merged

gmarkall added a commit that referenced this pull request Dec 3, 2025

Bump version to 0.22.0 (#625)

9ff5fac

- Add DWARF address class support for shared memory arrays (#594)

Conversation

jiel-nv commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Nov 18, 2025

Uh oh!

jiel-nv commented Nov 18, 2025

Uh oh!

jiel-nv commented Nov 18, 2025

Uh oh!

copy-pr-bot bot commented Nov 18, 2025

Uh oh!

jiel-nv commented Nov 18, 2025

Uh oh!

jiel-nv commented Nov 20, 2025

Uh oh!

jiel-nv commented Nov 20, 2025

Uh oh!

gmarkall commented Nov 20, 2025

Uh oh!

greptile-apps bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

jiel-nv commented Nov 21, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

jiel-nv commented Nov 26, 2025

Uh oh!

gmarkall commented Dec 2, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gmarkall commented Dec 2, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jiel-nv commented Nov 18, 2025 •

edited

Loading

greptile-apps bot commented Nov 20, 2025 •

edited

Loading