Add DWARF variant part support for polymorphic variables in CUDA debug info #544

jiel-nv · 2025-10-23T19:34:39Z

Overview

This PR implements proper DWARF debug information generation for polymorphic variables in Numba CUDA kernels using DW_TAG_variant_part (discriminated unions). This enables debuggers like cuda-gdb to correctly inspect polymorphic variables that can hold different types throughout their lifetime.

Key Changes

1. Enhanced Polymorphic Variable Lowering (`lowering.py`)

With CUDA_DEBUG_POLY enabled:
- Allocates storage for discriminant + data
- Stores discriminant at offset 0 and data at offset sizeof(fetype)
- Retrieves data pointer with proper offset calculation
- Memory layout:
  [discriminant: 8 bits][padding][data: varies by type]

2. DWARF Variant Part Metadata Generation (`debuginfo.py`)

Generate proper debug info metadata for variant types:
- Creates DW_TAG_variant_part with discriminator field
- Emits variant members with extraData field containing discriminant values
- Wraps variant part in a structure type with proper size calculations
- Each variant member includes correct offset (equals element's width in bits)
  The final DWARF structure: after this change would look like the following e.g.,
```
variant_wrapper_struct (DW_TAG_structure_type)
├── discriminator (DW_TAG_member, 8 bits)
└── variant_part (DW_TAG_variant_part)
    ├── _bool (DW_TAG_member, extraData: 0, offset: 8)
    ├── _float64 (DW_TAG_member, extraData: 1, offset: 64)
    ├── _int32 (DW_TAG_member, extraData: 2, offset: 32)
    └── _int64 (DW_TAG_member, extraData: 3, offset: 64)
```
Added llvmlite version-dependent logic for extraData emission:
- Typed constant (i8 N) for newer llvmlite versions (> 0.45)
- Metadata node reference for older versions (<= 0.45), requires an in progress nvvm patch
- Controlled by CUDA_DEBUG_POLY_USE_TYPED_CONST config flag
Added CTK version threshold check:
- Feature enabled: When CTK > 13.1
- Feature disabled: When CTK <= 13.1

3. Test Coverage (`test_debuginfo.py`)

Added a new test test_poly_variant_part() to verify DWARF output:
- Conditionally turns on, opt-in via CUDA_DEBUG_POLY configuration flag
- Validates complete hierarchical structure: DILocalVariable → wrapper struct → elements list → discriminator + variant_part → variant members
- Dynamically adapts FileCheck patterns based on CUDA_DEBUG_POLY_USE_TYPED_CONST
- Verifies all variant members (_bool, _float64, _int32, _int64) with correct offsets

4. Backward Compatibility

Without CUDA_DEBUG_POLY: Falls back to existing behavior, i.e. user mentally choose runtime type from a union like types wrapper in a debugging session.
Older test (test_union_poly_types) preserved and conditionally skipped

This change implements issue#520.

Summary by CodeRabbit

New Features
- Added support for polymorphic debug information for CUDA union types, enabling enhanced debugging capabilities with structured variant metadata.
- Introduced new configuration flags to control polymorphic debug behavior and data representation modes.
Tests
- Added test coverage for polymorphic variant-part debugging information under various configuration scenarios.

copy-pr-bot · 2025-10-23T19:34:43Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

jiel-nv · 2025-10-23T19:38:05Z

/ok to test 387f756

jiel-nv · 2025-10-23T19:47:54Z

/ok to test 98e1283

coderabbitai · 2025-10-23T19:57:58Z

Walkthrough

This PR introduces polymorphic debug information support for CUDA debugging. It adds version detection for llvmlite support, enhanced debug metadata generation for UnionModel payloads using DWARF5 variant_part structures, and corresponding storage/retrieval logic for polymorphic variables with discriminant handling in the lowering layer.

Changes

Cohort / File(s)	Summary
Polymorphic debug capability detection `numba_cuda/numba/cuda/debuginfo.py`	Added `_get_llvmlite_version()` and `_check_polymorphic_debug_info_support()` to detect support based on CUDA/CTK and llvmlite versions. Introduced module-level config flags `DEBUG_POLY_SUPPORTED` and `DEBUG_POLY_USE_TYPED_CONST`.
Debug metadata generation for polymorphic unions `numba_cuda/numba/cuda/debuginfo.py`	Enhanced `CUDADIBuilder._var_type` to handle UnionModel payloads by accumulating member metadata dictionaries and generating DIDerivedType entries. When `CUDA_DEBUG_POLY` is enabled, constructs wrapper struct with discriminator and variant_part; otherwise preserves legacy union-based path.
Polymorphic variable storage/retrieval `numba_cuda/numba/cuda/lowering.py`	Extended `CUDALower.storevar`, `_alloca_var`, and `getvar` to handle polymorphic variables when `CUDA_DEBUG_POLY` is active: allocates two-slot aggregates, writes discriminants, and performs offset-based reads/writes using i8\* bitcasts and GEP. Falls back to original paths for non-polymorphic variables.
Test coverage for polymorphic debug info `numba_cuda/numba/cuda/tests/cudapy/test_debuginfo.py`	Added conditional skip guards for existing union tests based on `CUDA_DEBUG_POLY` state. Introduced `test_poly_variant_part` to validate DWARF5 variant_part debug info with dynamic FileCheck patterns based on typed constant vs node reference configuration.

Sequence Diagram

sequenceDiagram
    participant Config as Configuration
    participant DebugInfo as DebugInfo Builder
    participant Lowering as CUDA Lowering
    participant Runtime as Runtime Execution

    Config->>DebugInfo: Initialize CUDA_DEBUG_POLY flag
    Note over DebugInfo: _check_polymorphic_debug_info_support()<br/>detects llvmlite version & capabilities
    
    alt CUDA_DEBUG_POLY enabled
        Lowering->>Lowering: _alloca_var: allocate 2-slot aggregate<br/>(discriminant + data)
        Lowering->>Runtime: storevar: write discriminant at offset 0
        Lowering->>Runtime: storevar: write data at offset sizeof(fetype)
        Runtime->>Lowering: getvar: read from offset sizeof(fetype)
        DebugInfo->>DebugInfo: _var_type: generate variant_part<br/>wrapper struct metadata
    else CUDA_DEBUG_POLY disabled
        Lowering->>Lowering: _alloca_var: allocate single slot
        Lowering->>Runtime: storevar: standard path
        Runtime->>Lowering: getvar: simple bitcast
        DebugInfo->>DebugInfo: _var_type: emit legacy union<br/>(DICompositeType)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The changes introduce new detection logic, multiple conditional paths in core debug metadata generation and variable storage/retrieval, and span three interconnected files with varying logic density. The polymorphic variant handling in _var_type and the offset-based reads/writes in lowering require careful verification, though the conditional structure preserves backward compatibility for non-polymorphic paths.

Poem

🐰 Hop into the debug realm so deep,
Where variants and types their secrets keep!
With discriminants marking each polymorphic way,
Your CUDA code's mysteries shine bright today! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "Add DWARF Variant Part Support for Polymorphic Variables in CUDA Debug Info" directly captures the main objective of the changeset. The title explicitly identifies the primary feature being introduced (DWARF Variant Part Support), the target use case (Polymorphic Variables), and the context (CUDA Debug Info). The changes across debuginfo.py, lowering.py, and tests all focus on implementing this exact functionality, and the title accurately reflects this scope. The title is concise, specific, and avoids vague terminology, making it clear to developers scanning the history that this PR adds debug info support for polymorphic variables.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jiel-nv · 2025-10-23T19:58:10Z

/ok to test a7f5393

gmarkall · 2025-10-24T13:37:26Z

@coderabbitai review

coderabbitai · 2025-10-24T13:37:52Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

numba_cuda/numba/cuda/tests/cudapy/test_debuginfo.py (1)
467-506: Tighten FileCheck and clean up unused noqa markers

Patterns look correct and adapt to typed-const vs node-ref. LGTM.

Static analysis flags unused “# noqa: F841” on Lines 483–485. Remove them; these assignments are already intentionally unused in a test context.

Apply this minimal cleanup:
-            foo = 3.14  # noqa: F841
-            foo = True  # noqa: F841
-            foo = np.int32(42)  # noqa: F841
+            foo = 3.14
+            foo = True
+            foo = np.int32(42)
numba_cuda/numba/cuda/lowering.py (2)
1863-1867: Alloc size choice matches test expectations; consider documenting alignment rationale

Doubling max element size to house [discriminant + data] matches the 128‑bit wrapper check. A brief comment on why data lives at offset sizeof(fetype) (not a fixed base) would help future maintainers reason about padding/alignment and the DWARF offsets.

1920-1939: Offset read mirrors store; ensure index type width is intentional

Using i32 indices in GEP on i8* works, but LLVM permits any integer width. If targeting 64‑bit consistently, consider IntType(64) to avoid future assertions in newer llvmlite.
-                    [llvm_ir.Constant(llvm_ir.IntType(32), sizeof_bytes)],
+                    [llvm_ir.Constant(llvm_ir.IntType(64), sizeof_bytes)],
numba_cuda/numba/cuda/debuginfo.py (2)

61-70: Config mutation at import time

Setting config defaults here is convenient but surprising. Consider centralizing config defaults in a single module (or guard behind explicit env/config read) to avoid hidden side‑effects on import.

679-794: DWARF variant_part emission looks consistent; add two clarifying comments

Offsets: you set each variant member’s offset to its own width (bits). This aligns with the storage scheme (data at sizeof(fetype)). Add a short comment to tie the metadata to the layout decision.

Discriminator: mark as artificial and 8 bits — good. Consider clarifying that wrapper size = 2*maxwidth to reserve [discriminant+padding]+[max variant] as implemented in lowering.

Add comments near metadata_dict["offset"] and wrapper_struct_size explaining the memory/layout rationale.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2567b28 and a7f5393.

📒 Files selected for processing (3)

numba_cuda/numba/cuda/debuginfo.py (3 hunks)
numba_cuda/numba/cuda/lowering.py (3 hunks)
numba_cuda/numba/cuda/tests/cudapy/test_debuginfo.py (3 hunks)

🧰 Additional context used

🪛 Ruff (0.14.1)

numba_cuda/numba/cuda/debuginfo.py

27-27: Consider moving this statement to an else block

(TRY300)

28-28: Do not catch blind exception: Exception

(BLE001)

54-54: Consider moving this statement to an else block

(TRY300)

56-56: Do not catch blind exception: Exception

(BLE001)

numba_cuda/numba/cuda/tests/cudapy/test_debuginfo.py

483-483: Unused noqa directive (unused: F841)

Remove unused noqa directive

(RUF100)

484-484: Unused noqa directive (unused: F841)

Remove unused noqa directive

(RUF100)

485-485: Unused noqa directive (unused: F841)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (3)

numba_cuda/numba/cuda/tests/cudapy/test_debuginfo.py (2)

10-10: Import of config is appropriate

Needed for gating on CUDA_DEBUG_POLY. No issues.

407-409: Good guard for legacy-union test

Skipping union-format test when variant_part is active avoids false failures.

numba_cuda/numba/cuda/lowering.py (1)

1692-1735: Polymorphic store bypasses parent decref—low risk today, but verify NRT remains disabled

The early return at line 1735 does skip the parent's decref(fetype, old) call (line 1580 in Lower.storevar). This is harmless today because NRT defaults to disabled—decref becomes a no-op—but if CUDA_ENABLE_NRT is ever toggled, old values would leak.

NRT is disabled by default in CUDA targets via the enable_nrt property, which returns getattr(config, "CUDA_ENABLE_NRT", False), and both incref and decref guard with if not self.context.enable_nrt: return. The bypass is therefore safe under current defaults.

To defensively prevent future issues, consider guarding the early return with a check to ensure it only bypasses when NRT is disabled, and add a comment explaining why parent logic must run if NRT is ever enabled.

numba_cuda/numba/cuda/debuginfo.py

…g Info

jiel-nv · 2025-10-28T21:42:00Z

/ok to test 9927acb

jiel-nv · 2025-10-28T21:47:28Z

/ok to test f64dddc

jiel-nv · 2025-10-28T22:04:57Z

/ok to test 2388791

jiel-nv · 2025-11-04T17:48:48Z

/ok to test d1f1dc3

jiel-nv · 2025-11-04T17:57:48Z

/ok to test c8dc398

jiel-nv · 2025-11-04T18:31:14Z

/ok to test 9932955

jiel-nv · 2025-11-05T18:09:58Z

@gmarkall Could you please take another look at this PR?

gmarkall

A couple of small nits on the diff. I'm going to merge main, fix these up and then run CI, because I think this will then be good to merge.

gmarkall · 2025-11-19T11:55:22Z

numba_cuda/numba/cuda/debuginfo.py

+def _get_llvmlite_version():
+    """Get llvmlite version as tuple (major, minor)."""
+    try:
+        import llvmlite


This import can be at the top level - llvmlite will always be available in a working Numba-CUDA installation.

gmarkall · 2025-11-19T11:59:23Z

numba_cuda/numba/cuda/debuginfo.py

+        major = int(parts[0])
+        minor = int(parts[1])
+        return (major, minor)
+    except (ImportError, AttributeError, ValueError):


With the import at the top level an ImportError should not be expected here. If the version splits into less than two parts (e.g. a single integer for some reason), we could also get an IndexError:

Suggested change

except (ImportError, AttributeError, ValueError):

except (IndexError, AttributeError, ValueError):

gmarkall · 2025-11-19T12:44:57Z

/ok to test

gmarkall · 2025-11-19T12:45:23Z

(Set to auto merge)

- Add support for cache-hinted load and store operations (NVIDIA#587) - Add more thirdparty tests (NVIDIA#586) - Add sphinx-lint to pre-commit and fix errors (NVIDIA#597) - Add DWARF variant part support for polymorphic variables in CUDA debug info (NVIDIA#544) - chore: clean up dead workaround for unavailable `lru_cache` (NVIDIA#598) - chore(docs): format types docs (NVIDIA#596) - refactor: decouple `Context` from `Stream` and `Event` objects (NVIDIA#579) - Fix freezing in of constant arrays with negative strides (NVIDIA#589) - Update tests to accept variants of generated PTX (NVIDIA#585) - refactor: replace device functionality with `cuda.core` APIs (NVIDIA#581) - Move frontend tests to `cudapy` namespace (NVIDIA#558) - Generalize the concurrency group for main merges (NVIDIA#582) - ci: move pre-commit checks to pre commit action (NVIDIA#577) - chore(pixi): set up doc builds; remove most `build-conda` dependencies (NVIDIA#574) - ci: ensure that python version in ci matches matrix (NVIDIA#575) - Fix the `cuda.is_supported_version()` API (NVIDIA#571) - Fix checks on main (NVIDIA#576) - feat: add `math.nextafter` (NVIDIA#543) - ci: replace conda testing with pixi (NVIDIA#554) - [CI] Run PR workflow on merge to main (NVIDIA#572) - Propose Alternative Module Path for `ext_types` and Maintain `numba.cuda.types.bfloat16` Import API (NVIDIA#569) - test: enable fail-on-warn and clean up resulting failures (NVIDIA#529) - [Refactor][NFC] Vendor-in compiler_lock for future CUDA-specific changes (NVIDIA#565) - Fix registration with Numba, vendor MakeFunctionToJITFunction tests (NVIDIA#566) - [Refactor][NFC][Cleanups] Update imports to upstream numba to use the numba.cuda modules (NVIDIA#561) - test: refactor process-based tests to use concurrent futures in order to simplify tests (NVIDIA#550) - test: revert back to ipc futures that await each iteration (NVIDIA#564) - chore(deps): move to self-contained pixi.toml to avoid mixed-pypi-pixi environments (NVIDIA#551) - [Refactor][NFC] Vendor-in errors for future CUDA-specific changes (NVIDIA#534) - Remove dependencies on target_extension for CUDA target (NVIDIA#555) - Relax the pinning to `cuda-core` to allow it floating across minor releases (NVIDIA#559) - [WIP] Port numpy reduction tests to CUDA (NVIDIA#523) - ci: add timeout to avoid blocking the job queue (NVIDIA#556) - Handle `cuda.core.Stream` in driver operations (NVIDIA#401) - feat: add support for `math.exp2` (NVIDIA#541) - Vendor in types and datamodel for CUDA-specific changes (NVIDIA#533) - refactor: cleanup device constructor (NVIDIA#548) - bench: add cupy to array constructor kernel launch benchmarks (NVIDIA#547) - perf: cache dimension computations (NVIDIA#542) - perf: remove duplicated size computation (NVIDIA#537) - chore(perf): add torch to benchmark (NVIDIA#539) - test: speed up ipc tests by ~6.5x (NVIDIA#527) - perf: speed up kernel launch (NVIDIA#510) - perf: remove context threading in various pointer abstractions (NVIDIA#536) - perf: reduce the number of `__cuda_array_interface__` accesses (NVIDIA#538) - refactor: remove unnecessary custom map and set implementations (NVIDIA#530) - [Refactor][NFC] Vendor-in vectorize decorators for future CUDA-specific changes (NVIDIA#513) - test: add benchmarks for kernel launch for reproducibility (NVIDIA#528) - test(pixi): update pixi testing command to work with the new `testing` directory (NVIDIA#522) - refactor: fully remove `USE_NV_BINDING` (NVIDIA#525) - Draft: Vendor in the IR module (NVIDIA#439) - pyproject.toml: add search path for Pyrefly (NVIDIA#524) - Vendor in numba.core.typing for CUDA-specific changes (NVIDIA#473) - Use numba.config when available, otherwise use numba.cuda.config (NVIDIA#497) - [MNT] Drop NUMBA_CUDA_USE_NVIDIA_BINDING; always use cuda.core and cuda.bindings as fallback (NVIDIA#479) - Vendor in dispatcher, entrypoints, pretty_annotate for CUDA-specific changes (NVIDIA#502) - build: allow parallelization of nvcc testing builds (NVIDIA#521) - chore(dev-deps): add pixi (NVIDIA#505) - Vendor the imputils module for CUDA refactoring (NVIDIA#448) - Don't use `MemoryLeakMixin` for tests that don't use NRT (NVIDIA#519) - Switch back to stable cuDF release in thirdparty tests (NVIDIA#518) - Updating .gitignore with binaries in the `testing` folder (NVIDIA#516) - Remove some unnecessary uses of ContextResettingTestCase (NVIDIA#507) - Vendor in _helperlib cext for CUDA-specific changes (NVIDIA#512) - Vendor in typeconv for future CUDA-specific changes (NVIDIA#499) - [Refactor][NFC] Vendor-in numba.cpython modules for future CUDA-specific changes (NVIDIA#493) - [Refactor][NFC] Vendor-in numba.np modules for future CUDA-specific changes (NVIDIA#494) - Make the CUDA target the default for CUDA overload decorators (NVIDIA#511) - Remove C extension loading hacks (NVIDIA#506) - Ensure NUMBA can manipulate memory from CUDA graphs before the graph is launched (NVIDIA#437) - [Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes (NVIDIA#433) - Fix Bf16 Test OB Error (NVIDIA#509) - Vendor in components from numba.core.runtime for CUDA-specific changes (NVIDIA#498) - [Refactor] Vendor in _dispatcher, _devicearray, mviewbuf C extension for CUDA-specific customization (NVIDIA#373) - [MNT] Managed UM memset fallback and skip CUDA IPC tests on WSL2 (NVIDIA#488) - Improve debug value range coverage (NVIDIA#461) - Add `compile_all` API (NVIDIA#484) - Vendor in core.registry for CUDA-specific changes (NVIDIA#485) - [Refactor][NFC] Vendor in numba.misc for CUDA-specific changes (NVIDIA#457) - Vendor in optional, boxing for CUDA-specific changes, fix dangling imports (NVIDIA#476) - [test] Remove dependency on cpu_target (NVIDIA#490) - Change dangling imports of numba.core.lowering to numba.cuda.lowering (NVIDIA#475) - [test] Use numpy's tolerance for float16 (NVIDIA#491) - [Refactor][NFC] Vendor-in numba.extending for future CUDA-specific changes (NVIDIA#466) - [Refactor][NFC] Vendor-in more cpython registries for future CUDA-specific changes (NVIDIA#478)

- Add support for cache-hinted load and store operations (#587) - Add more thirdparty tests (#586) - Add sphinx-lint to pre-commit and fix errors (#597) - Add DWARF variant part support for polymorphic variables in CUDA debug info (#544) - chore: clean up dead workaround for unavailable `lru_cache` (#598) - chore(docs): format types docs (#596) - refactor: decouple `Context` from `Stream` and `Event` objects (#579) - Fix freezing in of constant arrays with negative strides (#589) - Update tests to accept variants of generated PTX (#585) - refactor: replace device functionality with `cuda.core` APIs (#581) - Move frontend tests to `cudapy` namespace (#558) - Generalize the concurrency group for main merges (#582) - ci: move pre-commit checks to pre commit action (#577) - chore(pixi): set up doc builds; remove most `build-conda` dependencies (#574) - ci: ensure that python version in ci matches matrix (#575) - Fix the `cuda.is_supported_version()` API (#571) - Fix checks on main (#576) - feat: add `math.nextafter` (#543) - ci: replace conda testing with pixi (#554) - [CI] Run PR workflow on merge to main (#572) - Propose Alternative Module Path for `ext_types` and Maintain `numba.cuda.types.bfloat16` Import API (#569) - test: enable fail-on-warn and clean up resulting failures (#529) - [Refactor][NFC] Vendor-in compiler_lock for future CUDA-specific changes (#565) - Fix registration with Numba, vendor MakeFunctionToJITFunction tests (#566) - [Refactor][NFC][Cleanups] Update imports to upstream numba to use the numba.cuda modules (#561) - test: refactor process-based tests to use concurrent futures in order to simplify tests (#550) - test: revert back to ipc futures that await each iteration (#564) - chore(deps): move to self-contained pixi.toml to avoid mixed-pypi-pixi environments (#551) - [Refactor][NFC] Vendor-in errors for future CUDA-specific changes (#534) - Remove dependencies on target_extension for CUDA target (#555) - Relax the pinning to `cuda-core` to allow it floating across minor releases (#559) - [WIP] Port numpy reduction tests to CUDA (#523) - ci: add timeout to avoid blocking the job queue (#556) - Handle `cuda.core.Stream` in driver operations (#401) - feat: add support for `math.exp2` (#541) - Vendor in types and datamodel for CUDA-specific changes (#533) - refactor: cleanup device constructor (#548) - bench: add cupy to array constructor kernel launch benchmarks (#547) - perf: cache dimension computations (#542) - perf: remove duplicated size computation (#537) - chore(perf): add torch to benchmark (#539) - test: speed up ipc tests by ~6.5x (#527) - perf: speed up kernel launch (#510) - perf: remove context threading in various pointer abstractions (#536) - perf: reduce the number of `__cuda_array_interface__` accesses (#538) - refactor: remove unnecessary custom map and set implementations (#530) - [Refactor][NFC] Vendor-in vectorize decorators for future CUDA-specific changes (#513) - test: add benchmarks for kernel launch for reproducibility (#528) - test(pixi): update pixi testing command to work with the new `testing` directory (#522) - refactor: fully remove `USE_NV_BINDING` (#525) - Draft: Vendor in the IR module (#439) - pyproject.toml: add search path for Pyrefly (#524) - Vendor in numba.core.typing for CUDA-specific changes (#473) - Use numba.config when available, otherwise use numba.cuda.config (#497) - [MNT] Drop NUMBA_CUDA_USE_NVIDIA_BINDING; always use cuda.core and cuda.bindings as fallback (#479) - Vendor in dispatcher, entrypoints, pretty_annotate for CUDA-specific changes (#502) - build: allow parallelization of nvcc testing builds (#521) - chore(dev-deps): add pixi (#505) - Vendor the imputils module for CUDA refactoring (#448) - Don't use `MemoryLeakMixin` for tests that don't use NRT (#519) - Switch back to stable cuDF release in thirdparty tests (#518) - Updating .gitignore with binaries in the `testing` folder (#516) - Remove some unnecessary uses of ContextResettingTestCase (#507) - Vendor in _helperlib cext for CUDA-specific changes (#512) - Vendor in typeconv for future CUDA-specific changes (#499) - [Refactor][NFC] Vendor-in numba.cpython modules for future CUDA-specific changes (#493) - [Refactor][NFC] Vendor-in numba.np modules for future CUDA-specific changes (#494) - Make the CUDA target the default for CUDA overload decorators (#511) - Remove C extension loading hacks (#506) - Ensure NUMBA can manipulate memory from CUDA graphs before the graph is launched (#437) - [Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes (#433) - Fix Bf16 Test OB Error (#509) - Vendor in components from numba.core.runtime for CUDA-specific changes (#498) - [Refactor] Vendor in _dispatcher, _devicearray, mviewbuf C extension for CUDA-specific customization (#373) - [MNT] Managed UM memset fallback and skip CUDA IPC tests on WSL2 (#488) - Improve debug value range coverage (#461) - Add `compile_all` API (#484) - Vendor in core.registry for CUDA-specific changes (#485) - [Refactor][NFC] Vendor in numba.misc for CUDA-specific changes (#457) - Vendor in optional, boxing for CUDA-specific changes, fix dangling imports (#476) - [test] Remove dependency on cpu_target (#490) - Change dangling imports of numba.core.lowering to numba.cuda.lowering (#475) - [test] Use numpy's tolerance for float16 (#491) - [Refactor][NFC] Vendor-in numba.extending for future CUDA-specific changes (#466) - [Refactor][NFC] Vendor-in more cpython registries for future CUDA-specific changes (#478)

jiel-nv added the 2 - In Progress Currently a work in progress label Oct 23, 2025

jiel-nv self-assigned this Oct 23, 2025

coderabbitai bot reviewed Oct 24, 2025

View reviewed changes

numba_cuda/numba/cuda/debuginfo.py Show resolved Hide resolved

numba_cuda/numba/cuda/debuginfo.py Show resolved Hide resolved

gmarkall reviewed Oct 24, 2025

View reviewed changes

numba_cuda/numba/cuda/debuginfo.py Outdated Show resolved Hide resolved

jiel-nv added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 24, 2025

jiel-nv changed the title ~~Add DWARF Variant Part Support for Polymorphic Variables in CUDA Debug Info~~ Add DWARF variant part support for polymorphic variables in CUDA debug Info Oct 24, 2025

jiel-nv changed the title ~~Add DWARF variant part support for polymorphic variables in CUDA debug Info~~ Add DWARF variant part support for polymorphic variables in CUDA debug info Oct 24, 2025

gmarkall reviewed Oct 27, 2025

View reviewed changes

numba_cuda/numba/cuda/debuginfo.py Outdated Show resolved Hide resolved

jiel-nv added 6 commits October 28, 2025 17:10

Add DWARF Variant Part Support for Polymorphic Variables in CUDA Debu…

ebd63cc

…g Info

fix ruff format

f052932

another fix for ruff format

65ee864

query rumtime CTK version from libcudart.so via cuda.bindings

817e875

remove debugging statement

01d9dbb

Add back the try / except in llvmlite version detection

9fb6f3a

jiel-nv force-pushed the debug-dynamic branch from e521ac0 to 9fb6f3a Compare October 28, 2025 17:11

jiel-nv added 2 commits October 28, 2025 18:34

fix the ctk version detection

2641d33

uniquify the discrimintor name

9927acb

fix ruff format

f64dddc

another ruff format fix

2388791

update the test according to the discriminator name change

0bcd929

laxmansole mentioned this pull request Oct 29, 2025

[LLVM][DebugInfo] Allow ExtraData field to be a node reference llvm/llvm-project#165023

Merged

Merge branch 'main' into debug-dynamic

8a58c48

jiel-nv requested a review from gmarkall October 30, 2025 17:32

Merge branch 'main' into debug-dynamic

d1f1dc3

fix accidental line while resolving the merge conflict

c8dc398

fix the GridGroup import

9932955

Merge remote-tracking branch 'NVIDIA/main' into debug-dynamic

753883f

gmarkall approved these changes Nov 19, 2025

View reviewed changes

Minor fixups in llvmlite version check

b5eeb98

gmarkall enabled auto-merge (squash) November 19, 2025 12:45

gmarkall added 4 - Waiting on CI Waiting for a CI run to finish successfully and removed 3 - Ready for Review Ready for review by team labels Nov 19, 2025

gmarkall merged commit 911dafd into NVIDIA:main Nov 19, 2025
136 of 138 checks passed

gmarkall mentioned this pull request Nov 20, 2025

Bump version to 0.21.0 #602

Merged

	except (ImportError, AttributeError, ValueError):
	except (IndexError, AttributeError, ValueError):

Add DWARF variant part support for polymorphic variables in CUDA debug info #544

Add DWARF variant part support for polymorphic variables in CUDA debug info #544

Uh oh!

Conversation

jiel-nv commented Oct 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Changes

1. Enhanced Polymorphic Variable Lowering (lowering.py)

2. DWARF Variant Part Metadata Generation (debuginfo.py)

3. Test Coverage (test_debuginfo.py)

4. Backward Compatibility

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Oct 23, 2025

Uh oh!

jiel-nv commented Oct 23, 2025

Uh oh!

jiel-nv commented Oct 23, 2025

Uh oh!

coderabbitai bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

jiel-nv commented Oct 23, 2025

Uh oh!

gmarkall commented Oct 24, 2025

Uh oh!

coderabbitai bot commented Oct 24, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiel-nv commented Oct 28, 2025

Uh oh!

jiel-nv commented Oct 28, 2025

Uh oh!

jiel-nv commented Oct 28, 2025

Uh oh!

jiel-nv commented Nov 4, 2025

Uh oh!

jiel-nv commented Nov 4, 2025

Uh oh!

jiel-nv commented Nov 4, 2025

Uh oh!

jiel-nv commented Nov 5, 2025

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

gmarkall Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall commented Nov 19, 2025

Uh oh!

gmarkall commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jiel-nv commented Oct 23, 2025 •

edited by coderabbitai bot

Loading

1. Enhanced Polymorphic Variable Lowering (`lowering.py`)

2. DWARF Variant Part Metadata Generation (`debuginfo.py`)

3. Test Coverage (`test_debuginfo.py`)

coderabbitai bot commented Oct 23, 2025 •

edited

Loading