Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo #591

gmarkall · 2025-11-17T10:26:30Z

One test still fails, because the C ABI wrapper generator generates no debug info, and the separate compilation seems to lead NVVM to not generate a debug section for it. This should probably be addressed by generating debug info for the C ABI wrapper.

Fixes #588.
Fixes NVBugs: 5196888, 5227483, 5639364.

One test still fails, because the C ABI wrapper generator generates no debug info, and the separate compilation seems to lead NVVM to not generate a debug section for it. This should probably be addressed by generating debug info for the C ABI wrapper.

copy-pr-bot · 2025-11-17T10:26:34Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

gmarkall · 2025-11-17T10:26:39Z

/ok to test

greptile-apps · 2025-11-20T11:50:13Z

Greptile Overview

Greptile Summary

This PR fixes Issue #588 by implementing separate compilation of NVVM IR modules when generating debug information. The change reverts to pre-PR#8594 behavior specifically for debug mode compilation while maintaining the optimized combined compilation for lineinfo mode.

The core modification adds a new get_asm_strs() method that returns a list of PTX strings instead of a single concatenated string. When the "g" debug option is present, each IR module compiles separately to produce one PTX file per module. For lineinfo mode (no "g" option), the existing combined compilation approach is preserved for performance. The linkage type also changes from linkonce_odr to weak_odr in debug mode to handle symbol resolution across separately compiled modules.

This addresses NVVM's official requirement that debug compilation should only have a single debug compile unit, as combining multiple modules with debug info can produce invalid PTX with duplicate debug sections that the linker rejects.

Important Files Changed

Filename	Score	Overview
`numba_cuda/numba/cuda/codegen.py`	2/5	Refactors PTX compilation to support separate compilation in debug mode but contains critical bug missing `**options` parameter on line 225
`numba_cuda/numba/cuda/compiler.py`	4/5	Updates compilation process to handle multiple PTX codes using new `get_asm_strs()` method with proper error handling
`numba_cuda/numba/cuda/tests/cudapy/test_compiler.py`	4/5	Marks test as expected failure with detailed documentation explaining why C ABI wrapper debug info is no longer generated

Confidence score: 2/5

This PR contains a critical bug that will likely cause compilation failures in debug mode
Score reflects the missing **options parameter in codegen.py line 225 which will drop architecture and fastmath options during separate IR compilation
Pay close attention to numba_cuda/numba/cuda/codegen.py - the bug on line 225 needs to be fixed before merging

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

numba_cuda/numba/cuda/codegen.py

numba_cuda/numba/cuda/compiler.py

- logic: missing `**options` parameter - `arch` and other compilation options won't be passed when compiling with debug info - syntax: missing space in error message Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

brandon-b-miller · 2025-12-16T20:37:44Z

/ok to test 34c4fb1

gmarkall · 2025-12-16T20:47:43Z

/ok to test

copy-pr-bot · 2025-12-16T20:47:52Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

- Fix NVIDIA#624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (NVIDIA#643) - Fix Issue NVIDIA#588: separate compilation of NVVM IR modules when generating debuginfo (NVIDIA#591) - feat: allow printing nested tuples (NVIDIA#667) - build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (NVIDIA#655) - build(deps): bump actions/upload-artifact from 4 to 5 (NVIDIA#652) - Test RAPIDS 25.12 (NVIDIA#661) - Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (NVIDIA#662) - feat: add print support for int64 tuples (NVIDIA#663) - Only run dependabot monthly and open fewer PRs (NVIDIA#658) - test: fix bogus `self` argument to `Context` (NVIDIA#656) - Fix false negative NRT link decision when NRT was previously toggled on (NVIDIA#650) - Add support for dependabot (NVIDIA#647) - refactor: cull dead linker objects (NVIDIA#649) - Migrate numba-cuda driver to use cuda.core.launch API (NVIDIA#609) - feat: add set_shared_memory_carveout (NVIDIA#629) - chore: bump version in pixi.toml (NVIDIA#641) - refactor: remove devicearray code to reduce complexity (NVIDIA#600)

- Capture global device arrays in kernels and device functions (#666) - Fix #624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (#643) - Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo (#591) - feat: allow printing nested tuples (#667) - build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (#655) - build(deps): bump actions/upload-artifact from 4 to 5 (#652) - Test RAPIDS 25.12 (#661) - Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (#662) - feat: add print support for int64 tuples (#663) - Only run dependabot monthly and open fewer PRs (#658) - test: fix bogus `self` argument to `Context` (#656) - Fix false negative NRT link decision when NRT was previously toggled on (#650) - Add support for dependabot (#647) - refactor: cull dead linker objects (#649) - Migrate numba-cuda driver to use cuda.core.launch API (#609) - feat: add set_shared_memory_carveout (#629) - chore: bump version in pixi.toml (#641) - refactor: remove devicearray code to reduce complexity (#600)

v0.23.0 - Capture global device arrays in kernels and device functions (NVIDIA#666) - Fix NVIDIA#624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (NVIDIA#643) - Fix Issue NVIDIA#588: separate compilation of NVVM IR modules when generating debuginfo (NVIDIA#591) - feat: allow printing nested tuples (NVIDIA#667) - build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (NVIDIA#655) - build(deps): bump actions/upload-artifact from 4 to 5 (NVIDIA#652) - Test RAPIDS 25.12 (NVIDIA#661) - Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (NVIDIA#662) - feat: add print support for int64 tuples (NVIDIA#663) - Only run dependabot monthly and open fewer PRs (NVIDIA#658) - test: fix bogus `self` argument to `Context` (NVIDIA#656) - Fix false negative NRT link decision when NRT was previously toggled on (NVIDIA#650) - Add support for dependabot (NVIDIA#647) - refactor: cull dead linker objects (NVIDIA#649) - Migrate numba-cuda driver to use cuda.core.launch API (NVIDIA#609) - feat: add set_shared_memory_carveout (NVIDIA#629) - chore: bump version in pixi.toml (NVIDIA#641) - refactor: remove devicearray code to reduce complexity (NVIDIA#600)

Fix Issue NVIDIA#588

4968438

One test still fails, because the C ABI wrapper generator generates no debug info, and the separate compilation seems to lead NVVM to not generate a debug section for it. This should probably be addressed by generating debug info for the C ABI wrapper.

gmarkall added the 2 - In Progress Currently a work in progress label Nov 17, 2025

greptile-apps bot reviewed Nov 20, 2025

View reviewed changes

numba_cuda/numba/cuda/codegen.py Outdated Show resolved Hide resolved

numba_cuda/numba/cuda/compiler.py Show resolved Hide resolved

gmarkall and others added 2 commits November 20, 2025 13:19

Merge branch 'main' into issue-588

34c4fb1

brandon-b-miller approved these changes Dec 16, 2025

View reviewed changes

xfail test_device_function_with_debug

c809e00

gmarkall added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Dec 16, 2025

gmarkall marked this pull request as ready for review December 16, 2025 20:47

gmarkall changed the title ~~[WIP] Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo~~ Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo Dec 16, 2025

greptile-apps bot reviewed Dec 16, 2025

View reviewed changes

brandon-b-miller added 5 - Ready to merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Dec 16, 2025

brandon-b-miller merged commit dd396c8 into NVIDIA:main Dec 16, 2025
71 checks passed

gmarkall mentioned this pull request Dec 17, 2025

Bump version to 0.23.0 #668

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo #591

Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo #591

Uh oh!

gmarkall commented Nov 17, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Nov 17, 2025

Uh oh!

gmarkall commented Nov 17, 2025

Uh oh!

greptile-apps bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

brandon-b-miller commented Dec 16, 2025

Uh oh!

gmarkall commented Dec 16, 2025

Uh oh!

copy-pr-bot bot commented Dec 16, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo #591

Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo #591

Uh oh!

Conversation

gmarkall commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Nov 17, 2025

Uh oh!

gmarkall commented Nov 17, 2025

Uh oh!

greptile-apps bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Important Files Changed

Confidence score: 2/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

brandon-b-miller commented Dec 16, 2025

Uh oh!

gmarkall commented Dec 16, 2025

Uh oh!

copy-pr-bot bot commented Dec 16, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gmarkall commented Nov 17, 2025 •

edited

Loading

greptile-apps bot commented Nov 20, 2025 •

edited

Loading