Enable LTO by default when pynvjitlink is available by gmarkall · Pull Request #310 · NVIDIA/numba-cuda

gmarkall · 2025-06-25T13:24:06Z

Enabling LTO by default when pynvjitlink is available should:

Provide a general improvement in performance for various use cases, particularly those linking external code. This ought to be benchmarked, but I'm making an assumption that it helps for now based on prior anecdotal / informal experience.
Make the case where users link LTO-IR to kernels or as part of device function declarations "just work" as long as pynvjitlink is installed.

A further improvement would still be to error out when a users tries to link LTO-IR when pynvjitlink is not installed - that is left to be done in a future PR.

Enabling LTO by default when pynvjitlink is available should: - Provide a general improvement in performance for various use cases, particularly those linking external code. This ought to be benchmarked, but I'm making an assumption that it helps for now based on prior anecdotal / informal experience. - Make the case where users link LTO-IR to kernels or as part of device function declarations "just work" as long as pynvjitlink is installed. A further improvement would still be to error out when a users tries to link LTO-IR when pynvjitlink is not installed - that is left to be done in a future PR.

copy-pr-bot · 2025-06-25T13:24:10Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

gmarkall · 2025-06-25T13:39:02Z

/ok to test

tpn · 2025-06-25T15:50:32Z

This looks like it'll fix the recent cuda.coop issues I ran into, so, +1 from me.

brandon-b-miller

LGTM

brandon-b-miller · 2025-06-25T18:46:21Z

Looking into the failures here

numba_cuda/numba/cuda/decorators.py

brandon-b-miller · 2025-06-25T19:14:10Z

To fix the simulator test we need to add lto=None to numba_cuda/numba/cuda/simulator/api.py but locally for me there's more needed to fix the failing test, still tracking down exactly what.

numba_cuda/numba/cuda/tests/cudapy/test_errors.py

brandon-b-miller · 2025-06-25T19:33:09Z

To fix the simulator test we need to add lto=None to numba_cuda/numba/cuda/simulator/api.py but locally for me there's more needed to fix the failing test, still tracking down exactly what.

Actually I think it's just https://github.com/NVIDIA/numba-cuda/pull/310/files#r2167468144

Also skip an irrelevant test on cudasim.

gmarkall · 2025-06-26T10:00:53Z

/ok to test

- Updates for recent API changes (NVIDIA#313) - Fix lineinfo generation when compile_internal used (NVIDIA#271) (NVIDIA#287) - Build docs with NVIDIA Sphinx theme (NVIDIA#312) - Don't skip debug tests when LTO enabled by default (NVIDIA#311) - Use `cuda.bindings` and `cuda.core` for `Linker` (NVIDIA#133) - Enable LTO by default when pynvjitlink is available (NVIDIA#310)

- Updates for recent API changes (#313) - Fix lineinfo generation when compile_internal used (#271) (#287) - Build docs with NVIDIA Sphinx theme (#312) - Don't skip debug tests when LTO enabled by default (#311) - Use `cuda.bindings` and `cuda.core` for `Linker` (#133) - Enable LTO by default when pynvjitlink is available (#310)

brandon-b-miller approved these changes Jun 25, 2025

View reviewed changes

brandon-b-miller reviewed Jun 25, 2025

View reviewed changes

numba_cuda/numba/cuda/decorators.py Show resolved Hide resolved

brandon-b-miller reviewed Jun 25, 2025

View reviewed changes

numba_cuda/numba/cuda/tests/cudapy/test_errors.py Show resolved Hide resolved

Update tests to match LTO-by-default message

bb8ce7a

Also skip an irrelevant test on cudasim.

gmarkall merged commit ae94550 into NVIDIA:main Jun 26, 2025
39 checks passed

gmarkall mentioned this pull request Jul 2, 2025

Bump version to 0.16.0 #315

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable LTO by default when pynvjitlink is available#310

Enable LTO by default when pynvjitlink is available#310
gmarkall merged 2 commits intoNVIDIA:mainfrom
gmarkall:lto-default

gmarkall commented Jun 25, 2025

Uh oh!

copy-pr-bot bot commented Jun 25, 2025

Uh oh!

gmarkall commented Jun 25, 2025

Uh oh!

tpn commented Jun 25, 2025

Uh oh!

brandon-b-miller left a comment

Uh oh!

brandon-b-miller commented Jun 25, 2025

Uh oh!

Uh oh!

brandon-b-miller commented Jun 25, 2025

Uh oh!

Uh oh!

brandon-b-miller commented Jun 25, 2025

Uh oh!

gmarkall commented Jun 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gmarkall commented Jun 25, 2025

Uh oh!

copy-pr-bot bot commented Jun 25, 2025

Uh oh!

gmarkall commented Jun 25, 2025

Uh oh!

tpn commented Jun 25, 2025

Uh oh!

brandon-b-miller left a comment

Choose a reason for hiding this comment

Uh oh!

brandon-b-miller commented Jun 25, 2025

Uh oh!

Uh oh!

brandon-b-miller commented Jun 25, 2025

Uh oh!

Uh oh!

brandon-b-miller commented Jun 25, 2025

Uh oh!

gmarkall commented Jun 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants