Add LTO Support with BF16 by isVoid · Pull Request #253 · NVIDIA/numba-cuda

isVoid · 2025-05-14T21:12:56Z

In #245, we added bfloat16 API bindings. And it turns out that we missed testing the usage of the bindings with lto=True. This PR adds that.

Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>

…mba-cuda into fea-bfloat16-highlevel

copy-pr-bot · 2025-05-14T21:12:59Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

isVoid · 2025-05-14T21:18:23Z

@gmarkall it comes to me that whether we should adopt the similar usage here: #240 and go out to enable lto=True by default when we use Numbast generated bindings? I occurred to me that the overhead of non-LTO FFI is quite high.

isVoid · 2025-05-14T21:18:33Z

/ok to test

gmarkall · 2025-05-14T22:29:39Z

@gmarkall it comes to me that whether we should adopt the similar usage here: #240 and go out to enable lto=True by default when we use Numbast generated bindings? I occurred to me that the overhead of non-LTO FFI is quite high.

I think we should have a separate PR to enable LTO by default in general as long as pynvjitlink is available and it is new enough for the current GPU. There might be some caveats / nuance to this behaviour, but I think the general approach is that we should be doing LTO as much as possible because it's such a performance win with any external code.

…-support

isVoid · 2025-07-31T15:45:54Z

Closing: LTO is on by default. This means bfloat16 are tested under LTO mode currently. This PR now adds little value.

isVoid and others added 17 commits May 8, 2025 20:35

initial

87dad95

update generation script

79c4f19

enable exp2 for 3.10+

a37f97a

add tanh / tanh_approx bindings test

ca8ec01

bind to math.tanh

e834c4e

add tanh documentation

70824c5

Drop ctk11 support, handle py3.9 and py3.10 breakage

970e08b

add cuda_version12.0 to supported bfloat16 helper

190d2fb

Merge branch 'main' into fea-bfloat16-highlevel

f781ae7

Apply suggestions from code review

5af2cb7

Merge branch 'main' into fea-bfloat16-highlevel

87a42c8

Batch address documentation review comments

52df43b

Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>

Apply suggestions from code review

2e0ff0d

Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>

update h -> b

709ecf9

update the unit test with runtime skips

144497e

Merge branch 'fea-bfloat16-highlevel' of https://github.com/isVoid/nu…

de92534

…mba-cuda into fea-bfloat16-highlevel

add test to use bf16 with lto

2edf5d5

gmarkall added the 2 - In Progress Currently a work in progress label May 14, 2025

isVoid added 2 commits July 23, 2025 11:01

Merge branch 'main' of github.com:NVIDIA/numba-cuda into fea-bf16-lto…

3fe8816

…-support

Merge branch 'main' of github.com:NVIDIA/numba-cuda into fea-bf16-lto…

4a6999f

…-support

isVoid closed this Jul 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LTO Support with BF16#253

Add LTO Support with BF16#253
isVoid wants to merge 19 commits intoNVIDIA:mainfrom
isVoid:fea-bf16-lto-support

isVoid commented May 14, 2025

Uh oh!

copy-pr-bot bot commented May 14, 2025

Uh oh!

isVoid commented May 14, 2025

Uh oh!

isVoid commented May 14, 2025

Uh oh!

gmarkall commented May 14, 2025 •

edited

Loading

Uh oh!

isVoid commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

isVoid commented May 14, 2025

Uh oh!

copy-pr-bot bot commented May 14, 2025

Uh oh!

isVoid commented May 14, 2025

Uh oh!

isVoid commented May 14, 2025

Uh oh!

gmarkall commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isVoid commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

gmarkall commented May 14, 2025 •

edited

Loading