Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
987617d
update config file with use separate registries
isVoid Aug 4, 2025
feb8a09
regenerate bfloat16 bindings with lakshayg/Numbast@6282df4
isVoid Aug 4, 2025
ae6de8c
remove re-import of bfloat16 type
isVoid Aug 4, 2025
1c8de89
Merge branch 'main' of github.com:NVIDIA/numba-cuda into imprv-bf16-t…
isVoid Aug 5, 2025
8498a99
implement custom bfloat16 type object; insert type registry into cuda…
isVoid Aug 5, 2025
f79f0bf
update bfloat16 bindings
isVoid Aug 7, 2025
1b3598f
export typing and target registries in bf16
isVoid Aug 7, 2025
efc32f0
manually implement the lower_cast for float16 to bfloat16
isVoid Aug 7, 2025
b0f76e9
add converting rules and unify rules
isVoid Aug 7, 2025
0418625
choose irType based on compute capability
isVoid Aug 14, 2025
6ffa696
vend ctk13 code
isVoid Aug 14, 2025
834a905
regenerate with ctk13
isVoid Aug 14, 2025
577f00a
explicitly test against bfloat16 type
isVoid Aug 14, 2025
765f8ee
hand write lower cast fp16->bf16
isVoid Aug 15, 2025
c443e4d
ptx test for several basic ptx
isVoid Aug 15, 2025
166c9ae
add double underscore intrinsics
isVoid Aug 15, 2025
6d8fd66
regnerate with globals
isVoid Aug 15, 2025
c647ae3
apply binding patches
isVoid Aug 15, 2025
4b262e9
generate the bindings
isVoid Aug 15, 2025
9e79d37
apply binding patches
isVoid Aug 15, 2025
667d9fa
generate bindings
isVoid Aug 15, 2025
c4cf685
apply binding patches
isVoid Aug 15, 2025
7a89d3e
generate the bindings
isVoid Aug 15, 2025
da312aa
apply binding patches
isVoid Aug 15, 2025
bc7dbaa
re-imports the bf16 intrinsics
isVoid Aug 15, 2025
04823e8
Add documentation for arithmetic operations
isVoid Aug 15, 2025
b7e0e8b
add logical intrinsics
isVoid Aug 15, 2025
3407e19
make bfloat16 usable on host if ml_dtypes is installed
isVoid Aug 15, 2025
2ce64ed
add comparison operators
isVoid Aug 15, 2025
7d289b2
add basic conversion: float, int bidirectional conversion intrinsics
isVoid Aug 15, 2025
9317e7a
add numerical precision cast and tests
isVoid Aug 15, 2025
55d2220
add documentation for conversions
isVoid Aug 15, 2025
702b8ca
removing cuda_bf16 vended headers
isVoid Aug 15, 2025
8b569c6
update format constant method for BfloatType
isVoid Aug 15, 2025
2148be9
implement printing support for bfloat16
isVoid Aug 18, 2025
07b9c1e
implement to int conversion tests
isVoid Aug 18, 2025
0834f6d
add from integer conversion test
isVoid Aug 18, 2025
264f069
testing bitcast operations
isVoid Aug 18, 2025
abaa44d
Merge branch 'main' of github.com:NVIDIA/numba-cuda into imprv-bf16-t…
isVoid Aug 19, 2025
88ac53e
add fp16, bf16 vended headers
isVoid Aug 19, 2025
e33c7cc
update doc
isVoid Aug 19, 2025
bc3b27d
remove bfloattype custom impl
isVoid Aug 19, 2025
edea3c3
add print tests
isVoid Aug 19, 2025
0fa0174
add documentation for bfloat16 type
isVoid Aug 19, 2025
e09ffc6
update ci script and pyproject toml to make ml_dtypes a test time dep…
isVoid Aug 19, 2025
14b9fd5
add manual implementation of bf16->fp64, litint->bf16
isVoid Aug 20, 2025
b7b70c6
Maintain original overload resolution for all native operations
isVoid Aug 20, 2025
220287a
remove operator function exposure (a numbast bug)
isVoid Aug 20, 2025
8a504d8
Merge branch 'main' of github.com:NVIDIA/numba-cuda into imprv-bf16-t…
isVoid Aug 20, 2025
0f6683e
remove ml_dtypes dependency in core
isVoid Aug 21, 2025
c309442
use builtin not old_builtin
isVoid Aug 21, 2025
94bf193
Merge branch 'main' of github.com:NVIDIA/numba-cuda into imprv-bf16-t…
isVoid Aug 21, 2025
6ed0e81
add ml_dtypes to simulator ci
isVoid Aug 21, 2025
c0250cb
fix sub-sub section headers
isVoid Aug 21, 2025
13d7cb7
skip simulator for roundtrip
isVoid Aug 21, 2025
d41c67c
use numba typing templates
isVoid Aug 21, 2025
7d77ada
skip lto test without nvjitlink
isVoid Aug 21, 2025
212f4f0
skip cuda sim for bfloat16 tests
isVoid Aug 21, 2025
ac576be
update simulator tests
isVoid Aug 21, 2025
f3946db
skip simulator test on host
isVoid Aug 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ci/test_conda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ DEPENDENCIES=(
"pytest"
"pytest-xdist"
"cffi"
"ml_dtypes"
"python=${RAPIDS_PY_VERSION}"
)
# Constrain oldest supported dependencies for testing
Expand Down
1 change: 1 addition & 0 deletions ci/test_conda_ctypes_binding.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ DEPENDENCIES=(
"pytest"
"pytest-xdist"
"cffi"
"ml_dtypes"
"python=${RAPIDS_PY_VERSION}"
"numba-cuda"
)
Expand Down
1 change: 1 addition & 0 deletions ci/test_simulator.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ DEPENDENCIES=(
"pytest"
"pytest-xdist"
"cffi"
"ml_dtypes"
"python=${RAPIDS_PY_VERSION}"
"numba-cuda"
)
Expand Down
12 changes: 6 additions & 6 deletions configs/cuda_bf16.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: BSD-2-Clause
Name: Numba Bfloat16
Version: 0.0.1
Entry Point: ./numba_cuda/numba/cuda/include/12/cuda_bf16.h
Version: 0.0.2
GPU Arch:
- sm_80 # sm_80 is the first CUDA architecture that supports bfloat16
Entry Point: ./numba_cuda/numba/cuda/include/13/cuda_bf16.h
File List:
- ./numba_cuda/numba/cuda/include/12/cuda_bf16.h
- ./numba_cuda/numba/cuda/include/13/cuda_bf16.h
Exclude: {}
Types:
__nv_bfloat16_raw: Number
Expand All @@ -21,6 +23,4 @@ Data Models:
__nv_bfloat162: StructModel
nv_bfloat162: StructModel
Shim Include Override: "\"cuda_bf16.h\""
Additional Import:
- os
Require Pynvjitlink: False
Use Separate Registry: True
Loading