Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a250d4b
Initial attempt at CUDA 13 support / drop CUDA 11
gmarkall Jul 31, 2025
1e6a534
Update dependency naming for CUDA 13 changes
gmarkall Aug 5, 2025
8ebf865
Remove MVC env var from docs
gmarkall Aug 5, 2025
0e75f18
Merge remote-tracking branch 'NVIDIA/main' into cuda-13
gmarkall Aug 7, 2025
d1dc13c
Update pyproject.toml
gmarkall Aug 7, 2025
c1d42c8
Apply suggestions from code review
gmarkall Aug 12, 2025
3394a19
Merge remote-tracking branch 'NVIDIA/main' into cuda-13
gmarkall Aug 12, 2025
28957de
Filter the coverage-report job on a valid config
gmarkall Aug 12, 2025
57d5047
Try to WAR no ci-conda 13 images
gmarkall Aug 12, 2025
61380c3
Don't build test binaries for CC 7.0
gmarkall Aug 12, 2025
03943a6
Fix check for when to use apt-get
gmarkall Aug 12, 2025
2c66e51
Fix nvrtc import in generate_raw_ltoir for CUDA 13
gmarkall Aug 12, 2025
1bd3b22
Fix check for Ubuntu
gmarkall Aug 12, 2025
49fcff1
Revert "Try to WAR no ci-conda 13 images"
gmarkall Aug 13, 2025
e9ba0a4
Clarify CTK supported CCs
gmarkall Aug 13, 2025
1c00df4
Apply suggestions from code review
gmarkall Aug 13, 2025
39e400b
Attempt to accommodate CCCL include path
gmarkall Aug 13, 2025
e5be30f
Remove CTK_CURAND_VMAP
gmarkall Aug 13, 2025
1420c67
Only test third party deps on CUDA 12
gmarkall Aug 13, 2025
082b618
Don't load libnvrtc.so when using the cuda-python bindings
gmarkall Aug 15, 2025
9306ceb
Merge remote-tracking branch 'NVIDIA/main' into cuda-13
gmarkall Aug 15, 2025
9b973ba
Merge remote-tracking branch 'NVIDIA/main' into cuda-13
gmarkall Aug 15, 2025
55991cf
Delete vendored CUDA 11 includes
gmarkall Aug 15, 2025
775a9c1
Use cuda-core >= 0.3.2 for CUDA 13
gmarkall Aug 15, 2025
32a453c
Modify Ubuntu version coverage in CI
gmarkall Aug 15, 2025
239ab8b
Add TODO for 3rd party deps CUDA 13 testing
gmarkall Aug 15, 2025
d8843ef
Add TODO to use cudatoolkit package for dependencies on CUDA 13
gmarkall Aug 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,8 @@ jobs:
build_type: pull-request
script: "ci/test_thirdparty.sh"
matrix: ${{ needs.compute-matrix.outputs.TEST_MATRIX }}
matrix_filter: map(select(.ARCH == "amd64" and (.CUDA_VER | split(".") | .[0] | tonumber >= 12))) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
# TODO: Enable for CUDA 13 when a supporting version of cuDF is available
matrix_filter: map(select(.ARCH == "amd64" and (.CUDA_VER | split(".") | .[0] | tonumber == 12))) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a TODO to re-enable tests with CUDA 13? What requirements are missing? (e.g. cupy, …?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thirdparty test is presently cuDF, so having a version of cuDF that officially supports CUDA 13 is probably a good criteria for this - I think there's not a CUDA 13-supporting version yet?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We are waiting on numba-cuda. 🙃 We'll get this sorted out soon enough.

build-docs:
needs:
- build-conda
Expand All @@ -162,4 +163,4 @@ jobs:
build_type: pull-request
script: "ci/coverage_report.sh"
matrix: ${{ needs.compute-matrix.outputs.TEST_MATRIX }}
matrix_filter: 'map(select(.ARCH == "amd64" and .CUDA_VER == "12.8.0" and .PY_VER == "3.13")) | .[0:1]'
matrix_filter: 'map(select(.ARCH == "amd64" and .CUDA_VER == "12.9.1" and .PY_VER == "3.11")) | .[0:1]'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was Python downgraded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the matrix changed so that the newer Pythons were tested with CUDA 13, and I wanted to keep the coverage report on a CUDA 12 version.

20 changes: 10 additions & 10 deletions ci/matrix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@ build-matrix:
simulator-matrix:
- { ARCH: 'amd64', PY_VER: '3.12', CUDA_VER: '12.8.0', LINUX_VER: 'rockylinux8' }
test-matrix:
- { CUDA_VER: '11.8.0', ARCH: 'amd64', PY_VER: '3.9', LINUX_VER: 'rockylinux8', GPU: 'l4', DRIVER: 'earliest' }
- { CUDA_VER: '11.8.0', ARCH: 'amd64', PY_VER: '3.10', LINUX_VER: 'ubuntu20.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.11', LINUX_VER: 'ubuntu22.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '12.2.2', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'ubuntu22.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '12.8.0', ARCH: 'amd64', PY_VER: '3.13', LINUX_VER: 'ubuntu24.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '11.8.0', ARCH: 'arm64', PY_VER: '3.9', LINUX_VER: 'rockylinux8', GPU: 'a100', DRIVER: 'earliest' }
- { CUDA_VER: '11.8.0', ARCH: 'arm64', PY_VER: '3.10', LINUX_VER: 'ubuntu20.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '12.2.2', ARCH: 'arm64', PY_VER: '3.11', LINUX_VER: 'ubuntu22.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '12.2.2', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'ubuntu22.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '12.8.0', ARCH: 'arm64', PY_VER: '3.13', LINUX_VER: 'ubuntu24.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.9', LINUX_VER: 'rockylinux8', GPU: 'l4', DRIVER: 'earliest' }
- { CUDA_VER: '12.2.2', ARCH: 'amd64', PY_VER: '3.10', LINUX_VER: 'ubuntu22.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '12.9.1', ARCH: 'amd64', PY_VER: '3.11', LINUX_VER: 'ubuntu24.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '13.0.0', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'ubuntu22.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '13.0.0', ARCH: 'amd64', PY_VER: '3.13', LINUX_VER: 'ubuntu24.04', GPU: 'l4', DRIVER: 'latest' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.9', LINUX_VER: 'rockylinux8', GPU: 'a100', DRIVER: 'earliest' }
- { CUDA_VER: '12.2.2', ARCH: 'arm64', PY_VER: '3.10', LINUX_VER: 'ubuntu22.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '12.9.1', ARCH: 'arm64', PY_VER: '3.11', LINUX_VER: 'ubuntu24.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '13.0.0', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'ubuntu22.04', GPU: 'a100', DRIVER: 'latest' }
- { CUDA_VER: '13.0.0', ARCH: 'arm64', PY_VER: '3.13', LINUX_VER: 'ubuntu24.04', GPU: 'a100', DRIVER: 'latest' }
9 changes: 5 additions & 4 deletions ci/test_conda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@ set -euo pipefail

. /opt/conda/etc/profile.d/conda.sh

if [ "${CUDA_VER%.*.*}" = "11" ]; then
CTK_PACKAGES="cudatoolkit=11"
else
CTK_PACKAGES="cuda-cccl cuda-nvcc-impl cuda-nvrtc libcurand-dev cuda-cuobjdump"
CTK_PACKAGES="cuda-cccl cuda-nvcc-impl cuda-nvrtc libcurand-dev cuda-cuobjdump"

DISTRO=`cat /etc/os-release | grep "^ID=" | awk 'BEGIN {FS="="} { print $2 }'`

if [ "$DISTRO" = "ubuntu" ]; then
apt-get update
apt remove --purge `dpkg --get-selections | grep cuda-nvvm | awk '{print $1}'` -y
apt remove --purge `dpkg --get-selections | grep cuda-nvrtc | awk '{print $1}'` -y
Expand Down
4 changes: 0 additions & 4 deletions ci/test_wheel_deps_wheels.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,8 @@

set -euo pipefail

# cuRAND versions don't follow the toolkit versions - map toolkit versions to
# appropriate cuRAND versions
declare -A CTK_CURAND_VMAP=( ["12.8"]="10.3.9" ["12.9"]="10.3.10")
CUDA_VER_MAJOR_MINOR=${CUDA_VER%.*}
CUDA_VER_MAJOR=${CUDA_VER%.*.*}
CURAND_VER="${CTK_CURAND_VMAP[${CUDA_VER_MAJOR_MINOR}]}"

rapids-logger "Install wheel with test dependencies"
package=$(realpath wheel/numba_cuda*.whl)
Expand Down
11 changes: 2 additions & 9 deletions docs/source/reference/envvars.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ target.

.. seealso::

The `Default Stream section
<https://nvidia.github.io/cuda-python/release/11.6.0-notes.html#default-stream>`_
The `Runtime Environment Variables section
<https://nvidia.github.io/cuda-python/cuda-bindings/latest/environment_variables.html#runtime-environment-variables>`_
in the NVIDIA Bindings documentation.

.. envvar:: NUMBA_CUDA_LOW_OCCUPANCY_WARNINGS
Expand Down Expand Up @@ -119,13 +119,6 @@ target.
``/usr/local/cuda/include``. On Windows, the default is
``$env:CUDA_PATH\include``.

.. envvar:: NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY

Enable minor version compatibility for the CUDA driver. Requires the
``cubinlinker`` and ``ptxcompiler`` packages to be installed. Provides minor
version compatibility for driver versions less than 12.0.


.. envvar:: NUMBA_CUDA_NVRTC_EXTRA_SEARCH_PATHS

A colon separated list of paths that Numba's NVRTC should search for when compiling
Expand Down
3 changes: 1 addition & 2 deletions docs/source/user/cudapysupported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,8 +134,7 @@ The following built-in types support are inherited from CPU nopython mode.
See :ref:`nopython built-in types <pysupported-builtin-types>`.

There is also some very limited support for character sequences (bytes and
unicode strings) used in NumPy arrays. Note that this support can only be used
with CUDA 11.2 onwards.
unicode strings) used in NumPy arrays.

Built-in functions
==================
Expand Down
25 changes: 11 additions & 14 deletions docs/source/user/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,16 @@ Supported GPUs
--------------

Numba supports all NVIDIA GPUs that are supported by the CUDA Toolkit it uses.
Presently for CUDA 11 this ranges from Compute Capabilities 3.5 to 9.0, and for
CUDA 12 this ranges from 5.0 to 12.1, depending on the exact installed version.
Presently for CUDA 12 this ranges from Compute Capabilities 5.0 to 12.1
depending on the exact installed version, and for CUDA 13 this ranges from 7.5
to 12.1 (the latest as of CUDA 13.0).


Supported CUDA Toolkits
-----------------------

Numba-CUDA aims to support all minor versions of the two most recent CUDA
Toolkit releases. Presently 11 and 12 are supported; CUDA 11.2 is the minimum
required, because older releases (11.0 and 11.1) have a version of NVVM based on
a previous and incompatible LLVM version.
Toolkit releases. Presently 12 and 13 are supported.

For further information about version compatibility between toolkit and driver
versions, refer to :ref:`minor-version-compatibility`.
Expand All @@ -30,23 +29,21 @@ versions, refer to :ref:`minor-version-compatibility`.
Installation with a Python package manager
==========================================

Conda users can install the CUDA Toolkit into a conda environment.
Conda users can install the CUDA Toolkit into a conda environment::

For CUDA 12::
$ conda install -c conda-forge numba-cuda "cuda-version=12"

$ conda install -c conda-forge numba-cuda "cuda-version>=12.0"
Or for CUDA 13::

$ conda install -c conda-forge numba-cuda "cuda-version=13"

Alternatively, you can install all CUDA 12 dependencies from PyPI via ``pip``::

$ pip install numba-cuda[cu12]

For CUDA 11, ``cudatoolkit`` is required::

$ conda install -c conda-forge numba-cuda "cuda-version>=11.2,<12.0"

or::
CUDA 13 dependencies can be installed via ``pip`` with::

$ pip install numba-cuda[cu11]
$ pip install numba-cuda[cu13]

If you are not using Conda/pip or if you want to use a different version of CUDA
toolkit, :ref:`cudatoolkit-lookup` describes how Numba searches for a CUDA toolkit.
Expand Down
67 changes: 4 additions & 63 deletions docs/source/user/minor_version_compatibility.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,68 +7,11 @@ CUDA `Minor Version Compatibility
<https://docs.nvidia.com/deploy/cuda-compatibility/index.html#minor-version-compatibility>`_
(MVC) enables the use of a newer CUDA Toolkit version than the CUDA version
supported by the driver, provided that the Toolkit and driver both have the same
major version. For example, use of CUDA Toolkit 11.5 with CUDA driver 450 (CUDA
version 11.0) is supported through MVC.

Numba supports MVC for CUDA 12 on Linux using the `nvjitlink` library.

Numba supports MVC for CUDA 11 on Linux using the external ``cubinlinker`` and
``ptxcompiler`` packages, subject to the following limitations:

- Linking of archives is unsupported.
- Cooperative Groups are unsupported, because they require an archive to be
linked.

MVC is not supported on Windows.


Installation
------------

CUDA 11
~~~~~~~

To use MVC support, the ``cubinlinker`` and ``ptxcompiler`` compiler packages
must be installed from the appropriate channels. To install using conda, use:

.. code:: bash

conda install -c rapidsai -c conda-forge cubinlinker ptxcompiler

To install with pip, use the NVIDIA package index:

.. code:: bash

pip install --extra-index-url https://pypi.nvidia.com ptxcompiler-cu11 cubinlinker-cu11

CUDA 12
~~~~~~~

For CUDA 12, MVC is provied by default through the ``nvjitlink`` package,
which ``numba-cuda[cu12]`` depends on directly, so no additional installation
steps are required.

Enabling MVC Support
--------------------

CUDA 11
~~~~~~~

MVC support is enabled by setting the environment variable:

.. code:: bash

export NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY=1

or by setting a configuration variable prior to using any CUDA functionality in
Numba:

.. code:: python

from numba import config
config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY = True

major version. For example, use of CUDA Toolkit 12.9 with CUDA driver 570 (CUDA
version 12.8) is supported through MVC.

Numba supports MVC using the linker in the NVIDIA CUDA Python bindings, which
uses ``nvjitlink`` to provide MVC.


References
Expand All @@ -78,5 +21,3 @@ Further information about Minor Version Compatibility may be found in:

- The `CUDA Compatibility Guide
<https://docs.nvidia.com/deploy/cuda-compatibility/index.html>`_.
- The `README for ptxcompiler
<https://github.com/rapidsai/ptxcompiler/blob/main/README.md>`_.
9 changes: 2 additions & 7 deletions numba_cuda/numba/cuda/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
from .cudadrv import devicearray, devices, driver
from numba.core import config
from numba.cuda.api_util import prepare_shape_strides_dtype
from numba.cuda.cudadrv.runtime import get_version

# NDarray device helper

Expand Down Expand Up @@ -99,13 +98,9 @@ def is_float16_supported():
def is_bfloat16_supported():
"""Whether bfloat16 are supported.

bfloat16 are only supported on devices with compute capability >= 8.0 and cuda version >= 12.0
bfloat16 is only supported on devices with compute capability >= 8.0
"""
cuda_version = get_version()
return current_context().device.supports_bfloat16 and cuda_version >= (
12,
0,
)
return current_context().device.supports_bfloat16


@require_context
Expand Down
26 changes: 9 additions & 17 deletions numba_cuda/numba/cuda/cuda_paths.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,6 @@ def get_nvrtc_dso_path():
# Check for each version of the NVRTC DLL, preferring the most
# recent.
versions = (
"112" if IS_WIN32 else "11.2",
"120" if IS_WIN32 else "12",
"130" if IS_WIN32 else "13",
)
Expand Down Expand Up @@ -303,16 +302,16 @@ def get_nvidia_nvvm_ctk():

# Assume the existence of NVVM in the conda env implies that a CUDA toolkit
# conda package is installed.
if IS_WIN32:
# The path used on Windows
libdir = os.path.join(sys.prefix, "Library", "nvvm", _cudalib_path())
else:
# The path used on Linux is different to that on Windows
libdir = os.path.join(sys.prefix, "nvvm", _cudalib_path())

# First, try the location used on Linux and the Windows 11.x packages
libdir = os.path.join(sys.prefix, "nvvm", _cudalib_path())
if not os.path.exists(libdir) or not os.path.isdir(libdir):
# If that fails, try the location used for Windows 12.x packages
libdir = os.path.join(sys.prefix, "Library", "nvvm", _cudalib_path())
if not os.path.exists(libdir) or not os.path.isdir(libdir):
# If that doesn't exist either, assume we don't have the NVIDIA
# conda package
return
# If the path doesn't exist, we didn't find the NVIDIA conda package
return

paths = find_lib("nvvm", libdir=libdir)
if not paths:
Expand Down Expand Up @@ -346,15 +345,8 @@ def get_nvidia_static_cudalib_ctk():
if not nvvm_ctk:
return

if IS_WIN32 and ("Library" not in nvvm_ctk):
# Location specific to CUDA 11.x packages on Windows
dirs = ("Lib", "x64")
else:
# Linux, or Windows with CUDA 12.x packages
dirs = ("lib",)

env_dir = os.path.dirname(os.path.dirname(nvvm_ctk))
return os.path.join(env_dir, *dirs)
return os.path.join(env_dir, "lib")


def get_cuda_home(*subdirs):
Expand Down
15 changes: 3 additions & 12 deletions numba_cuda/numba/cuda/cudadrv/driver.py
Original file line number Diff line number Diff line change
Expand Up @@ -2802,19 +2802,10 @@ def new(
lto=None,
additional_flags=None,
):
driver_ver = driver.get_version()
if driver_ver < (12, 0):
if config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY:
linker = MVCLinker
elif USE_NV_BINDING:
linker = _Linker
else:
linker = CtypesLinker
if USE_NV_BINDING:
linker = _Linker
else:
if USE_NV_BINDING:
linker = _Linker
else:
linker = CtypesLinker
linker = CtypesLinker

params = (max_registers, lineinfo, cc)
if linker is _Linker:
Expand Down
3 changes: 1 addition & 2 deletions numba_cuda/numba/cuda/cudadrv/libs.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@

CUDA Toolkit libraries can be available via either:

- the `cuda-nvcc` and `cuda-nvrtc` conda packages for CUDA 12,
- the `cudatoolkit` conda package for CUDA 11,
- the `cuda-nvcc` and `cuda-nvrtc` conda packages,
- a user supplied location from CUDA_HOME,
- a system wide location,
- package-specific locations (e.g. the Debian NVIDIA packages),
Expand Down
Loading