Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,24 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

Numba CUDA
Numba-CUDA
==========

This is the documentation for Numba's CUDA target.
Numba-CUDA provides a CUDA target for the Numba Python JIT Compiler. It is used
for writing SIMT kernels in Python, for providing Python bindings for
accelerated device libraries, and as a compiler for user-defined functions in
accelerated libraries like `RAPIDS <https://rapids.ai>`_.

This is presently a work-in-progress - the user guide and reference
documentation are presently direct copies of the `upstream Numba CUDA
documentation <https://numba.readthedocs.io/en/0.60.0/>`_.
* To install Numba-CUDA, see: :ref:`numba-cuda-installation`.
* To get started writing CUDA kernels in Python with Numba, see
:ref:`writing-cuda-kernels`.
* Browse the :ref:`numba-cuda-examples` to see a variety of use cases of Numba-CUDA.

Contents
========

.. toctree::
:maxdepth: 2
:hidden:

user/index.rst
reference/index.rst
1 change: 1 addition & 0 deletions docs/source/user/examples.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.. _numba-cuda-examples:

========
Examples
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ User guide

.. toctree::

overview.rst
installation.rst
kernels.rst
memory.rst
device-functions.rst
Expand Down
105 changes: 105 additions & 0 deletions docs/source/user/installation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
.. _numba-cuda-installation:

============
Installation
============

Requirements
============

Supported GPUs
--------------

Numba supports all NVIDIA GPUs that are supported by the CUDA Toolkit it uses.
Presently for CUDA 11 this ranges from Compute Capabilities 3.5 to 9.0, and for
CUDA 12 this ranges from 5.0 to 12.1, depending on the exact installed version.


Supported CUDA Toolkits
-----------------------

Numba-CUDA aims to support all minor versions of the two most recent CUDA
Toolkit releases. Presently 11 and 12 are supported; CUDA 11.2 is the minimum
required, because older releases (11.0 and 11.1) have a version of NVVM based on
a previous and incompatible LLVM version.

For further information about version compatibility between toolkit and driver
versions, refer to :ref:`minor-version-compatibility`.


Installation with a Python package manager
==========================================

Conda users can install the CUDA Toolkit into a conda environment.

For CUDA 12::

$ conda install -c conda-forge numba-cuda "cuda-version>=12.0"

Alternatively, you can install all CUDA 12 dependencies from PyPI via ``pip``::

$ pip install numba-cuda[cu12]

For CUDA 11, ``cudatoolkit`` is required::

$ conda install -c conda-forge numba-cuda "cuda-version>=11.2,<12.0"

or::

$ pip install numba-cuda[cu11]

If you are not using Conda/pip or if you want to use a different version of CUDA
toolkit, :ref:`cudatoolkit-lookup` describes how Numba searches for a CUDA toolkit.


Configuration
=============

.. _cuda-bindings:

CUDA Bindings
-------------

Numba supports interacting with the CUDA Driver API via either the `NVIDIA CUDA
Python bindings <https://nvidia.github.io/cuda-python/>`_ or its own ctypes-based
bindings. Functionality is equivalent between the two binding choices. The
NVIDIA bindings are the default, and the ctypes bindings are now deprecated.

If you do not want to use the NVIDIA bindings, the (deprecated) ctypes bindings
can be enabled by setting the environment variable
:envvar:`NUMBA_CUDA_USE_NVIDIA_BINDING` to ``"0"``.


.. _cudatoolkit-lookup:

CUDA Driver and Toolkit search paths
------------------------------------

Default behavior
~~~~~~~~~~~~~~~~

When using the NVIDIA bindings, searches for the CUDA driver and toolkit
libraries use its `built-in path-finding logic <https://github.com/NVIDIA/cuda-python/tree/main/cuda_bindings/cuda/bindings/_path_finder>`_.

Ctypes bindings (deprecated) behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When using the ctypes bindings, Numba searches for a CUDA toolkit installation
in the following order:

1. Conda-installed CUDA Toolkit packages
2. Pip-installed CUDA Toolkit packages
3. The environment variable ``CUDA_HOME``, which points to the directory of the
installed CUDA toolkit (i.e. ``/home/user/cuda-12``)
4. System-wide installation at exactly ``/usr/local/cuda`` on Linux platforms.
Versioned installation paths (i.e. ``/usr/local/cuda-12.0``) are intentionally
ignored. Users can use ``CUDA_HOME`` to select specific versions.

In addition to the CUDA toolkit libraries, which can be installed by conda into
an environment or installed system-wide by the `CUDA SDK installer
<https://developer.nvidia.com/cuda-downloads>`_, the CUDA target in Numba also
requires an up-to-date NVIDIA driver. Updated NVIDIA drivers are also installed
by the CUDA SDK installer, so there is no need to do both. If the ``libcuda``
library is in a non-standard location, users can set environment variable
:envvar:`NUMBA_CUDA_DRIVER` to the file path (not the directory path) of the
shared library file.
28 changes: 28 additions & 0 deletions docs/source/user/kernels.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
.. _writing-cuda-kernels:

====================
Writing CUDA Kernels
====================

Numba-CUDA supports programming NVIDIA CUDA GPUs by directly compiling a
restricted subset of Python code into CUDA kernels and device functions
following the CUDA execution model.


Introduction
============

Expand All @@ -24,6 +30,28 @@ consider how to use and access memory in order to minimize bandwidth
requirements and contention.


Terminology
===========

Several important terms in the topic of CUDA programming are listed here:

- *host*: the CPU
- *device*: the GPU
- *host memory*: the system main memory
- *device memory*: onboard memory on a GPU card
- *kernels*: a GPU function launched by the host and executed on the device
- *device function*: a GPU function executed on the device which can only be
called from the device (i.e. from a kernel or another device function)


Programming model
=================

Most CUDA programming facilities exposed by Numba map directly to the CUDA
C language offered by NVIDIA. Therefore, it is recommended you read the
official `CUDA C programming guide <http://docs.nvidia.com/cuda/cuda-c-programming-guide>`_.


Kernel declaration
==================

Expand Down
142 changes: 0 additions & 142 deletions docs/source/user/overview.rst

This file was deleted.