Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 0 additions & 52 deletions .flake8

This file was deleted.

2 changes: 2 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Migrate code style to ruff
06b62024f77bb92b585315fe61b9ba15e0885d71
24 changes: 21 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
repos:
- repo: https://github.com/PyCQA/flake8
rev: 7.1.0
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0 # Use the latest version or a specific tag
hooks:
- id: flake8
- id: check-added-large-files
- id: check-ast
- id: check-json
- id: check-merge-conflict
- id: check-toml
- id: check-yaml
exclude: ^conda/recipes/numba-cuda/meta.yaml
- id: debug-statements
- id: end-of-file-fixer
- id: requirements-txt-fixer
- id: trailing-whitespace
- id: mixed-line-ending
args: ['--fix=lf']
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.2
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
78 changes: 39 additions & 39 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -1,39 +1,39 @@
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)
if "%1" == "" goto help
if "%SPHINXOPTS%" == "" (
set SPHINXOPTS=-W
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

if "%SPHINXOPTS%" == "" (
set SPHINXOPTS=-W
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
23 changes: 12 additions & 11 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,39 +6,40 @@
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'Numba CUDA'
copyright = '2012-2024 Anaconda Inc. 2024, NVIDIA Corporation.'
author = 'NVIDIA Corporation'
project = "Numba CUDA"
copyright = "2012-2024 Anaconda Inc. 2024, NVIDIA Corporation."
author = "NVIDIA Corporation"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = ['numpydoc', 'sphinx.ext.intersphinx', 'sphinx.ext.autodoc']
extensions = ["numpydoc", "sphinx.ext.intersphinx", "sphinx.ext.autodoc"]

templates_path = ['_templates']
templates_path = ["_templates"]
exclude_patterns = []

intersphinx_mapping = {
'python': ('https://docs.python.org/3', None),
'numpy': ('https://numpy.org/doc/stable/', None),
'llvmlite': ('https://llvmlite.readthedocs.io/en/latest/', None),
'numba': ('https://numba.readthedocs.io/en/latest/', None),
"python": ("https://docs.python.org/3", None),
"numpy": ("https://numpy.org/doc/stable/", None),
"llvmlite": ("https://llvmlite.readthedocs.io/en/latest/", None),
"numba": ("https://numba.readthedocs.io/en/latest/", None),
}

# To prevent autosummary warnings
numpydoc_show_class_members = False

autodoc_typehints = 'none'
autodoc_typehints = "none"

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

try:
import nvidia_sphinx_theme # noqa: F401

html_theme = "nvidia_sphinx_theme"
except ImportError:
html_theme = "sphinx_rtd_theme"

html_static_path = ['_static']
html_static_path = ["_static"]
html_favicon = "_static/numba-green-icon-rgb.svg"
html_show_sphinx = False
4 changes: 2 additions & 2 deletions docs/source/reference/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ this is the recommended way to instantiate vector types.

For convenience, users adapting existing kernels from CUDA C/C++ to Python may use
aliases consistent with the C/C++ namings. For example, ``float3`` aliases ``float32x3``,
``long3`` aliases ``int32x3`` or ``int64x3`` (depending on the platform), etc.
``long3`` aliases ``int32x3`` or ``int64x3`` (depending on the platform), etc.

Second, unlike CUDA C/C++ where factory functions are used, vector types are constructed directly
with their constructor. For example, to construct a ``float32x3``:
Expand All @@ -44,7 +44,7 @@ vector type. For example, all of the following constructions are valid:
# Construct a 4-component vector with 2 2-component vectors
u4 = uint32x4(u2, u2)

The 1st, 2nd, 3rd and 4th component of the vector type can be accessed through fields
The 1st, 2nd, 3rd and 4th component of the vector type can be accessed through fields
``x``, ``y``, ``z``, and ``w`` respectively. The components are immutable after
construction in the present version of Numba; it is expected that support for
mutating vector components will be added in a future release.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user/cooperative_groups.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ overloads:
This can be used to ensure that the kernel is launched with no more than the
maximum number of blocks. Exceeding the maximum number of blocks for the
cooperative launch will result in a ``CUDA_ERROR_COOPERATIVE_LAUNCH_TOO_LARGE``
error.
error.


Applications and Example
Expand Down
1 change: 0 additions & 1 deletion docs/source/user/device-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,4 +89,3 @@ For example, to obtain the UUID of the current device:
dev = cuda.current_context().device
# prints e.g. "GPU-e6489c45-5b68-3b03-bab7-0e7c8e809643"
print(dev.uuid)

32 changes: 16 additions & 16 deletions docs/source/user/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,17 +101,17 @@ propagates through an object over time. It works by discretizing the problem in
1. The domain is partitioned into a mesh of points that each have an individual temperature.
2. Time is partitioned into discrete intervals that are advanced forward sequentially.

Then, the following assumption is applied: The temperature of a point after some interval
Then, the following assumption is applied: The temperature of a point after some interval
has passed is some weighted average of the temperature of the points that are directly
adjacent to it. Intuitively, if all the points in the domain are very hot
and a single point in the middle is very cold, as time passes, the hot points will cause
the cold one to heat up and the cold point will cause the surrounding hot pieces to cool
slightly. Simply put, the heat spreads throughout the object.

We can implement this simulation using a Numba kernel. Let's start simple by assuming
we have a one dimensional object which we'll represent with an array of values. The position
we have a one dimensional object which we'll represent with an array of values. The position
of the element in the array is the position of a point within the object, and the value
of the element represents the temperature.
of the element represents the temperature.

.. literalinclude:: ../../../numba_cuda/numba/cuda/tests/doc_examples/test_laplace.py
:language: python
Expand All @@ -138,7 +138,7 @@ The initial state of the problem can be visualized as:

In our kernel each thread will be responsible for managing the temperature update for a single element
in a loop over the desired number of timesteps. The kernel is below. Note the use of cooperative group
synchronization and the use of two buffers swapped at each iteration to avoid race conditions. See
synchronization and the use of two buffers swapped at each iteration to avoid race conditions. See
:func:`numba.cuda.cg.this_grid() <numba.cuda.cg.this_grid>` for details.

.. literalinclude:: ../../../numba_cuda/numba/cuda/tests/doc_examples/test_laplace.py
Expand Down Expand Up @@ -237,15 +237,15 @@ A common problem in business analytics is that of grouping the activity of users
sessions, called "sessionization". The idea is that users generally traverse through a website and perform
various actions (clicking something, filling out a form, etc.) in discrete groups. Perhaps a customer spends
some time shopping for an item in the morning and then again at night - often the business is interested in
treating these periods as separate interactions with their service, and this creates the problem of
treating these periods as separate interactions with their service, and this creates the problem of
programmatically splitting up activity in some agreed-upon way.

Here we'll illustrate how to write a Numba kernel to solve this problem. We'll start with data
containing two fields: let ``user_id`` represent a unique ID corresponding to an individual customer, and let
``action_time`` be a time that some unknown action was taken on the service. Right now, we'll assume there's
Here we'll illustrate how to write a Numba kernel to solve this problem. We'll start with data
containing two fields: let ``user_id`` represent a unique ID corresponding to an individual customer, and let
``action_time`` be a time that some unknown action was taken on the service. Right now, we'll assume there's
only one type of action, so all there is to know is when it happened.

Our goal will be to create a new column called ``session_id``, which contains a label corresponding to a unique
Our goal will be to create a new column called ``session_id``, which contains a label corresponding to a unique
session. We'll define the boundary between sessions as when there has been at least one hour between clicks.


Expand All @@ -256,7 +256,7 @@ session. We'll define the boundary between sessions as when there has been at le
:end-before: ex_sessionize.import.end
:dedent: 8
:linenos:

Here is a solution using Numba:

.. literalinclude:: ../../../numba_cuda/numba/cuda/tests/doc_examples/test_sessionize.py
Expand Down Expand Up @@ -285,8 +285,8 @@ and a similar pattern is seen throughout.
JIT Function CPU-GPU Compatibility
==================================

This example demonstrates how ``numba.jit`` can be used to jit compile a function for the CPU, while at the same time making
it available for use inside CUDA kernels. This can be very useful for users that are migrating workflows from CPU to GPU as
This example demonstrates how ``numba.jit`` can be used to jit compile a function for the CPU, while at the same time making
it available for use inside CUDA kernels. This can be very useful for users that are migrating workflows from CPU to GPU as
they can directly reuse potential business logic with fewer code changes.

Take the following example function:
Expand All @@ -309,7 +309,7 @@ The function ``business_logic`` can be run standalone in compiled form on the CP
:dedent: 8
:linenos:

It can also be directly reused threadwise inside a GPU kernel. For example one may
It can also be directly reused threadwise inside a GPU kernel. For example one may
generate some vectors to represent ``x``, ``y``, and ``z``:

.. literalinclude:: ../../../numba_cuda/numba/cuda/tests/doc_examples/test_cpu_gpu_compat.py
Expand Down Expand Up @@ -345,12 +345,12 @@ This kernel can be invoked in the normal way:
Monte Carlo Integration
=======================

This example shows how to use Numba to approximate the value of a definite integral by rapidly generating
This example shows how to use Numba to approximate the value of a definite integral by rapidly generating
random numbers on the GPU. A detailed description of the mathematical mechanics of Monte Carlo integration
is out of the scope of the example, but it can briefly be described as an averaging process where the area
is out of the scope of the example, but it can briefly be described as an averaging process where the area
under the curve is approximated by taking the average of many rectangles formed by its function values.

In addition, this example shows how to perform reductions in numba using the
In addition, this example shows how to perform reductions in numba using the
:func:`cuda.reduce() <numba.cuda.Reduce>` API.

.. literalinclude:: ../../../numba_cuda/numba/cuda/tests/doc_examples/test_montecarlo.py
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user/external-memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ sections, using the :func:`~numba.cuda.defer_cleanup` context manager.
When an EMM Plugin is in use, the deallocation strategy is implemented by the
EMM, and Numba's internal deallocation mechanism is not used. The EMM
Plugin could implement:

- A similar strategy to the Numba deallocation behaviour, or
- Something more appropriate to the plugin - for example, deallocated memory
might immediately be returned to a memory pool.
Expand Down
2 changes: 0 additions & 2 deletions docs/source/user/intrinsics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,5 +54,3 @@ Multiple dimension arrays are supported by using a tuple of ints for the index::
result = np.zeros((3, 3, 3), dtype=np.float64)
max_example_3d[(2, 2, 2), (5, 5, 5)](result, arr)
print(result[0, 1, 2], '==', np.max(arr))


Loading