Deprecate the `DeviceNDArray` class and public APIs that return instances #546

brandon-b-miller · 2025-10-24T13:30:34Z

Part of #471

Adds a DeprecatedNDArrayAPIWarning emitted from all user facing functions for moving data around (cuda.to_device, driver.host_to_device, device_to_host, also as_cuda_array, is_cuda_array, etc
Separates existing now deprecated APIs into internal non-warning versions and external warning versions
Adds a deprecation warning to the DeviceNDArray ctor
Adds DeviceNDArray._create_nowarn
Removes as many usages of the deprecated APIs as possible from the test suite in favor of cupy arrays
Catches warnings for tests of the currently exposed and now deprecated APIs
Where absolutely necessary, tests calls internal non-warning versions of the deprecated APIs
Rework tests to not use these apis as much as possible

copy-pr-bot · 2025-10-24T13:30:37Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

cpcloud · 2025-10-24T15:04:39Z

numba_cuda/numba/cuda/cudadrv/devicearray.py

+    @functools.wraps(func)
+    def wrapper(*args, **kwargs):
+        warnings.warn(
+            f"{func.__name__} api is deprecated. Please prefer cupy for array functions",


cupy arrays are much slower than DeviceNDArray because they require creating an external (i.e., non-numba-cuda-created) stream, so I'm not sure a recommendation for that is what we should do right now.

I was thinking that we can keep the top-level APIs (device_array etc.) and replace their internals with StridedMemoryView or something similar, in an effort to allow folks to as-cheaply-as-possible construct arrays.

Here's the current state of the art:

I concur that a light weight device array like container should exist, I'm just not sure that numba-cuda should necessarily be the library providing it publicly. I think we should nudge users away from using numba-cuda as such, like for moving data from host to device. That said, I'm open to suggestions on what we should recommend.

brandon-b-miller · 2026-01-05T16:29:12Z

/ok to test

copy-pr-bot · 2026-01-05T16:29:29Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

greptile-apps · 2026-01-05T16:37:43Z

Greptile Summary

This PR deprecates the DeviceNDArray class and all public APIs for host-side device array operations (to_device, device_array, as_cuda_array, etc.) in favor of CuPy, addressing issue #471.

Key Changes:

Introduces DeprecatedDeviceArrayApiWarning with @deprecated_array_api decorator for public APIs
Separates deprecated public APIs in api.py from internal non-warning implementations in _api.py
Adds DeviceNDArray._create_nowarn() factory method for internal use
Updates 17+ test files to use new DeprecatedDeviceArrayApiTest base class that suppresses warnings
Adds CuPy as test dependency and updates documentation with deprecation notices
Systematically replaces direct DeviceNDArray() constructor calls with _create_nowarn() throughout codebase

Issues Found:

reduction.py:262-264 introduces unnecessary complexity by converting already-sliceable device arrays through __cuda_array_interface__

Confidence Score: 4/5

Safe to merge with one logic issue in reduction.py that should be addressed
The deprecation infrastructure is well-designed with clear separation between public/internal APIs. Test coverage is comprehensive with proper warning suppression. However, the reduction.py change introduces unnecessary complexity that could impact performance and should be simplified before merging.
Pay close attention to numba_cuda/numba/cuda/kernels/reduction.py - the result handling logic was unnecessarily complicated

Important Files Changed

Filename	Overview
numba_cuda/numba/cuda/cudadrv/devicearray.py	Adds deprecation infrastructure: `DeprecatedDeviceArrayApiWarning`, `deprecated_array_api` decorator, `DeviceNDArray._create_nowarn()` factory method, and marks public methods (`split`, `squeeze`, `view`, `get_ipc_handle`) as deprecated
numba_cuda/numba/cuda/api.py	Wraps all public device array APIs (`to_device`, `device_array`, `managed_array`, `pinned_array`, `mapped_array`, `as_cuda_array`, `is_cuda_array`) with deprecation warnings, delegates to internal `_api` module implementations
numba_cuda/numba/cuda/_api.py	Introduces internal non-warning implementations (`_from_cuda_array_interface`, `_as_cuda_array`, `_is_cuda_array`, `_to_device`, `_device_array`, etc.) for use within the library without triggering deprecation warnings
numba_cuda/numba/cuda/testing.py	Adds `DeprecatedDeviceArrayApiTest` base class that automatically suppresses `DeprecatedDeviceArrayApiWarning` in setUp/tearDown for tests that need to use deprecated APIs
numba_cuda/numba/cuda/kernels/reduction.py	Replaces `cuda.device_array()` with `_api._device_array()`, modifies result handling to use `_from_cuda_array_interface()` - complex change to slicing logic that may need verification
docs/source/user/memory.rst	Adds deprecation notes recommending CuPy for all device array operations including memory transfers, pinned/mapped/managed memory

greptile-apps

Additional Comments (15)

numba_cuda/numba/cuda/kernels/transpose.py, line 24-25 (link)

style: The deprecation message mentions 'transpose method' but this function is not a method - it's a standalone function. Consider rewording to 'transpose function and DeviceNDArray class are deprecated.'

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/tests/cudadrv/test_profiler.py, line 19 (link)

style: Array size changed from 100 to 10, should this be intentional? Was the change from 100 to 10 elements intentional, or should this remain 100to preserve the original test's different allocation sizes?
numba_cuda/numba/cuda/api.py, line 69-70 (link)

logic: Dead code: line 70 is unreachable after the return statement on line 69
numba_cuda/numba/cuda/vectorizers.py, line 121 (link)

logic: This line still uses the deprecated cuda.as_cuda_array instead of _api._as_cuda_array like line 186
numba_cuda/numba/cuda/tests/cudapy/test_random.py, line 22 (link)

style: CuPy is imported but never used. Either remove the unused import or complete the migration to use CuPy arrays instead of deprecated DeviceNDArray APIs. Is this import intended for future work, or should the migration to CuPy arrays be completed in this PR?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/testing.py, line 197 (link)

logic: Using warnings.resetwarnings() clears all warning filters, not just the ones added in setUp. This could affect other tests running in the same process. Should this use a more targeted approach to only reset the specific filter added in setUp?
numba_cuda/numba/cuda/tests/doc_examples/test_reduction.py, line 73 (link)

style: This line accesses a[0] directly on the GPU array, which works with CuPy but the assertion on line 77 uses a.get()[0]. Consider using a[0].get() here for consistency.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/_api.py, line 324-330 (link)

logic: This function calls the public device_array instead of the internal _device_array, which will emit deprecation warnings when used internally

Should this call _device_array to avoid deprecation warnings when used internally?
numba_cuda/numba/cuda/_api.py, line 340-348 (link)

logic: This function calls the public mapped_array instead of the internal _mapped_array, which will emit deprecation warnings when used internally

Should this call _mapped_array to avoid deprecation warnings when used internally?
numba_cuda/numba/cuda/_api.py, line 358-360 (link)

logic: This function calls the public pinned_array instead of the internal _pinned_array, which will emit deprecation warnings when used internally

Should this call _pinned_array to avoid deprecation warnings when used internally?
numba_cuda/numba/cuda/tests/cudapy/test_multithreads.py, line 66 (link)

style: inconsistent with migration goals - still uses deprecated cuda.to_device(). Are these methods intentionally testing the deprecated API, or should they also be migrated to CuPy?
numba_cuda/numba/cuda/tests/cudapy/test_multithreads.py, line 75-76 (link)

style: inconsistent with migration goals - still uses deprecated cuda.to_device(). Should these test methods also migrate to CuPy arrays for consistency?
numba_cuda/numba/cuda/tests/doc_examples/test_globals.py, line 47 (link)

logic: cp.asarray(5, dtype=np.float64) creates a scalar array with value 5, not a 5-element array. Should be cp.zeros(5, dtype=np.float64) to match original behavior.
numba_cuda/numba/cuda/tests/cudapy/test_vectorize.py, line 89 (link)

logic: Line 89 creates a CUDA stream that is never used - it's immediately overridden on line 94
numba_cuda/numba/cuda/tests/cudapy/test_vectorize.py, line 94-96 (link)

logic: Stream creation moved outside the loop but needs to be inside for proper isolation between test iterations. Should each iteration use a separate stream for proper test isolation?

_{74 files reviewed, 15 comments}

_{Edit Code Review Agent Settings | Greptile}

brandon-b-miller · 2026-01-05T16:59:36Z

/ok to test

greptile-apps

Additional Comments (3)

numba_cuda/numba/cuda/cudadrv/devicearray.py, line 773-779 (link)

logic: type(self)( will trigger deprecation warning during internal operations
numba_cuda/numba/cuda/_api.py, line 205-208 (link)

logic: Internal _pinned_array should not emit deprecation warnings. Remove this warning since this is the internal implementation
numba_cuda/numba/cuda/_api.py, line 241-244 (link)

logic: Internal _mapped_array should not emit deprecation warnings. Remove this warning since this is the internal implementation

_{74 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

cpcloud

Couple of minor changes requested, but overall LGTM.

numba_cuda/numba/cuda/tests/cudapy/test_gufunc.py

numba_cuda/numba/cuda/tests/cudapy/test_gufunc_scalar.py

cpcloud · 2026-01-05T17:34:58Z

numba_cuda/numba/cuda/tests/cudapy/test_random.py

    xoroshiro128p_normal_float64,
 )
-
+import cupy as cp


Why is this import needed here? It doesn't appear to be used and nothing else was changed.

cpcloud · 2026-01-05T17:37:28Z

numba_cuda/numba/cuda/tests/cudapy/test_vectorize.py

+        setattr(driver, "cuMemcpyHtoD", raising_transfer)
+        setattr(driver, "cuMemcpyDtoH", raising_transfer)


Why are we using setattr here? We know the attribute name, so there's no obvious reason to set this attribute dynamically.

Suggested change

setattr(driver, "cuMemcpyHtoD", raising_transfer)

setattr(driver, "cuMemcpyDtoH", raising_transfer)

driver.cuMemcpyHtoD = driver.cuMemcpyDtoH = raising_transfer

cpcloud · 2026-01-05T17:38:01Z

numba_cuda/numba/cuda/tests/cudapy/test_vectorize.py

+        else:
+            del driver.cuMemcpyHtoD
+        if self.old_DtoH is not None:
+            setattr(driver, "cuMemcpyDtoH", self.old_DtoH)


Suggested change

setattr(driver, "cuMemcpyDtoH", self.old_DtoH)

driver.cuMemcpyDtoH = self.old_DtoH

cpcloud · 2026-01-05T17:38:17Z

numba_cuda/numba/cuda/tests/cudapy/test_vectorize.py

-            old_DtoH = getattr(driver, "cuMemcpyDtoH", None)
+    def tearDown(self):
+        if self.old_HtoD is not None:
+            setattr(driver, "cuMemcpyHtoD", self.old_HtoD)


Suggested change

setattr(driver, "cuMemcpyHtoD", self.old_HtoD)

driver.cuMemcpyHtoD = self.old_HtoD

Co-authored-by: Phillip Cloud <[email protected]>

greptile-apps

Additional Comments (10)

numba_cuda/numba/cuda/tests/cudapy/test_random.py, line 22 (link)

style: CuPy import is unused in this file. Is this import intended for future use, or should it be removed until needed?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/tests/doc_examples/test_laplace.py, line 67 (link)

logic: This still calls copy_to_host() on a CuPy array, but CuPy arrays don't have this method - should use .get() instead
numba_cuda/numba/cuda/_api.py, line 205-208 (link)

logic: Internal function _pinned_array should not emit deprecation warnings - this defeats the purpose of having separate internal implementations
numba_cuda/numba/cuda/_api.py, line 241-244 (link)

logic: Internal function _mapped_array should not emit deprecation warnings - this defeats the purpose of having separate internal implementations
numba_cuda/numba/cuda/tests/cudapy/test_gufunc.py, line 130 (link)

logic: This cuda.device_array call should also be wrapped in the warning context since it's a deprecated API
numba_cuda/numba/cuda/tests/cudapy/test_device_array_capture.py, line 23 (link)

style: Function name and docstring are now misleading - creates CuPy array, not Numba device array

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/tests/cudapy/test_device_array_capture.py, line 28 (link)

style: Docstring refers to 'Numba device array' but now wraps CuPy array

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/vectorizers.py, line 121 (link)

style: Inconsistency: this uses public cuda.as_cuda_array() while line186 uses internal _api._as_cuda_array() for the same operation

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/tests/cudapy/test_gufunc_scalar.py, line 126 (link)

syntax: Missing import for pytest module which is required for pytest.warns()
numba_cuda/numba/cuda/cudadrv/devicearray.py, line 773-779 (link)

logic: Using deprecated constructor will trigger warnings during internal setitem operations. Should use type(self)._create_nowarn(

_{74 files reviewed, 10 comments}

_{Edit Code Review Agent Settings | Greptile}

brandon-b-miller · 2026-01-05T20:48:24Z

/ok to test

brandon-b-miller · 2026-01-05T20:53:25Z

/ok to test

greptile-apps

_{74 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-05T20:55:54Z

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

brandon-b-miller · 2026-01-05T21:00:01Z

/ok to test

greptile-apps

Additional Comments (5)

numba_cuda/numba/cuda/tests/cudapy/test_transpose.py, line 9 (link)

style: CuPy import is unused in this file

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/tests/cudadrv/test_cuda_devicerecord.py, line 116 (link)

style: Should use super().setUp() for consistency with line 42

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/tests/cudapy/test_gufunc_scalar.py, line 58 (link)

logic: inconsistent migration - dev_out2 still uses deprecated copy_to_host() method while dev_out1 was migrated to CuPy's .get(). Should be out2 = dev_out2.get()
numba_cuda/numba/cuda/vectorizers.py, line 119 (link)

style: Line uses public cuda._api._is_cuda_ndarray but should use internal _api._is_cuda_ndarray for consistency

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/vectorizers.py, line 184 (link)

style: Still using deprecated public cuda.cudadrv.devicearray.is_cuda_ndarray - should use _api._is_cuda_ndarray

_{74 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

Additional Comments (2)

numba_cuda/numba/cuda/api.py, line 148-154 (link)

style: duplicates _api._device_array() - consider importing from _api instead to reduce code duplication

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}
numba_cuda/numba/cuda/api.py, line 293-320 (link)

style: duplicates _api._contiguous_strides_like_array() - consider importing from _api module

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

_{74 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

brandon-b-miller · 2026-01-05T22:42:29Z

/ok to test

greptile-apps

Additional Comments (1)

numba_cuda/numba/cuda/kernels/reduction.py, line 262-264 (link)

logic: Unnecessary conversion here - res is already a device array with slicing support. The original res[:1].copy_to_device(partials[:1], stream=stream) was simpler and more efficient.

_{74 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

cpcloud

LGTM.

cpcloud · 2026-01-08T14:24:28Z

numba_cuda/numba/cuda/tests/cudadrv/test_pinned.py



-class TestPinned(CUDATestCase):
+# TODO


non-blocking: is there a specific todo here?

initial

a58f928

cpcloud reviewed Oct 24, 2025

View reviewed changes

gmarkall added the 2 - In Progress Currently a work in progress label Oct 24, 2025

rparolin added this to the next milestone Oct 24, 2025

brandon-b-miller mentioned this pull request Oct 27, 2025

Handle cuda.core.Stream in driver operations #401

Merged

brandon-b-miller added 24 commits October 27, 2025 15:18

Merge branch 'main' into deprecate-host-array-api

db335c1

Merge branch 'main' into deprecate-host-array-api

601eec4

progress replacing tests

762a6b1

progress

2081572

Merge branch 'main' into deprecate-host-array-api

8377953

clean

edf413d

Merge branch 'main' into deprecate-host-array-api

bb92fec

more clean

6e08f80

working through more test cases

50683d0

working out class relationships

e44516b

Merge branch 'main' into deprecate-host-array-api

85d5149

partially switch designs

58d716c

merge/progress

7872521

partial

ec5c175

more progress

e06ce49

fix a few more tests

ef6860a

even more tests

238052b

Merge branch 'main' into deprecate-host-array-api

c2d1f25

fix blackscholes test

1c69776

fix test_gufunc_arg

ec67eb5

tests

40a89a8

fix remaining tests

bec33c4

merge/resolve

da9a7e5

fix new test failures

b8f3790

style

0e9fc8e

brandon-b-miller marked this pull request as ready for review January 5, 2026 16:29