Fix bindings: consistency of contexts, streams, and events, similar to #295 #296

gmarkall · 2025-06-13T15:49:40Z

This is an incomplete follow-on from #295. Putting this PR up so others can see the general idea of it.

copy-pr-bot · 2025-06-13T15:49:44Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

gmarkall · 2025-06-13T15:52:34Z

/ok to test

It should not change based on which binding is in use.

gmarkall · 2025-06-27T14:15:17Z

/ok to test

brandon-b-miller

This seems like it's almost there. Locally I pass tests with the following patch on top of this:

Details

diff --git a/numba_cuda/numba/cuda/cudadrv/driver.py b/numba_cuda/numba/cuda/cudadrv/driver.py
index 0dfe177..6b3d0af 100644
--- a/numba_cuda/numba/cuda/cudadrv/driver.py
+++ b/numba_cuda/numba/cuda/cudadrv/driver.py
@@ -1256,13 +1256,8 @@ class _PendingDeallocs(object):
             while self._cons:
                 [dtor, handle, size] = self._cons.popleft()
                 _logger.info("dealloc: %s %s bytes", dtor.__name__, size)
-                # The EMM plugin interface uses CUdeviceptr instances when the
-                # NVIDIA binding is enabled, but the other interfaces use ctypes
-                # objects, so we have to check what kind of object we have here.
-                # binding_types = (binding.CUdeviceptr, binding.CUmodule)
-                # if USE_NV_BINDING and not isinstance(handle, binding_types):
-                #    handle = handle.value
                 dtor(handle)
+
             self._size = 0
 
     @contextlib.contextmanager
@@ -1789,14 +1784,14 @@ def _pin_finalizer(memory_manager, ptr, alloc_key, mapped):
 
 def _event_finalizer(deallocs, handle):
     def core():
-        deallocs.add_item(driver.cuEventDestroy, handle)
+        deallocs.add_item(driver.cuEventDestroy, handle.value)
 
     return core
 
 
 def _stream_finalizer(deallocs, handle):
     def core():
-        deallocs.add_item(driver.cuStreamDestroy, handle)
+        deallocs.add_item(driver.cuStreamDestroy, handle.value)
 
     return core
 
@@ -2406,7 +2401,7 @@ class Stream(object):
             stream_callback = binding.CUstreamCallback(ptr)
             # The callback needs to receive a pointer to the data PyObject
             data = id(data)
-            handle = int(self.handle)
+            handle = self.handle.value
         else:
             stream_callback = self._stream_callback
             handle = self.handle
diff --git a/numba_cuda/numba/cuda/memory_management/nrt.py b/numba_cuda/numba/cuda/memory_management/nrt.py
index d6e4f53..1dff61d 100644
--- a/numba_cuda/numba/cuda/memory_management/nrt.py
+++ b/numba_cuda/numba/cuda/memory_management/nrt.py
@@ -143,7 +143,7 @@ class _Runtime:
             1,
             1,
             0,
-            stream.handle,
+            stream.handle.value,
             params,
             cooperative=False,
         )
diff --git a/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py b/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py
index 8367b46..c79bbfb 100644
--- a/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py
+++ b/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py
@@ -38,10 +38,7 @@ class TestStreamAPI(CUDATestCase):
         # We don't test synchronization on the stream because it's not a real
         # stream - we used a dummy pointer for testing the API, so we just
         # ensure that the stream handle matches the external stream pointer.
-        if config.CUDA_USE_NVIDIA_BINDING:
-            value = int(s.handle)
-        else:
-            value = s.handle.value
+        value = s.handle.value
         self.assertEqual(ptr, value)
 
     @skip_unless_cudasim("External streams are usable with hardware")

The APIs that create streams and events now always return ctypes objects, so moving the .value access to the finalizer should work and might be a little cleaner since you don't have to puzzle through each object during clear. Beyond that, there's a small update in Stream.add_callback that assumes the stream handle is now a ctypes object and similar changes in the tests and NRT code.

If you think this is a reasonable way forward I can push this and debug from there.

…events

gmarkall · 2025-07-03T12:07:09Z

/ok to test

gmarkall · 2025-07-03T14:29:54Z

@brandon-b-miller Thanks for the fixes! I think the current state / notes are:

I think your solution looks good and works for the Numba-CUDA internal cases.
I can't foresee whether these changes will cause an issue for EMM plugins, so I'm going to try RMM with both bindings locally to investigate that.
There are some cases where there are some splits between the NVIDIA binding and the ctypes one in the test suite:
- The IPC tests (test_ipc.py). This is a legacy API and I'd be surprised if it's in use anywhere, so I think this can just be left
- Some CUDA driver tests:
  - test_cuda_memory.py: This is because we haven't touched the EMM plugin interface so we don't break RMM.
  - test_context_stack.py: These are because we either directly call driver APIs that need separate handling (test_attached_primary) or because we're testing internal parts of the API where we don't need to / can't maintain consistency between the two bindings (test_attached_non_primary).
  - test_managed_alloc.py: We're directly calling driver functions so we need to handle the differences.
  - test_cuda_driver.py: Uses the launch_kernel() function from the driver, which is an internal API and handles the bindings differently (in test_cuda_driver_basic), or are directly calling the driver functions (in test_cuda_driver_external_stream).

So assuming this is also OK with RMM, then I think this completes the work we need to / will do in the near term to make the APIs between the two bindings consistent.

…events

copy-pr-bot · 2025-07-07T12:01:25Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

gmarkall · 2025-07-07T12:02:35Z

@brandon-b-miller This is OK with RMM, with the fix from @wence- in #317 (comment) (which I think is a deadlock caused by a combination of the behaviour of RMM and cuda-python).

The IPC tests fail when using RMM and not using cuda-python, but I don't think that is an issue resulting from this PR.

gmarkall · 2025-07-07T12:02:42Z

/ok to test

- Add deadlock warnings to Stream.add_callback and Stream.async_done docstrings (NVIDIA#321) - `MemoryPointer`: ensure `CUdeviceptr` used with NVIDIA binding (NVIDIA#328) - Fix indexing GPUs with CUdevice object (NVIDIA#319) - Fix bindings: consistency of contexts, streams, and events, similar to NVIDIA#295 (NVIDIA#296) - Fix nvrtc resolution when CUDA_HOME env is set (NVIDIA#314)

- Add deadlock warnings to Stream.add_callback and Stream.async_done docstrings (#321) - `MemoryPointer`: ensure `CUdeviceptr` used with NVIDIA binding (#328) - Fix indexing GPUs with CUdevice object (#319) - Fix bindings: consistency of contexts, streams, and events, similar to #295 (#296) - Fix nvrtc resolution when CUDA_HOME env is set (#314)

- Add deadlock warnings to Stream.add_callback and Stream.async_done docstrings (NVIDIA#321) - `MemoryPointer`: ensure `CUdeviceptr` used with NVIDIA binding (NVIDIA#328) - Fix indexing GPUs with CUdevice object (NVIDIA#319) - Fix bindings: consistency of contexts, streams, and events, similar to NVIDIA#295 (NVIDIA#296) - Fix nvrtc resolution when CUDA_HOME env is set (NVIDIA#314)

gmarkall added the 2 - In Progress Currently a work in progress label Jun 26, 2025

gmarkall added 2 commits June 27, 2025 15:14

Make Context.handle always be a drvapi.cu_context instance

07852d2

It should not change based on which binding is in use.

Attempt to make streams and events consistent

cb37c69

gmarkall force-pushed the fix-bindings-streams-events branch from a8939c1 to cb37c69 Compare June 27, 2025 14:14

brandon-b-miller reviewed Jul 1, 2025

View reviewed changes

brandon-b-miller and others added 2 commits July 3, 2025 13:06

Move .value access to finalizer

2c8b6e2

Merge remote-tracking branch 'NVIDIA/main' into fix-bindings-streams-…

e61b61d

…events

gmarkall force-pushed the fix-bindings-streams-events branch from 5b30ebf to e61b61d Compare July 3, 2025 12:06

brandon-b-miller approved these changes Jul 3, 2025

View reviewed changes

Merge remote-tracking branch 'NVIDIA/main' into fix-bindings-streams-…

bb71a94

…events

gmarkall added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jul 7, 2025

gmarkall marked this pull request as ready for review July 7, 2025 12:01

gmarkall changed the title ~~[WIP] Fix bindings: consistency of contexts, streams, and events, similar to #295~~ Fix bindings: consistency of contexts, streams, and events, similar to #295 Jul 7, 2025

gmarkall added 5 - Ready to merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jul 7, 2025

gmarkall merged commit 20a2e3b into NVIDIA:main Jul 7, 2025
39 checks passed

gmarkall mentioned this pull request Jul 17, 2025

[Python] test_cuda_numba_interop fails with numba-cuda apache/arrow#47128

Closed

gmarkall mentioned this pull request Jul 18, 2025

Bump version to 0.17.0 #331

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bindings: consistency of contexts, streams, and events, similar to #295 #296

Fix bindings: consistency of contexts, streams, and events, similar to #295 #296

Uh oh!

gmarkall commented Jun 13, 2025

Uh oh!

copy-pr-bot bot commented Jun 13, 2025

Uh oh!

gmarkall commented Jun 13, 2025

Uh oh!

gmarkall commented Jun 27, 2025

Uh oh!

brandon-b-miller left a comment •

edited by gmarkall

Loading

Uh oh!

gmarkall commented Jul 3, 2025

Uh oh!

gmarkall commented Jul 3, 2025

Uh oh!

copy-pr-bot bot commented Jul 7, 2025

Uh oh!

gmarkall commented Jul 7, 2025

Uh oh!

gmarkall commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix bindings: consistency of contexts, streams, and events, similar to #295 #296

Fix bindings: consistency of contexts, streams, and events, similar to #295 #296

Uh oh!

Conversation

gmarkall commented Jun 13, 2025

Uh oh!

copy-pr-bot bot commented Jun 13, 2025

Uh oh!

gmarkall commented Jun 13, 2025

Uh oh!

gmarkall commented Jun 27, 2025

Uh oh!

brandon-b-miller left a comment • edited by gmarkall Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gmarkall commented Jul 3, 2025

Uh oh!

gmarkall commented Jul 3, 2025

Uh oh!

copy-pr-bot bot commented Jul 7, 2025

Uh oh!

gmarkall commented Jul 7, 2025

Uh oh!

gmarkall commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brandon-b-miller left a comment •

edited by gmarkall

Loading