-
Notifications
You must be signed in to change notification settings - Fork 54
Fix bindings: consistency of contexts, streams, and events, similar to #295 #296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
It should not change based on which binding is in use.
a8939c1 to
cb37c69
Compare
|
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it's almost there. Locally I pass tests with the following patch on top of this:
Details
diff --git a/numba_cuda/numba/cuda/cudadrv/driver.py b/numba_cuda/numba/cuda/cudadrv/driver.py
index 0dfe177..6b3d0af 100644
--- a/numba_cuda/numba/cuda/cudadrv/driver.py
+++ b/numba_cuda/numba/cuda/cudadrv/driver.py
@@ -1256,13 +1256,8 @@ class _PendingDeallocs(object):
while self._cons:
[dtor, handle, size] = self._cons.popleft()
_logger.info("dealloc: %s %s bytes", dtor.__name__, size)
- # The EMM plugin interface uses CUdeviceptr instances when the
- # NVIDIA binding is enabled, but the other interfaces use ctypes
- # objects, so we have to check what kind of object we have here.
- # binding_types = (binding.CUdeviceptr, binding.CUmodule)
- # if USE_NV_BINDING and not isinstance(handle, binding_types):
- # handle = handle.value
dtor(handle)
+
self._size = 0
@contextlib.contextmanager
@@ -1789,14 +1784,14 @@ def _pin_finalizer(memory_manager, ptr, alloc_key, mapped):
def _event_finalizer(deallocs, handle):
def core():
- deallocs.add_item(driver.cuEventDestroy, handle)
+ deallocs.add_item(driver.cuEventDestroy, handle.value)
return core
def _stream_finalizer(deallocs, handle):
def core():
- deallocs.add_item(driver.cuStreamDestroy, handle)
+ deallocs.add_item(driver.cuStreamDestroy, handle.value)
return core
@@ -2406,7 +2401,7 @@ class Stream(object):
stream_callback = binding.CUstreamCallback(ptr)
# The callback needs to receive a pointer to the data PyObject
data = id(data)
- handle = int(self.handle)
+ handle = self.handle.value
else:
stream_callback = self._stream_callback
handle = self.handle
diff --git a/numba_cuda/numba/cuda/memory_management/nrt.py b/numba_cuda/numba/cuda/memory_management/nrt.py
index d6e4f53..1dff61d 100644
--- a/numba_cuda/numba/cuda/memory_management/nrt.py
+++ b/numba_cuda/numba/cuda/memory_management/nrt.py
@@ -143,7 +143,7 @@ class _Runtime:
1,
1,
0,
- stream.handle,
+ stream.handle.value,
params,
cooperative=False,
)
diff --git a/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py b/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py
index 8367b46..c79bbfb 100644
--- a/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py
+++ b/numba_cuda/numba/cuda/tests/cudapy/test_stream_api.py
@@ -38,10 +38,7 @@ class TestStreamAPI(CUDATestCase):
# We don't test synchronization on the stream because it's not a real
# stream - we used a dummy pointer for testing the API, so we just
# ensure that the stream handle matches the external stream pointer.
- if config.CUDA_USE_NVIDIA_BINDING:
- value = int(s.handle)
- else:
- value = s.handle.value
+ value = s.handle.value
self.assertEqual(ptr, value)
@skip_unless_cudasim("External streams are usable with hardware")
The APIs that create streams and events now always return ctypes objects, so moving the .value access to the finalizer should work and might be a little cleaner since you don't have to puzzle through each object during clear. Beyond that, there's a small update in Stream.add_callback that assumes the stream handle is now a ctypes object and similar changes in the tests and NRT code.
If you think this is a reasonable way forward I can push this and debug from there.
5b30ebf to
e61b61d
Compare
|
/ok to test |
|
@brandon-b-miller Thanks for the fixes! I think the current state / notes are:
So assuming this is also OK with RMM, then I think this completes the work we need to / will do in the near term to make the APIs between the two bindings consistent. |
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
@brandon-b-miller This is OK with RMM, with the fix from @wence- in #317 (comment) (which I think is a deadlock caused by a combination of the behaviour of RMM and cuda-python). The IPC tests fail when using RMM and not using cuda-python, but I don't think that is an issue resulting from this PR. |
|
/ok to test |
- Add deadlock warnings to Stream.add_callback and Stream.async_done docstrings (NVIDIA#321) - `MemoryPointer`: ensure `CUdeviceptr` used with NVIDIA binding (NVIDIA#328) - Fix indexing GPUs with CUdevice object (NVIDIA#319) - Fix bindings: consistency of contexts, streams, and events, similar to NVIDIA#295 (NVIDIA#296) - Fix nvrtc resolution when CUDA_HOME env is set (NVIDIA#314)
- Add deadlock warnings to Stream.add_callback and Stream.async_done docstrings (#321) - `MemoryPointer`: ensure `CUdeviceptr` used with NVIDIA binding (#328) - Fix indexing GPUs with CUdevice object (#319) - Fix bindings: consistency of contexts, streams, and events, similar to #295 (#296) - Fix nvrtc resolution when CUDA_HOME env is set (#314)
- Add deadlock warnings to Stream.add_callback and Stream.async_done docstrings (NVIDIA#321) - `MemoryPointer`: ensure `CUdeviceptr` used with NVIDIA binding (NVIDIA#328) - Fix indexing GPUs with CUdevice object (NVIDIA#319) - Fix bindings: consistency of contexts, streams, and events, similar to NVIDIA#295 (NVIDIA#296) - Fix nvrtc resolution when CUDA_HOME env is set (NVIDIA#314)
This is an incomplete follow-on from #295. Putting this PR up so others can see the general idea of it.