Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 1 addition & 19 deletions numba_cuda/numba/cuda/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,7 @@
@pytest.mark.usefixtures("initialize_from_pytest_config")
class CUDATestCase(TestCase):
"""
For tests that use a CUDA device. Test methods in a CUDATestCase must not
be run out of module order, because the ContextResettingTestCase may reset
the context and destroy resources used by a normal CUDATestCase if any of
its tests are run between tests from a CUDATestCase.
For tests that use a CUDA device.

Methods assertFileCheckAsm and assertFileCheckLLVM will inspect a
CUDADispatcher and assert that the compilation artifacts match the
Expand Down Expand Up @@ -187,21 +184,6 @@ def assertFileCheckMatches(
)


class ContextResettingTestCase(CUDATestCase):
"""
For tests where the context needs to be reset after each test. Typically
these inspect or modify parts of the context that would usually be expected
to be internal implementation details (such as the state of allocations and
deallocations, etc.).
"""

def tearDown(self):
super().tearDown()
from numba.cuda.cudadrv.devices import reset

reset()


def skip_on_cudasim(reason):
"""Skip this test if running on the CUDA simulator"""
return unittest.skipIf(config.ENABLE_CUDASIM, reason)
Expand Down
11 changes: 3 additions & 8 deletions numba_cuda/numba/cuda/tests/cudadrv/test_context_stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,7 @@ class TestContextStack(CUDATestCase):
def setUp(self):
super().setUp()
# Reset before testing
cuda.close()

def test_gpus_current(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems to rely on there being no active context at all which is hard to guarantee without the hard cuda.close.

self.assertIs(cuda.gpus.current, None)
with cuda.gpus[0]:
self.assertEqual(int(cuda.gpus.current.id), 0)
cuda.current_context().reset()

def test_gpus_len(self):
self.assertGreater(len(cuda.gpus), 0)
Expand All @@ -45,7 +40,7 @@ def test_gpus_cudevice_indexing(self):
class TestContextAPI(CUDATestCase):
def tearDown(self):
super().tearDown()
cuda.close()
cuda.current_context().reset()

def test_context_memory(self):
try:
Expand Down Expand Up @@ -91,7 +86,7 @@ def switch_gpu():
class Test3rdPartyContext(CUDATestCase):
def tearDown(self):
super().tearDown()
cuda.close()
cuda.current_context().reset()

def test_attached_primary(self, extra_work=lambda: None):
# Emulate primary context creation by 3rd party
Expand Down
9 changes: 5 additions & 4 deletions numba_cuda/numba/cuda/tests/cudadrv/test_cuda_memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,18 @@
import numpy as np

from numba.cuda.cudadrv import driver, drvapi, devices
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase
from numba.cuda.testing import skip_on_cudasim


@skip_on_cudasim("CUDA Memory API unsupported in the simulator")
class TestCudaMemory(ContextResettingTestCase):
class TestCudaMemory(CUDATestCase):
def setUp(self):
super().setUp()
self.context = devices.get_context()

def tearDown(self):
self.context.reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @gmarkall I think this should clear things safely, this appears to free the resources within the context but is less scorched earth than a full reset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean it will no longer invalidate assumptions made by cuda-core?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarkall are you asking if cuda.core protects against the context being reset in some kind of way, the answer is no. Resetting the context would 100% lead to any resources that existed being dangling and probably result in segfaults.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reset name of this context method is perhaps a little misleading.

Step by step this method:

  1. Calls reset on the context's MemoryManager, in this case a NumbaCUDAMemoryManager instance, whose implementation is inherited from HostOnlyCUDAMemoryManager here. This in turn calls clear on two of objects attached to the instance: First the allocations and then the deallocations. TL;DR, clearing the allocations member has the effect of adding everything the context allocated to its pending deallocs, and then deallocations.clear() eventually calls the dtor which maps to a true cuda free through the driver.
  2. Does the same for all modules (same thing but terminating in cuModuleUnload)
  3. Does the same for anything left in its own _PendingDeallocs list

My reading of this is that the net effect is that all resources owned by this context object are released - but it only releases resources it itself allocated.

This is distinct from the full gpu reset function called by ContextResettingTestCase which destroys all the contexts on the device. If cuda.core has cached contexts it interacts with internally after this is called, I can imagine things going wrong... but this particular kind of reset should just be cleaning up objects that are being carefully kept track of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying - this makes sense.

del self.context
super(TestCudaMemory, self).tearDown()

Expand Down Expand Up @@ -107,7 +108,7 @@ def dtor():
self.assertEqual(dtor_invoked[0], 2)


class TestCudaMemoryFunctions(ContextResettingTestCase):
class TestCudaMemoryFunctions(CUDATestCase):
def setUp(self):
super().setUp()
self.context = devices.get_context()
Expand Down Expand Up @@ -153,7 +154,7 @@ def test_d2d(self):


@skip_on_cudasim("CUDA Memory API unsupported in the simulator")
class TestMVExtent(ContextResettingTestCase):
class TestMVExtent(CUDATestCase):
def test_c_contiguous_array(self):
ary = np.arange(100)
arysz = ary.dtype.itemsize * ary.size
Expand Down
13 changes: 8 additions & 5 deletions numba_cuda/numba/cuda/tests/cudadrv/test_emm_plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,14 +112,16 @@ class TestDeviceOnlyEMMPlugin(CUDATestCase):
def setUp(self):
super().setUp()
# Always start afresh with a new context and memory manager
cuda.close()
cuda.set_memory_manager(DeviceOnlyEMMPlugin)
ctx = cuda.current_context()
ctx.reset()
self._initial_memory_manager = ctx.memory_manager
ctx.memory_manager = DeviceOnlyEMMPlugin(context=ctx)

def tearDown(self):
super().tearDown()
# Unset the memory manager for subsequent tests
cuda.close()
cuda.cudadrv.driver._memory_manager = None
ctx = cuda.current_context()
ctx.reset()
ctx.memory_manager = self._initial_memory_manager

def test_memalloc(self):
mgr = cuda.current_context().memory_manager
Expand All @@ -129,6 +131,7 @@ def test_memalloc(self):
arr_1 = np.arange(10)
d_arr_1 = cuda.device_array_like(arr_1)
self.assertTrue(mgr.memalloc_called)

self.assertEqual(mgr.count, 1)
self.assertEqual(mgr.allocations[1], arr_1.nbytes)

Expand Down
7 changes: 5 additions & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_host_alloc.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,13 @@
import numpy as np
from numba.cuda.cudadrv import driver
from numba import cuda
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase


class TestHostAlloc(ContextResettingTestCase):
class TestHostAlloc(CUDATestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these are fairly old tests from the upstream numba code base where it looks like a lot of tests may have simply reset the device every time in an effort to be extra safe and start with a clean slate every time. This may have made sense at the time but might not anymore.

def tearDown(self):
cuda.current_context().reset()

def test_host_alloc_driver(self):
n = 32
mem = cuda.current_context().memhostalloc(n, mapped=True)
Expand Down
4 changes: 2 additions & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_inline_ptx.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
from llvmlite import ir

from numba.cuda.cudadrv import nvvm
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase
from numba.cuda.testing import skip_on_cudasim


@skip_on_cudasim("Inline PTX cannot be used in the simulator")
class TestCudaInlineAsm(ContextResettingTestCase):
class TestCudaInlineAsm(CUDATestCase):
def test_inline_rsqrt(self):
mod = ir.Module(__name__)
mod.triple = "nvptx64-nvidia-cuda"
Expand Down
8 changes: 6 additions & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_managed_alloc.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,19 @@
from ctypes import byref, c_size_t
from numba.cuda.cudadrv.driver import device_memset, driver, USE_NV_BINDING
from numba import cuda
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase
from numba.cuda.testing import skip_on_cudasim, skip_on_arm
from numba.cuda.tests.support import linux_only


@skip_on_cudasim("CUDA Driver API unsupported in the simulator")
@linux_only
@skip_on_arm("Managed Alloc support is experimental/untested on ARM")
class TestManagedAlloc(ContextResettingTestCase):
class TestManagedAlloc(CUDATestCase):
def tearDown(self):
super().tearDown()
cuda.current_context().reset()

def get_total_gpu_memory(self):
# We use a driver function to directly get the total GPU memory because
# an EMM plugin may report something different (or not implement
Expand Down
3 changes: 1 addition & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_module_callbacks.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
from numba.cuda.cudadrv.linkable_code import CUSource
from numba.cuda.testing import (
CUDATestCase,
ContextResettingTestCase,
skip_on_cudasim,
)

Expand Down Expand Up @@ -42,7 +41,7 @@ def get_hashable_handle_value(handle):


@skip_on_cudasim("Module loading not implemented in the simulator")
class TestModuleCallbacksBasic(ContextResettingTestCase):
class TestModuleCallbacksBasic(CUDATestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connected with @isVoid about these and they should be safe to remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons I left this was because I hadn't dug into why it was used, and it wasn't obvious to me - can you share the reasoning please?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is superfluous. context.reset should have the effect of calling the finalizer on any modules.

def test_basic(self):
counter = 0

Expand Down
4 changes: 2 additions & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_pinned.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@
import platform

from numba import cuda
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase


class TestPinned(ContextResettingTestCase):
class TestPinned(CUDATestCase):
def _run_copies(self, A):
A0 = np.copy(A)

Expand Down
4 changes: 2 additions & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_profiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
# SPDX-License-Identifier: BSD-2-Clause

import unittest
from numba.cuda.testing import ContextResettingTestCase
from numba.cuda.testing import CUDATestCase
from numba import cuda
from numba.cuda.testing import skip_on_cudasim


@skip_on_cudasim("CUDA Profiler unsupported in the simulator")
class TestProfiler(ContextResettingTestCase):
class TestProfiler(CUDATestCase):
def test_profiling(self):
with cuda.profiling():
a = cuda.device_array(10)
Expand Down
4 changes: 2 additions & 2 deletions numba_cuda/numba/cuda/tests/cudadrv/test_reset_device.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@
import threading
from numba import cuda
from numba.cuda.cudadrv.driver import driver
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase
from queue import Queue


class TestResetDevice(ContextResettingTestCase):
class TestResetDevice(CUDATestCase):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment (and code) below suggests that the context on the main thread is unaffected (it creates new threads for the tests) so I think there should be no need to reset the context again after the test has run.

def test_reset_device(self):
def newthread(exception_queue):
try:
Expand Down
6 changes: 3 additions & 3 deletions numba_cuda/numba/cuda/tests/cudadrv/test_select_device.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

import numpy as np
from numba import cuda
from numba.cuda.testing import unittest, ContextResettingTestCase
from numba.cuda.testing import unittest, CUDATestCase


def newthread(exception_queue):
Expand All @@ -21,12 +21,12 @@ def newthread(exception_queue):
stream.synchronize()
del dA
del stream
cuda.close()
cuda.synchronize()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intention of cuda.close() was to make sure any errors from asynchronous operations are detected; I think cuda.synchronize() should be sufficient for this purpose.

except Exception as e:
exception_queue.put(e)


class TestSelectDevice(ContextResettingTestCase):
class TestSelectDevice(CUDATestCase):
def test_select_device(self):
exception_queue = Queue()
for i in range(10):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@

from numba import vectorize, guvectorize
from numba import cuda
from numba.cuda.testing import unittest, ContextResettingTestCase, ForeignArray
from numba.cuda.testing import unittest, CUDATestCase, ForeignArray
from numba.cuda.testing import skip_on_cudasim, skip_if_external_memmgr
from numba.cuda.tests.support import linux_only, override_config
from unittest.mock import call, patch


@skip_on_cudasim("CUDA Array Interface is not supported in the simulator")
class TestCudaArrayInterface(ContextResettingTestCase):
class TestCudaArrayInterface(CUDATestCase):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think context resets are needed here - some tests check the list of pending deallocations and flush it, but that should not require a context reset.

def assertPointersEqual(self, a, b):
self.assertEqual(
a.device_ctypes_pointer.value, b.device_ctypes_pointer.value
Expand Down
8 changes: 4 additions & 4 deletions numba_cuda/numba/cuda/tests/cudapy/test_ipc.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
skip_on_cudasim,
skip_under_cuda_memcheck,
skip_on_wsl2,
ContextResettingTestCase,
CUDATestCase,
ForeignArray,
)
from numba.cuda.tests.support import linux_only, windows_only
Expand Down Expand Up @@ -95,7 +95,7 @@ def ipc_array_test(ipcarr, result_queue):
@skip_on_cudasim("Ipc not available in CUDASIM")
@skip_on_arm("CUDA IPC not supported on ARM in Numba")
@skip_on_wsl2("CUDA IPC unreliable on WSL2; skipping IPC tests")
class TestIpcMemory(ContextResettingTestCase):
class TestIpcMemory(CUDATestCase):
def test_ipc_handle(self):
# prepare data for IPC
arr = np.arange(10, dtype=np.intp)
Expand Down Expand Up @@ -264,7 +264,7 @@ def staged_ipc_array_test(ipcarr, device_num, result_queue):
@skip_on_cudasim("Ipc not available in CUDASIM")
@skip_on_arm("CUDA IPC not supported on ARM in Numba")
@skip_on_wsl2("CUDA IPC unreliable on WSL2; skipping IPC tests")
class TestIpcStaged(ContextResettingTestCase):
class TestIpcStaged(CUDATestCase):
def test_staged(self):
# prepare data for IPC
arr = np.arange(10, dtype=np.intp)
Expand Down Expand Up @@ -324,7 +324,7 @@ def test_ipc_array(self):

@windows_only
@skip_on_cudasim("Ipc not available in CUDASIM")
class TestIpcNotSupported(ContextResettingTestCase):
class TestIpcNotSupported(CUDATestCase):
def test_unsupported(self):
arr = np.arange(10, dtype=np.intp)
devarr = cuda.to_device(arr)
Expand Down