Skip IPC mempool tests on WSL #1045

rparolin · 2025-09-29T15:19:09Z

Skip IPC mempool tests on WSL

Skip IPC mempool tests on WSL using pytest.skipif(IS_WSL, ...).

On WSL2, cuMemPoolCreate with POSIX handle returns CUDA_ERROR_INVALID_VALUE despite capability flags indicating support. This is a platform/driver limitation, not a bug in our code.

I confirmed the failures are driver-related rather than a bug in our code. The host is WSL2 (kernel “microsoft-standard-WSL2”) with NVIDIA driver 581.15 (CUDA 13.0). Device attributes report mempools and POSIX FD handle support, yet a minimal, direct driver repro calling cuMemPoolCreate with handleTypes=CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR consistently returns CUDA_ERROR_INVALID_VALUE, while the same call with handleTypes=CU_MEM_HANDLE_TYPE_NONE succeeds.

# Minimal direct-driver repro of IPC mempool creation behavior

try:
    from cuda.bindings import driver
except ImportError:
    from cuda import cuda as driver

driver.cuInit(0)

loc = driver.CUmemLocation()
loc.type = driver.CUmemLocationType.CU_MEM_LOCATION_TYPE_DEVICE
loc.id = 0

def create_pool(handle_type):
    props = driver.CUmemPoolProps()
    props.allocType = driver.CUmemAllocationType.CU_MEM_ALLOCATION_TYPE_PINNED
    props.handleTypes = handle_type
    props.location = loc
    props.maxSize = 2_097_152
    props.win32SecurityAttributes = 0
    props.usage = 0
    return driver.cuMemPoolCreate(props)

print("POSIX_FD:", create_pool(driver.CUmemAllocationHandleType.CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR)[0].name)
print("NONE    :", create_pool(driver.CUmemAllocationHandleType.CU_MEM_HANDLE_TYPE_NONE)[0].name)

Output:

POSIX_FD: CUDA_ERROR_INVALID_VALUE
NONE    : CUDA_SUCCESS

Because this reproduces outside our code with the same CUmemPoolProps we set, the issue lies in the driver/runtime path under WSL2 (consistent with known IPC limitations there), not our implementation. Therefore, we skip IPC mempool tests on WSL to keep the suite portable and green, while leaving the tests enabled on native Linux and other environments where the driver accepts IPC pools.

copy-pr-bot · 2025-09-29T15:19:13Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cuda_core/tests/conftest.py

cuda_core/tests/test_ipc_mempool.py

cuda_core/tests/conftest.py

rparolin · 2025-09-29T15:43:59Z

Wow, you both jumped on this PR fast. LOL. I had to double check to make sure I didn't mark this as ready by mistake.

Co-authored-by: Phillip Cloud <[email protected]>

leofang

In general we avoid skipping tests based on only the platform or hard-coded conditions in cuda.core. What we should do is to query the driver and see if a functionality is supported or not. The reason is that the driver can be updated to gain new capabilities, and checks like this would not keep up.

@rparolin could you check what's the returned values on WSL?

>>> dev = Device()
>>> dev.properties.mempool_supported_handle_types
9       
>>> dev.properties.handle_type_posix_file_descriptor_supported
True
>>> dev.properties.handle_type_win32_handle_supported
False
>>> dev.properties.handle_type_win32_kmt_handle_supported
False

It could be that our existing skip condition is not sufficient:

cuda-python/cuda_core/tests/test_ipc_mempool.py

Lines 21 to 36 in 62d6963

    
           @pytest.fixture(scope="function") 
        
           def ipc_device(): 
        
               """Obtains a device suitable for IPC-enabled mempool tests, or skips.""" 
        
               # Check if IPC is supported on this platform/device 
        
               device = Device() 
        
               device.set_current() 
        
               if not device.properties.memory_pools_supported: 
        
                   pytest.skip("Device does not support mempool operations") 
        
               # Note: Linux specific. Once Windows support for IPC is implemented, this 
        
               # test should be updated. 
        
               if not device.properties.handle_type_posix_file_descriptor_supported: 
        
                   pytest.skip("Device does not support IPC") 
        
               return device

rwgk · 2025-09-29T18:21:15Z

could you check what's the returned values on WSL?

This is what I'm seeing on my Windows 11 24H2 WSL2 Ubuntu 24.04 workstation:

>>> dev = Device()
>>> dev.properties.mempool_supported_handle_types
0
>>> dev.properties.handle_type_posix_file_descriptor_supported
True
>>> dev.properties.handle_type_win32_handle_supported
False
>>> dev.properties.handle_type_win32_kmt_handle_supported
False

Using current cuda-python main (at commit dbde2b4).

Ubuntu side:

(WslLocalCudaVenv) rwgk-win11.localdomain:~/forked/cuda-python $ nvidia-smi
Mon Sep 29 11:21:40 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.02              Driver Version: 581.15         CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A6000               On  |   00000000:C1:00.0 Off |                  Off |
| 30%   29C    P8             20W /  300W |    1652MiB /  49140MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

Windows side:

PS C:\Users\rgrossekunst> nvidia-smi
Mon Sep 29 11:22:45 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 581.15                 Driver Version: 581.15         CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A6000             WDDM  |   00000000:C1:00.0 Off |                  Off |
| 30%   31C    P8             42W /  300W |    1729MiB /  49140MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

leofang · 2025-09-29T18:30:08Z

>>> dev.properties.mempool_supported_handle_types
0
>>> dev.properties.handle_type_posix_file_descriptor_supported
True

mempool_supported_handle_types seems to be correct, which is a bit mask for extracting the supported handle types, and handle_type_posix_file_descriptor_supported is where it went south. @Andy-Jost let's check if we messed up in the implementation of handle_type_posix_file_descriptor_supported, or if WSL driver is being inconsistent here.

…hon into rparolin/skip_ipc_on_wsl

rparolin · 2025-09-29T21:01:13Z

/ok to test 57ead02

rparolin · 2025-09-29T21:03:07Z

/ok to test 0a9be40

rparolin · 2025-09-29T21:13:20Z

/ok to test f7deec9

rparolin · 2025-10-08T18:35:05Z

/ok to test 90590ec

cuda_python_test_helpers/__init__.py

…hon into rparolin/skip_ipc_on_wsl

leofang

@kkraus14 Sorry but I think making this an installable package is a very poor UX. It means every single new contributor would stumble during first test run, and have to learn that there is a separate package that we must install locally first.

Also, by doing so we need to touch the CI workflows so as to make it installed first before testing. We seem to be shooting ourselves in the foot without justifiable gain?

rparolin · 2025-10-08T19:25:39Z

@kkraus14 Sorry but I think making this an installable package is a very poor UX. It means every single new contributor would stumble during first test run, and have to learn that there is a separate package that we must install locally first.

Also, by doing so we need to touch the CI workflows so as to make it installed first before testing. We seem to be shooting ourselves in the foot without justifiable gain?

I had similar thoughts but figured that it could be mitigated by our script/run_tests.sh script which can auto install for you. My tendency is have my build/dev environment tools script all the complexity so others don't step on landmines but I haven't developed an intuition of what is [un]familiar for python programmers yet.

rparolin · 2025-10-08T19:26:21Z

/ok to test cd6b714

leofang · 2025-10-08T20:18:50Z

The CI shows what I noted earlier:

Also, by doing so we need to touch the CI workflows so as to make it installed first before testing.

rparolin · 2025-10-08T21:35:48Z

/ok to test baac405

leofang · 2025-10-08T21:58:05Z

/ok to test bbe82c8

This reverts commit baac405.

leofang · 2025-10-08T22:09:08Z

/ok to test 7726e05

leofang · 2025-10-08T22:26:13Z

I pushed commit bbe82c8 so that we don't get blocked by this package discussion. If the helper module is installed as a package, it's used. Otherwise, we find it via relative path. We can revisit this discussion later and find a way to avoid friction with local development and CI testing.

kkraus14 · 2025-10-08T22:57:54Z

I agree that packaging this as a separate package is a poor developer UX. Maybe we should introduce a cuda.core.testing module or something similar?

Regardless, we can do that in a follow up PR.

github-actions · 2025-10-09T00:14:14Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

Skip IPC mempool tests on WSL

65b944e

rparolin marked this pull request as draft September 29, 2025 15:19

Merge branch 'main' into rparolin/skip_ipc_on_wsl

67d5f45

cpcloud requested changes Sep 29, 2025

View reviewed changes

cuda_core/tests/conftest.py Outdated Show resolved Hide resolved

cuda_core/tests/test_ipc_mempool.py Outdated Show resolved Hide resolved

cuda_core/tests/conftest.py Outdated Show resolved Hide resolved

rwgk reviewed Sep 29, 2025

View reviewed changes

cuda_core/tests/conftest.py Outdated Show resolved Hide resolved

rparolin requested a review from leofang September 29, 2025 15:42

rparolin and others added 3 commits September 29, 2025 08:44

Update cuda_core/tests/conftest.py

47a8d89

Co-authored-by: Phillip Cloud <[email protected]>

Update cuda_core/tests/test_ipc_mempool.py

7f7be24

Co-authored-by: Phillip Cloud <[email protected]>

Apply suggestion from @cpcloud

d88d627

Co-authored-by: Phillip Cloud <[email protected]>

rparolin requested a review from cpcloud September 29, 2025 15:56

leofang requested changes Sep 29, 2025

View reviewed changes

leofang requested a review from keenan-simpson September 29, 2025 15:57

cpcloud previously approved these changes Sep 29, 2025

View reviewed changes

rparolin added 3 commits September 29, 2025 11:40

addressing feedback

aab97af

Making utils globally accessible

fe47883

Merge branch 'rparolin/skip_ipc_on_wsl' of github.com:NVIDIA/cuda-pyt…

46cd33c

…hon into rparolin/skip_ipc_on_wsl

rparolin dismissed cpcloud’s stale review via 46cd33c September 29, 2025 20:08

rparolin added 3 commits September 29, 2025 13:31

working

4366efe

wip

2b49a77

wip

57ead02

formatting

0a9be40

rparolin added 2 commits September 29, 2025 14:05

removing deleted files

1da2c82

removing skip helper

f7deec9

[pre-commit.ci] auto code formatting

90590ec

leofang requested changes Oct 8, 2025

View reviewed changes

cuda_python_test_helpers/__init__.py Outdated Show resolved Hide resolved

rparolin added 2 commits October 8, 2025 12:02

pre-commit changes

034125f

Merge branch 'rparolin/skip_ipc_on_wsl' of github.com:NVIDIA/cuda-pyt…

75c4d25

…hon into rparolin/skip_ipc_on_wsl

leofang requested changes Oct 8, 2025

View reviewed changes

removing cuda_python_test_helpers/__init__.py

cd6b714

install cuda_python_test_helpers as editable

baac405

leofang added 2 commits October 8, 2025 21:50

Merge branch 'main' into rparolin/skip_ipc_on_wsl

d8b0a46

do not require cuda_python_test_helpers to be pre-installed for now

bbe82c8

Revert "install cuda_python_test_helpers as editable"

7726e05

This reverts commit baac405.

leofang approved these changes Oct 8, 2025

View reviewed changes

rparolin enabled auto-merge (squash) October 8, 2025 22:15

leofang added this to the cuda.core beta 7 milestone Oct 8, 2025

rparolin requested review from Andy-Jost, kkraus14 and rwgk October 8, 2025 22:41

kkraus14 approved these changes Oct 8, 2025

View reviewed changes

leofang disabled auto-merge October 9, 2025 00:00

leofang merged commit 7028804 into main Oct 9, 2025
71 checks passed

leofang deleted the rparolin/skip_ipc_on_wsl branch October 9, 2025 00:01

	@pytest.fixture(scope="function")
	def ipc_device():
	"""Obtains a device suitable for IPC-enabled mempool tests, or skips."""
	# Check if IPC is supported on this platform/device
	device = Device()
	device.set_current()

	if not device.properties.memory_pools_supported:
	pytest.skip("Device does not support mempool operations")

	# Note: Linux specific. Once Windows support for IPC is implemented, this
	# test should be updated.
	if not device.properties.handle_type_posix_file_descriptor_supported:
	pytest.skip("Device does not support IPC")

	return device

Skip IPC mempool tests on WSL #1045

Skip IPC mempool tests on WSL #1045

Uh oh!

Conversation

rparolin commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Sep 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rparolin commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

rwgk commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leofang commented Sep 29, 2025

Uh oh!

rparolin commented Sep 29, 2025

Uh oh!

rparolin commented Sep 29, 2025

Uh oh!

rparolin commented Sep 29, 2025

Uh oh!

rparolin commented Oct 8, 2025

Uh oh!

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

rparolin commented Oct 8, 2025

Uh oh!

rparolin commented Oct 8, 2025

Uh oh!

leofang commented Oct 8, 2025

Uh oh!

rparolin commented Oct 8, 2025

Uh oh!

leofang commented Oct 8, 2025

Uh oh!

leofang commented Oct 8, 2025

Uh oh!

leofang commented Oct 8, 2025

Uh oh!

kkraus14 commented Oct 8, 2025

Uh oh!

Uh oh!

github-actions bot commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

rparolin commented Sep 29, 2025 •

edited

Loading

rparolin commented Sep 29, 2025 •

edited

Loading

rwgk commented Sep 29, 2025 •

edited

Loading