Improve dtype handling. #4004

tpn · 2025-03-04T03:05:57Z

This PR improves dtype handling for cuda.cooperative routines, allowing a much wider range of values to be passed to any of the Pythonic CUB routines that accept a dtype param. Per the normalize_dtype_param() docstring, the logic is as follows:

    - If the dtype is already a numba type, return it as is.
    - If the dtype is a valid numpy dtype, convert it to the corresponding
      numba type.  Note that this applies to both `np.int32` and
      `np.dtype(np.int32)`.
    - If the dtype is a string:
        - If there's a period in the string, ensure it's because the string
          starts with "np." and is followed by a valid numpy dtype.  Otherwise,
          raise a ValueError in both cases: if there's a period belonging to
          something other than the leading "np.", or if the following numpy
          type isn't valid.
        - If there's no period, assume the type is referring to a numba type.
          If not, raise a ValueError.  If it is, return the corresponding numba
          type.

This is for #3914.

github-actions · 2025-03-04T04:11:19Z

🟩 CI finished in 1h 00m: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

🟩 cpu
  🟩 amd64              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 ctk
  🟩 12.8               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 cudacxx
  🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 cudacxx_family
  🟩 nvcc               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 cxx
  🟩 GCC13              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 cxx_family
  🟩 GCC                Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 gpu
  🟩 rtx2080            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
🟩 jobs
  🟩 Test               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

brycelelbach

Looks good except for lowering int->int32 and float->float32

python/cuda_cooperative/cuda/cooperative/experimental/_common.py

python/cuda_cooperative/cuda/cooperative/experimental/block/_block_scan.py

python/cuda_cooperative/tests/test_common.py

Add supporting tests.

github-actions · 2025-03-04T20:22:07Z

🟩 CI finished in 58m 48s: Pass: 100%/1 | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s

🟩 python: Pass: 100%/1 | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s

🟩 cpu
  🟩 amd64              Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 ctk
  🟩 12.8               Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 cudacxx
  🟩 nvcc12.8           Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 cudacxx_family
  🟩 nvcc               Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 cxx
  🟩 GCC13              Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 cxx_family
  🟩 GCC                Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 gpu
  🟩 rtx2080            Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
🟩 jobs
  🟩 Test               Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

python/cuda_cooperative/cuda/cooperative/experimental/_common.py

python/cuda_cooperative/tests/test_common.py

python/cuda_cooperative/cuda/cooperative/experimental/_common.py

tpn self-assigned this Mar 4, 2025

tpn requested a review from a team as a code owner March 4, 2025 03:05

tpn requested review from rwgk, brycelelbach and leofang March 4, 2025 03:05

tpn force-pushed the 3914-accept-numpy-types-in-cuda-cooperative branch from c995236 to 1f2fa7d Compare March 4, 2025 03:09

brycelelbach requested changes Mar 4, 2025

View reviewed changes

tpn added 3 commits March 4, 2025 11:05

Implement normalize_dtype_param() routine.

5845fa5

Add supporting tests.

Use new normalize_dtype_param() routine.

f857106

PR feedback: remove support for native Python types.

Loading
Loading status checks…

cf2f269

tpn force-pushed the 3914-accept-numpy-types-in-cuda-cooperative branch from 1f2fa7d to cf2f269 Compare March 4, 2025 19:13

PR feedback: use @pytest.mark.parametrize().

Loading
Loading status checks…

c85b70a

tpn force-pushed the 3914-accept-numpy-types-in-cuda-cooperative branch from 4835992 to c85b70a Compare March 4, 2025 19:20

brycelelbach approved these changes Mar 4, 2025

View reviewed changes

rwgk approved these changes Mar 4, 2025

View reviewed changes

tpn merged commit e2d73f1 into NVIDIA:main Mar 4, 2025
20 of 23 checks passed

tpn deleted the 3914-accept-numpy-types-in-cuda-cooperative branch March 4, 2025 21:29

This was referenced Mar 4, 2025

Accept NumPy types in cuda.cooperative #3914

Closed

[BUG]: cuda.cooperative passes string dtype parameters through to C++ #3912

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve dtype handling. #4004

Improve dtype handling. #4004

tpn commented Mar 4, 2025 •

edited

Loading

github-actions bot commented Mar 4, 2025

🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

brycelelbach left a comment

github-actions bot commented Mar 4, 2025

🟩 python: Pass: 100%/1 | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

Improve dtype handling. #4004

Improve dtype handling. #4004

Conversation

tpn commented Mar 4, 2025 • edited Loading

github-actions bot commented Mar 4, 2025

🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

brycelelbach left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 4, 2025

🟩 python: Pass: 100%/1 | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

tpn commented Mar 4, 2025 •

edited

Loading