Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dtype handling. #4004

Merged
merged 4 commits into from
Mar 4, 2025

Conversation

tpn
Copy link
Contributor

@tpn tpn commented Mar 4, 2025

This PR improves dtype handling for cuda.cooperative routines, allowing a much wider range of values to be passed to any of the Pythonic CUB routines that accept a dtype param. Per the normalize_dtype_param() docstring, the logic is as follows:

    - If the dtype is already a numba type, return it as is.
    - If the dtype is a valid numpy dtype, convert it to the corresponding
      numba type.  Note that this applies to both `np.int32` and
      `np.dtype(np.int32)`.
    - If the dtype is a string:
        - If there's a period in the string, ensure it's because the string
          starts with "np." and is followed by a valid numpy dtype.  Otherwise,
          raise a ValueError in both cases: if there's a period belonging to
          something other than the leading "np.", or if the following numpy
          type isn't valid.
        - If there's no period, assume the type is referring to a numba type.
          If not, raise a ValueError.  If it is, return the corresponding numba
          type.

This is for #3914.

@tpn tpn self-assigned this Mar 4, 2025
@tpn tpn requested a review from a team as a code owner March 4, 2025 03:05
@tpn tpn requested review from rwgk, brycelelbach and leofang March 4, 2025 03:05
@tpn tpn force-pushed the 3914-accept-numpy-types-in-cuda-cooperative branch from c995236 to 1f2fa7d Compare March 4, 2025 03:09
Copy link
Contributor

github-actions bot commented Mar 4, 2025

🟩 CI finished in 1h 00m: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m
  • 🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-rtx2080-latest-1

Copy link
Contributor

@brycelelbach brycelelbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for lowering int->int32 and float->float32

@tpn tpn force-pushed the 3914-accept-numpy-types-in-cuda-cooperative branch from 1f2fa7d to cf2f269 Compare March 4, 2025 19:13
@tpn tpn force-pushed the 3914-accept-numpy-types-in-cuda-cooperative branch from 4835992 to c85b70a Compare March 4, 2025 19:20
Copy link
Contributor

github-actions bot commented Mar 4, 2025

🟩 CI finished in 58m 48s: Pass: 100%/1 | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
  • 🟩 python: Pass: 100%/1 | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 58m 48s | Avg: 58m 48s | Max: 58m 48s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-rtx2080-latest-1

@tpn tpn merged commit e2d73f1 into NVIDIA:main Mar 4, 2025
20 of 23 checks passed
@tpn tpn deleted the 3914-accept-numpy-types-in-cuda-cooperative branch March 4, 2025 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants