Skip to content

Conversation

@cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Dec 12, 2025

Add support for printing tuples of int64. This is useful for printing array shapes in kernels.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 12, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 12, 2025

Greptile Overview

Greptile Summary

This PR adds support for printing UniTuple(int64, n) tuples in CUDA kernels, enabling convenient printing of array shapes and other int64 tuple data.

Key changes:

  • Implemented tuple_print_impl handler registered for types.UniTuple that formats tuples with parentheses and comma-separated values
  • Uses %lld format specifier appropriate for int64 elements, consistent with how other int64 values are printed
  • Follows the same pattern as the existing Dim3 print implementation (extracting values and building format string)
  • Includes proper error handling for unsupported tuple element types
  • Added comprehensive test case validating the output format

Implementation notes:

  • Only supports int64 element type; other types raise NotImplementedError with a clear message
  • The implementation correctly uses builder.extract_value() to extract individual tuple elements
  • Format string generation is dynamic based on tuple size, handling any number of elements including empty tuples

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk
  • The implementation follows established patterns in the codebase (mirrors Dim3 handling), includes proper type checking and error handling, and has test coverage. The only minor issue is a style suggestion for improved error message clarity, which doesn't affect functionality.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
numba_cuda/numba/cuda/printimpl.py 4/5 Added print support for int64 UniTuples following the existing Dim3 pattern; minor issue with error message clarity
numba_cuda/numba/cuda/tests/cudapy/test_print.py 5/5 Added test case for printing int64 tuples with proper expected output validation

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. numba_cuda/numba/cuda/printimpl.py, line 39 (link)

    style: Error message shows full tuple type instead of element type. Consider using ty.dtype instead of ty for clarity

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@cpcloud
Copy link
Contributor Author

cpcloud commented Dec 12, 2025

/ok to test

@gmarkall gmarkall merged commit a27070f into NVIDIA:main Dec 12, 2025
71 checks passed
@cpcloud cpcloud deleted the print-int64-tuples branch December 12, 2025 19:41
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Dec 17, 2025
- Fix NVIDIA#624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (NVIDIA#643)
- Fix Issue NVIDIA#588: separate compilation of NVVM IR modules when generating debuginfo (NVIDIA#591)
- feat: allow printing nested tuples (NVIDIA#667)
- build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (NVIDIA#655)
- build(deps): bump actions/upload-artifact from 4 to 5 (NVIDIA#652)
- Test RAPIDS 25.12 (NVIDIA#661)
- Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (NVIDIA#662)
- feat: add print support for int64 tuples (NVIDIA#663)
- Only run dependabot monthly and open fewer PRs (NVIDIA#658)
- test: fix bogus `self` argument to `Context` (NVIDIA#656)
- Fix false negative NRT link decision when NRT was previously toggled on (NVIDIA#650)
- Add support for dependabot (NVIDIA#647)
- refactor: cull dead linker objects (NVIDIA#649)
- Migrate numba-cuda driver to use cuda.core.launch API (NVIDIA#609)
- feat: add set_shared_memory_carveout (NVIDIA#629)
- chore: bump version in pixi.toml (NVIDIA#641)
- refactor: remove devicearray code to reduce complexity (NVIDIA#600)
@gmarkall gmarkall mentioned this pull request Dec 17, 2025
gmarkall added a commit that referenced this pull request Dec 17, 2025
- Capture global device arrays in kernels and device functions (#666)
- Fix #624: Accept Numba IR
nodes in all places Numba-CUDA IR nodes are expected
(#643)
- Fix Issue #588: separate
compilation of NVVM IR modules when generating debuginfo
(#591)
- feat: allow printing nested tuples
(#667)
- build(deps): bump actions/setup-python from 5.6.0 to 6.1.0
(#655)
- build(deps): bump actions/upload-artifact from 4 to 5
(#652)
- Test RAPIDS 25.12 (#661)
- Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests
(#662)
- feat: add print support for int64 tuples
(#663)
- Only run dependabot monthly and open fewer PRs
(#658)
- test: fix bogus `self` argument to `Context`
(#656)
- Fix false negative NRT link decision when NRT was previously toggled
on (#650)
- Add support for dependabot
(#647)
- refactor: cull dead linker objects
(#649)
- Migrate numba-cuda driver to use cuda.core.launch API
(#609)
- feat: add set_shared_memory_carveout
(#629)
- chore: bump version in pixi.toml
(#641)
- refactor: remove devicearray code to reduce complexity
(#600)
ZzEeKkAa added a commit to ZzEeKkAa/numba-cuda that referenced this pull request Jan 8, 2026
v0.23.0

- Capture global device arrays in kernels and device functions (NVIDIA#666)
- Fix NVIDIA#624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (NVIDIA#643)
- Fix Issue NVIDIA#588: separate compilation of NVVM IR modules when generating debuginfo (NVIDIA#591)
- feat: allow printing nested tuples (NVIDIA#667)
- build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (NVIDIA#655)
- build(deps): bump actions/upload-artifact from 4 to 5 (NVIDIA#652)
- Test RAPIDS 25.12 (NVIDIA#661)
- Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (NVIDIA#662)
- feat: add print support for int64 tuples (NVIDIA#663)
- Only run dependabot monthly and open fewer PRs (NVIDIA#658)
- test: fix bogus `self` argument to `Context` (NVIDIA#656)
- Fix false negative NRT link decision when NRT was previously toggled on (NVIDIA#650)
- Add support for dependabot (NVIDIA#647)
- refactor: cull dead linker objects (NVIDIA#649)
- Migrate numba-cuda driver to use cuda.core.launch API (NVIDIA#609)
- feat: add set_shared_memory_carveout (NVIDIA#629)
- chore: bump version in pixi.toml (NVIDIA#641)
- refactor: remove devicearray code to reduce complexity (NVIDIA#600)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants