Skip to content

Conversation

@cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Jan 6, 2026

Add "PERF" lints to the ruff config. For our current kernel launch benchmarks there is no change. Most of the changes here would not be incredibly impactful on their own.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 6, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cpcloud
Copy link
Contributor Author

cpcloud commented Jan 6, 2026

/ok to test

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 6, 2026

Greptile Summary

This PR adds PERF lints to the ruff configuration and applies the resulting auto-fixes across the codebase. The changes are primarily mechanical performance improvements.

Key Changes

  • Added extend-select = ["PERF"] to ruff configuration in pyproject.toml
  • Excluded test files and simulator code from PERF lints to avoid unnecessary churn
  • Applied automatic PERF lint fixes across 29 code files

Types of Performance Improvements

  • Converted loops with .append() to list/dict comprehensions (more Pythonic and efficient)
  • Replaced indexed loops with direct iteration (e.g., for item in items instead of for i in range(len(items)))
  • Changed loops with .append() to .extend() calls for batch operations
  • Used walrus operator (:=) to eliminate try/except AttributeError patterns
  • Removed unused loop variables (e.g., for k, v in dict.items()for v in dict.values() when k is unused)
  • Added # noqa: PERF203 comments for legitimate exception handling patterns that shouldn't use walrus operator

All changes are automated refactorings that preserve existing behavior while following Python performance best practices.

Confidence Score: 5/5

  • This PR is safe to merge - all changes are automated lint fixes that preserve behavior
  • The changes are entirely automated refactorings from enabling PERF lints. The transformations are well-established Python idioms (comprehensions, walrus operator, direct iteration) that preserve semantics. No logical changes were made, only syntactic improvements. The author appropriately added noqa comments where exception handling patterns should remain unchanged.
  • No files require special attention

Important Files Changed

Filename Overview
pyproject.toml Added PERF lints to ruff config with exclusions for test and simulator files
numba_cuda/numba/cuda/core/controlflow.py Replaced loop appends with extend() calls for better performance
numba_cuda/numba/cuda/cudadrv/driver.py Used walrus operator to eliminate try/except AttributeError pattern, added noqa: PERF203 for legitimate exception handling
numba_cuda/numba/cuda/core/ir_utils.py Converted multiple loops to dict/list comprehensions, replaced loop iteration over unused keys with .keys() and .values()
numba_cuda/numba/cuda/core/interpreter.py Multiple perf improvements: removed unused loop variables, converted loops to comprehensions and dict updates
numba_cuda/numba/cuda/core/base.py Converted loop to dict comprehension and used walrus operator for attribute checking

@rparolin rparolin requested a review from gmarkall January 6, 2026 23:25
@rparolin
Copy link
Contributor

rparolin commented Jan 6, 2026

@gmarkall for visibility

@cpcloud cpcloud merged commit b05dfbe into NVIDIA:main Jan 7, 2026
130 checks passed
@cpcloud cpcloud deleted the perf-lint branch January 7, 2026 00:35
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Jan 12, 2026
- Add arch specific target support (NVIDIA#549)
- chore: disable `locked` flag to bypass prefix-dev/pixi#5256 (NVIDIA#714)
- ci: relock pixi (NVIDIA#712)
- ci: remove redundant conda build in ci (NVIDIA#711)
- chore(deps): bump numba-cuda version and relock pixi (NVIDIA#707)
- Dropping bits in the old CI & Propagating recent changes from cuda-python (NVIDIA#683)
- Fix `test_wheel_deps_wheels.sh` to actually uninstall `nvvm` and `nvrtc` packages for CUDA 13 (NVIDIA#701)
- perf: remove some exception control flow and buffer-exception penalization for arrays (NVIDIA#700)
- perf: let CAI fall through instead of calling from_cuda_array_interface (NVIDIA#694)
- chore: perf lint (NVIDIA#697)
- chore(deps): bump deps in pixi lockfile (NVIDIA#693)
- fix: use freethreading-supported `_PySet_NextItemRef` where possible (NVIDIA#682)
- Support python `3.14` (NVIDIA#599)
- Remove customized address space tracking and address class emission in debug info (NVIDIA#669)
- Drop `experimental` from cuda.core namespace imports (NVIDIA#676)
- Remove dangling references to NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY (NVIDIA#675)
- Use `rapidsai/sccache` in CI (NVIDIA#674)
- chore(dev-deps): remove ipython and pyinstrument (NVIDIA#670)
- Set up a new VM-based CI infrastructure  (NVIDIA#604)
@gmarkall gmarkall mentioned this pull request Jan 12, 2026
gmarkall added a commit that referenced this pull request Jan 12, 2026
- Add arch specific target support (#549)
- chore: disable `locked` flag to bypass
prefix-dev/pixi#5256 (#714)
- ci: relock pixi (#712)
- ci: remove redundant conda build in ci (#711)
- chore(deps): bump numba-cuda version and relock pixi (#707)
- Dropping bits in the old CI & Propagating recent changes from
cuda-python (#683)
- Fix `test_wheel_deps_wheels.sh` to actually uninstall `nvvm` and
`nvrtc` packages for CUDA 13 (#701)
- perf: remove some exception control flow and buffer-exception
penalization for arrays (#700)
- perf: let CAI fall through instead of calling
from_cuda_array_interface (#694)
- chore: perf lint (#697)
- chore(deps): bump deps in pixi lockfile (#693)
- fix: use freethreading-supported `_PySet_NextItemRef` where possible
(#682)
- Support python `3.14` (#599)
- Remove customized address space tracking and address class emission in
debug info (#669)
- Drop `experimental` from cuda.core namespace imports (#676)
- Remove dangling references to
NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY (#675)
- Use `rapidsai/sccache` in CI (#674)
- chore(dev-deps): remove ipython and pyinstrument (#670)
- Set up a new VM-based CI infrastructure  (#604)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants