Skip to content

fix: cache config enum#628

Merged
gmarkall merged 3 commits intoNVIDIA:mainfrom
kaeun97:kaeun97/fix-cache-config
Dec 4, 2025
Merged

fix: cache config enum#628
gmarkall merged 3 commits intoNVIDIA:mainfrom
kaeun97:kaeun97/fix-cache-config

Conversation

@kaeun97
Copy link
Contributor

@kaeun97 kaeun97 commented Dec 3, 2025

This will be followed by a sequence of work on improving the l1 cache/shared memory configuration implementation as stated here.

This PR fixes the following bug:

Traceback (most recent call last):
  File "/home/gmarkall/numbadev/issues/discourse-3080/repro.py", line 13, in <module>
    cufunc.cache_config(prefer_shared=True)
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/cudadrv/driver.py", line 2418, in cache_config
    flag = attr.CU_FUNC_CACHE_PREFER_SHARED
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'CUfunction_attribute' has no attribute 'CU_FUNC_CACHE_PREFER_SHARED'

with a test that would ensure that this won't regress.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 3, 2025

Greptile Overview

Greptile Summary

Fixed AttributeError in cache_config() by correcting the enum type from binding.CUfunction_attribute to binding.CUfunc_cache.

  • Changed line 2412 in driver.py:2412 to use the correct CUfunc_cache enum that contains cache configuration constants (CU_FUNC_CACHE_PREFER_SHARED, CU_FUNC_CACHE_PREFER_L1, etc.)
  • Added comprehensive test coverage with test_cuda_cache_config that validates all four cache configuration modes
  • Test ensures the kernel executes correctly after setting cache configuration

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Simple one-line fix correcting an enum type error with comprehensive test coverage added. The change aligns with the existing CtypesFunction class pattern and directly addresses the reported bug
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
numba_cuda/numba/cuda/cudadrv/driver.py 5/5 Fixed enum type from CUfunction_attribute to CUfunc_cache for cache configuration flags
numba_cuda/numba/cuda/tests/cudadrv/test_cuda_driver.py 5/5 Added comprehensive test for all cache config modes (prefer_shared, prefer_cache, prefer_equal, default) with functional validation

Sequence Diagram

sequenceDiagram
    participant User
    participant Kernel as CUDA Kernel
    participant CudaPythonFunction
    participant Binding as cuda.bindings.driver
    participant Driver as CUDA Driver
    
    User->>Kernel: kernel.overloads[sig]
    Kernel->>CudaPythonFunction: get_cufunc()
    User->>CudaPythonFunction: cache_config(prefer_shared=True)
    CudaPythonFunction->>Binding: CUfunc_cache.CU_FUNC_CACHE_PREFER_SHARED
    Binding-->>CudaPythonFunction: flag value (0x01)
    CudaPythonFunction->>Driver: cuFuncSetCacheConfig(handle, flag)
    Driver-->>CudaPythonFunction: success
    CudaPythonFunction-->>User: configuration applied
    User->>Kernel: kernel[grid, block](args)
    Kernel->>Driver: launch with cache config
    Driver-->>User: execution complete
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@kaeun97 kaeun97 force-pushed the kaeun97/fix-cache-config branch from 012c318 to c221592 Compare December 3, 2025 01:08
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@gmarkall
Copy link
Contributor

gmarkall commented Dec 3, 2025

/ok to test a23a794

@gmarkall gmarkall added the 3 - Ready for Review Ready for review by team label Dec 3, 2025
@gmarkall
Copy link
Contributor

gmarkall commented Dec 4, 2025

/ok to test

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 4, 2025

/ok to test

@gmarkall, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@gmarkall
Copy link
Contributor

gmarkall commented Dec 4, 2025

/ok to test 3c14514

Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks! I've checked with NSight Compute that setting the attributes takes effect.

@gmarkall gmarkall added 4 - Waiting on CI Waiting for a CI run to finish successfully and removed 3 - Ready for Review Ready for review by team labels Dec 4, 2025
@gmarkall gmarkall enabled auto-merge (squash) December 4, 2025 14:12
@gmarkall gmarkall merged commit ea1779e into NVIDIA:main Dec 4, 2025
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Waiting on CI Waiting for a CI run to finish successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants