Skip to content

[Bugfix] Add NVIDIA HPC SDK support in CUDA detection (#974)#976

Merged
LeiWang1999 merged 6 commits intotile-ai:mainfrom
Degeneracy-Evil:fix-#974
Oct 12, 2025
Merged

[Bugfix] Add NVIDIA HPC SDK support in CUDA detection (#974)#976
LeiWang1999 merged 6 commits intotile-ai:mainfrom
Degeneracy-Evil:fix-#974

Conversation

@Degeneracy-Evil
Copy link
Contributor

@Degeneracy-Evil Degeneracy-Evil commented Oct 11, 2025

Enhanced CUDA detection to recognize NVIDIA HPC SDK installations:

  • Added path check for nvhpc in nvcc binary path
  • Added fallback scan for default nvhpc paths: /opt/nvidia/hpc_sdk/Linux_x86_64
  • Maintained backward compatibility with standard CUDA installations

Verification:

  • Tested on Ubuntu 24.04 with NVIDIA HPC SDK 25.7
  • Confirmed detection works without manual CUDA_HOME or CUDA_PATH setting

Fixes #974

Summary by CodeRabbit

  • Bug Fixes

    • Improved CUDA toolkit autodetection to recognize NVIDIA HPC SDK installs.
    • Reduced false positives by validating discovered CUDA locations and rejecting invalid candidates.
    • Enhanced Unix-like detection to prefer standard CUDA paths and fall back to HPC SDK when needed.
    • Preserved existing Windows discovery behavior for consistent results.
  • Refactor

    • Streamlined detection flow and path handling for more reliable tool discovery.

Enhanced CUDA detection to recognize NVIDIA HPC SDK installations:
- Added path check for nvhpc in nvcc binary path
- Added fallback scan for default nvhpc paths:
  /opt/nvidia/hpc_sdk/Linux_x86_64
- Maintained backward compatibility with standard CUDA installations

Verification:
- Tested on Ubuntu 24.04 with NVIDIA HPC SDK 25.7
- Confirmed detection works without manual CUDA_HOME or CUDA_PATH setting

Fixes tile-ai#974
@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run bash format.sh in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work!

🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 11, 2025

Walkthrough

Refines CUDA home detection in tilelang/env.py: recognizes nvcc paths containing "cuda" or "hpc_sdk" (derives cuda_home two levels up), adds Unix-like fallback to /usr/local/cuda then /opt/nvidia/hpc_sdk/Linux_x86_64, preserves Windows heuristics, and validates/reset candidate if missing.

Changes

Cohort / File(s) Summary
CUDA home discovery logic
tilelang/env.py
Updated _find_cuda_home flow to: when nvcc_path is present, detect "cuda" or "hpc_sdk" in the path and set cuda_home two levels up; otherwise use a generic two-level-up fallback. When nvcc_path is absent, keep Windows Program Files heuristics, and on Unix-like systems prefer /usr/local/cuda then /opt/nvidia/hpc_sdk/Linux_x86_64. Added explicit validation of the resolved candidate and reset to None if it does not exist.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller
    participant Env as env._find_cuda_home
    Caller->>Env: _find_cuda_home()
    alt nvcc_path found
        alt nvcc path contains "cuda"
            Note over Env: cuda_home = dirname(dirname(nvcc_path))
        else nvcc path contains "hpc_sdk"
            Note over Env: cuda_home = dirname(dirname(nvcc_path)) (HPC SDK pattern)
        else
            Note over Env: Generic fallback: dirname(dirname(nvcc_path))
        end
    else nvcc_path not found
        alt Windows
            Note over Env: Probe Program Files CUDA locations (unchanged)
        else Unix-like
            alt /usr/local/cuda exists
                Note over Env: Use /usr/local/cuda
            else
                Note over Env: Fallback /opt/nvidia/hpc_sdk/Linux_x86_64
            end
        end
    end
    Env->>Env: Validate candidate path exists
    alt invalid
        Note over Env: Reset cuda_home = None
    end
    Env-->>Caller: cuda_home (string | None)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

Hop hop — I sniffed the nvcc trail,
Through cuda paths and hpc_sdk vale.
Two hops up, the home I claim,
Or /usr/local then /opt the same.
A joyful thump: our toolchain's tame. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly summarizes the core change of adding NVIDIA HPC SDK support to the CUDA detection logic and aligns with the pull request’s main implementation. It is clear, specific, and directly references the targeted functionality without unnecessary detail or ambiguity. Including the issue number does not detract from its relevance or clarity.
Linked Issues Check ✅ Passed The changes implement the requested support for NVIDIA HPC SDK by detecting nvcc paths containing the HPC SDK pattern and adding the fallback scan for the default HPC SDK installation directory, while preserving existing CUDA detection behavior. This satisfies the linked issue’s requirement to recognize non-standard nvcc paths and eliminate manual environment variable configuration. Testing on Ubuntu 24.04 with NVIDIA HPC SDK confirms the fix addresses the original ValueError scenario.
Out of Scope Changes Check ✅ Passed The pull request confines changes exclusively to the CUDA detection routine in tilelang/env.py, enhancing support for HPC SDK paths and refining fallbacks without modifying unrelated modules or introducing extraneous functionality. All alterations directly relate to the linked issue’s objectives and maintain existing behavior for other environments. No out-of-scope changes are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7913fb1 and e01f912.

📒 Files selected for processing (1)
  • tilelang/env.py (1 hunks)
🔇 Additional comments (1)
tilelang/env.py (1)

52-53: Adjust HPC SDK fallback path (tilelang/env.py:52-53)
The fallback /opt/nvidia/hpc_sdk/Linux_x86_64 omits the version and compilers directory; on systems with the NVIDIA HPC SDK, binaries live under /opt/nvidia/hpc_sdk/Linux_x86_64/<version>/compilers. Verify this structure on an SDK-installed machine and update the fallback—either by scanning version subdirectories (e.g., pick the latest) or by requiring an explicit CUDA_HOME/CUDA_PATH—to ensure correct detection.

@LeiWang1999
Copy link
Member

@Degeneracy-Evil Thanks! We're good to go if we can adapt coderabbit's advice.

NaNExist and others added 4 commits October 11, 2025 13:40
Adds a check for None before validating the CUDA home path to prevent errors when the path is not set.
[Bugfix] Fix CUDA home detection logic
- Improved nvhpc path detection logic
- Added None check for cuda_home to avoid crashes
- Maintained existing CUDA installation compatibility

Fixes tile-ai#974
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59a673b and 307dd56.

📒 Files selected for processing (1)
  • tilelang/env.py (1 hunks)
🔇 Additional comments (2)
tilelang/env.py (2)

33-43: Control flow improvements look good (pending fix for line 40).

The refactored logic correctly uses if/elif/else to prevent cuda_home from being overwritten, and the generic fallback on lines 42-43 handles symlinked or non-standard installations as recommended in past reviews.

Once the critical issue on line 40 is fixed (changing 3 dirname levels to 2), this detection logic will properly support standard CUDA, HPC SDK, and custom installations.


57-59: Validation logic is a good addition.

The explicit existence check prevents invalid paths from being propagated downstream, making the CUDA detection more robust. This is a valuable improvement to catch misconfigured or incomplete installations early.

Comment on lines +38 to +40
elif "hpc_sdk" in nvcc_path.lower():
# Navigate to the root directory of nvhpc
cuda_home = os.path.dirname(os.path.dirname(os.path.dirname(nvcc_path)))
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Line 40 still has 3 dirname levels instead of 2.

Despite past reviews marking this as addressed (commits b317a0e to 59a673b), line 40 still navigates three levels up from nvcc, which is incorrect.

Path breakdown for HPC SDK:

  • nvcc at: /opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin/nvcc
  • 1× dirname: .../compilers/bin
  • 2× dirname: .../compilersCorrect CUDA_HOME
  • 3× dirname: .../25.7Current code (WRONG)

This will cause CUDA_HOME to point to the version directory instead of the compilers directory, breaking the build system that expects to find bin/, lib/, and include/ subdirectories.

Apply this diff to fix the directory level:

     # NVIDIA HPC SDK pattern
     elif "hpc_sdk" in nvcc_path.lower():
-        # Navigate to the root directory of nvhpc
-        cuda_home = os.path.dirname(os.path.dirname(os.path.dirname(nvcc_path)))
+        # Navigate up from compilers/bin/nvcc to compilers directory
+        cuda_home = os.path.dirname(os.path.dirname(nvcc_path))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
elif "hpc_sdk" in nvcc_path.lower():
# Navigate to the root directory of nvhpc
cuda_home = os.path.dirname(os.path.dirname(os.path.dirname(nvcc_path)))
elif "hpc_sdk" in nvcc_path.lower():
- # Navigate to the root directory of nvhpc
# Navigate up from compilers/bin/nvcc to compilers directory
cuda_home = os.path.dirname(os.path.dirname(nvcc_path))
🤖 Prompt for AI Agents
In tilelang/env.py around lines 38 to 40, the code ascends three dirname levels
for the HPC SDK nvcc path which sets cuda_home to the version directory; change
it to ascend only two dirname levels so cuda_home points to the compilers
directory (i.e., use os.path.dirname(os.path.dirname(nvcc_path)) instead of
three dirname calls), ensuring cuda_home contains the expected bin/, lib/, and
include/ subdirectories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically need the directory to point back to /opt/nvidia/hpc_sdk/Linux_x86_64/, because the apache/tvm library can automatically locate the required nvcc compiler without manual intervention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for how TVM uses CUDA_HOME to locate nvcc compiler
rg -nP --type=py -C5 'CUDA_HOME|nvcc' | head -100

Length of output: 5626


🌐 Web query:

Apache TVM CUDA_HOME environment variable nvcc path detection NVIDIA HPC SDK

💡 Result:

Short answer

  • TVM looks for nvcc using its nvcc helper (tvm.contrib.nvcc.find_cuda_path()) — it checks PATH, then CUDA_HOME (and common default /usr/local/cuda) to locate nvcc. Set PATH or CUDA_HOME so nvcc is discoverable. [1][2]

NVIDIA HPC SDK specifics and what to do

  • The HPC SDK does not rely on CUDA_HOME; the compilers use NVHPC_CUDA_HOME (or NVCOMPILER_CUDA_HOME / NVHPC_CUDA_HOME) to point to a CUDA toolkit inside/outside the SDK. The HPC SDK may ship multiple CUDA co-installs and also provides an nvcc wrapper in its compilers tree. CUDA_HOME can be ignored by NVHPC unless you set the NVHPC_* variable. [3][4]

Practical fixes

  1. Preferred: point NVHPC_CUDA_HOME (and also set CUDA_HOME for tools that expect it), and add the nvcc bin to PATH so TVM and other tools find it:
    export NVHPC_CUDA_HOME=/opt/nvidia/hpc_sdk//cuda/
    export CUDA_HOME="$NVHPC_CUDA_HOME"
    export PATH="$CUDA_HOME/bin:$PATH"
    export LD_LIBRARY_PATH="$CUDA_HOME/lib64:$LD_LIBRARY_PATH"
    (adjust paths to your install)

  2. Or create a symlink to the nvcc binary at /usr/local/cuda/bin/nvcc (or add the SDK nvcc path to PATH) so tvm.contrib.nvcc can find it without changing TVM code.

References

  • TVM nvcc helper / find_cuda_path docs. [1]
  • TVM install / CUDA detection notes. [2]
  • NVIDIA HPC Compilers User’s Guide: NVHPC_CUDA_HOME / NVCOMPILER_CUDA_HOME. [3]
  • NVIDIA HPC SDK release notes: CUDA_HOME is ignored by HPC Compilers; NVHPC_CUDA_HOME is used. [4]

Would you like the exact commands adapted to your HPC SDK / CUDA installation path?


🏁 Script executed:

#!/bin/bash
# Search for TVM's CUDA path detection logic - look for how it uses CUDA_HOME
# We need to understand if TVM expects CUDA_HOME to be version dir or compilers dir
rg -nP 'def find_cuda|CUDA_HOME|compilers/bin' --type=py -A10 | head -150

Length of output: 8258


@Degeneracy-Evil, thank you for the clarification. I want to make sure I understand the directory structure correctly.

Looking at the tilelang codebase, I see that when CUDA_HOME is used, the code expects to find:

  • nvcc at: {CUDA_HOME}/bin/nvcc (see setup.py:151)
  • Headers at: {CUDA_HOME}/include (see tilelang/jit/adapter/libgen.py:247)

For NVIDIA HPC SDK, if nvcc is at /opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin/nvcc:

  • With 3 dirname levels: CUDA_HOME = /opt/nvidia/hpc_sdk/Linux_x86_64/25.7

    • Expected nvcc path: /opt/nvidia/hpc_sdk/Linux_x86_64/25.7/bin/nvcc ❌ (doesn't exist)
    • Actual nvcc path: /opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin/nvcc
  • With 2 dirname levels: CUDA_HOME = /opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers

    • Expected nvcc path: /opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin/nvcc

Could you help clarify:

  1. Does the HPC SDK version directory have a bin/nvcc symlink that points to compilers/bin/nvcc?
  2. Or does TVM have special logic that searches subdirectories like compilers/bin/ when given the version directory?
  3. Where are the CUDA headers located in your HPC SDK installation (include/ or compilers/include/ or elsewhere)?

Understanding the actual directory structure and how TVM navigates it will help ensure the detection logic is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can directly set the cuda_home variable for testing, so your concerns are unnecessary.

When I set cuda_home="", the test fails.
When I set cuda_home="/", the test passes.
When I set cuda_home="/opt", the test passes.
When I set cuda_home="/opt/nvidia", the test passes.
When I set cuda_home="/opt/nvidia/hpc_sdk", the test passes.
When I set cuda_home="/opt/nvidia/hpc_sdk/Linux_x86_64", the test passes.
When I set cuda_home="/opt/nvidia/hpc_sdk/Linux_x86_64/25.7", the test passes.
When I set cuda_home="/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers", the test passes.
When I set cuda_home="/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/cuda", the test passes.
When I set cuda_home="/opt/nvidia/hpc_sdk/Linux_x86_64/some_wrong_path", the test passes.

Therefore, I believe that setting the directory to fall back to /opt/nvidia/hpc_sdk/Linux_x86_64/25.7 is the best choice. It ensures consistency with the NVHPC version detected by the nvcc check, while still allowing the TVM library to perform automatic detection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Degeneracy-Evil, thank you for the thorough empirical testing! Your results clearly demonstrate that TVM's nvcc detection is flexible enough to work with various CUDA_HOME values, including the version directory at /opt/nvidia/hpc_sdk/Linux_x86_64/25.7.

Given that:

  1. Your testing validates the 3-dirname approach works correctly
  2. It maintains consistency with the NVHPC version detected by nvcc
  3. TVM's automatic detection successfully locates the compiler regardless

Your implementation is sound. I appreciate you taking the time to validate this empirically. The current code is good to go! 👍

Comment on lines +54 to +55
elif os.path.exists('/opt/nvidia/hpc_sdk/Linux_x86_64'):
cuda_home = '/opt/nvidia/hpc_sdk/Linux_x86_64'
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

HPC SDK fallback path is too high-level and won't function correctly.

The fallback path /opt/nvidia/hpc_sdk/Linux_x86_64 points to the architecture directory, which contains multiple version subdirectories (e.g., 24.5/, 25.7/) but doesn't itself have the required CUDA structure (bin/, lib/, include/).

Example structure:

/opt/nvidia/hpc_sdk/Linux_x86_64/
├── 24.5/
│   └── compilers/  ← actual CUDA_HOME
├── 25.7/
│   └── compilers/  ← actual CUDA_HOME
└── ...

Setting cuda_home to /opt/nvidia/hpc_sdk/Linux_x86_64 will fail downstream operations expecting a valid CUDA installation.

Consider one of these solutions:

Solution 1 (recommended): Use glob to find the latest version and point to its compilers directory:

             else:
                 # Linux/macOS
                 if os.path.exists('/usr/local/cuda'):
                     cuda_home = '/usr/local/cuda'
                 elif os.path.exists('/opt/nvidia/hpc_sdk/Linux_x86_64'):
-                    cuda_home = '/opt/nvidia/hpc_sdk/Linux_x86_64'
+                    # Find the latest version directory
+                    hpc_sdk_base = '/opt/nvidia/hpc_sdk/Linux_x86_64'
+                    version_dirs = sorted(glob.glob(f'{hpc_sdk_base}/*/compilers'))
+                    if version_dirs:
+                        cuda_home = version_dirs[-1]  # Use latest version

Solution 2 (simpler): Remove this fallback and document that users must set CUDA_HOME manually when using HPC SDK without nvcc in PATH:

             else:
                 # Linux/macOS
                 if os.path.exists('/usr/local/cuda'):
                     cuda_home = '/usr/local/cuda'
-                elif os.path.exists('/opt/nvidia/hpc_sdk/Linux_x86_64'):
-                    cuda_home = '/opt/nvidia/hpc_sdk/Linux_x86_64'

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically need the directory to point back to /opt/nvidia/hpc_sdk/Linux_x86_64/, because the apache/tvm library can automatically locate the required nvcc compiler without manual intervention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for how TVM uses CUDA_HOME or resolves CUDA paths
rg -n -C5 --type=py 'CUDA_HOME|cuda_home' | head -100

Length of output: 5321


🏁 Script executed:

#!/bin/bash
# Search for TVM's nvcc detection logic
rg -n -C5 --type=py 'nvcc.*path|which.*nvcc' | head -100

Length of output: 2578


🏁 Script executed:

#!/bin/bash
# Look for any documentation about HPC SDK or CUDA path configuration
fd -e md -e rst -e txt | xargs rg -l 'hpc.*sdk|CUDA.*path|nvcc.*path' | head -20

Length of output: 23


🏁 Script executed:

#!/bin/bash
# Check if there's any TVM-related configuration or imports in the codebase
rg -n --type=py 'from tvm|import tvm' | head -50

Length of output: 3176


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@LeiWang1999 LeiWang1999 merged commit b0b5347 into tile-ai:main Oct 12, 2025
8 of 12 checks passed
RubiaCx pushed a commit to RubiaCx/tilelang that referenced this pull request Nov 24, 2025
…ile-ai#976)

* [Bugfix] Add NVIDIA HPC SDK support in CUDA detection (tile-ai#974)

Enhanced CUDA detection to recognize NVIDIA HPC SDK installations:
- Added path check for nvhpc in nvcc binary path
- Added fallback scan for default nvhpc paths:
  /opt/nvidia/hpc_sdk/Linux_x86_64
- Maintained backward compatibility with standard CUDA installations

Verification:
- Tested on Ubuntu 24.04 with NVIDIA HPC SDK 25.7
- Confirmed detection works without manual CUDA_HOME or CUDA_PATH setting

Fixes tile-ai#974

* [Bugfix] Fix CUDA home detection logic

* [Bugfix] Safely handle None cuda_home during CUDA detection

Adds a check for None before validating the CUDA home path to prevent errors when the path is not set.

* [Bugfix] Fix CUDA detection edge cases in nvhpc support (tile-ai#974)

- Improved nvhpc path detection logic
- Added None check for cuda_home to avoid crashes
- Maintained existing CUDA installation compatibility

Fixes tile-ai#974

* chore: rerun CI

---------

Co-authored-by: NaNExist <138002947+NaNExist@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for NVIDIA HPC SDK (nvhpc) installation paths

3 participants