[Bugfix] Auto-configure TRITON_PTXAS_PATH for new GPU architectures by danielostrow · Pull Request #32704 · vllm-project/vllm

danielostrow · 2026-01-20T17:11:50Z

Summary

Triton bundles a ptxas binary from CUDA 12.8 that does not support GPU architectures sm_110a (Jetson Thor) or sm_121a (DGX Spark GB10). This causes Triton kernel compilation to fail with:

ptxas fatal: Value 'sm_121a' is not defined for option 'gpu-name'

This PR adds automatic detection of new GPU architectures and configures TRITON_PTXAS_PATH to use the system CUDA toolkit's ptxas when needed.

Changes

Add _configure_triton_ptxas_for_new_gpus() function to vllm/triton_utils/importing.py
Uses Triton's native backend detection (triton.backends.backends) to get GPU architecture
Sets TRITON_PTXAS_PATH for GPUs with arch >= 110 (compute capability 11.0+)
Respects user-configured TRITON_PTXAS_PATH if already set
Fails gracefully if detection is unavailable
Add unit tests for the new functionality

Testing

Tested on NVIDIA GB10 (DGX Spark) with:

CUDA 13.0 (V13.0.88)
Triton 3.5.1
PyTorch 2.9.1+cu130

Verified that:

Triton kernels compile and execute successfully with the fix
Triton kernels fail without the fix (expected ptxas error)

Related Issues

Fixes #31269
Fixes #32093

Related to #29469

github-actions · 2026-01-20T17:12:25Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

mergify · 2026-01-20T17:19:55Z

Hi @danielostrow, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

gemini-code-assist

Code Review

This pull request introduces a helpful auto-configuration mechanism for TRITON_PTXAS_PATH to support new GPU architectures. The implementation is sound and the accompanying tests are thorough. My review focuses on improving the debuggability of the new logic by adding logging to error-handling paths that currently fail silently. These changes will make it easier to diagnose issues if the auto-configuration does not behave as expected.

vllm/triton_utils/importing.py

Triton bundles a ptxas binary from CUDA 12.8 that does not support GPU architectures sm_110a (Jetson Thor) or sm_121a (DGX Spark GB10). This causes Triton kernel compilation to fail with: ptxas fatal: Value 'sm_121a' is not defined for option 'gpu-name' This change adds automatic detection of new GPU architectures using Triton's native backend detection and configures TRITON_PTXAS_PATH to use the system CUDA toolkit's ptxas when needed. The fix: - Uses triton.backends.backends to detect GPU architecture - Sets TRITON_PTXAS_PATH for GPUs with arch >= 110 (CC 11.0+) - Respects user-configured TRITON_PTXAS_PATH if already set - Fails gracefully if detection is unavailable Tested on NVIDIA GB10 (DGX Spark) with CUDA 13.0 and Triton 3.5.1. Related issues: vllm-project#31269, vllm-project#29469, vllm-project#32093 Signed-off-by: Daniel Ostrow <daniel@neuralintellect.com>

danielostrow · 2026-01-21T21:27:12Z

How are we looking on this? is it relevant?

changqingla · 2026-01-27T09:16:32Z

I'm experiencing the same problem and hope this PR can be merged as soon as possible.

danielostrow · 2026-01-28T00:52:15Z

tests/test_triton_utils.py

Please note-- this script acts as a handoff when Triton updates are behind CUDA updates. in the future this handoff may be deemed irrelevant on Triton update

Cherry-pick from PR vllm-project#32704 - auto-detects GPU arch >= 110 and configures TRITON_PTXAS_PATH to use system CUDA toolkit's ptxas instead of Triton's bundled version (CUDA 12.8) which doesn't support sm_121a. This ensures Triton kernels compile correctly on DGX Spark GB10. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: seli-equinix <seli@equinix.com>

mgoin · 2026-02-04T20:37:34Z

I think it is likely that this will be resolved by torch==2.10.0 update here #30525 since that pins to triton==3.6.0

Kaweees · 2026-02-12T22:30:55Z

I think it is likely that this will be resolved by torch==2.10.0 update here #30525 since that pins to triton==3.6.0

@mgoin I could be completely wrong, but my configuration in #34470 uses torch>=2.10.0 and triton>=3.6.0 but still has issues

Cherry-pick from PR vllm-project#32704 - auto-detects GPU arch >= 110 and configures TRITON_PTXAS_PATH to use system CUDA toolkit's ptxas instead of Triton's bundled version (CUDA 12.8) which doesn't support sm_121a. This ensures Triton kernels compile correctly on DGX Spark GB10. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: seli-equinix <seli@equinix.com>

johnnynunez · 2026-02-24T13:33:32Z

I think it is likely that this will be resolved by torch==2.10.0 update here #30525 since that pins to triton==3.6.0

@mgoin I could be completely wrong, but my configuration in #34470 uses torch>=2.10.0 and triton>=3.6.0 but still has issues

because that is only solved in future triton 3.7.0
you have to build from main due not cut branch yet
and use https://dev-discuss.pytorch.org/t/pytorch-2-11-rc1-produced-for-pytorch-torchvision/3316

then build vllm from upstream...

I had running yesterday nemotron nvfp4 in jetson agx thor and dgx spark

Cherry-pick from PR vllm-project#32704 - auto-detects GPU arch >= 110 and configures TRITON_PTXAS_PATH to use system CUDA toolkit's ptxas instead of Triton's bundled version (CUDA 12.8) which doesn't support sm_121a. This ensures Triton kernels compile correctly on DGX Spark GB10. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: seli-equinix <seli@equinix.com>

mergify bot added the bug Something isn't working label Jan 20, 2026

danielostrow force-pushed the fix-triton-ptxas-new-gpus branch from 6e0d8d4 to 2edfbec Compare January 20, 2026 17:13

danielostrow force-pushed the fix-triton-ptxas-new-gpus branch from 2edfbec to 641bf35 Compare January 20, 2026 17:27

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

vllm/triton_utils/importing.py Outdated Show resolved Hide resolved

vllm/triton_utils/importing.py Outdated Show resolved Hide resolved

danielostrow force-pushed the fix-triton-ptxas-new-gpus branch from 641bf35 to ad16030 Compare January 20, 2026 17:30

danielostrow commented Jan 28, 2026

View reviewed changes

shahizat mentioned this pull request Mar 7, 2026

[Bug]: Torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device #31170

Open

1 task

mergify bot added the intel-gpu Related to Intel GPU label Mar 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Auto-configure TRITON_PTXAS_PATH for new GPU architectures#32704

[Bugfix] Auto-configure TRITON_PTXAS_PATH for new GPU architectures#32704
danielostrow wants to merge 1 commit intovllm-project:mainfrom
danielostrow:fix-triton-ptxas-new-gpus

danielostrow commented Jan 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

mergify bot commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

danielostrow commented Jan 21, 2026

Uh oh!

changqingla commented Jan 27, 2026

Uh oh!

danielostrow Jan 28, 2026

Uh oh!

mgoin commented Feb 4, 2026

Uh oh!

Kaweees commented Feb 12, 2026 •

edited

Loading

Uh oh!

johnnynunez commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

danielostrow commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Related Issues

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

mergify bot commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

danielostrow commented Jan 21, 2026

Uh oh!

changqingla commented Jan 27, 2026

Uh oh!

danielostrow Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mgoin commented Feb 4, 2026

Uh oh!

Kaweees commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnnynunez commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danielostrow commented Jan 20, 2026 •

edited

Loading

Kaweees commented Feb 12, 2026 •

edited

Loading

johnnynunez commented Feb 24, 2026 •

edited

Loading