[flashinfer][fix] do not check nvcc availability when using pre-downloaded cubins by mxz297 · Pull Request #27990 · vllm-project/vllm

mxz297 · 2025-11-03T16:19:53Z

Summary: #26443 adds checking of availability of nvcc as a condition to enable flashinfer moe. In our deployment env, there is no nvcc, so flashinfer moe is disabled
Differential Revision: D86104899

gemini-code-assist

Code Review

This pull request aims to enable FlashInfer in environments where nvcc is not available by removing the nvcc availability check. While this addresses the issue for environments with pre-compiled kernels, it could introduce runtime crashes for users who lack both nvcc and pre-compiled kernels. I've suggested a safer alternative that makes the nvcc check conditional on the VLLM_HAS_FLASHINFER_CUBIN environment variable. This approach provides the desired flexibility for production environments while preserving the safeguard for other users.

mxz297 · 2025-11-03T16:22:26Z

@mgoin our internal prod environment uses flashinfer in an AOT fashion, and do not have nvcc. So right now we are seeing flashinfer moe being disabled internally, causing perf regression.

alecsolder · 2025-11-03T16:22:34Z

Is there a way we can add unit tests to ensure this doesn't get turned off accidentally again for the model?

heheda12345 · 2025-11-05T01:28:36Z

Is nvcc required by the jit compilation of FlashInfer?

mxz297 · 2025-11-07T00:41:44Z

Is nvcc required by the jit compilation of FlashInfer?

Yes, so it makes sense to take the recommendation from the gemini-code-assist to guard the check with VLLM_HAS_FLASHINFER_CUBIN

gemini-code-assist · 2025-11-07T00:41:55Z

That's correct. Guarding the nvcc check with VLLM_HAS_FLASHINFER_CUBIN ensures that FlashInfer JIT compilation only attempts to use nvcc when pre-downloaded cubins are not available, which aligns with the goal of supporting AOT environments. This change effectively addresses the issue of FlashInfer being disabled in environments without nvcc.

…-project#27990) Summary: vllm-project#26443 adds checking of availability of nvcc as a condition to enable flashinfer moe. On devgpus, we may have nvcc so there is no issue. But in tw jobs, there is no nvcc, then flashinfer moe is disabled. Differential Revision: D86104899 Signed-off-by: Xiaozhu <mxz297@gmail.com>

Signed-off-by: Xiaozhu <mxz297@gmail.com>

houseroad

Looks good.

…oaded cubins (vllm-project#27990) Signed-off-by: Xiaozhu <mxz297@gmail.com> Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>

gemini-code-assist bot reviewed Nov 3, 2025

View reviewed changes

mxz297 changed the title ~~do not check nvcc availability~~ [flashinfer][fix] do not check nvcc availability Nov 3, 2025

heheda12345 requested review from mgoin and youkaichao November 5, 2025 01:28

mxz297 force-pushed the export-D86104899 branch from d0134bf to 7e39e1e Compare November 7, 2025 00:38

mxz297 changed the title ~~[flashinfer][fix] do not check nvcc availability~~ [flashinfer][fix] do not check nvcc availability when using pre-downloaded cubins Nov 7, 2025

mxz297 force-pushed the export-D86104899 branch from 7e39e1e to ac594ea Compare November 7, 2025 01:11

Fix lint

ff76802

Signed-off-by: Xiaozhu <mxz297@gmail.com>

houseroad approved these changes Nov 7, 2025

View reviewed changes

houseroad added ready-for-merge ready ONLY add when PR is ready to merge/full CI is needed labels Nov 7, 2025

Merge branch 'main' into export-D86104899

9f7e0eb

houseroad merged commit 4a36681 into vllm-project:main Nov 8, 2025
45 checks passed

mxz297 deleted the export-D86104899 branch November 8, 2025 18:10

mgoin added the nvidia label Nov 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flashinfer][fix] do not check nvcc availability when using pre-downloaded cubins#27990

[flashinfer][fix] do not check nvcc availability when using pre-downloaded cubins#27990
houseroad merged 3 commits intovllm-project:mainfrom
mxz297:export-D86104899

mxz297 commented Nov 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mxz297 commented Nov 3, 2025 •

edited

Loading

Uh oh!

alecsolder commented Nov 3, 2025

Uh oh!

heheda12345 commented Nov 5, 2025

Uh oh!

mxz297 commented Nov 7, 2025

Uh oh!

gemini-code-assist bot commented Nov 7, 2025

Uh oh!

houseroad left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

mxz297 commented Nov 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mxz297 commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alecsolder commented Nov 3, 2025

Uh oh!

heheda12345 commented Nov 5, 2025

Uh oh!

mxz297 commented Nov 7, 2025

Uh oh!

gemini-code-assist bot commented Nov 7, 2025

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mxz297 commented Nov 3, 2025 •

edited by github-actions bot

Loading

mxz297 commented Nov 3, 2025 •

edited

Loading