[flashinfer][fix] do not check nvcc availability when using pre-downloaded cubins#27990
[flashinfer][fix] do not check nvcc availability when using pre-downloaded cubins#27990houseroad merged 3 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request aims to enable FlashInfer in environments where nvcc is not available by removing the nvcc availability check. While this addresses the issue for environments with pre-compiled kernels, it could introduce runtime crashes for users who lack both nvcc and pre-compiled kernels. I've suggested a safer alternative that makes the nvcc check conditional on the VLLM_HAS_FLASHINFER_CUBIN environment variable. This approach provides the desired flexibility for production environments while preserving the safeguard for other users.
|
@mgoin our internal prod environment uses flashinfer in an AOT fashion, and do not have nvcc. So right now we are seeing flashinfer moe being disabled internally, causing perf regression. |
|
Is there a way we can add unit tests to ensure this doesn't get turned off accidentally again for the model? |
|
Is nvcc required by the jit compilation of FlashInfer? |
d0134bf to
7e39e1e
Compare
Yes, so it makes sense to take the recommendation from the gemini-code-assist to guard the check with VLLM_HAS_FLASHINFER_CUBIN |
|
That's correct. Guarding the |
…-project#27990) Summary: vllm-project#26443 adds checking of availability of nvcc as a condition to enable flashinfer moe. On devgpus, we may have nvcc so there is no issue. But in tw jobs, there is no nvcc, then flashinfer moe is disabled. Differential Revision: D86104899 Signed-off-by: Xiaozhu <mxz297@gmail.com>
7e39e1e to
ac594ea
Compare
…oaded cubins (vllm-project#27990) Signed-off-by: Xiaozhu <mxz297@gmail.com> Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>
Summary: #26443 adds checking of availability of nvcc as a condition to enable flashinfer moe. In our deployment env, there is no nvcc, so flashinfer moe is disabled
Differential Revision: D86104899