fix: error due to FA2 when building#3266
fix: error due to FA2 when building#3266AlpinDale wants to merge 3 commits intovllm-project:mainfrom
Conversation
|
@AlpinDale thank you for this! This almost worked out of the box for me but I got an error |
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
Thanks for the quick fix @mgoin |
|
I spoke too soon, it seems like the build succeeds but in actuality flash-attn just fails to install |
|
That's odd. I wonder if there's a way to specify a module should be installed without external dependencies in requirements.txt. That should be the only reason we need to do this for flash attention. |
|
Looks like there's no way to do this reliably. @WoosukKwon can we instead import the flash attention forward kernels directly in vLLM? I'm unsure why they're needed in the first place, I noticed 0 performance improvements with Flash Attention 2 in place of xFormers. |
|
Closing due to #3269 being a better solution. |
The #3005 PR introduced an issue where the Python env can't find
pipunder certain conditions. This PR usesensurepipto bootstrappipinto the existing environment.Resolves #3265