Skip to content

[Core] Remove xformers dependency#28287

Closed
ywang96 wants to merge 1 commit intovllm-project:mainfrom
ywang96:remove-xformers
Closed

[Core] Remove xformers dependency#28287
ywang96 wants to merge 1 commit intovllm-project:mainfrom
ywang96:remove-xformers

Conversation

@ywang96
Copy link
Copy Markdown
Member

@ywang96 ywang96 commented Nov 7, 2025

Purpose

This PR completely removes the dependency of xformers library and should be only merged after v0.11.1 release. The rationale behind removing xformers is that:

  1. xformers is used for multimodal attention (MHA) but we can have alternative attention backends to replace it
  2. We have xformers attention backend for decoder LM, but it's no longer used for anything
  3. Having another external dependency puts extra risks on our release - a hard lesson we learned from working on upgrading pytorch 2.9.

For 1, the alternatives are:

  • Compile for commom head sizes of MHA in our fork of FA so we don't need to use xformers as a fallback
  • Triton

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify
Copy link
Copy Markdown

mergify bot commented Nov 7, 2025

Documentation preview: https://vllm--28287.org.readthedocs.build/en/28287/

@mergify mergify bot added documentation Improvements or additions to documentation ci/build qwen Related to Qwen models v1 labels Nov 7, 2025
Signed-off-by: Roger Wang <hey@rogerw.io>
@mergify
Copy link
Copy Markdown

mergify bot commented Nov 11, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ywang96.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@ywang96
Copy link
Copy Markdown
Member Author

ywang96 commented Nov 12, 2025

Going to create a new PR for this since there are quite a few conflicts.

@ywang96 ywang96 closed this Nov 12, 2025
@github-project-automation github-project-automation bot moved this to Done in NVIDIA Nov 12, 2025
This was referenced Nov 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation needs-rebase nvidia qwen Related to Qwen models v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants