[Core] Remove `xformers` dependency by ywang96 · Pull Request #28287 · vllm-project/vllm

ywang96 · 2025-11-07T10:13:18Z

Purpose

This PR completely removes the dependency of xformers library and should be only merged after v0.11.1 release. The rationale behind removing xformers is that:

xformers is used for multimodal attention (MHA) but we can have alternative attention backends to replace it
We have xformers attention backend for decoder LM, but it's no longer used for anything
Having another external dependency puts extra risks on our release - a hard lesson we learned from working on upgrading pytorch 2.9.

For 1, the alternatives are:

Compile for commom head sizes of MHA in our fork of FA so we don't need to use xformers as a fallback
Triton

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2025-11-07T10:13:54Z

Documentation preview: https://vllm--28287.org.readthedocs.build/en/28287/

Signed-off-by: Roger Wang <hey@rogerw.io>

tests/kernels/attention/test_prefix_prefill.py

mergify · 2025-11-11T05:14:10Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ywang96.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

ywang96 · 2025-11-12T09:24:03Z

Going to create a new PR for this since there are quite a few conflicts.

mergify bot added documentation Improvements or additions to documentation ci/build qwen Related to Qwen models v1 labels Nov 7, 2025

Delete

e25ab7b

Signed-off-by: Roger Wang <hey@rogerw.io>

ywang96 force-pushed the remove-xformers branch from 1dc5055 to e25ab7b Compare November 7, 2025 10:35

zhewenl reviewed Nov 7, 2025

View reviewed changes

tests/kernels/attention/test_prefix_prefill.py Show resolved Hide resolved

zhewenl mentioned this pull request Nov 11, 2025

[CI/Build] Refactor Attention backend for test_prefix_prefill from xformers to SDPA #28424

Merged

mergify bot added needs-rebase nvidia labels Nov 11, 2025

github-project-automation bot added this to NVIDIA Nov 11, 2025

ywang96 closed this Nov 12, 2025

github-project-automation bot moved this to Done in NVIDIA Nov 12, 2025

This was referenced Nov 20, 2025

[Core] Deprecate xformers #29080

Closed

[Core] Deprecate xformers #29262

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core] Remove `xformers` dependency#28287

[Core] Remove `xformers` dependency#28287
ywang96 wants to merge 1 commit intovllm-project:mainfrom
ywang96:remove-xformers

ywang96 commented Nov 7, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Nov 7, 2025

Uh oh!

Uh oh!

mergify bot commented Nov 11, 2025

Uh oh!

ywang96 commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ywang96 commented Nov 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Nov 7, 2025

Uh oh!

Uh oh!

mergify bot commented Nov 11, 2025

Uh oh!

ywang96 commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ywang96 commented Nov 7, 2025 •

edited by github-actions bot

Loading