[Attention] Bump FA for removed method #28429

MatthewBonanni · 2025-11-10T22:35:46Z

Purpose

vLLM-side PR for vllm-project/flash-attention#107. Removes unused flash_attn_with_kvcache and sets flags for build speedup + size reduction.

It looks like we see a 20% reduction in wheel size (481MB --> 380MB)

Recent commit on main:

[2025-11-14T16:34:18Z] #30 0.877 Wheel dist/vllm-0.11.1rc7.dev169+gc934caee8.cu129-cp38-abi3-linux_x86_64.whl is within the allowed size (481.69 MB).

This PR:

[2025-11-14T15:05:58Z] #30 0.953 Wheel dist/vllm-0.11.1rc7.dev174+g843768001.cu129-cp38-abi3-linux_x86_64.whl is within the allowed size (380.61 MB).

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <[email protected]>

mergify · 2025-11-10T22:36:57Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @MatthewBonanni.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist

Code Review

This pull request updates the flash-attention dependency. However, this change introduces two critical issues. First, the new dependency version removes the flash_attn_with_kvcache function, but this function is still used in the test file tests/kernels/attention/test_flash_attn.py, which will likely cause the tests to fail. The test code needs to be updated to reflect this change in the dependency. Second, the dependency now points to a temporary branch on a personal fork. This is a major risk for build stability and security, and it should be pointed to an official repository and a stable commit/tag before merging into a main branch.

cmake/external_projects/vllm_flash_attn.cmake

Signed-off-by: Matthew Bonanni <[email protected]>

mgoin · 2025-11-14T17:11:20Z

Wow, it looks like we see a 20% reduction in wheel size (481MB --> 380MB)

Recent commit on main:

[2025-11-14T16:34:18Z] #30 0.877 Wheel dist/vllm-0.11.1rc7.dev169+gc934caee8.cu129-cp38-abi3-linux_x86_64.whl is within the allowed size (481.69 MB).

This PR:

[2025-11-14T15:05:58Z] #30 0.953 Wheel dist/vllm-0.11.1rc7.dev174+g843768001.cu129-cp38-abi3-linux_x86_64.whl is within the allowed size (380.61 MB).

Signed-off-by: Matthew Bonanni <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: George D. Torres <[email protected]>

Signed-off-by: Matthew Bonanni <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Bram Wasti <[email protected]>

Signed-off-by: Matthew Bonanni <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

change git tag

9ce21ab

Signed-off-by: Matthew Bonanni <[email protected]>

mergify bot added the ci/build label Nov 10, 2025

MatthewBonanni changed the title ~~change git tag~~ [Attention] Bump FA for removed method Nov 10, 2025

mergify bot added the needs-rebase label Nov 10, 2025

gemini-code-assist bot reviewed Nov 10, 2025

View reviewed changes

cmake/external_projects/vllm_flash_attn.cmake Outdated Show resolved Hide resolved

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 10, 2025

MatthewBonanni added 3 commits November 11, 2025 09:34

remove comment (trigger CI)

806e1aa

Signed-off-by: Matthew Bonanni <[email protected]>

bump

9e266ce

Signed-off-by: Matthew Bonanni <[email protected]>

Merge branch 'main' into bump_fa

cf7468f

Signed-off-by: Matthew Bonanni <[email protected]>

mergify bot removed the needs-rebase label Nov 13, 2025

LucasWilkinson changed the title ~~[Attention] Bump FA for removed method~~ [DO NOT MERGE][Attention] Bump FA for removed method Nov 13, 2025

MatthewBonanni added 2 commits November 13, 2025 15:00

update post-merge

20fd19f

Signed-off-by: Matthew Bonanni <[email protected]>

Merge branch 'main' into bump_fa

9825271

MatthewBonanni changed the title ~~[DO NOT MERGE][Attention] Bump FA for removed method~~ [Attention] Bump FA for removed method Nov 13, 2025

Merge branch 'main' into bump_fa

160afa8

mgoin approved these changes Nov 13, 2025

View reviewed changes

Merge branch 'main' into bump_fa

8437680

vllm-bot merged commit 8cc40f8 into vllm-project:main Nov 14, 2025
87 of 89 checks passed

MatthewBonanni deleted the bump_fa branch November 14, 2025 17:19

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Attention] Bump FA for removed method (vllm-project#28429)

b270ed3

Signed-off-by: Matthew Bonanni <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025

[Attention] Bump FA for removed method (vllm-project#28429)

cb64e69

Signed-off-by: Matthew Bonanni <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Attention] Bump FA for removed method #28429

[Attention] Bump FA for removed method #28429

Uh oh!

MatthewBonanni commented Nov 10, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mgoin commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Attention] Bump FA for removed method #28429

[Attention] Bump FA for removed method #28429

Uh oh!

Conversation

MatthewBonanni commented Nov 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mgoin commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MatthewBonanni commented Nov 10, 2025 •

edited by github-actions bot

Loading