Skip to content

[Attention][MLA] Re-enable FA4 as default MLA prefill backend#38819

Merged
LucasWilkinson merged 2 commits intomainfrom
revert-38562-fi_mla_prefill_default
Apr 6, 2026
Merged

[Attention][MLA] Re-enable FA4 as default MLA prefill backend#38819
LucasWilkinson merged 2 commits intomainfrom
revert-38562-fi_mla_prefill_default

Conversation

@MatthewBonanni
Copy link
Copy Markdown
Collaborator

@MatthewBonanni MatthewBonanni commented Apr 2, 2026

Reverts #38562

NaN issue resulting in correctness problems for MLA models #36763 has been resolved by updating FA4 (#38690) to capture the upstream fix Dao-AILab/flash-attention@0293155

This PR makes FA4 the default again due to its superior performance (see benchmarks in #34732)

@MatthewBonanni MatthewBonanni added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 2, 2026
@MatthewBonanni MatthewBonanni changed the title Re-enable FA4 as default MLA prefill backend [Do Not Merge] Re-enable FA4 as default MLA prefill backend Apr 2, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the AttentionConfig in vllm/config/attention.py by changing the default value of use_trtllm_ragged_deepseek_prefill from True to False. I have no feedback to provide.

@MatthewBonanni MatthewBonanni changed the title [Do Not Merge] Re-enable FA4 as default MLA prefill backend [Attention][MLA] Re-enable FA4 as default MLA prefill backend Apr 2, 2026
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@LucasWilkinson LucasWilkinson merged commit 9c81f35 into main Apr 6, 2026
57 checks passed
@LucasWilkinson LucasWilkinson deleted the revert-38562-fi_mla_prefill_default branch April 6, 2026 21:51
askliar pushed a commit to netanel-haber/vllm that referenced this pull request Apr 7, 2026
askliar pushed a commit to netanel-haber/vllm that referenced this pull request Apr 7, 2026
askliar pushed a commit to netanel-haber/vllm that referenced this pull request Apr 7, 2026
jacob-lou pushed a commit to jacob-lou/vllm that referenced this pull request Apr 7, 2026
USTCKAY pushed a commit to USTCKAY/vllm that referenced this pull request Apr 7, 2026
@mgoin mgoin added the nvidia label Apr 7, 2026
rishitdholakia13 pushed a commit to rishitdholakia13/vllm that referenced this pull request Apr 7, 2026
…roject#38819)

Signed-off-by: rishitdholakia13 <rishit+github@cohere.com>
puririshi98 pushed a commit to puririshi98/vllm that referenced this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nvidia ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants