Skip to content

[Bugfix][MLA] Change default SM100 MLA prefill backend back to TRT-LLM#38562

Merged
vllm-bot merged 1 commit intovllm-project:mainfrom
MatthewBonanni:fi_mla_prefill_default
Mar 30, 2026
Merged

[Bugfix][MLA] Change default SM100 MLA prefill backend back to TRT-LLM#38562
vllm-bot merged 1 commit intovllm-project:mainfrom
MatthewBonanni:fi_mla_prefill_default

Conversation

@MatthewBonanni
Copy link
Copy Markdown
Collaborator

@MatthewBonanni MatthewBonanni commented Mar 30, 2026

FIX: #36763

Purpose

On SM100, FA4 MLA prefill appears to cause unusable output on Kimi-K2.5. This PR changes the default MLA prefill backend back to TRTLLM while we resolve the issues with FA4.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify bot added the bug Something isn't working label Mar 30, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the default value of use_trtllm_ragged_deepseek_prefill to True in the attention configuration. The reviewer suggests renaming this flag to use_trtllm_mla_prefill to better reflect its general purpose for MLA prefill backends and improve maintainability, as the current name is overly specific to DeepSeek.

@MatthewBonanni MatthewBonanni changed the title [Bugfix][MLA] Change default MLA prefill backend back to TRT-LLM [Bugfix][MLA] Change default SM100 MLA prefill backend back to TRT-LLM Mar 30, 2026
Copy link
Copy Markdown
Collaborator

@LucasWilkinson LucasWilkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick fix!

@LucasWilkinson LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 30, 2026
@LucasWilkinson LucasWilkinson added this to the v0.18.0 cherry picks milestone Mar 30, 2026
"""Whether to use cudnn prefill."""

use_trtllm_ragged_deepseek_prefill: bool = False
use_trtllm_ragged_deepseek_prefill: bool = True
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we control FA4 MLA prefill? I don't see a similar entry for it

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It falls through to FA4 when trtllm isn't enabled. It's a messy interface, #32623 will clean this up

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant code block is here:

@vllm-bot vllm-bot merged commit 2c734ed into vllm-project:main Mar 30, 2026
41 of 55 checks passed
@MatthewBonanni MatthewBonanni deleted the fi_mla_prefill_default branch March 30, 2026 18:05
khluu pushed a commit that referenced this pull request Mar 30, 2026
#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
(cherry picked from commit 2c734ed)
benenzhu pushed a commit to benenzhu/vllm that referenced this pull request Mar 31, 2026
vllm-project#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: zhutaoyu <zhutaoyu97@gmail.com>
vllm-agent pushed a commit to vllm-agent/vllm that referenced this pull request Mar 31, 2026
neweyes pushed a commit to neweyes/vllm that referenced this pull request Mar 31, 2026
vllm-project#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: neweyes <328719365@qq.com>
EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026
vllm-project#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: EricccYang <yangyang4991@gmail.com>
bhargav-patel-29 pushed a commit to Bharatgen-Tech/vllm that referenced this pull request Apr 1, 2026
vllm-project#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: bhargav-patel-29 <bhargav.patel@tihiitb.org>
yzong-rh pushed a commit to yzong-rh/vllm that referenced this pull request Apr 3, 2026
liuchenbing2026 pushed a commit to liuchenbing2026/vllm that referenced this pull request Apr 4, 2026
rishitdholakia13 pushed a commit to rishitdholakia13/vllm that referenced this pull request Apr 7, 2026
vllm-project#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: rishitdholakia13 <rishit+github@cohere.com>
puririshi98 pushed a commit to puririshi98/vllm that referenced this pull request Apr 7, 2026
vllm-project#38562)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Rishi Puri <riship@nvidia.com>
big-yellow-duck pushed a commit to EmbeddedLLM/vllm that referenced this pull request Apr 8, 2026
mtparet pushed a commit to blackfuel-ai/vllm that referenced this pull request Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Kimi-K2.5 outputs only '!!!!!!!!!!' in reasoning field, content is always null

5 participants