[Bugfix][MLA] Change default SM100 MLA prefill backend back to TRT-LLM#38562
Conversation
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
There was a problem hiding this comment.
Code Review
This pull request updates the default value of use_trtllm_ragged_deepseek_prefill to True in the attention configuration. The reviewer suggests renaming this flag to use_trtllm_mla_prefill to better reflect its general purpose for MLA prefill backends and improve maintainability, as the current name is overly specific to DeepSeek.
LucasWilkinson
left a comment
There was a problem hiding this comment.
Thank you for the quick fix!
| """Whether to use cudnn prefill.""" | ||
|
|
||
| use_trtllm_ragged_deepseek_prefill: bool = False | ||
| use_trtllm_ragged_deepseek_prefill: bool = True |
There was a problem hiding this comment.
Where do we control FA4 MLA prefill? I don't see a similar entry for it
There was a problem hiding this comment.
It falls through to FA4 when trtllm isn't enabled. It's a messy interface, #32623 will clean this up
There was a problem hiding this comment.
Relevant code block is here:
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: zhutaoyu <zhutaoyu97@gmail.com>
…o TRT-LLM (vllm-project#38562)" This reverts commit 2c734ed.
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: neweyes <328719365@qq.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: bhargav-patel-29 <bhargav.patel@tihiitb.org>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: rishitdholakia13 <rishit+github@cohere.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Rishi Puri <riship@nvidia.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
vllm-project#38562) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
FIX: #36763
Purpose
On SM100, FA4 MLA prefill appears to cause unusable output on Kimi-K2.5. This PR changes the default MLA prefill backend back to TRTLLM while we resolve the issues with FA4.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.