Skip to content

[V0.9.1][BugFix] Address PrefillCacheHit state to fix prefix cache accuracy bug#1492

Merged
wangxiyuan merged 1 commit intovllm-project:v0.9.1-devfrom
whx-sjtu:fix_prefix_cache_accu_091
Jun 28, 2025
Merged

[V0.9.1][BugFix] Address PrefillCacheHit state to fix prefix cache accuracy bug#1492
wangxiyuan merged 1 commit intovllm-project:v0.9.1-devfrom
whx-sjtu:fix_prefix_cache_accu_091

Conversation

@whx-sjtu
Copy link
Copy Markdown
Collaborator

@whx-sjtu whx-sjtu commented Jun 28, 2025

When use AscendScheduler with prefix-cache enabled and chunk-prefill disabled, there will be accuray problem because there is no branch in mla_v1 to process this scenario. This PR fixes it.

Backport: #1498

Signed-off-by: whx-sjtu <2952154980@qq.com>
@MengqingCao
Copy link
Copy Markdown
Collaborator

Thanks for this fix, please also backport on main

@wangxiyuan wangxiyuan merged commit 9acc082 into vllm-project:v0.9.1-dev Jun 28, 2025
15 checks passed
@whx-sjtu whx-sjtu deleted the fix_prefix_cache_accu_091 branch June 28, 2025 06:53
@Yikun Yikun changed the title [BugFix] Fix accuray bug of prefix-caching. [V0.9.1][BugFix] Address PrefillCacheHit state to fix prefix cache accuracy bug Jun 29, 2025
@Yikun Yikun added the no-main label Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants