[0.9.1][bugfix] fix deepseek memory bug by zzzzwwjj · Pull Request #1551 · vllm-project/vllm-ascend

zzzzwwjj · 2025-07-01T07:17:58Z

What this PR does / why we need it?

fix OOM error when chunked_prefill_for_mla is enable and long input scene.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: zzzzwwjj <1183291235@qq.com>

ApsarasX · 2025-07-01T11:17:28Z

Can a compressed mask be used? If the sequence is too long, it might cause memory waste here.

Refer #1100

ringmla's mask size is equal to chunksize, it won't cause memory waste.

ganyi1996ppo · 2025-07-02T06:26:51Z

Looks good, please cherry-pick this change back to main

jianzs · 2025-07-29T11:41:39Z

Should we consider which scheduler is be using?

self.use_ring_mla = ascend_config.chunked_prefill_for_mla or \ not ascend_config.ascend_scheduler_config.enabled

fix OOM error when `chunked_prefill_for_mla` is enable and long input scene. Signed-off-by: zzzzwwjj <1183291235@qq.com>

fix OOM error when `chunked_prefill_for_mla` is enable and long input scene. Signed-off-by: zzzzwwjj <1183291235@qq.com> Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

[bugfix] fix deepseek memory bug

4c87236

Signed-off-by: zzzzwwjj <1183291235@qq.com>

github-actions Bot added the module:ops label Jul 1, 2025

ApsarasX reviewed Jul 1, 2025

View reviewed changes

wangxiyuan changed the title ~~[bugfix] fix deepseek memory bug~~ [0.9.1][bugfix] fix deepseek memory bug Jul 2, 2025

ganyi1996ppo approved these changes Jul 2, 2025

View reviewed changes

ganyi1996ppo merged commit 129a472 into vllm-project:v0.9.1-dev Jul 2, 2025
15 checks passed

Yikun added the no-main label Jul 14, 2025

jianzs mentioned this pull request Jul 21, 2025

[Bugfix]: Correct handling of cos_sin_cache length #1900

Closed

jianzs reviewed Jul 29, 2025

View reviewed changes

jianzs pushed a commit to jianzs/vllm-ascend that referenced this pull request Jul 29, 2025

[0.9.1][bugfix] fix deepseek memory bug (vllm-project#1551)

20683f4

fix OOM error when `chunked_prefill_for_mla` is enable and long input scene. Signed-off-by: zzzzwwjj <1183291235@qq.com>

wangxiyuan mentioned this pull request Sep 4, 2025

[bugfix] fix deepseek rope sincoscache re-generation #2744

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[0.9.1][bugfix] fix deepseek memory bug#1551

[0.9.1][bugfix] fix deepseek memory bug#1551
ganyi1996ppo merged 1 commit intovllm-project:v0.9.1-devfrom
zzzzwwjj:v0.9.1-dev

zzzzwwjj commented Jul 1, 2025 •

edited

Loading

Uh oh!

ApsarasX Jul 1, 2025

Uh oh!

zzzzwwjj Jul 2, 2025

Uh oh!

ganyi1996ppo commented Jul 2, 2025

Uh oh!

Uh oh!

jianzs Jul 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

zzzzwwjj commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ApsarasX Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

zzzzwwjj Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

ganyi1996ppo commented Jul 2, 2025

Uh oh!

Uh oh!

jianzs Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zzzzwwjj commented Jul 1, 2025 •

edited

Loading