[Bugfix][DP] Add with_prefill_across_dp to AscendMetadata to fix dp#1094
Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom Jun 6, 2025
Merged
[Bugfix][DP] Add with_prefill_across_dp to AscendMetadata to fix dp#1094wangxiyuan merged 1 commit intovllm-project:mainfrom
wangxiyuan merged 1 commit intovllm-project:mainfrom
Conversation
Signed-off-by: MengqingCao <cmq0113@163.com>
wangxiyuan
approved these changes
Jun 6, 2025
ApsarasX
approved these changes
Jun 6, 2025
76 tasks
Yikun
pushed a commit
that referenced
this pull request
Jun 26, 2025
### What this PR does / why we need it? After #1094, decode might be executed with non-compiled mode, despite of `torchair_graph_config.enabled`, causing multistream mla to fail, which assumes torchair compiled mode for decode when `torchair_graph_config.enabled == True`. Augment that assumption to fix this. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested both offline, and by graph mode mla e2e testcase. --------- Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
weijinqian0
pushed a commit
to weijinqian0/vllm-ascend
that referenced
this pull request
Jun 30, 2025
### What this PR does / why we need it? After vllm-project#1094, decode might be executed with non-compiled mode, despite of `torchair_graph_config.enabled`, causing multistream mla to fail, which assumes torchair compiled mode for decode when `torchair_graph_config.enabled == True`. Augment that assumption to fix this. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested both offline, and by graph mode mla e2e testcase. --------- Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
chopper0126
pushed a commit
to chopper0126/vllm-ascend
that referenced
this pull request
Oct 16, 2025
…llm-project#1094) ### What this PR does / why we need it? Add `with_prefill_across_dp` to AscendMetadata to fix dp This pr fixes the bug introduced by vllm-project#1012, which add an arg `with_prefill_across_dp` when dp_size > 1. Signed-off-by: MengqingCao <cmq0113@163.com>
chopper0126
pushed a commit
to chopper0126/vllm-ascend
that referenced
this pull request
Oct 16, 2025
### What this PR does / why we need it? After vllm-project#1094, decode might be executed with non-compiled mode, despite of `torchair_graph_config.enabled`, causing multistream mla to fail, which assumes torchair compiled mode for decode when `torchair_graph_config.enabled == True`. Augment that assumption to fix this. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested both offline, and by graph mode mla e2e testcase. --------- Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
Angazenn
pushed a commit
to Angazenn/vllm-ascend
that referenced
this pull request
Oct 21, 2025
…llm-project#1094) ### What this PR does / why we need it? Add `with_prefill_across_dp` to AscendMetadata to fix dp This pr fixes the bug introduced by vllm-project#1012, which add an arg `with_prefill_across_dp` when dp_size > 1. Signed-off-by: MengqingCao <cmq0113@163.com>
Angazenn
pushed a commit
to Angazenn/vllm-ascend
that referenced
this pull request
Oct 21, 2025
### What this PR does / why we need it? After vllm-project#1094, decode might be executed with non-compiled mode, despite of `torchair_graph_config.enabled`, causing multistream mla to fail, which assumes torchair compiled mode for decode when `torchair_graph_config.enabled == True`. Augment that assumption to fix this. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested both offline, and by graph mode mla e2e testcase. --------- Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it?
Add
with_prefill_across_dpto AscendMetadata to fix dpThis pr fixes the bug introduced by #1012, which add an arg
with_prefill_across_dpwhen dp_size > 1.