[Misc][v0.13.0]Removes unnecessary graph size re-initialization by Angazenn · Pull Request #6281 · vllm-project/vllm-ascend

Angazenn · 2026-01-26T13:01:06Z

What this PR does / why we need it?

Cherry-pick from #6280 .
This PR removes update_default_aclgraph_sizes. In earlier versions, we add this function to change default cudagraph_capture_sizes because _npu_paged_attention degrades significantly on certain shapes (which is included in default cudagraph_capture_sizes of VLLM). Now since we use FIA as default attention op (which does not contain such performance degradation), there is no need to add this default change. Otherwise, it could cause some conflicts if we set a small cudagraph_capture_sizes that < 20 now.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: Angazenn <supperccell@163.com>

gemini-code-assist

Code Review

The pull request removes the update_default_aclgraph_sizes function and its associated test patch. While the PR title suggests this re-initialization is unnecessary, the original comments within the removed code indicated that this function was crucial for optimizing ACL graph capture sizes for Ascend hardware and specifically addressed performance degradation for Qwen3-MoE models. If these underlying issues have been resolved through other mechanisms, it would be beneficial to explicitly state this to ensure that performance and correctness are not negatively impacted by this removal.

I am having trouble creating individual review comments. Click here to see my feedback.

vllm_ascend/platform.py (219-222)

The removal of update_default_aclgraph_sizes(vllm_config) is concerning given the original comment stating its purpose was to "improve default performance" and address cases where "default cudagraph_capture_sizes are not friendly to ascend ops && hardwares." Without this update, there's a high risk of performance degradation or incorrect behavior if the default sizes are indeed suboptimal for Ascend. Please confirm that the issues previously addressed by this function are now resolved by other means, or that the function was indeed redundant.

vllm_ascend/utils.py (432-485)

The complete removal of update_default_aclgraph_sizes and _is_default_capture_sizes is a critical change. The update_default_aclgraph_sizes function explicitly aimed to make ACL graph sizes "more friendly to ascend ops && hardware" and specifically handled performance issues for "Qwen3-MoE models on dp settings." If the problems this function was designed to solve still exist, its removal could lead to severe performance regressions or functional errors. It's crucial to confirm that the issues previously addressed by this function are now resolved by other means, or that the function was indeed redundant.

Signed-off-by: Angazenn <supperccell@163.com>

…-project#6281) ### What this PR does / why we need it? Cherry-pick from vllm-project#6280 . This PR removes `update_default_aclgraph_sizes`. In earlier versions, we add this function to change default `cudagraph_capture_sizes` because `_npu_paged_attention` degrades significantly on certain shapes (which is included in default `cudagraph_capture_sizes` of VLLM). Now since we use FIA as default attention op (which does not contain such performance degradation), there is no need to add this default change. Otherwise, it could cause some conflicts if we set a small `cudagraph_capture_sizes` that < 20 now. --------- Signed-off-by: Angazenn <supperccell@163.com>

bugfix_

9c11e94

Signed-off-by: Angazenn <supperccell@163.com>

Angazenn added the ready read for review label Jan 26, 2026

gemini-code-assist Bot reviewed Jan 26, 2026

View reviewed changes

Angazenn added the ready-for-test start test by label for PR label Jan 26, 2026

Angazenn force-pushed the bugfix_dev branch from ee5c3fa to 292ccaf Compare January 26, 2026 14:11

fix

96615e7

Signed-off-by: Angazenn <supperccell@163.com>

Angazenn force-pushed the bugfix_dev branch from 292ccaf to 96615e7 Compare January 26, 2026 14:40

yiz-liu approved these changes Jan 27, 2026

View reviewed changes

wangxiyuan merged commit 744d1e6 into vllm-project:releases/v0.13.0 Jan 27, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc][v0.13.0]Removes unnecessary graph size re-initialization#6281

[Misc][v0.13.0]Removes unnecessary graph size re-initialization#6281
wangxiyuan merged 2 commits intovllm-project:releases/v0.13.0from
Angazenn:bugfix_dev

Angazenn commented Jan 26, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Angazenn commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

vllm_ascend/platform.py (219-222)

vllm_ascend/utils.py (432-485)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Angazenn commented Jan 26, 2026 •

edited

Loading