Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)" by Fager10086 · Pull Request #5667 · vllm-project/vllm-ascend

Fager10086 · 2026-01-06T13:44:40Z

This reverts commit e7b623b.

What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

Does this PR introduce any user-facing change?

Does not.

How was this patch tested?

It was tested in the failure case.

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@2f4e654

…ct#5253)" This reverts commit e7b623b. Signed-off-by: Rifa <865071616@qq.com>

gemini-code-assist

Code Review

This pull request reverts a previous commit, primarily addressing a bug related to graph fusion failure. The changes involve modifying how compilation ranges and runtime shapes are handled within the Ascend compilation interface and graph fusion pass manager. Additionally, the test suite for the worker has been updated to reflect the new expected call counts for model warmup. A new patch file patch_compile_backend.py has been introduced to recover dynamic shape compilation behavior in the piecewise backend.

I am having trouble creating individual review comments. Click here to see my feedback.

vllm_ascend/compilation/graph_fusion_pass_manager.py (42)

The removal of graph.recompile() could lead to issues where graph modifications made by the fusion passes are not properly applied or reflected in the GraphModule's executable code. If the pattern_match_passes.apply(graph) method modifies the underlying fx.Graph object, graph.recompile() is typically necessary to update the GraphModule's internal state (e.g., _code and _wrapped_fns). Without this, subsequent execution of the GraphModule might not use the optimized graph, potentially causing correctness issues or negating the performance benefits of the fusion passes.

vllm_ascend/worker/worker.py (383-392)

The removed logic was responsible for ensuring that all defined compile_ranges were covered by the warmup process, either explicitly by warmup_sizes or cudagraph_capture_sizes, or by adding the end of the range. Without this mechanism, it's possible that certain batch sizes within the compile_ranges might not be warmed up, leading to potential performance degradation or unexpected behavior when the model is run with those specific input sizes. It's crucial to ensure comprehensive warmup across all expected operational ranges.

github-actions · 2026-01-06T14:41:28Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>

…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>

…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>

Revert "[BugFix][Fusion] Fix graph fusion failure problem (vllm-proje…

aaf9118

…ct#5253)" This reverts commit e7b623b. Signed-off-by: Rifa <865071616@qq.com>

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

wangxiyuan merged commit 77a0299 into vllm-project:main Jan 6, 2026
11 checks passed

github-actions bot added the module:tests label Jan 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)"#5667

Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)"#5667
wangxiyuan merged 1 commit intovllm-project:mainfrom
Fager10086:main

Fager10086 commented Jan 6, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Fager10086 commented Jan 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

vllm_ascend/compilation/graph_fusion_pass_manager.py (42)

vllm_ascend/worker/worker.py (383-392)

Uh oh!

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fager10086 commented Jan 6, 2026 •

edited by github-actions bot

Loading