Skip to content

Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)"#5667

Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom
Fager10086:main
Jan 6, 2026
Merged

Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)"#5667
wangxiyuan merged 1 commit intovllm-project:mainfrom
Fager10086:main

Conversation

@Fager10086
Copy link
Contributor

@Fager10086 Fager10086 commented Jan 6, 2026

This reverts commit e7b623b.

What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

Does this PR introduce any user-facing change?

Does not.

How was this patch tested?

It was tested in the failure case.

…ct#5253)"

This reverts commit e7b623b.

Signed-off-by: Rifa <865071616@qq.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request reverts a previous commit, primarily addressing a bug related to graph fusion failure. The changes involve modifying how compilation ranges and runtime shapes are handled within the Ascend compilation interface and graph fusion pass manager. Additionally, the test suite for the worker has been updated to reflect the new expected call counts for model warmup. A new patch file patch_compile_backend.py has been introduced to recover dynamic shape compilation behavior in the piecewise backend.

I am having trouble creating individual review comments. Click here to see my feedback.

vllm_ascend/compilation/graph_fusion_pass_manager.py (42)

critical

The removal of graph.recompile() could lead to issues where graph modifications made by the fusion passes are not properly applied or reflected in the GraphModule's executable code. If the pattern_match_passes.apply(graph) method modifies the underlying fx.Graph object, graph.recompile() is typically necessary to update the GraphModule's internal state (e.g., _code and _wrapped_fns). Without this, subsequent execution of the GraphModule might not use the optimized graph, potentially causing correctness issues or negating the performance benefits of the fusion passes.

vllm_ascend/worker/worker.py (383-392)

high

The removed logic was responsible for ensuring that all defined compile_ranges were covered by the warmup process, either explicitly by warmup_sizes or cudagraph_capture_sizes, or by adding the end of the range. Without this mechanism, it's possible that certain batch sizes within the compile_ranges might not be warmed up, leading to potential performance degradation or unexpected behavior when the model is run with those specific input sizes. It's crucial to ensure comprehensive warmup across all expected operational ranges.

@wangxiyuan wangxiyuan merged commit 77a0299 into vllm-project:main Jan 6, 2026
11 checks passed
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request Jan 8, 2026
…ct#5253)" (vllm-project#5667)

### What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

### Does this PR introduce _any_ user-facing change?

Does not.

### How was this patch tested?

It was tested in the failure case.

Signed-off-by: Rifa <865071616@qq.com>
aipaes pushed a commit to aipaes/vllm-ascend that referenced this pull request Jan 15, 2026
…ct#5253)" (vllm-project#5667)

### What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

### Does this PR introduce _any_ user-facing change?

Does not.

### How was this patch tested?

It was tested in the failure case.

Signed-off-by: Rifa <865071616@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…ct#5253)" (vllm-project#5667)

### What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

### Does this PR introduce _any_ user-facing change?

Does not.

### How was this patch tested?

It was tested in the failure case.

Signed-off-by: Rifa <865071616@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…ct#5253)" (vllm-project#5667)

### What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

### Does this PR introduce _any_ user-facing change?

Does not.

### How was this patch tested?

It was tested in the failure case.

Signed-off-by: Rifa <865071616@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…ct#5253)" (vllm-project#5667)

### What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

### Does this PR introduce _any_ user-facing change?

Does not.

### How was this patch tested?

It was tested in the failure case.

Signed-off-by: Rifa <865071616@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…ct#5253)" (vllm-project#5667)

### What this PR does / why we need it?

Revert PR 5253 to fix the smoking problem

### Does this PR introduce _any_ user-facing change?

Does not.

### How was this patch tested?

It was tested in the failure case.

Signed-off-by: Rifa <865071616@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants