Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)"#5667
Revert "[BugFix][Fusion] Fix graph fusion failure problem (#5253)"#5667wangxiyuan merged 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request reverts a previous commit, primarily addressing a bug related to graph fusion failure. The changes involve modifying how compilation ranges and runtime shapes are handled within the Ascend compilation interface and graph fusion pass manager. Additionally, the test suite for the worker has been updated to reflect the new expected call counts for model warmup. A new patch file patch_compile_backend.py has been introduced to recover dynamic shape compilation behavior in the piecewise backend.
I am having trouble creating individual review comments. Click here to see my feedback.
vllm_ascend/compilation/graph_fusion_pass_manager.py (42)
The removal of graph.recompile() could lead to issues where graph modifications made by the fusion passes are not properly applied or reflected in the GraphModule's executable code. If the pattern_match_passes.apply(graph) method modifies the underlying fx.Graph object, graph.recompile() is typically necessary to update the GraphModule's internal state (e.g., _code and _wrapped_fns). Without this, subsequent execution of the GraphModule might not use the optimized graph, potentially causing correctness issues or negating the performance benefits of the fusion passes.
vllm_ascend/worker/worker.py (383-392)
The removed logic was responsible for ensuring that all defined compile_ranges were covered by the warmup process, either explicitly by warmup_sizes or cudagraph_capture_sizes, or by adding the end of the range. Without this mechanism, it's possible that certain batch sizes within the compile_ranges might not be warmed up, leading to potential performance degradation or unexpected behavior when the model is run with those specific input sizes. It's crucial to ensure comprehensive warmup across all expected operational ranges.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>
…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>
…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>
…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ct#5253)" (vllm-project#5667) ### What this PR does / why we need it? Revert PR 5253 to fix the smoking problem ### Does this PR introduce _any_ user-facing change? Does not. ### How was this patch tested? It was tested in the failure case. Signed-off-by: Rifa <865071616@qq.com>
This reverts commit e7b623b.
What this PR does / why we need it?
Revert PR 5253 to fix the smoking problem
Does this PR introduce any user-facing change?
Does not.
How was this patch tested?
It was tested in the failure case.