[v0.13.0][Bugfix] Fix XliteModelRunner init failed when aclgraph is enabled#5887
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses a bug in XliteModelRunner that led to an initialization failure when aclgraph is enabled. The problem was that the inherited capture_model method was incorrectly identifying a module for patching, which is essential for Ascend NPU graph capturing. The fix correctly overrides capture_model in XliteModelRunner to locate the appropriate module. The change is correct and necessary. I have included one suggestion to enhance the implementation's robustness and maintainability by avoiding string-based class lookups.
| gpu_runner_cls = next((cls for cls in self.__class__.__mro__ | ||
| if cls.__name__ == "GPUModelRunner"), None) | ||
| if gpu_runner_cls is None: | ||
| raise TypeError("Could not find GPUModelRunner in the MRO. " | ||
| "The class hierarchy may have changed.") | ||
| parent_module_name = gpu_runner_cls.__module__ |
There was a problem hiding this comment.
Since GPUModelRunner is imported, you can simplify this logic by using the class object directly instead of searching for it in the MRO by its string name. This approach is more robust and less likely to break if the class is ever renamed.
You can replace this block with a simple isinstance check and direct access to GPUModelRunner.__module__.
if not isinstance(self, GPUModelRunner):
raise TypeError("Could not find GPUModelRunner in the MRO. "
"The class hierarchy may have changed.")
parent_module_name = GPUModelRunner.__module__c848e46 to
891fc7c
Compare
Signed-off-by: changdawei1 <changdawei3@huawei.com>
…lm-ascend into FIA_v0.13.0 * 'releases/v0.13.0' of https://github.com/vllm-project/vllm-ascend: [0.13.0][Bugfix] Fix setting of `speculative_config.enforce_eager` for dsv32 (vllm-project#5958) [v0.13.0][Bugfix] Fix XliteModelRunner init failed when aclgraph is enabled (vllm-project#5887) [0.13.0][Bugfix] Fixed an problem related to embeddings sharing (vllm-project#5972) [Bugfix]Fixed precision issues caused by pooled request pooling (vllm-project#6057) [0.13.0][Bugfix] fix pcp aclgraph qwen FIA bug (vllm-project#6038) [0.13.0][cherry-pick][bugfix] fix bug of triton mrope (vllm-project#6009) 【0.13.0】【bugfix】Resolved memory deallocation failure in the pooling layer under re-computation workloads. (vllm-project#6056)
…nabled (vllm-project#5887) ### What this PR does / why we need it? Fix XliteModelRunner init failed when aclgraph is enabled. Ensure function graph_capture of vllm.v1.worker.gpu_model_runner is replaced. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Signed-off-by: changdawei1 <changdawei3@huawei.com>
…nabled (vllm-project#5887) ### What this PR does / why we need it? Fix XliteModelRunner init failed when aclgraph is enabled. Ensure function graph_capture of vllm.v1.worker.gpu_model_runner is replaced. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Signed-off-by: changdawei1 <changdawei3@huawei.com>
What this PR does / why we need it?
Fix XliteModelRunner init failed when aclgraph is enabled. Ensure function graph_capture of vllm.v1.worker.gpu_model_runner is replaced.
Does this PR introduce any user-facing change?
How was this patch tested?