[Bugfix] Fix the bug in initializing the shared_weight communication domain in sfa-cp, and fix the mtp weight load in pp>1 situation#4913
Conversation
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a bug where initializing the shared_weight communication domain would fail due to undefined variables. The change moves the variable definitions to a higher scope, ensuring they are always available when needed. While this fixes the immediate bug, I've identified a potential resource leak related to the _SHARED_WEIGHT group initialization that this change makes more likely to occur. Please see my detailed comment.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
efcb21d to
5cc0f2c
Compare
57a9d61 to
9ecd1ee
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
9ecd1ee to
c7065c5
Compare
f9f36c4 to
dbff1a6
Compare
Signed-off-by: zzhx1 <zzh_201018@outlook.com>
4ab52cc to
fffe348
Compare
…domain in sfa-cp, and fix the mtp weight load in pp>1 situation (vllm-project#4913) ### What this PR does / why we need it? In PR vllm-project#4188, a small bug was introduced that caused sfa-cp to be unable to find the global_pp_size parameter during initialization, and this PR fixed the issue. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: zzhx1 <zzh_201018@outlook.com> Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com>
…domain in sfa-cp, and fix the mtp weight load in pp>1 situation (vllm-project#4913) ### What this PR does / why we need it? In PR vllm-project#4188, a small bug was introduced that caused sfa-cp to be unable to find the global_pp_size parameter during initialization, and this PR fixed the issue. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: zzhx1 <zzh_201018@outlook.com> Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…domain in sfa-cp, and fix the mtp weight load in pp>1 situation (vllm-project#4913) ### What this PR does / why we need it? In PR vllm-project#4188, a small bug was introduced that caused sfa-cp to be unable to find the global_pp_size parameter during initialization, and this PR fixed the issue. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: zzhx1 <zzh_201018@outlook.com> Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
What this PR does / why we need it?
In PR #4188, a small bug was introduced that caused sfa-cp to be unable to find the global_pp_size parameter during initialization, and this PR fixed the issue.
Does this PR introduce any user-facing change?
How was this patch tested?