[Bugfix] Fix broken deepseek fp8 TP weights loading#24367
[Bugfix] Fix broken deepseek fp8 TP weights loading#24367vllm-bot merged 1 commit intovllm-project:mainfrom
Conversation
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
There was a problem hiding this comment.
Code Review
This pull request aims to fix an issue with loading DeepSeek FP8 weights with tensor parallelism. The changes involve renaming __post_init__ to update_param_tp_status in LinearBase and calling it explicitly in ColumnParallelLinear and RowParallelLinear, which is a correct approach since nn.Module doesn't support __post_init__. Additionally, the logic for retrieving tp_size in Fp8LinearMethod is updated to respect layer-specific tensor parallelism settings.
My main feedback is a potential omission in update_param_tp_status. While the PR description mentions updating both tp_size and tp_rank, the implementation only seems to update tp_rank. This could lead to issues if tp_size is required by the parameter loading logic. I've added a specific comment with a suggestion to address this.
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Purpose
tp_sizeandtp_rankthroughself.update_param_tp_status()(I just realized that
__post_init__is never called fornn.Modulebecause it's not a dataclass. My bad. 😅)Also cc @minosfuture, can you check if this PR also fixed the issue on your side? Thanks!
Test Plan
Test Result
Verified on dummy tiny Deepseek-v3 FP8 checkpoint with 1 MLA layer.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.