[DSV4] Guard megamoe flag with Pure TP#41522
Conversation
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request updates the initialization logic for DeepSeek V4 models to support the deep_gemm_mega_moe backend. While a guard was added to DeepseekV4MoE to ensure expert parallel is enabled when using MegaMoE, the reviewer pointed out that this same check is missing in DeepseekV4Model and suggested adding it for consistency and to ensure the model fails early during initialization.
| self.use_mega_moe = ( | ||
| vllm_config.kernel_config.moe_backend == "deep_gemm_mega_moe" | ||
| ) |
There was a problem hiding this comment.
The guard against using MegaMoE without expert parallel is missing here in DeepseekV4Model, although it was added to DeepseekV4MoE. For consistency and to ensure the model fails early during initialization (before creating layers), the same guard should be applied here. This also ensures that self.use_mega_moe is only True when the configuration is valid, which is important as this flag is used in the forward pass and for expert mapping logic.
| self.use_mega_moe = ( | |
| vllm_config.kernel_config.moe_backend == "deep_gemm_mega_moe" | |
| ) | |
| self.use_mega_moe = ( | |
| vllm_config.kernel_config.moe_backend == "deep_gemm_mega_moe" | |
| ) | |
| if self.use_mega_moe and not vllm_config.parallel_config.enable_expert_parallel: | |
| raise NotImplementedError( | |
| "DeepSeek V4 MegaMoE currently requires expert parallel. " | |
| "Enable it with --enable-expert-parallel, or pick a different " | |
| "--moe-backend." | |
| ) |
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com> Signed-off-by: Joachim Studnia <joachim@mistral.ai>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com> Co-authored-by: hongbolv <33214277+hongbolv@users.noreply.github.com>
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.