Revert "[EPLB][refactor] Modification of the initialization logic for expert_map and log2phy(depend on pr5285) (#5311)"#5506
Conversation
… expert_map and log2phy(depend on pr5285) (vllm-project#5311)" This reverts commit f81cf69. Signed-off-by: hfadzxy <starmoon_zhang@163.com>
c106590 to
fb3cfdb
Compare
There was a problem hiding this comment.
Code Review
This pull request appears to be a significant refactoring of the Expert Placement Load Balancing (EPLB) initialization logic, despite the 'Revert' in the title. The changes introduce a new ExpertLoadBalancer class to manage static expert maps from a file, and streamline the logic for both static and dynamic EPLB.
My review has identified a critical issue regarding inconsistent state management of local_num_experts in fused_moe.py, which could lead to runtime errors. Additionally, I've pointed out a couple of high-severity issues related to the use of random.choice in the log2phy map generation, which introduces non-determinism and can affect reproducibility. The corresponding tests for this functionality even mock the random choice mechanism, which is a strong indicator that this should be addressed in the production code as well.
Overall, the refactoring improves the structure and clarity of the EPLB logic. Addressing the identified issues will enhance the robustness and predictability of the implementation.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
|
fix accuracy test by #5521 |
This reverts commit f81cf69.
What this PR does / why we need it?
This reverts commit f81cf69 fix accuracy test for qwen3-30B:
https://github.com/vllm-project/vllm-ascend/actions/runs/20577361161/job/59119089426#logs
Does this PR introduce any user-facing change?
How was this patch tested?
accuracy test ok: https://github.com/vllm-project/vllm-ascend/actions/runs/20589189595?pr=5506