[EPLB] Add log Info for moe_load Imbalance Ratio#4482
[EPLB] Add log Info for moe_load Imbalance Ratio#4482weijinqian0 merged 6 commits intovllm-project:mainfrom
Conversation
Signed-off-by: daishixun <dsxsteven@sina.com>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request introduces logging for the Mixture-of-Experts (MoE) load imbalance ratio, which is a useful metric for performance analysis. The implementation correctly calculates the peak-to-average load ratio for each layer and then provides a summary across all layers. The logic is executed only on rank 0 to prevent excessive logging in a distributed setup. I have identified a minor robustness issue that could, under specific circumstances, lead to a ZeroDivisionError. I've provided a suggestion to make the code more resilient.
| if len(self.moe_imbalance_dict) == 0: | ||
| logger.info("[MOE_load_stats] No data available.") | ||
| return | ||
|
|
||
| values = list(self.moe_imbalance_dict.values()) | ||
|
|
||
| avg_imbalance = sum(values) / len(values) |
There was a problem hiding this comment.
There is a potential for a ZeroDivisionError on line 207. Although self.moe_imbalance_dict is checked for emptiness on line 201, if it were to be modified by another thread between the check and when values is used, it could become empty, leading to a crash. While this is not an issue with the current single-threaded usage, it's safer to fetch the values first and then check for emptiness. This makes the code more robust against future changes and potential concurrency, and is also more idiomatic Python.
| if len(self.moe_imbalance_dict) == 0: | |
| logger.info("[MOE_load_stats] No data available.") | |
| return | |
| values = list(self.moe_imbalance_dict.values()) | |
| avg_imbalance = sum(values) / len(values) | |
| values = list(self.moe_imbalance_dict.values()) | |
| if not values: | |
| logger.info("[MOE_load_stats] No data available.") | |
| return | |
| avg_imbalance = sum(values) / len(values) |
Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: daishixun <dsxsteven@sina.com>
Signed-off-by: daishixun <dsxsteven@sina.com>
### What this PR does / why we need it? Add log Info for MOE_load Imbalance Ratio ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0 --------- Signed-off-by: daishixun <dsxsteven@sina.com> Co-authored-by: weijinqian0 <1184188277@qq.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
### What this PR does / why we need it? Add log Info for MOE_load Imbalance Ratio ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0 --------- Signed-off-by: daishixun <dsxsteven@sina.com> Co-authored-by: weijinqian0 <1184188277@qq.com> Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
### What this PR does / why we need it? Add log Info for MOE_load Imbalance Ratio ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0 --------- Signed-off-by: daishixun <dsxsteven@sina.com> Co-authored-by: weijinqian0 <1184188277@qq.com>
### What this PR does / why we need it? Add log Info for MOE_load Imbalance Ratio ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0 --------- Signed-off-by: daishixun <dsxsteven@sina.com> Co-authored-by: weijinqian0 <1184188277@qq.com>
What this PR does / why we need it?
Add log Info for MOE_load Imbalance Ratio
Does this PR introduce any user-facing change?
No
How was this patch tested?