Fix RoutingMethodType.from_topk softmax+renormalize mapping#33792#33836
Fix RoutingMethodType.from_topk softmax+renormalize mapping#33792#33836baonudesifeizhai wants to merge 7 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a bug in the routing method mapping for softmax + renormalize by introducing a centralized from_topk static method. The refactoring to use this new method across different routers and quantization layers is a good improvement, and the addition of unit tests is appreciated. I have one suggestion to make the logic in the new from_topk method more robust against unsupported top_k values.
| if scoring_func == "sigmoid": | ||
| return ( | ||
| RoutingMethodType.Llama4 if top_k == 1 else RoutingMethodType.DeepSeekV3 | ||
| ) | ||
| if scoring_func == "softmax": | ||
| return ( | ||
| RoutingMethodType.Renormalize | ||
| if renormalize | ||
| else RoutingMethodType.Default | ||
| ) | ||
| raise ValueError(f"Unsupported scoring function: {scoring_func}") |
There was a problem hiding this comment.
The logic for sigmoid is too broad. It implicitly maps any top_k value other than 1 to RoutingMethodType.DeepSeekV3. This includes potentially invalid values like 0 or negative numbers, and top_k > 2 which may not be correct for DeepSeekV3. The tests only cover top_k=1 and top_k=2. It would be more robust to explicitly check for supported top_k values and raise an error for unsupported ones.
| if scoring_func == "sigmoid": | |
| return ( | |
| RoutingMethodType.Llama4 if top_k == 1 else RoutingMethodType.DeepSeekV3 | |
| ) | |
| if scoring_func == "softmax": | |
| return ( | |
| RoutingMethodType.Renormalize | |
| if renormalize | |
| else RoutingMethodType.Default | |
| ) | |
| raise ValueError(f"Unsupported scoring function: {scoring_func}") | |
| if scoring_func == "sigmoid": | |
| if top_k == 1: | |
| return RoutingMethodType.Llama4 | |
| if top_k == 2: | |
| return RoutingMethodType.DeepSeekV3 | |
| elif scoring_func == "softmax": | |
| return ( | |
| RoutingMethodType.Renormalize | |
| if renormalize | |
| else RoutingMethodType.Default | |
| ) | |
| raise ValueError( | |
| f"Unsupported scoring function '{scoring_func}' or top_k '{top_k}'") |
|
Do you have some model evals you could run to check that the changes are expressed and correct for the flashinfer kernels? |
this branch model.layers.31.block_sparse_moe.experts scoring_func=softmax renormalize=True routing_method_type=1 for lmeval .. c
|
|
For which model are you trying to enable the Flashinfer TRTLLM kernels? |
|
mistralai/Mixtral-8x7B-Instruct-v0.1
|
Purpose
#33792
Align from_topk with routing semantics: softmax + renormalize=True now maps to RoutingMethodType.Renormalize (not RenormalizeNaive).
Add a small unit test to cover from_topk mapping and invalid scoring functions.
Test Plan
Test Result
``python -m pytest tests/model_executor/test_routed_experts_capture.py -v
passedEssential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.