Commit fbdb439
authored
Enable renormalize(naive) routing for fp8 per-tensor (#2030)
<!-- .github/pull_request_template.md -->
## π Description
Disable expert weights in the FC1 except for Llama routing.
## π Related Issues
<!-- Link any related issues here -->
## π Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### β
Pre-commit Checks
- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## π§ͺ Tests
- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Re-enabled Renormalize routing that was previously blocked.
* Made token_scales available for Llama4 routing.
* Corrected GEMM1 input so the proper data source is used during MoE
processing.
* **Tests**
* Added FP8PerTensorMoe to test parameterization.
* Expanded Renormalize and DeepSeekV3 test coverage and removed related
skips.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: siyuanf <[email protected]>1 parent d42fb90 commit fbdb439
File tree
4 files changed
+27
-4
lines changed- csrc
- include/flashinfer/trtllm/fused_moe
- tests/moe
4 files changed
+27
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
584 | 584 | | |
585 | 585 | | |
586 | 586 | | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
587 | 590 | | |
588 | 591 | | |
589 | 592 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
518 | 518 | | |
519 | 519 | | |
520 | 520 | | |
521 | | - | |
| 521 | + | |
522 | 522 | | |
523 | 523 | | |
524 | 524 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
305 | 305 | | |
306 | 306 | | |
307 | 307 | | |
| 308 | + | |
| 309 | + | |
308 | 310 | | |
| 311 | + | |
| 312 | + | |
309 | 313 | | |
310 | 314 | | |
311 | 315 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2275 | 2275 | | |
2276 | 2276 | | |
2277 | 2277 | | |
| 2278 | + | |
2278 | 2279 | | |
2279 | 2280 | | |
2280 | 2281 | | |
| |||
2293 | 2294 | | |
2294 | 2295 | | |
2295 | 2296 | | |
2296 | | - | |
| 2297 | + | |
| 2298 | + | |
| 2299 | + | |
| 2300 | + | |
| 2301 | + | |
| 2302 | + | |
2297 | 2303 | | |
2298 | 2304 | | |
2299 | 2305 | | |
| |||
2308 | 2314 | | |
2309 | 2315 | | |
2310 | 2316 | | |
2311 | | - | |
| 2317 | + | |
| 2318 | + | |
| 2319 | + | |
| 2320 | + | |
| 2321 | + | |
| 2322 | + | |
2312 | 2323 | | |
2313 | 2324 | | |
2314 | 2325 | | |
| |||
2323 | 2334 | | |
2324 | 2335 | | |
2325 | 2336 | | |
2326 | | - | |
| 2337 | + | |
| 2338 | + | |
| 2339 | + | |
| 2340 | + | |
| 2341 | + | |
| 2342 | + | |
2327 | 2343 | | |
2328 | 2344 | | |
2329 | 2345 | | |
| |||
0 commit comments