-
Notifications
You must be signed in to change notification settings - Fork 5k
[FlashInfer v0.6.4] [RL] Integrate FlashInfer mxfp8 gemm, MoE, and routed MoE #19537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Fridge003
merged 24 commits into
sgl-project:main
from
zianglih:agent-flashinfer-mxfp8-moe
Mar 10, 2026
Merged
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
da4444c
Initial flashinfer mxfp8 integration
zianglih acdab35
Clean up
zianglih 3dd4f78
Refactor with `copy_or_rebind_param`
zianglih 49af4e7
Fix `_handle_moe_kernel_config`
zianglih 7cee453
Clean up
zianglih 5194a31
Add doc
zianglih 8166e6d
Clean up
zianglih cad547b
Expand test to include mxfp8 and flashinfer_trtllm_routed
zianglih 1e8a692
Fix flashinfer_trtllm_routed EP
zianglih ae15808
Expand test_fp8_blockwise_gemm.py
zianglih 97492bb
Revert "Expand test to include mxfp8 and flashinfer_trtllm_routed"
zianglih a7c5801
Refactor test_flashinfer_trtllm_gen_moe_backend.py changes
zianglih bc0ec5f
Fix bad `moe_runner_backend` override
zianglih 863f9d5
Fix piece wise cuda graph
zianglih 6ed9c53
Add raw logits topk
zianglih 86ded54
Use flashinfer mxfp8 in test for now since triton mxfp8 is not yet PC…
zianglih c8cb1e1
Minor fix for DeepSeek
zianglih 4e3f31e
Merge branch 'main' into agent-flashinfer-mxfp8-moe
zianglih ae08f23
Add docs
zianglih 5f65dc2
Lazy import `block_scale_interleave`
zianglih 542775a
Add comments
zianglih 1febb3e
Refine docs
zianglih 9cfe5c9
Rename for pytest compatibility
zianglih 181c130
Merge branch 'main' into agent-flashinfer-mxfp8-moe
zianglih File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.