-
-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Feature][OCP MX] Support mxfp6 and mixed mxfp6-mxfp4 #21166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
2349099
mxfp6 support
fxmarty-amd d538659
refactor mxfp4 to accomodate mxfp6
fxmarty-amd 49b2bbe
add todo
fxmarty-amd a016c18
rename mxfp4_utils to ocp_mx_utils and add fp6 dequant function
fxmarty-amd a5835ac
use str instead of enum as torch.library infer_schema does not suppor…
fxmarty-amd 08da3e0
fix a few remaining bugs
fxmarty-amd 43f0ae8
simulate on mi350 as well for now
fxmarty-amd c537c76
fix e2m3/e3m2 bug
fxmarty-amd 13dea3e
Merge branch 'main' into mxfp6_mixed
fxmarty-amd a37ef27
wip update tests
fxmarty-amd de19714
Merge branch 'main' into mxfp6_mixed
fxmarty-amd b106df5
Merge branch 'main' into mxfp6_mixed
fxmarty-amd f07c3a8
update tests
fxmarty-amd b9f9124
update documentation
fxmarty-amd 9c1a90f
address review comments
fxmarty-amd e4aa06e
linting
fxmarty-amd 1e53ab9
linting 2
fxmarty-amd 5bfa8cb
linting 3
fxmarty-amd 33e431f
linting 4... if only mypy would run locally
fxmarty-amd bccb5e3
Merge branch 'main' into mxfp6_mixed_updated
fxmarty-amd bad17cc
undo current_platform.supports_mx() change, moved to standalone #22355
fxmarty-amd 7bbfbc7
Merge branch 'main' into mxfp6_mixed
fxmarty-amd ef895ca
post merge fixes
fxmarty-amd 160d5c3
fix issues
fxmarty-amd e53c5c7
edit reference
fxmarty-amd df3d964
linting
fxmarty-amd 5024d70
linting
fxmarty-amd 28473a8
address comments
fxmarty-amd 4291d3a
fix mxfp4/fp4 typos
fxmarty-amd 4ed70f6
Merge branch 'main' into mxfp6_mixed
fxmarty-amd 37326f0
post-merge cleanup
fxmarty-amd d2ef885
linting & fixes
fxmarty-amd 3b7260f
cleanup
fxmarty-amd bdb3706
disable check_model test as it is not working well with v1
fxmarty-amd f374514
linting
fxmarty-amd 868d5a9
linting
fxmarty-amd 28aa39c
address review comments
fxmarty-amd b957668
Merge branch 'main' into mxfp6_mixed
fxmarty-amd 397722d
post merge fixes
fxmarty-amd cebca37
reset files
fxmarty-amd 8ef7e00
lint
fxmarty-amd 996ddd9
Merge branch 'main' into mxfp6_mixed
fxmarty-amd 76e6ee7
prefix with 'mx' everywhere as suggested
fxmarty-amd 28b995b
fix remaining issues
fxmarty-amd dc246fa
linting
fxmarty-amd 5cd1b2e
typo
fxmarty-amd 1288977
fix typo
fxmarty-amd ec51387
fix tests
fxmarty-amd eb5f8f6
linting
fxmarty-amd fb0d1ac
skip test if amd-quark is not installed
fxmarty-amd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FusedMoEParallelConfigcurrently assume a common dtype for weights/activations, beingquant_dtype. I added thisweight_dtypeto hopefully not break anything, but it is not clean.Maybe
quant_dtypeis too vagueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. Each
FusedMoEQuantDeschas it's own dtype.quant_dtypeis meant for activations. The weights can have their own types which don't need to be the same.