-
Notifications
You must be signed in to change notification settings - Fork 19.8k
ggml-cuda: Add generic NVFP4 MMQ kernel #21074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
94e58be
Introduced NVFP4 generic MMQ kernel
michaelw9999 2761dca
Added extra FP8 guard, hope to solve ci HIP failure
michaelw9999 cbd9fba
Rename tiles and use HIP_FP8_AVAILABLE
michaelw9999 0d9292c
Removed remaning FP8 straggler and added const int
michaelw9999 0018ce8
Const
michaelw9999 592e18c
Removed DECL_MMQ_CASE artifact
michaelw9999 1489ea5
Removed newline
michaelw9999 3177030
Removed space after else
michaelw9999 ebe28e9
Changed HIP FP8 NVFP4 conversion gate
michaelw9999 aa55cb3
Added new line to bottom of mmq.cu 270
michaelw9999 8af4325
Removed extra spaces
michaelw9999 d8c5b7b
Removed single space in front of else on line 814
michaelw9999 cba8605
Added NVFP4 to generate cu script so HIP can see it, further tightene…
michaelw9999 a2f724d
Include generated mmq-instance-nvfp4.cu
michaelw9999 4be4b92
Added NVFP4 mmq to HIP Check ignore list
michaelw9999 30d7c8c
Update ggml/src/ggml-cuda/mmq.cuh
michaelw9999 145d8f1
Update ggml/src/ggml-cuda/mmq.cuh
michaelw9999 bf496f6
Update ggml/src/ggml-cuda/mmq.cuh
michaelw9999 e2babc3
Added function names to closing endif
michaelw9999 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.