-
Notifications
You must be signed in to change notification settings - Fork 829
refactor: refactoring cuda code to cute-dsl (part 1) #2428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
6275947
upd
yzh119 ec57a7c
remove contiguous
yzh119 946ad76
upd
yzh119 928a2ce
upd
yzh119 71631f1
upd
yzh119 0e9dd6c
upd
yzh119 e21b28f
upd
yzh119 d4d53d5
upd
yzh119 ba1a645
refactor
yzh119 fcd5c5d
add pdl for all
yzh119 510b6f6
reduce cpu overhead
cyx-6 ef19fb9
reduce host overhead
cyx-6 6d78f6f
Merge commit 'f521fe19ac387e8baffd7b5c925ef59d9f2ecc0c' into cute-dsl…
cyx-6 62764d6
cleean code
cyx-6 11649ab
clean code
cyx-6 68ea276
Fix incorrect input ordering
bkryu a7690ec
Address comment by absorbing PR 2459
bkryu 2dc2439
Remove dead allocations for sGamma and sBeta that are discarded immed…
bkryu 9a8a379
Merge branch 'main' into cute-dsl-part-1
bkryu 6124c18
Add SW emulated fp8 typecasting for SM < 89
bkryu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expose quantized norm APIs at the package level.
Line 97-101 exports
rmsnormandfused_add_rmsnorm, but the new quantized variants (rmsnorm_quant,fused_add_rmsnorm_quant) fromflashinfer.normare still missing at the top level. Consider exporting them here soflashinfer.rmsnorm_quantworks consistently.✅ Suggested export additions
As per coding guidelines: Export new operations in flashinfer/init.py to make them available at package level.
🤖 Prompt for AI Agents