-
Notifications
You must be signed in to change notification settings - Fork 33.6k
🚨 [FA4] Initial support
#42435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
🚨 [FA4] Initial support
#42435
Changes from 19 commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
e82beeb
initial implementation
vasqu 1a45b34
Merge branch 'main' into fa4-support
vasqu 30c6682
CB support
vasqu e9cdeea
change how we call item on max_seq_len_q/k
vasqu 40168b4
fix
vasqu 91a1b3b
tests
vasqu 8d3dc6c
fix fa2 clash
vasqu bf1d589
unify the fa dispatch
vasqu f5b7f9c
fix
vasqu 6288f44
modernbert...
vasqu 15ed2eb
oops
vasqu 6be5bbe
parity test
vasqu dad1b04
style
vasqu 34c15c2
nit
vasqu ac8e309
Merge branch 'main' into fa4-support
vasqu 776a1af
fixup imports for fa4
vasqu cc7a1b7
enable attention sinks, fixup logits checks in parity test
vasqu ca26ecf
Merge branch 'main' into fa4-support
vasqu 65912b2
style
vasqu d07749f
change dispatch logic and introduce lower bound for FA
vasqu 95d644e
Merge branch 'main' into fa4-support
vasqu ed88dcc
style
vasqu 7fba6df
fix test
vasqu 27acafe
min fa2, avoid 2x device sync
vasqu ba9fb59
Merge branch 'main' into fa4-support
vasqu afa0940
style
vasqu 7223fe6
simple min version instead of list
vasqu da88dcf
fixup error message on non init check
vasqu 654db43
fixup up non init check a tad more
vasqu a690e55
Merge branch 'main' into fa4-support
vasqu d3485da
refactor some FA constants out to main fa utils
vasqu 476789f
new marker for all fas needed
vasqu 19e4c44
oops
vasqu 08445b6
style and make the fa kernel fallback generalized
vasqu 920bef7
default none...
vasqu 8ee8c56
more refactors
vasqu cd2a9b3
style
vasqu 27e0d58
fix
vasqu 043f11f
this test faulty even on main, xformers can handle any shape apparent…
vasqu b0485b5
lets make this more robust, we should check for none within...
vasqu eae216e
fix
vasqu 15f6ba9
oops
vasqu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we also check
is_tracinghere? I believe it would make it fail for previous FA version and no compileThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't fail for any FA version. In non-compile environments, FA can handle this and we avoid device syncs with this. Only compile fails, because the fake ops and the underlying signature forces base int which dynamo cannot reason about if it is a scalar int tensor instead.
Initially, this change was specifically for FA4 which did not have max_seqlen_q/k but at some point it was added back with the same signature issue. This way we boost non-compile forward at least