This repository was archived by the owner on Mar 21, 2026. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Efficient Transformers backend support #2858
Closed
Closed
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
ade0f44
add transformers_flash
Cyrilvallez da22290
inits
Cyrilvallez b3b0747
switch version to make it work
Cyrilvallez 738f0b0
Update Makefile-flash-att-v2
Cyrilvallez a84ecf2
Update Makefile-flash-att-v2
Cyrilvallez 372799a
Update Makefile-flash-att-v2
Cyrilvallez a0035e6
Update Makefile-flash-att-v2
Cyrilvallez e69a384
Update Makefile-flash-att-v2
Cyrilvallez 3a636ed
Update Makefile-flash-att-v2
Cyrilvallez 649cb1f
runnable version
490ca0e
working
f843b62
push change
Cyrilvallez 715b2d1
fix high dim
Cyrilvallez e93ab92
init
Cyrilvallez f4c60ca
default
Cyrilvallez 2e2631e
latest transformers changes
Cyrilvallez 44b3679
revert
Cyrilvallez 266377b
simplify check
Cyrilvallez 32488c1
remove flag
Cyrilvallez ac62bd1
improve type hints + required args
Cyrilvallez b03d7ae
Update based on transformers PR
Cyrilvallez b40c889
small fix
Cyrilvallez 42ae6de
Remove Warpers for Processor
Cyrilvallez f01014d
fix compatibility version issue
Cyrilvallez 2659b59
raise error if needed
Cyrilvallez a2fe842
Simplify with monkey patch
Cyrilvallez 6e0f37c
revert + style + minor improvements
Cyrilvallez File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I like to remove indirections.
Here,
transformers_causal_lm_classis not known by the reader, he requires looking up where that's define which means following the flow of code is hard.We know if models support flex attention or not. We can hardcode them
CausalLM->TransformersFlashCausalLM.That removes the need to "guess" and the dependency on the private bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the dynamic behavior is simpler as we will roll support for more and more models in transformers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But can obviously be changed if this is a blocker on your side 😁