-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Improve enable chunked_prefill & prefix_caching logic. #26623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 59 commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
aa55dab
init
noooop 4e09700
Merge branch 'main' into chunked_prefill_logic
noooop 2deeb18
Merge branch 'main' into chunked_prefill_logic
noooop 0c624ba
+ BoolWithReason
noooop 0eb45a1
fix
noooop 5588a84
Merge branch 'main' into chunked_prefill_logic
noooop f51e7d9
state -> value
noooop 0397175
Merge branch 'main' into chunked_prefill_logic
noooop 8174b3c
+ @attn_type("encoder_only")
noooop 43a8616
fix clip
noooop e944168
fix
noooop 824dd36
fix
noooop e6b2a3b
add AttnTypeStr & PoolingTypeStr
noooop ce19005
ruff
noooop 909cceb
TYPE_CHECKING
noooop 2aec810
fix
noooop f8757ca
fix
noooop 0597911
Merge branch 'main' into chunked_prefill_logic
noooop 2c567cd
+ BoolWithReasonGroup
noooop 1705fbf
+ BoolWithReasonGroup
noooop 7a6ac0c
fix
noooop 862618e
constants as class variables
noooop 4aad642
E501
noooop aa3fcc5
Merge branch 'main' into chunked_prefill_logic
noooop bf5578f
warning_if_false
noooop acda010
fix
noooop d2301ee
Merge branch 'main' into chunked_prefill_logic
noooop 0b798e6
+ FIXME
noooop 1375cfb
Merge branch 'main' into chunked_prefill_logic
noooop 84665f2
update
noooop 7b2ad12
update
noooop 896e407
update
noooop f07f422
SIM102
noooop 867ba67
SIM102
noooop c844a78
SIM102
noooop 524ee08
cache_kwargs["enable_prefix_caching"]['default'] = None
noooop 7c25dea
Merge branch 'main' into chunked_prefill_logic
noooop 563c458
fix
noooop d1a9f7d
fix
noooop 61c866f
fix
noooop f2a4a05
fix
noooop 34d8dbe
fix
noooop 4a0d865
fix
noooop 953a5f2
Merge branch 'main' into chunked_prefill_logic
noooop a32ab75
+ attention_free
noooop d7172fc
fix
noooop ed25df5
fix
noooop bd0cbab
fix
noooop 0928d57
fix
noooop 0c80c9e
fix
noooop 3a28ae1
+ attn_type_to_reason_map
noooop f3a317a
fix
noooop 183b0f6
fix
noooop 0a00f3f
E501
noooop 7623684
Merge branch 'main' into chunked_prefill_logic
noooop dbf3ebf
- BoolWithReason
noooop ea5df7a
- Unnecessary modifications
noooop 2fcc5e9
Merge branch 'main' into chunked_prefill_logic
noooop f210656
Merge branch 'main' into chunked_prefill_logic
DarkLight1337 eaa4c96
- Unnecessary modifications
noooop 73212be
Merge branch 'main' into chunked_prefill_logic
noooop c620f45
update
noooop df8bba6
Merge branch 'main' into chunked_prefill_logic
DarkLight1337 084bc8f
+ all pooling and step pooling
noooop 3abb7fa
SIM103
noooop 466d36d
Merge branch 'main' into chunked_prefill_logic
noooop dad2bbd
fix
noooop 4d816b4
fix
noooop ba80b42
fix
noooop 792f33a
Merge branch 'main' into chunked_prefill_logic
noooop ac53016
fix
noooop 08a2f8f
Merge branch 'main' into chunked_prefill_logic
noooop File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.