-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[Performance]zero bubble async scheduling and spec decoding #7640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 46 commits
Commits
Show all changes
53 commits
Select commit
Hold shift + click to select a range
a5bfa27
[Async][spec decode] Zero-bubble async scheduling +spec decoding
e599872
[Async][spec decode] Zero-bubble async scheduling +spec decoding
33a6d13
[Async][spec decode] Zero-bubble async scheduling +spec decoding
0a60c26
optimize
9248184
update to 0324
22dimensions 1dfa935
fix: add vllm_is_batch_invariant compatibility wrapper
claude 29eb701
Merge branch 'main' into zero_bubble_async_spec
HF-001 c82dad6
fix
62c38ee
fix format
74e2526
Merge branch 'main' into zero_bubble_async_spec
HF-001 dfe1b6d
fix
5f157ba
fix
9b0ff73
fix
c39d214
fix
013dcbe
fix
f28aacd
Merge branch 'main' into zero_bubble_async_spec
HF-001 8ca4fcd
Merge branch 'main' into zero_bubble_async_spec
HF-001 69bf146
fix
HF-001 fce6a54
Merge branch 'main' into zero_bubble_async_spec
HF-001 c45a066
fix ut test
2acc473
fix ut test
946107d
fix ut test
80a604b
fix ut test
5c88ee5
fix
c1e05db
fix
f36c75c
Merge branch 'main' into zero_bubble_async_spec
HF-001 eba0a00
Merge branch 'main' into zero_bubble_async_spec
HF-001 687e8c1
fix kvcache
a9bac6f
fix ci
e931e98
fix
4363d13
fix
5609a66
fix
7f441b0
fix
HF-001 e01874d
Merge branch 'main' into zero_bubble_async_spec
HF-001 bcbee07
fix
c83ea55
fix
bd21d43
Merge branch 'main' into zero_bubble_async_spec
HF-001 c0239cc
fix
b4b1dfb
Merge branch 'main' into zero_bubble_async_spec
HF-001 39f2498
fix
HF-001 e27c900
fix
HF-001 fea2844
fix
HF-001 a100226
fix ci conf
d860dc7
Merge branch 'main' into zero_bubble_async_spec
HF-001 0c2d0b0
del batch_invariant func
39aa20c
Merge branch 'main' into zero_bubble_async_spec
HF-001 1fe1e17
fix
1743abb
Merge branch 'main' into zero_bubble_async_spec
HF-001 43bbc01
fix
cd4bb98
fix
HF-001 f90bce7
fix
HF-001 5ae0787
fix
HF-001 45e3282
Merge branch 'main' into zero_bubble_async_spec
HF-001 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weijinqian0 I am optimizing here and will update later