-
-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Core] NGram GPU Implementation compatible with Async Scheduler #29184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
121 commits
Select commit
Hold shift + click to select a range
c92e4b8
[CI Failure] Fix Gemma3 RoPE configuration for sliding attention laye…
hl475 d183dcb
fix typo error
293e3ae
fix return values in ngram gpu
4534c88
python3.13 pre-commit check
07e6b8a
fix pre-commit and sign-off
2f08629
Merge branch 'main' into patchy/async_ngram
PatchouliTIS e70b060
fix ngram gpu kernel compile issue
cde94b2
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 33c4437
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
25d36b1
fix docs bug
71b0dca
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 183556e
v.01
f6f871f
test
1fbf296
fix large batch performance.
b5243ec
refactor ngram gpu
0081487
modify nvtx
bcf454f
change copy to async
0d2638b
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 34cc523
remove irrelevant files
c9f2724
use discard_request_mask in ngram
16eb87c
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 82ff639
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 3abd884
remove irrelevant computations
cd9ecc9
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
38cf7fd
Merge branch 'main' into patchy/async_ngram
PatchouliTIS b518ef2
remove irrelevant comments
d07f4a7
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
3d28827
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 8920a59
move token ids tensor gpu init inline
6967bb2
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
25d6b1f
remove unused status check
3a6df84
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 65260a4
detailed comments in ngram gpu
37b1bb2
remove irrlevant input params for _dummy_run
14243fb
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 430fc13
move the preprocess of token_ids_gpu_tensor and mask tensor into Ngra…
4e0eca7
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
ddf24aa
merge conflicts fixed
63180be
change the CompileConfig to match latest vllm config
30b463a
fix documents
e1a44d1
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 9e7b089
fix vllm config in ngram gpu
3d0510e
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 8b7865c
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
2769039
enable ngram gpu in sync mode
6a3a26c
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 588bb65
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
b28ffd3
merge conflicts fixed
732ce0c
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 8ef01b7
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 4538bea
Merge branch 'main' into patchy/async_ngram
PatchouliTIS f505d97
merge conflicts fixed
3cff47f
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
cc3700a
Merge branch 'main' into patchy/async_ngram
PatchouliTIS f141cc1
merge conflicts fixed
1d94b70
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 064707c
modify ngram gpu process
93375ff
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
eac7085
remove irrelevant codes
6b46372
merge conflicts
2a14605
vllm async conf check
c683c56
merge conflicts fixed
ab2b0d5
change sync data access to async
30824ed
Merge branch 'main' into patchy/async_ngram
PatchouliTIS a87fe7d
pre-commit fixed
879c488
merge conflicts fixed
4b9511b
Merge branch 'vllm-project:main' into patchy/async_ngram
PatchouliTIS ff33c28
comments resolved, redundent codes removed.
1ee4cce
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 3faf03e
Merge branch 'main' into patchy/async_ngram
PatchouliTIS baf359c
Merge branch 'main' of github.com:vllm-project/vllm into patchy/async…
4451866
remove overcomments and disorganized codes
e9d510f
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 7c16ba0
typo error fixed in gpu_model_runner
7050336
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
1dd58e6
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 3764a0f
merge conflicts fixed
64737f3
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 2f8ab86
Merge branch 'main' of github.com:vllm-project/vllm into patchy/async…
b0104b2
Merge branch 'main' into patchy/async_ngram
PatchouliTIS c345829
merge conflicts fixed
4f467f5
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
9a9b35e
pre-commits error fixed
c188d8b
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 2d5edf9
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 85de9af
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 2db87e8
merge conflicts fixed
234ff7b
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 3221bb0
merge conflicts
e42ecd3
Merge branch 'main' into patchy/async_ngram
PatchouliTIS c68efc7
Merge branch 'main' into patchy/async_ngram
PatchouliTIS a356135
Merge branch 'main' into patchy/async_ngram
PatchouliTIS eb20909
merge conflicts fixed
18e462d
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 3dc6545
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 3388786
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 67394d7
fixed bugs during preemption
a34c5be
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
c8b8d71
fix bugs in preemption and add GSM8k Tests
0eab1f9
Merge branch 'main' of github.com:vllm-project/vllm into patchy/async…
8ae962b
fix merge conflicts in gpu_model_runner
1e12b12
fix merge conflicts in gpu_model_runner
3bf97f2
fix bugs
a2d216f
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 33a8aab
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 7e7ecac
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 5fdb7bc
merge conflicts
cb4fa70
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 6748677
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 21e26fc
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 0f1046c
Merge branch 'main' of github.com:vllm-project/vllm into patchy/async…
afd7933
Merge branch 'main' into patchy/async_ngram
PatchouliTIS 942c8ae
remove cuda api call
b9e9c8a
Merge branch 'patchy/async_ngram' of https://github.com/PatchouliTIS/…
9da1866
Merge branch 'main' of github.com:vllm-project/vllm into patchy/async…
a5e7bb3
pre-commits fixed
07fa301
Merge branch 'main' into patchy/async_ngram
PatchouliTIS e48c64f
Merge branch 'main' into patchy/async_ngram
PatchouliTIS cb508b6
Merge branch 'main' into patchy/async_ngram
PatchouliTIS bc71da2
Merge branch 'main' into patchy/async_ngram
PatchouliTIS fc64156
Merge branch 'main' into patchy/async_ngram
PatchouliTIS File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.