forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 6
[P/D Disagg] [1/N] Support Homogeneous TP > 1 #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
robertgshaw2-redhat
merged 145 commits into
neuralmagic:disagg_pd_dev
from
robertgshaw2-redhat:tp-gt-1
May 4, 2025
Merged
Changes from all commits
Commits
Show all changes
145 commits
Select commit
Hold shift + click to select a range
4730522
[Update] LMcache connector v1 implementation
ApostaC 4162650
[Add] examples for disaggregated prefill
ApostaC 3ccd34c
[add] extra information about evns
ApostaC 161010c
Initial stubs for P/D scheduling changes
tlrmchlsmth 38a2eb8
Merge branch 'main' into local-dev/lmcache-v1-connector-pr
tlrmchlsmth 6c3191f
Merge branch 'local-dev/lmcache-v1-connector-pr' into pd_scheduling_l…
tlrmchlsmth 1f708e9
Updates
tlrmchlsmth 038f2f8
Rs branch (#3)
robertgshaw2-redhat 5c4fc6f
Rs branch (#5)
robertgshaw2-redhat 1800689
Remove Unneeded Arguments (#7)
robertgshaw2-redhat 7a1f25f
Improve disagg-example.sh (#8)
tlrmchlsmth 2385d8e
updated
robertgshaw2-redhat 6eeb47c
updated
robertgshaw2-redhat 266fcee
updated
robertgshaw2-redhat f7e16f1
updated
robertgshaw2-redhat f591b8e
added connector
robertgshaw2-redhat 184d0b6
updated
robertgshaw2-redhat d4a9e5b
updated
robertgshaw2-redhat 4b0d1dc
updated
robertgshaw2-redhat bfef039
updated
robertgshaw2-redhat 54f4a43
updated
robertgshaw2-redhat e604b09
updated
robertgshaw2-redhat 2fc00ad
updated
robertgshaw2-redhat e5967b6
updated
robertgshaw2-redhat f1bc0f7
updated
robertgshaw2-redhat 1cea2bb
updated
robertgshaw2-redhat 489e4c0
updated
robertgshaw2-redhat 437ac91
updated
robertgshaw2-redhat ea47af7
updated
robertgshaw2-redhat 554b27d
updated
robertgshaw2-redhat 1aea5ba
updated
robertgshaw2-redhat e0c112b
updated
robertgshaw2-redhat c7717c1
update
robertgshaw2-redhat e0af1db
remove
robertgshaw2-redhat 9533471
updated
robertgshaw2-redhat 2eb068e
updated
robertgshaw2-redhat 0f2b7e3
updated
robertgshaw2-redhat 6127cb8
updated
robertgshaw2-redhat 568249e
updated
robertgshaw2-redhat ccb44ea
seems to load properly
robertgshaw2-redhat 3785905
updated
robertgshaw2-redhat 8a94b2e
updated
robertgshaw2-redhat ac19437
updated
robertgshaw2-redhat 6391ec9
updated
robertgshaw2-redhat 7dd764b
updated
robertgshaw2-redhat 97316d9
updated
robertgshaw2-redhat 2771353
Revert "updated"
robertgshaw2-redhat baed1bf
updated
robertgshaw2-redhat d0ad6d9
updated
robertgshaw2-redhat 055885e
updated
robertgshaw2-redhat 5ed3806
updated
robertgshaw2-redhat 58266b5
updated
robertgshaw2-redhat 344d9da
stash
robertgshaw2-redhat 2996638
added
robertgshaw2-redhat bcc88dc
diffs for local dev on macos
62205ae
updated
b4609a5
update
5d78ba6
updaed
c1f26b9
updated
9b9ef36
updated
c60639e
Checkpoint.
tlrmchlsmth 006dda3
Merge branch 'pd_scheduling_nixl' of https://github.com/robertgshaw2-…
tlrmchlsmth c5e023e
updated
8b0c93c
Cleanup
tlrmchlsmth 5e45d90
WIP
tlrmchlsmth 20a5491
updated
cee3c61
updated
5972571
updated on scheduler side
1b69d33
updated
74e105a
Merge remote-tracking branch 'rs/pd_scheduling_rob_dev' into nixl_int…
tlrmchlsmth 8adf1ad
updated
21ab3d9
updated
3a27bbc
updated
f252df9
updated
8104803
updated
10bbe21
Hacking away
tlrmchlsmth a14278c
Merge remote-tracking branch 'rs/pd_scheduling_rob_dev_2' into nixl_i…
tlrmchlsmth 65ea91f
cleanup
f2550ef
ensure request removed from running list
985bac3
Runs E2E. Garbage output. Crashes on 2nd request
tlrmchlsmth bf37a7d
update
tlrmchlsmth ebe1263
updated
a008aa3
updated
195dceb
rename files
e2cc365
updated
2324a50
Merge remote-tracking branch 'rs/pd_scheduling_rob_dev_2' into nixl_i…
tlrmchlsmth b4b64fe
updated
6686397
updated
8736043
updated
dcbf6e5
updated
7c8e21a
update
a4855d2
Second request no longer crashes
tlrmchlsmth 0914040
Merge remote-tracking branch 'rs/pd_scheduling_rob_dev_2' into nixl_i…
tlrmchlsmth c5b3053
Remove gpu_model_runner hacks
tlrmchlsmth 7502819
Clean up Justfile
tlrmchlsmth 7768b96
[Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT
tlrmchlsmth a5950b7
update
tlrmchlsmth 610a357
justfile edits
tlrmchlsmth 5b026ab
Update
tlrmchlsmth f2fadd6
Fixes - lm_eval gsm8k has correctness
tlrmchlsmth 4060f86
"just delete the assert"
tlrmchlsmth bfe9d19
fixup precommit issues
tlrmchlsmth ced529a
Fixes
tlrmchlsmth 83f2872
updated (#12)
robertgshaw2-redhat e853b3c
Add Accuracy Test (#13)
robertgshaw2-redhat 1c45ed1
Preemption Bugfixes (#15)
robertgshaw2-redhat a45a694
updated (#16)
robertgshaw2-redhat f6d0ac5
Merge branch 'main' into nixl_integration
tlrmchlsmth 2f9a3f3
Fix Bad Merge | Fix Memory Leak in Upstream (#18)
robertgshaw2-redhat 90ba831
updated
robertgshaw2-redhat 9378594
cleanup code
robertgshaw2-redhat 790c1b2
cleanup code
robertgshaw2-redhat e4802fd
updated
robertgshaw2-redhat f4c2915
updated
robertgshaw2-redhat 6346a64
updated
robertgshaw2-redhat a8832ec
stash
robertgshaw2-redhat dd0935a
complete merge
robertgshaw2-redhat 42a28ff
updated
robertgshaw2-redhat 422a9ac
updated
robertgshaw2-redhat 836e76b
updatted
robertgshaw2-redhat 0aafe4a
updated
robertgshaw2-redhat 4fe1829
updated
robertgshaw2-redhat d6b2531
Merge remote-tracking branch 'nm-fork/disagg_pd_dev' into tp-gt-1
robertgshaw2-redhat 1bbd623
revert
robertgshaw2-redhat afdcd2f
more spurious changes
robertgshaw2-redhat 6790c00
updated
robertgshaw2-redhat 87277d6
updated
robertgshaw2-redhat 8ff421e
updated
robertgshaw2-redhat 79af352
updated
robertgshaw2-redhat e21f5f9
updated
robertgshaw2-redhat 39fee21
updated
robertgshaw2-redhat 99a5afd
updated
robertgshaw2-redhat 1e0db0b
updated
robertgshaw2-redhat 9bdbe38
updated
robertgshaw2-redhat 911e480
updated
robertgshaw2-redhat 357bd03
updated
robertgshaw2-redhat 93a32eb
updated
robertgshaw2-redhat 181d68d
updated
robertgshaw2-redhat 01e5864
updated
robertgshaw2-redhat 04cba85
updated
robertgshaw2-redhat 06c5c39
updated
robertgshaw2-redhat 9a87c34
updated
robertgshaw2-redhat 027689d
updated
robertgshaw2-redhat 48add56
Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py
robertgshaw2-redhat ed6fd4f
Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py
robertgshaw2-redhat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is how Dyanmo does it (with the tp_group)
I wonder if there is a better way cc @njhill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robertgshaw2-redhat here is an alternative to consider robertgshaw2-redhat#7
Guess this might be preferable latency wise since we don't have additional gather collective, but not sure (since now scheduler needs to receive from all ranks .. though it was doing this anyhow until recently).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets just time things and see which one is faster
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like for TP=2, the setup I have is taking <1ms, so I think this is good enough for now as I would prefer to keep the changes in this file if possible