-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[V1] Support multiple kv connectors #17564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
02b54bd to
f6a48c5
Compare
Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
f6a48c5 to
8423135
Compare
Signed-off-by: mgoin <[email protected]>
|
Dependent on #17686 landing first for |
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
From PR vllm-project#17564 Co-authored-by: Nick Hill <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: mgoin <[email protected]>
tlrmchlsmth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think my main comment is that we should make sure to document the behavior of the loads. I.E. make sure the precedence order stuff is written down somewhere.
This was there originally but inadvertently dropped Signed-off-by: Nick Hill <[email protected]>
@maobaolong We'll add better doc/comments for this but the logic initially is to:
|
@njhill Thanks for your explain. And I'd like to know more about the motivation and the scenario, if you can add some more description, it will help. So I guess this PR can not solve the co-exist issue for P/D transfer connector and offload-reuse purpose connector ? |
Signed-off-by: Nick Hill <[email protected]>
@maobaolong this is intended to be a starting point, it can work with P/D + offload-reuse if the P/D connector uses kv transfer params passed in the request to determine whether it should offer tokens to load. Which is how the current NixlConnector works - it will only return nonzero token count from We can extend it with other loading selection logic if you have additional ideas... |
Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: Yuqi Zhang <[email protected]>
Co-authored with @njhill.
We take advantage of the
kv_connector_extra_config: dict[str, Any]already present inKVTransferConfigto stash all the connectors we want in an ordered list of kwargs. The newMultiConnectorthen creates a connector from each item in its list ofKVTransferConfig.kv_connector_extra_config["connectors"]:Example usage: