-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
[Feature] add model aware kv ops helper #16020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] add model aware kv ops helper #16020
Conversation
Signed-off-by: billishyahao <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
KuntaiDu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this PR! Some comments on naming stuff, but functionality LGTM!
vllm/distributed/kv_transfer/kv_connector/mooncake_store_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/mooncake_store_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/model_aware_kv_ops.py
Outdated
Show resolved
Hide resolved
ShangmingCai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this PR to modularize this part of the code to reduce duplication and adapt to all connectors, but model_aware_kv_ops.py this filename seems a bit confusing, maybe it should be placed in a utils.py file. Otherwise, LGTM.
vllm/distributed/kv_transfer/kv_connector/mooncake_store_connector.py
Outdated
Show resolved
Hide resolved
Signed-off-by: billishyahao <[email protected]>
ShangmingCai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But maybe a shorter name like "utils.py" will be better? So that we can put more util functions or helpers all in this file as well instead of creating so many files in the future. I suggest this because I see some "utils.py" in many sub-directories of vllm.
Yes, it makes sense. I rename it in latest commit 07c73ea . Thanks! |
Signed-off-by: billishyahao <[email protected]>
ShangmingCai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@billishyahao LGTM now. You can ping @KuntaiDu to review it again.
KuntaiDu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Can you merge from main to fix the CI failures? |
|
@DarkLight1337 This probably could use a force-merge since it only changes the files under vllm/distributed/kv_transfer/kv_connector dir, and the disaggregated serving feature doesn't have CI yet. |
Signed-off-by: billishyahao <[email protected]>
Signed-off-by: billishyahao <[email protected]>
Signed-off-by: billishyahao <[email protected]> Signed-off-by: Yang Wang <[email protected]>
Signed-off-by: billishyahao <[email protected]>
Signed-off-by: billishyahao <[email protected]>
Signed-off-by: billishyahao <[email protected]> Signed-off-by: Mu Huai <[email protected]>
This patch provides
Tested on both AMD and NVIDIA DCGPUs to verify its correctness on both simple connector 1P1D and mooncake store connector XPYD case.
XPYD:
1P1D