[KVConnector]: Enable Cross-layers KV cache layout for MultiConnector#30761
[KVConnector]: Enable Cross-layers KV cache layout for MultiConnector#30761NickLucche merged 2 commits intovllm-project:mainfrom
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request enables the MultiConnector to support the new continuous cross-layer KV cache layout. The changes correctly propagate the prefer_cross_layer_blocks property and the register_cross_layers_kv_cache method to the underlying connectors. My review identifies a potential edge case in the implementation of prefer_cross_layer_blocks where it could return an incorrect value if no connectors are configured, which could lead to unexpected behavior. I've provided a suggestion to make the implementation more robust.
vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py
Outdated
Show resolved
Hide resolved
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
|
Hi @kfirtoledo, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
@NickLucche per your suggestion here we defined |
83fdf01 to
cd61201
Compare
NickLucche
left a comment
There was a problem hiding this comment.
Thanks for doing this @kfirtoledo!
And yes the property change is fine here @orozery , given this feature was just introduced.
Would you mind adding a small unit test to keep this feature in check? It looks trivial, but we had similar issues with MC before..
Other than that this is lgtm.
cd61201 to
517698a
Compare
517698a to
c842eee
Compare
|
@NickLucche and @orozery, I added the unit tests, PTAL |
vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py
Outdated
Show resolved
Hide resolved
c842eee to
2f7a452
Compare
|
Hi @kfirtoledo, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
2f7a452 to
288e3a1
Compare
NickLucche
left a comment
There was a problem hiding this comment.
Thanks @kfirtoledo !
…vllm-project#30761) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
…vllm-project#30761) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
…vllm-project#30761) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
…vllm-project#30761) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Enable MultiConnector to use the new continuous cross-layer KV cache layout
described in RFC #27742 and implemented in #27743.
The MultiConnector aggregates cross-layer support across all underlying connectors
and returns true only if all connectors support it.