[KVConnector] Handle KV events from multiple connectors#31811
[KVConnector] Handle KV events from multiple connectors#31811hickeyma wants to merge 9 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds support for handling KV cache events from multiple connectors, which is a valuable enhancement. The implementation introduces MultiConnectorKVEvents and updates the aggregation logic accordingly. While the overall approach is sound, I've identified several areas for improvement. My review focuses on simplifying the new tests, which are currently overly complex, removing redundant code in the event aggregation logic, and addressing a potential performance bottleneck by replacing a deepcopy operation with a more efficient alternative. These changes will improve the code's readability, maintainability, and performance.
vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py
Outdated
Show resolved
Hide resolved
8e5aaf7 to
a1f3883
Compare
|
Hi @hickeyma, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
a1f3883 to
1041602
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
1041602 to
866588d
Compare
8fa9c0d to
00e3b50
Compare
a426db6 to
a87f816
Compare
b34f6a8 to
6041228
Compare
NickLucche
left a comment
There was a problem hiding this comment.
Thanks for the followup @hickeyma .
It appears current impl is devaiating somewhat from my expected model (eg MultiKVConnectorStats). What's preventing us from implementing the abstract methods in MultiConnectorKVEvents and let the system program to that interface as it was no different from an LMCacheKVEvents instance?
vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py
Outdated
Show resolved
Hide resolved
43b8ec7 to
4a770fc
Compare
|
Hi @hickeyma, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
This is to handle Multiple connectors (different types) that generate KV events. Each connector type may process thier events differently before making them available to scheduler for publishing and therefore the specific connector |
Address PR feedback: the current merge logic doesn't properly group events by connector. Remove the implementation and restore the TODO comment pointing to PR vllm-project#31811 which implements this properly. Signed-off-by: Seiji Eicher <seiji@anyscale.com>
84f7c28 to
5b61971
Compare
Co-authored with gemini-code-assist ion: - vllm-project#31811 (comment) - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Co-authored with cursor: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Review comments: - vllm-project#31811 (review) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
…KVEvents Add support for KV cache events in MultiConnector by implementing the relevant KVConnectorKVEvents interface methods. Changes: - Add ConnectorKVCacheEvents class that wraps events per connector type, extending KVCacheEvent for serialization compatibility - Implement get_all_events() to return events grouped by connector as a list of ConnectorKVCacheEvents - Implement add_events() to dispatch incoming ConnectorKVCacheEvents to their respective underlying connector event containers - Implement increment_workers() as no-op since worker tracking is delegated to individual connectors via add_events() - Add get_connector_events() helper method This allows KVOutputAggregator to handle MultiConnectorKVEvents using the standard interface without adding special logic for multi-connectors, simplifying the aggregation logic in utils.py. Update following review comment: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Comments: - vllm-project#31811 (review) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
77ee8fc to
dbe9fb1
Compare
Co-authored with gemini-code-assist ion: - vllm-project#31811 (comment) - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Co-authored with cursor: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Review comments: - vllm-project#31811 (review) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
…KVEvents Add support for KV cache events in MultiConnector by implementing the relevant KVConnectorKVEvents interface methods. Changes: - Add ConnectorKVCacheEvents class that wraps events per connector type, extending KVCacheEvent for serialization compatibility - Implement get_all_events() to return events grouped by connector as a list of ConnectorKVCacheEvents - Implement add_events() to dispatch incoming ConnectorKVCacheEvents to their respective underlying connector event containers - Implement increment_workers() as no-op since worker tracking is delegated to individual connectors via add_events() - Add get_connector_events() helper method This allows KVOutputAggregator to handle MultiConnectorKVEvents using the standard interface without adding special logic for multi-connectors, simplifying the aggregation logic in utils.py. Update following review comment: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Comments: - vllm-project#31811 (review) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
|
This pull request has merge conflicts that must be resolved before it can be |
This is a follow on PR to vllm-project#28309. It was raised by @NickLucche in vllm-project#28309 (review) to provide support for KV events for multiple conectors. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
The use of copy.deepcopy() on connector_output.kv_cache_events can lead to significant performance overhead, especially if the number of events is large. A better approach is to avoid modifying the connector_output object directly in the loop. Instead, create a temporary, shallow copy of connector_output for each sub-connector, replacing kv_cache_events with the appropriate subset of events. This can be done efficiently using dataclasses.replace. This avoids the expensive deep copy and makes the code's intent clearer. Co-authored with gemini-code-assist Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Co-authored with gemini-code-assist ion: - vllm-project#31811 (comment) - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Rempve unnecessary patching from test functions. Co-authored with gemini-code-assist Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Co-authored with cursor: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Review comments: - vllm-project#31811 (review) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
…KVEvents Add support for KV cache events in MultiConnector by implementing the relevant KVConnectorKVEvents interface methods. Changes: - Add ConnectorKVCacheEvents class that wraps events per connector type, extending KVCacheEvent for serialization compatibility - Implement get_all_events() to return events grouped by connector as a list of ConnectorKVCacheEvents - Implement add_events() to dispatch incoming ConnectorKVCacheEvents to their respective underlying connector event containers - Implement increment_workers() as no-op since worker tracking is delegated to individual connectors via add_events() - Add get_connector_events() helper method This allows KVOutputAggregator to handle MultiConnectorKVEvents using the standard interface without adding special logic for multi-connectors, simplifying the aggregation logic in utils.py. Update following review comment: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Comments: - vllm-project#31811 (review) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
dbe9fb1 to
2193699
Compare
From code review comment: - vllm-project#31811 (comment) Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
|
Superseded by #34522 |
Purpose
This is a follow on PR to #28309. It was raised by @NickLucche in #28309 (review) to request support for KV events for multiple connectors.
Test Plan
Test using vLLM and LMCache from CLI:
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Note
Adds per-connector KV event handling and aggregation for
MultiConnectorand across workers.MultiConnectorKVEventsto hold KV events keyed by connector typeMultiConnector.get_kv_connector_kv_cache_events()to group events by connector;update_connector_output()now dispatches per-connector events downstreamKVOutputAggregatorto mergeMultiConnectorKVEventsacross workers while preserving per-connector boundariestest_multi_connector.pyWritten by Cursor Bugbot for commit b34f6a8a3760a496e0181ad4a5c399e4bf5ceb8a. This will update automatically on new commits. Configure here.
Note
Cursor Bugbot is generating a summary for commit 6041228d3729cc70f0c4d92534f26913729453e1. Configure here.
Note
Introduces per-connector KV cache event plumbing for multi-connector setups and aggregates events across workers while preserving connector boundaries.
MultiConnectorKVEventsto hold per-connectorKVConnectorKVEventsMultiConnector.get_kv_connector_kv_cache_events()groups events by connector type;update_connector_output()dispatches per-connector events to each connectorKVOutputAggregatorinkv_connector/utils.pymergesMultiConnectorKVEventsacross workers (including mixed single/multi cases)tests/v1/kv_connector/unit/test_multi_connector.pyfor event grouping, merging, and edge casesWritten by Cursor Bugbot for commit 6041228d3729cc70f0c4d92534f26913729453e1. This will update automatically on new commits. Configure here.
Note
Adds per-connector KV cache event plumbing to multi-connector setups and preserves connector boundaries during aggregation.
MultiConnectorKVEventscontainer to hold per-connectorKVConnectorKVEventsMultiConnector.get_kv_connector_kv_cache_events()groups events by connector type;update_connector_output()dispatches per-connector events to each connectorKVOutputAggregatorinkv_connector/utils.pymergesMultiConnectorKVEventsacross workers (also handles mixed single/multi cases)test_multi_connector.pyfor event grouping, merging, and edge casesWritten by Cursor Bugbot for commit 43b8ec71f51294bc09afb37f73df1a6ab1b165ba. This will update automatically on new commits. Configure here.