[P/D] Update output of get_finished to newly defined class KVConnectorFinishOutput#21790
[P/D] Update output of get_finished to newly defined class KVConnectorFinishOutput#21790lk-chen wants to merge 20 commits intovllm-project:mainfrom
get_finished to newly defined class KVConnectorFinishOutput#21790Conversation
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Code Review
This pull request introduces a new KVConnectorFinishOutput class to standardize the output from kv_connector implementations, which is a great step towards improving code clarity and maintainability. The removal of the V0 connector code is also a welcome cleanup.
I've found a few critical issues that need to be addressed. The __init__ method of the new KVConnectorFinishOutput class has an incorrect return type hint. Additionally, several connectors are not providing all the required arguments when instantiating this new class, which would lead to runtime errors. I've provided specific suggestions for these issues in the comments. Addressing these will ensure the refactoring is robust and correct.
| finished_sending: set[str], | ||
| finished_recving: set[str], | ||
| finished_loading_num_tokens: dict[str, int], | ||
| ) -> "KVConnectorBase_V1.KVConnectorFinishOutput": |
| finished_sending, finished_recving = self._lmcache_engine.get_finished( | ||
| finished_req_ids) | ||
| return KVConnectorBase_V1.KVConnectorFinishOutput( | ||
| finished_sending=finished_sending, | ||
| finished_recving=finished_recving) |
There was a problem hiding this comment.
The KVConnectorFinishOutput constructor is missing the required keyword argument finished_loading_num_tokens. This will cause a TypeError at runtime.
return KVConnectorBase_V1.KVConnectorFinishOutput(
finished_sending=finished_sending,
finished_recving=finished_recving,
finished_loading_num_tokens={})| finished_sending: set[str] = set() | ||
| finished_recving: set[str] = set() | ||
| for c in self._connectors: |
There was a problem hiding this comment.
With the change of the return type of get_finished to KVConnectorFinishOutput, the unpacking on line 114 (sending, recving = c.get_finished(finished_req_ids)) will raise a TypeError because get_finished no longer returns a tuple. You need to update the loop body to handle the new return type.
For example:
output = c.get_finished(finished_req_ids)
sending = output.finished_sending
recving = output.finished_recving| return KVConnectorBase_V1.KVConnectorFinishOutput( | ||
| finished_sending=finished_sending, | ||
| finished_recving=finished_recving) |
There was a problem hiding this comment.
The KVConnectorFinishOutput constructor is missing the required keyword argument finished_loading_num_tokens. This will cause a TypeError at runtime.
| return KVConnectorBase_V1.KVConnectorFinishOutput( | |
| finished_sending=finished_sending, | |
| finished_recving=finished_recving) | |
| return KVConnectorBase_V1.KVConnectorFinishOutput( | |
| finished_sending=finished_sending, | |
| finished_recving=finished_recving, | |
| finished_loading_num_tokens={}) |
| return KVConnectorBase_V1.KVConnectorFinishOutput( | ||
| finished_sending=done_sending, finished_recving=done_recving) |
There was a problem hiding this comment.
The KVConnectorFinishOutput constructor is missing the required keyword argument finished_loading_num_tokens. This will cause a TypeError at runtime.
| return KVConnectorBase_V1.KVConnectorFinishOutput( | |
| finished_sending=done_sending, finished_recving=done_recving) | |
| return KVConnectorBase_V1.KVConnectorFinishOutput( | |
| finished_sending=done_sending, finished_recving=done_recving, finished_loading_num_tokens={}) |
| finished_sending, finished_recving = self.p2p_nccl_engine.get_finished( | ||
| finished_req_ids, no_compile_layers) | ||
| return KVConnectorBase_V1.KVConnectorFinishOutput( | ||
| finished_sending=finished_sending or set(), | ||
| finished_recving=finished_recving or set(), | ||
| ) |
There was a problem hiding this comment.
The KVConnectorFinishOutput constructor is missing the required keyword argument finished_loading_num_tokens. This will cause a TypeError at runtime.
return KVConnectorBase_V1.KVConnectorFinishOutput(
finished_sending=finished_sending or set(),
finished_recving=finished_recving or set(),
finished_loading_num_tokens={},
)Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
|
Hi @lk-chen, I completely agree on the need to encapsulate KV connector output in a structured way. I introduced a similar concept in #19330 — see KVConnectorOutput. In that PR, I didn’t yet propagate this class through ModelRunnerOutput or IntermediateTensors, as doing so would require broader refactoring that goes beyond the scope of the current change. There are also ongoing efforts to generalize and unify the structure of these fields, which are being discussed in #19330 and #19555. Would be great to align these directions going forward. |
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Linkun Chen <github@lkchen.net>
|
This pull request has merge conflicts that must be resolved before it can be |
|
|
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.This PR is on top of #21785
Purpose
We need to pipe more information from
kv_connectorimplementations tomodel_runner(e.g.finished_loading_dictin #21534 needed by #19329 , and metrics to be added in #21784 ). It's not sustainable to append ad-hoc fields along with existing ones everywhere (ModelRunnerOutput,IntermediateTensorsetc.). This PR aggregates all these info into a new class.Test Plan
CI unit tests
Test Result
(Optional) Documentation Update