[KVConnector][1/N] PP-aware handshake aggregation and intermediate-PP output plumbing#43720
Conversation
b59289d to
e6532ef
Compare
|
|
This pull request has merge conflicts that must be resolved before it can be |
e6532ef to
c1336b2
Compare
njhill
left a comment
There was a problem hiding this comment.
Thanks @zixi-qi, this looks mostly good to me.
However I think the SupportsPP interface is a bit heavy, it would be cleaner to manage this inside the KVConnectorBase_V1 class, and just check for existence of the pp_aware method on self rather than using a marker class.
Would be good for @NickLucche to bless the TransferTopology updates too when he has time!
Updated to remove SupportsPP interface, thanks for the suggestion! |
njhill
left a comment
There was a problem hiding this comment.
Thanks @zixi-qi that looks much cleaner. I think we can clean up some comments further though.
One question - what happens if a non-PP-aware connector is used in a PP context. Can that still be valid? If not, we should think about where to validate this.
Thanks for the suggestion! I have cleaned up the excessive comments, will check the stacked PRs to do the same as well.
|
|
cc @NickLucche would be great if you could take a look as well :) |
Co-authored-by: Claude Signed-off-by: zixi-qi <zixi@inferact.ai>
… base Co-authored-by: Claude Signed-off-by: zixi-qi <zixi@inferact.ai>
…fault Co-authored-by: Claude Signed-off-by: zixi-qi <zixi@inferact.ai>
Co-authored-by: Claude Signed-off-by: zixi-qi <zixi@inferact.ai>
d9ba472 to
97fdc87
Compare
… output plumbing (vllm-project#43720) Signed-off-by: zixi-qi <zixi@inferact.ai> Signed-off-by: JisoLya <523420504@qq.com>
… output plumbing (vllm-project#43720) Signed-off-by: zixi-qi <zixi@inferact.ai>
… output plumbing (vllm-project#43720) Signed-off-by: zixi-qi <zixi@inferact.ai> Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
… output plumbing (vllm-project#43720) Signed-off-by: zixi-qi <zixi@inferact.ai> Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
… output plumbing (vllm-project#43720) Signed-off-by: zixi-qi <zixi@inferact.ai>
Purpose
The goal of this PR is to make kv connector / engine / model runner / gpu worker PP-aware, to layout the foundations to support PP(pipeline parallelism) + PD disaggregation.
Changes
Test Plan
AI-assisted PR. The submitter has reviewed every line.