Skip to content

Conversation

@robertgshaw2-redhat
Copy link
Collaborator

@robertgshaw2-redhat robertgshaw2-redhat commented May 3, 2025

SUMMARY:

  • cleanup NixlConnector apis and add better typing / comments
  • enable each rank to exchange metadata with its cousin
  • rank 0 keeps track of which ranks have finishes txns

TODO (longer term):

  • add support for heterogeneous TP
  • use etcd rather than side channel for meta xchange

ApostaC and others added 30 commits April 17, 2025 13:56
Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Tyler Michael Smith <[email protected]>
- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <[email protected]>
@robertgshaw2-redhat robertgshaw2-redhat changed the title [P/D] Support TP g.t. 1 [P/D Disagg] [1/N] Support Homogeneous TP > 1 May 3, 2025
from vllm.config import VllmConfig
from vllm.distributed.kv_transfer.kv_connector.v1.base import (
KVConnectorBase_V1, KVConnectorMetadata, KVConnectorRole)
from vllm.distributed.parallel_state import get_tensor_model_parallel_rank
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in follow up: do this by passing the rank to the Connector directly rather than grabbing it here.

other_ranks_finished_ids: list[str] = []
for i in range(1, self.world_size):
other_ranks_finished_ids.extend(
self.tp_group.recv_object(src=i))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how Dyanmo does it (with the tp_group)

I wonder if there is a better way cc @njhill

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertgshaw2-redhat here is an alternative to consider robertgshaw2-redhat#7

Guess this might be preferable latency wise since we don't have additional gather collective, but not sure (since now scheduler needs to receive from all ranks .. though it was doing this anyhow until recently).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just time things and see which one is faster

Copy link
Collaborator Author

@robertgshaw2-redhat robertgshaw2-redhat May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like for TP=2, the setup I have is taking <1ms, so I think this is good enough for now as I would prefer to keep the changes in this file if possible

@robertgshaw2-redhat robertgshaw2-redhat marked this pull request as ready for review May 4, 2025 19:18
@robertgshaw2-redhat robertgshaw2-redhat merged commit 06847be into neuralmagic:disagg_pd_dev May 4, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants