Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 7 additions & 10 deletions vllm/executor/ray_gpu_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -224,16 +224,13 @@ def _init_workers_ray(self, placement_group: "PlacementGroup",
# broadcasted to.
self.non_driver_workers: List[RayWorkerWrapper] = []

for pp_rank in range(self.parallel_config.pipeline_parallel_size):
for tp_rank in range(self.parallel_config.tensor_parallel_size):
rank = (pp_rank *
self.parallel_config.tensor_parallel_size) + tp_rank
if rank == 0:
pass
elif rank % self.parallel_config.tensor_parallel_size == 0:
self.tp_driver_workers.append(self.workers[rank - 1])
else:
self.non_driver_workers.append(self.workers[rank - 1])
for idx, rank in enumerate(worker_ranks[1:]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can iterate over worker_ranks? I think worker_ranks[0] == 0 is always true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but then we need to do idx - 1. It's a little bit less clean. We can get rid of the first if condition as well if you like that style more, but I kept it for readability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get rid of the first if condition as well, and put a comment here, saying that worker_ranks[0] == 0 .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# We need to skip the driver worker, which we
# do by skipping worker_ranks[0] which is always 0.
if rank % self.parallel_config.tensor_parallel_size == 0:
self.tp_driver_workers.append(self.workers[idx])
else:
self.non_driver_workers.append(self.workers[idx])

def _driver_execute_model(
self, execute_model_req: Optional[ExecuteModelRequest]
Expand Down