Skip to content

Commit aa1b9a0

Browse files
committed
fix the scheduler could start on rank > 0
Signed-off-by: richardhuo-nv <[email protected]>
1 parent 0126e6f commit aa1b9a0

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/pyexecutor/py_executor_creator.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -404,14 +404,15 @@ def create_py_executor(
404404
scheduler_cls = getattr(
405405
module, kv_connector_config.connector_scheduler_class)
406406

407+
rank = tensorrt_llm.mpi_rank()
407408
# Some connector API implementations may need to establish out-of-band communication between the scheduler and workers.
408409
# In this case, the worker may be dependent on the scheduler, or vice-versa.
409410
# To deal with cases like this, we instantiate them both concurrently.
410411
with ThreadPoolExecutor(max_workers=2) as executor:
411412
connector_worker_task = executor.submit(worker_cls,
412413
executor_config)
413414

414-
if scheduler_cls is not None:
415+
if scheduler_cls is not None and rank == 0:
415416
connector_scheduler_task = executor.submit(
416417
scheduler_cls, executor_config)
417418
connector_scheduler = connector_scheduler_task.result()

0 commit comments

Comments
 (0)