File tree Expand file tree Collapse file tree 1 file changed +2
-13
lines changed Expand file tree Collapse file tree 1 file changed +2
-13
lines changed Original file line number Diff line number Diff line change @@ -73,7 +73,7 @@ python -m dynamo.vllm \
7373
7474Deploy prefill and decode workers on separate nodes for optimized resource utilization:
7575
76- ** Node 1** : Run ingress and prefill workers
76+ ** Node 1** : Run ingress and decode worker
7777``` bash
7878# Start ingress
7979python -m dynamo.frontend --router-mode kv &
@@ -85,7 +85,7 @@ python -m dynamo.vllm \
8585 --enforce-eager
8686```
8787
88- ** Node 2** : Run decode workers
88+ ** Node 2** : Run prefill worker
8989``` bash
9090# Start decode worker
9191python -m dynamo.vllm \
@@ -94,14 +94,3 @@ python -m dynamo.vllm \
9494 --enforce-eager \
9595 --is-prefill-worker
9696```
97-
98- ## Large Model Deployment
99-
100- For models requiring more GPUs than available on a single node such as tensor-parallel-size 16:
101-
102- ** Node 1** : First part of tensor-parallel model
103- ``` bash
104- # Start ingress
105- python -m dynamo.frontend --router-mode kv &
106- ```
107-
You can’t perform that action at this time.
0 commit comments