Skip to content

Commit d4ff6f0

Browse files
authored
docs: Fix typos and remove incomplete section from vllm multinode doc (#3588)
Signed-off-by: Ryan McCormick <[email protected]>
1 parent 8dd104d commit d4ff6f0

File tree

1 file changed

+2
-13
lines changed

1 file changed

+2
-13
lines changed

docs/backends/vllm/multi-node.md

Lines changed: 2 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ python -m dynamo.vllm \
7373

7474
Deploy prefill and decode workers on separate nodes for optimized resource utilization:
7575

76-
**Node 1**: Run ingress and prefill workers
76+
**Node 1**: Run ingress and decode worker
7777
```bash
7878
# Start ingress
7979
python -m dynamo.frontend --router-mode kv &
@@ -85,7 +85,7 @@ python -m dynamo.vllm \
8585
--enforce-eager
8686
```
8787

88-
**Node 2**: Run decode workers
88+
**Node 2**: Run prefill worker
8989
```bash
9090
# Start decode worker
9191
python -m dynamo.vllm \
@@ -94,14 +94,3 @@ python -m dynamo.vllm \
9494
--enforce-eager \
9595
--is-prefill-worker
9696
```
97-
98-
## Large Model Deployment
99-
100-
For models requiring more GPUs than available on a single node such as tensor-parallel-size 16:
101-
102-
**Node 1**: First part of tensor-parallel model
103-
```bash
104-
# Start ingress
105-
python -m dynamo.frontend --router-mode kv &
106-
```
107-

0 commit comments

Comments
 (0)