File tree Expand file tree Collapse file tree 1 file changed +2
-13
lines changed Expand file tree Collapse file tree 1 file changed +2
-13
lines changed Original file line number Diff line number Diff line change @@ -73,7 +73,7 @@ python -m dynamo.vllm \
7373
7474Deploy prefill and decode workers on separate nodes for optimized resource utilization:
7575
76- ** Node 1** : Run ingress and prefill workers 
76+ ** Node 1** : Run ingress and decode worker 
7777``` bash 
7878#  Start ingress
7979python -m dynamo.frontend --router-mode kv & 
@@ -85,7 +85,7 @@ python -m dynamo.vllm \
8585  --enforce-eager
8686``` 
8787
88- ** Node 2** : Run decode workers 
88+ ** Node 2** : Run prefill worker 
8989``` bash 
9090#  Start decode worker
9191python -m dynamo.vllm \
@@ -94,14 +94,3 @@ python -m dynamo.vllm \
9494  --enforce-eager \
9595  --is-prefill-worker
9696``` 
97- 
98- ## Large Model Deployment  
99- 
100- For models requiring more GPUs than available on a single node such as tensor-parallel-size 16:
101- 
102- ** Node 1** : First part of tensor-parallel model
103- ``` bash 
104- #  Start ingress
105- python -m dynamo.frontend --router-mode kv & 
106- ``` 
107- 
    
 
   
 
     
   
   
          
     
  
    
     
 
    
      
     
 
     
    You can’t perform that action at this time.
  
 
    
  
     
    
      
        
     
 
       
      
     
   
 
    
    
  
 
  
 
     
    
0 commit comments