Skip to content

Commit 90ee473

Browse files
committed
chore: update LLM Router configuration files and README for improved deployment instructions
- Added SPDX license headers to llm-router-values-override.yaml. - Updated imageRegistry placeholder in llm-router-values-override.yaml for clarity. - Revised README.md to reflect changes in directory structure and emphasize the need to update imageRegistry and imagePullSecrets. - Adjusted paths in README.md for configuration file references to ensure accuracy. - Modified router-config-dynamo.yaml to enhance model routing strategies and updated model names for better clarity.
1 parent 3bb2b21 commit 90ee473

File tree

3 files changed

+61
-59
lines changed

3 files changed

+61
-59
lines changed

examples/deployments/LLM Router/README.md

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -562,8 +562,8 @@ docker pull $DYNAMO_IMAGE
562562
### Validate Configuration Files
563563

564564
```bash
565-
# Navigate to the customization directory
566-
cd customizations/LLM\ Router
565+
# Navigate to the deployment directory
566+
cd examples/deployments/LLM\ Router
567567

568568
# Check that required files exist
569569
ls -la frontend.yaml agg.yaml disagg.yaml router-config-dynamo.yaml llm-router-values-override.yaml
@@ -704,7 +704,7 @@ kubectl create secret generic hf-token-secret \
704704
-n ${NAMESPACE}
705705

706706
# 2. Navigate to your LLM Router directory (where agg.yaml/disagg.yaml are located)
707-
cd "customizations/LLM Router/"
707+
cd "examples/deployments/LLM Router/"
708708
```
709709

710710
#### Shared Frontend Deployment
@@ -884,9 +884,22 @@ kubectl get secrets -n llm-router
884884
git clone https://github.com/NVIDIA-AI-Blueprints/llm-router.git
885885
cd llm-router
886886

887-
# 2. Use official NVIDIA LLM Router images (no building required)
888-
# Our values file is configured to use the official images from nvcr.io/nvidian/sae/
889-
# If you need custom images, build and push them to your registry:
887+
# 2. Configure Docker Registry (REQUIRED)
888+
# IMPORTANT: Update the imageRegistry in llm-router-values-override.yaml before deployment
889+
# The file contains a placeholder "YOUR_REGISTRY_HERE/" that MUST be replaced.
890+
891+
# Edit the values file:
892+
nano ../examples/deployments/LLM\ Router/llm-router-values-override.yaml
893+
894+
# Update line ~34: Replace "YOUR_REGISTRY_HERE/" with your actual registry:
895+
# Examples:
896+
# - "nvcr.io/nvidia/" (if you have access to NVIDIA's public registry)
897+
# - "your-company-registry.com/llm-router/" (for private registries)
898+
# - "docker.io/your-username/" (for Docker Hub)
899+
900+
# Also update imagePullSecrets name to match your registry credentials
901+
902+
# If you need to build custom images, use:
890903
# docker build -t <your-registry>/router-server:latest -f src/router-server/router-server.dockerfile .
891904
# docker build -t <your-registry>/router-controller:latest -f src/router-controller/router-controller.dockerfile .
892905
# docker push <your-registry>/router-server:latest
@@ -896,7 +909,7 @@ cd llm-router
896909
# 3. Create router configuration ConfigMap using official External ConfigMap strategy
897910
# The official Helm chart now supports external ConfigMaps natively
898911
kubectl create configmap router-config-dynamo \
899-
--from-file=config.yaml=router-config-dynamo.yaml \
912+
--from-file=config.yaml=../examples/deployments/LLM\ Router/router-config-dynamo.yaml \
900913
--namespace=llm-router
901914

902915
# 4. Prepare router models (download from NGC)
@@ -954,7 +967,7 @@ kubectl create secret generic llm-api-keys \
954967
cd deploy/helm/llm-router
955968
helm upgrade --install llm-router . \
956969
--namespace llm-router \
957-
--values ../../../llm-router-values-override.yaml \
970+
--values ../../../../examples/deployments/LLM\ Router/llm-router-values-override.yaml \
958971
--wait --timeout=10m
959972

960973
# 6. Verify LLM Router deployment

examples/deployments/LLM Router/llm-router-values-override.yaml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,19 @@
1+
##
2+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
# SPDX-License-Identifier: Apache-2.0
4+
##
5+
16
# LLM Router Helm Values for NVIDIA Dynamo Cloud Platform Integration
27
# Based on official sample: https://github.com/NVIDIA-AI-Blueprints/llm-router/blob/main/deploy/helm/llm-router/values.override.yaml.sample
38
# Uses official External ConfigMap strategy for custom configuration
49

510
# Global configuration (following official sample structure)
11+
# NOTE: Update imageRegistry and imagePullSecrets before deployment (see README Step 6)
612
global:
713
storageClass: "standard"
8-
imageRegistry: "nvcr.io/nvidian/sae/"
14+
imageRegistry: "YOUR_REGISTRY_HERE/" # REPLACE with your Docker registry
915
imagePullSecrets:
10-
- name: nvcr-secret
16+
- name: nvcr-secret # UPDATE to match your registry credentials
1117

1218
# Router Controller Configuration
1319
routerController:

examples/deployments/LLM Router/router-config-dynamo.yaml

Lines changed: 32 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -41,99 +41,82 @@ policies:
4141
- name: "task_router"
4242
url: http://llm-router-router-server.llm-router.svc.cluster.local:8000/v2/models/task_router_ensemble/infer
4343
llms:
44-
# === INTELLIGENT ROUTING STRATEGY ===
45-
# Route to appropriate models based on task complexity
46-
47-
# Simple tasks → Fast 8B model
48-
- name: "Closed QA"
44+
- name: Brainstorming
4945
api_base: ${DYNAMO_API_BASE}
5046
api_key: ${DYNAMO_API_KEY}
51-
model: meta-llama/Llama-3.1-8B-Instruct
52-
- name: Classification
47+
model: meta-llama/Llama-3.1-70B-Instruct
48+
- name: Chatbot
5349
api_base: ${DYNAMO_API_BASE}
5450
api_key: ${DYNAMO_API_KEY}
55-
model: meta-llama/Llama-3.1-8B-Instruct
56-
- name: Extraction
51+
model: mistralai/Mixtral-8x22B-Instruct-v0.1
52+
- name: Classification
5753
api_base: ${DYNAMO_API_BASE}
5854
api_key: ${DYNAMO_API_KEY}
5955
model: meta-llama/Llama-3.1-8B-Instruct
60-
- name: Rewrite
56+
- name: Closed QA
6157
api_base: ${DYNAMO_API_BASE}
6258
api_key: ${DYNAMO_API_KEY}
63-
model: meta-llama/Llama-3.1-8B-Instruct
64-
- name: Summarization
59+
model: meta-llama/Llama-3.1-70B-Instruct
60+
- name: Code Generation
6561
api_base: ${DYNAMO_API_BASE}
6662
api_key: ${DYNAMO_API_KEY}
67-
model: meta-llama/Llama-3.1-8B-Instruct
68-
- name: Unknown
63+
model: meta-llama/Llama-3.1-70B-Instruct
64+
- name: Extraction
6965
api_base: ${DYNAMO_API_BASE}
7066
api_key: ${DYNAMO_API_KEY}
7167
model: meta-llama/Llama-3.1-8B-Instruct
72-
73-
# Complex tasks → Powerful 70B model
74-
- name: Brainstorming
68+
- name: Open QA
7569
api_base: ${DYNAMO_API_BASE}
7670
api_key: ${DYNAMO_API_KEY}
7771
model: meta-llama/Llama-3.1-70B-Instruct
78-
- name: "Code Generation"
72+
- name: Other
7973
api_base: ${DYNAMO_API_BASE}
8074
api_key: ${DYNAMO_API_KEY}
81-
model: meta-llama/Llama-3.1-70B-Instruct
82-
- name: "Open QA"
75+
model: mistralai/Mixtral-8x22B-Instruct-v0.1
76+
- name: Rewrite
8377
api_base: ${DYNAMO_API_BASE}
8478
api_key: ${DYNAMO_API_KEY}
85-
model: meta-llama/Llama-3.1-70B-Instruct
86-
- name: Other
79+
model: meta-llama/Llama-3.1-8B-Instruct
80+
- name: Summarization
8781
api_base: ${DYNAMO_API_BASE}
8882
api_key: ${DYNAMO_API_KEY}
89-
model: mistralai/Mixtral-8x22B-Instruct-v0.1
90-
91-
# Creative/Conversational tasks → Mixtral model
92-
- name: Chatbot
83+
model: meta-llama/Llama-3.1-70B-Instruct
84+
- name: Text Generation
9385
api_base: ${DYNAMO_API_BASE}
9486
api_key: ${DYNAMO_API_KEY}
9587
model: mistralai/Mixtral-8x22B-Instruct-v0.1
96-
- name: "Text Generation"
88+
- name: Unknown
9789
api_base: ${DYNAMO_API_BASE}
9890
api_key: ${DYNAMO_API_KEY}
99-
model: mistralai/Mixtral-8x22B-Instruct-v0.1
100-
91+
model: meta-llama/Llama-3.1-8B-Instruct
10192
- name: "complexity_router"
10293
url: http://llm-router-router-server.llm-router.svc.cluster.local:8000/v2/models/complexity_router_ensemble/infer
10394
llms:
104-
# === INTELLIGENT COMPLEXITY ROUTING ===
105-
# Route to appropriate models based on complexity level
106-
107-
# Simple complexity → Fast 8B model
108-
- name: "Contextual-Knowledge"
95+
- name: Creativity
10996
api_base: ${DYNAMO_API_BASE}
11097
api_key: ${DYNAMO_API_KEY}
111-
model: meta-llama/Llama-3.1-8B-Instruct
112-
- name: "No-Label-Reason"
98+
model: meta-llama/Llama-3.1-70B-Instruct
99+
- name: Reasoning
113100
api_base: ${DYNAMO_API_BASE}
114101
api_key: ${DYNAMO_API_KEY}
115-
model: meta-llama/Llama-3.1-8B-Instruct
116-
- name: Constraint
102+
model: meta-llama/Llama-3.1-70B-Instruct
103+
- name: Contextual-Knowledge
117104
api_base: ${DYNAMO_API_BASE}
118105
api_key: ${DYNAMO_API_KEY}
119106
model: meta-llama/Llama-3.1-8B-Instruct
120-
121-
# High complexity → Powerful 70B model
122-
- name: Creativity
107+
- name: Few-Shot
123108
api_base: ${DYNAMO_API_BASE}
124109
api_key: ${DYNAMO_API_KEY}
125110
model: meta-llama/Llama-3.1-70B-Instruct
126-
- name: Reasoning
111+
- name: Domain-Knowledge
127112
api_base: ${DYNAMO_API_BASE}
128113
api_key: ${DYNAMO_API_KEY}
129-
model: meta-llama/Llama-3.1-70B-Instruct
130-
- name: "Few-Shot"
114+
model: mistralai/Mixtral-8x22B-Instruct-v0.1
115+
- name: No-Label-Reason
131116
api_base: ${DYNAMO_API_BASE}
132117
api_key: ${DYNAMO_API_KEY}
133-
model: meta-llama/Llama-3.1-70B-Instruct
134-
135-
# Creative/Domain complexity → Mixtral model
136-
- name: "Domain-Knowledge"
118+
model: meta-llama/Llama-3.1-8B-Instruct
119+
- name: Constraint
137120
api_base: ${DYNAMO_API_BASE}
138121
api_key: ${DYNAMO_API_KEY}
139-
model: mistralai/Mixtral-8x22B-Instruct-v0.1
122+
model: meta-llama/Llama-3.1-8B-Instruct

0 commit comments

Comments
 (0)