Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,13 @@ pip install "ai-dynamo[all]"

### Building the Dynamo Base Image

Although not needed for local development, deploying your Dynamo pipelines to Kubernetes will require you to build and push a Dynamo base image to your container registry. You can use any container registry of your choice, such as:
Although not needed for local development, deploying your Dynamo pipelines to Kubernetes will require you to use a Dynamo base image to your container registry. You can use any container registry of your choice, such as:
- Docker Hub (docker.io)
- NVIDIA NGC Container Registry (nvcr.io)
- Any private registry

Here's how to build it:
We publish our images in nvcr.io and you can use them.
Alternatively you could build and push an image from source:

```bash
./container/build.sh
Expand All @@ -83,8 +84,10 @@ docker push <your-registry>/dynamo-base:latest-vllm
```

Notes about builds for specific frameworks:
- For specific details on the `--framework vllm` build, see [here](examples/vllm/README.md).
- For specific details on the `--framework tensorrtllm` build, see [here](examples/tensorrt_llm/README.md).
- For specific details on the `--framework vllm` build [read about the VLLM backend](components/backends/vllm/README.md)
.
- For specific details on the `--framework tensorrtllm` build, see [Read about the TensorRT-LLM backend](components/backends/trtllm/README.md)
.

Note about AWS environments:
- If deploying Dynamo in AWS, make sure to build the container with EFA support using the `--make-efa` flag.
Expand Down Expand Up @@ -197,8 +200,6 @@ pip install .
cd ../../../
pip install ".[all]"

# To test
docker compose -f deploy/metrics/docker-compose.yml up -d
cd examples/llm
dynamo serve graphs.agg:Frontend -f configs/agg.yaml
Follow the [Quickstart Guide](docs/guides/dynamo_deploy/quickstart.md)

```
4 changes: 2 additions & 2 deletions components/backends/vllm/deploy/disagg.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ spec:
image: nvcr.io/nvidian/nim-llm-dev/vllm_v1-runtime:dep-216.4
workingDir: /workspace/components/backends/vllm
args:
- "python3 components/main.py --model Qwen/Qwen3-0.6B --enforce-eager 2>&1 | tee /tmp/vllm.log"
- "python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager 2>&1 | tee /tmp/vllm.log"
VllmPrefillWorker:
dynamoNamespace: vllm-v1-disagg
envFromSecret: hf-token-secret
Expand Down Expand Up @@ -119,4 +119,4 @@ spec:
image: nvcr.io/nvidian/nim-llm-dev/vllm_v1-runtime:dep-216.4
workingDir: /workspace/components/backends/vllm
args:
- "python3 components/main.py --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker 2>&1 | tee /tmp/vllm.log"
- "python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker 2>&1 | tee /tmp/vllm.log"
2 changes: 1 addition & 1 deletion components/backends/vllm/deploy/disagg_planner.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -119,4 +119,4 @@ spec:
image: nvcr.io/nvidian/nim-llm-dev/vllm_v1-runtime:dep-216.4
workingDir: /workspace/components/backends/vllm
args:
- "python3 components/main.py --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker 2>&1 | tee /tmp/vllm.log"
- "python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager --is-prefill-worker 2>&1 | tee /tmp/vllm.log"
6 changes: 4 additions & 2 deletions deploy/helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,10 @@ This approach allows you to install Dynamo directly using a DynamoGraphDeploymen

### Basic Installation

Here is how you would install a VLLM inference backend example.

```bash
helm upgrade --install dynamo-graph ./deploy/helm/chart -n dynamo-cloud -f ./examples/vllm/deploy/agg.yaml
helm upgrade --install dynamo-graph ./deploy/helm/chart -n dynamo-cloud -f ./components/backends/vllm/deploy/agg.yaml
```

### Customizable Properties
Expand All @@ -39,7 +41,7 @@ You can override the default configuration by setting the following properties:

```bash
helm upgrade --install dynamo-graph ./deploy/helm/chart -n dynamo-cloud \
-f ./examples/vllm/deploy/agg.yaml \
-f ./components/backends/vllm/deploy/agg.yaml \
--set "imagePullSecrets[0].name=docker-secret-1" \
--set etcdAddr="my-etcd-service:2379" \
--set natsAddr="nats://my-nats-service:4222"
Expand Down
8 changes: 2 additions & 6 deletions deploy/inference-gateway/example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,7 @@ This guide provides instructions for setting up the Inference Gateway with Dynam
[See Quickstart Guide](../../../docs/guides/dynamo_deploy/quickstart.md) to install Dynamo Cloud.


2. **Launch Dynamo Deployments**

[See VLLM Example](../../../examples/vllm/README.md)

3. **Deploy Inference Gateway**
2. **Deploy Inference Gateway**

First, deploy an inference gateway service. In this example, we'll install `kgateway` based gateway implementation.

Expand Down Expand Up @@ -54,7 +50,7 @@ kubectl get gateway inference-gateway
# inference-gateway kgateway True 1m
```

4. **Apply Dynamo-specific manifests**
3. **Apply Dynamo-specific manifests**

The Inference Gateway is configured through the `inference-gateway-resources.yaml` file.

Expand Down
4 changes: 2 additions & 2 deletions docs/examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Serving examples locally

TODO: Follow individual examples to serve models locally.
Follow individual examples under components/backends/ to serve models locally.


## Deploying Examples to Kubernetes
Expand Down Expand Up @@ -38,7 +38,7 @@ export NAMESPACE=<your-namespace> # the namespace you used to deploy Dynamo clou
Deploying an example consists of the simple `kubectl apply -f ... -n ${NAMESPACE}` command. For example:

```bash
kubectl apply -f examples/vllm/deploy/agg.yaml -n ${NAMESPACE}
kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE}
```

You can use `kubectl get dynamoGraphDeployment -n ${NAMESPACE}` to view your deployment.
Expand Down
2 changes: 1 addition & 1 deletion docs/get_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ docker compose -f deploy/docker-compose.yml up -d

### Start Dynamo LLM Serving Components

[Explore the VLLM Example](../examples/vllm/README.md)
[Explore the VLLM Example](../components/backends/vllm/README.md)


## Local Development
Expand Down
13 changes: 0 additions & 13 deletions docs/guides/dynamo_deploy/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,19 +187,6 @@ We provide a script to uninstall CRDs should you need a clean start.

## Explore Examples

Pick your deployment destination.

If local

```bash
export DYNAMO_CLOUD=http://localhost:8080
```

If kubernetes
```bash
export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com
```

If deploying to Kubernetes, create a Kubernetes secret containing your sensitive values if needed:

```bash
Expand Down
Loading