feat: add DynamoModel CRD and controller #3525

hhzhang16 · 2025-10-09T17:30:40Z

Overview:

This MR adds a DynamoModel resource following this enhancement proposal.

Details:

adds DynamoModel CRD
adds DynamoModel controller to download models to cache
updates DGD controller to pull/reference from the downloaded model; waits for model to be "ready" before proceeding

TODO still

Remove pvc creation and make it optional; without it, it downloads to the node local system (race conditions?)
Automatic model path argument injection for all backends
Testing
Update examples
add support matrix -- only HF, only PVC RWX

Example CRDs:

# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# Basic DynamoModel example with HuggingFace public model
apiVersion: nvidia.com/v1alpha1
kind: DynamoModel
metadata:
  name: llama-3-70b-instruct-v1
  namespace: dynamo-cloud
spec:
  # Canonical model name from HuggingFace
  name: meta-llama/Llama-3.3-70B-Instruct
  
  # Version pin (optional but recommended for production)
  # This ensures all deployments use the exact same model artifact
  version: main
  
  # Source URL - supports hf://, s3://, ngc://, or https://
  sourceURL: hf://meta-llama/Llama-3.3-70B-Instruct
  
  # Secret reference for authentication (required for private models)
  secretRef: hf-token-secret
  
  # PVC configuration for model storage
  pvc:
    create: true
    storageClass: standard  # Update with your storage class
    size: 200Gi  # Adjust based on model size
    volumeAccessMode: ReadWriteMany  # Required for multi-replica deployments

---
# DynamoGraphDeployment that uses the model
apiVersion: nvidia.com/v1alpha1
kind: DynamoGraphDeployment
metadata:
  name: vllm-with-model
  namespace: dynamo-cloud
spec:
  # Reference the DynamoModel at top-level (applies to all services)
  modelRef: llama-3-70b-instruct-v1
  backendFramework: vllm
  services:
    VllmWorker:
      replicas: 2
      resources:
        limits:
          nvidia.com/gpu: "2"
      # Model arguments will be auto-injected by the controller
    Frontend:
      replicas: 1

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Signed-off-by: Hannah Zhang <[email protected]>

hhzhang16 added 3 commits October 7, 2025 16:49

feat: initial dynamomodel work

a400b94

Signed-off-by: Hannah Zhang <[email protected]>

feat: add model examples to components

31dd961

Signed-off-by: Hannah Zhang <[email protected]>

feat: move modelRef to DGD top level

ba965f5

Signed-off-by: Hannah Zhang <[email protected]>

hhzhang16 requested a review from biswapanda October 9, 2025 17:30

hhzhang16 self-assigned this Oct 9, 2025

pull-request-size bot added the size/XXL label Oct 9, 2025

github-actions bot added the feat label Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add DynamoModel CRD and controller #3525

feat: add DynamoModel CRD and controller #3525

Uh oh!

hhzhang16 commented Oct 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: add DynamoModel CRD and controller #3525

Are you sure you want to change the base?

feat: add DynamoModel CRD and controller #3525

Uh oh!

Conversation

hhzhang16 commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hhzhang16 commented Oct 9, 2025 •

edited

Loading