feat(omni): add Cosmos3 support to vLLM-Omni backend by ayushag-nv · Pull Request #10132 · ai-dynamo/dynamo

ayushag-nv · 2026-05-29T17:36:16Z

Summary

Adds Dynamo vLLM-Omni backend support for NVIDIA Cosmos3 (nvidia/Cosmos3-Nano / -Super) — text-to-image, text-to-video, and image-to-video — backed by the native Cosmos3 pipeline in vllm-project/vllm-omni#3454.

Changes

Worker integration (components/src/dynamo/vllm/omni, common/utils)

--cosmos3-guardrails / --no-cosmos3-guardrails flag, routed into AsyncOmni(model_config={"guardrails": False}) so the Cosmos3 safety-guardrail models can be skipped at startup.
normalize_image_frames() + the image output path: the native Cosmos3 pipeline returns numpy [batch, frames, H, W, C] arrays (not PIL), so the formatter normalizes them before PNG-encoding /v1/images/generations responses.

Examples & docs

Launch scripts agg_omni_cosmos3_{image,video,i2v}.sh (one modality per worker).
Sample request payloads under examples/backends/vllm/launch/cosmos3/ (official Cosmos3 prompts mapped to the Dynamo request schema).
Guide docs/backends/vllm/cosmos3.md (install, serving, request formats, gotchas).

Tests for normalize_image_frames, the guardrails arg, and the model_config passthrough.

Dependency

Requires the Cosmos3 pipeline from vllm-omni#3454 (not yet in a released vLLM-Omni). This PR is the Dynamo-side integration only — container pinning of vLLM-Omni is intentionally not included; install vLLM-Omni from that PR per the guide.

Notes

One modality per worker (--output-modalities image|video) — request type derives from the worker's configured modality, not the HTTP endpoint.
Image size is the OpenAI enum; per-request num_inference_steps / guidance_scale / seed go under nvext; i2v input_reference must be an http(s) URL or a data: URI (local paths rejected).

Verified t2i / t2v / i2v end-to-end through the frontend.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added support for NVIDIA Cosmos3 omni model via vLLM backend with new --cosmos3-guardrails CLI flag to control optional safety features.
- Introduced image frame normalization for improved handling of diffusion pipeline outputs.
Improvements
- MP4 video encoding now uses h264_nvenc codec for better GPU efficiency.
- Updated FFmpeg configuration for enhanced media support.
Documentation
- New guide for running Cosmos3 text-to-image, text-to-video, and image-to-video generation.
- Added example launch scripts and request payloads for Cosmos3 workflows.
Tests
- Expanded test coverage for frame normalization and Cosmos3 guardrails configuration.

Signed-off-by: ayushag <ayushag@nvidia.com>

copy-pr-bot · 2026-05-29T17:36:20Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

fferroni · 2026-05-29T17:41:41Z

+  -H 'Content-Type: application/json' \\
+  -d '{
+    "model": "${MODEL}",
+    "prompt": "A robot standing in a bright laboratory",


imo, for the examples, I think we should provide an appropriate JSON caption, not a dense one.
currently, we don't have any upsampling within the container, so our example captions should only be JSON strings.

If later on, we add JSON upsampling within the container, we can have a normal "dense" prompt as an example and then a extra parameter like "upsample_prompt=True" or whatever.

…stall Signed-off-by: ayushag <ayushag@nvidia.com>

Signed-off-by: ayushag <ayushag@nvidia.com>

GuanLuo · 2026-05-29T20:47:36Z

@@ -0,0 +1,12 @@
+{


This is too long for a sample request. Where is this from? If from web download, can we just point to the url instead of have this in the repo?

@GuanLuo This is from cosmos private repo. There is just a release branch. We can clean this up before merging. Till then more examples of payloads will be published publicly.

Cosmos3 pipelines are only in the unreleased vllm-omni PR vllm-project/vllm-omni#3454, not in any released wheel. Re-enable the git-install mechanism (reverted in 7744835) so the vllm-runtime container installs vllm-omni from the canonical repo pinned to the current PR head SHA (65b83d87, == refs/pull/3454/head). When vllm_omni_git_url is set, install_vllm_omni.sh installs "vllm-omni @ git+<url>@<ref>"; otherwise it falls back to the released "vllm-omni==<ref>" wheel. Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…it (#10091) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> (cherry picked from commit dc2f352)

saturley-hall · 2026-05-31T08:33:26Z

/ok to test 22d56b9

github-actions · 2026-05-31T08:35:53Z

🌿 Fern Docs Preview: https://nvidia-preview-3f0e77a2-87cc-4014-8759-de5535f5c52e.docs.buildwithfern.com/dynamo/dev

coderabbitai · 2026-05-31T08:38:31Z

Walkthrough

This PR integrates NVIDIA Cosmos3 omni model support into Dynamo's vLLM-Omni backend, including codec migration from libx264 to h264_nvenc for MP4 encoding, Cosmos3 guardrails toggleable configuration, FFmpeg build infrastructure with NVENC/VP9 encoder support, updated container runtime images to deploy LGPL ffmpeg instead of bundled GPL binary wheels, vLLM-Omni git-based installation support, and comprehensive documentation with launch examples for text-to-image, text-to-video, and image-to-video generation.

Changes

Cosmos3 Model Integration

Layer / File(s)	Summary
Video normalization and MP4 codec migration `components/src/dynamo/common/utils/video_utils.py`, `components/src/dynamo/common/tests/test_video_utils.py`, `components/src/dynamo/sglang/request_handlers/video_generation/video_generation_handler.py`, `components/src/dynamo/vllm/omni/output_formatter.py`	Adds `normalize_image_frames()` helper to flatten diffusion outputs into ordered PIL frames, migrates MP4 encoding from libx264 to h264_nvenc codec across video utilities and handlers, and integrates normalization into image output formatter. Tests validate codec selection and normalization behavior for PIL passthrough, uint8/float numpy arrays, and multi-dimensional Cosmos3 inputs.
Cosmos3 guardrails configuration `components/src/dynamo/vllm/omni/args.py`, `components/src/dynamo/vllm/omni/base_handler.py`, `components/src/dynamo/vllm/tests/omni/test_omni_args.py`, `components/src/dynamo/vllm/tests/omni/test_omni_base_handler.py`	Adds `--cosmos3-guardrails` CLI flag (default enabled) and `OmniConfig.cosmos3_guardrails` field; BaseOmniHandler conditionally injects `model_config={"guardrails": False}` into AsyncOmni kwargs when disabled; tests verify toggle behavior and configuration validation.
FFmpeg NVENC and VP9 build support `container/templates/wheel_builder.Dockerfile`, `container/context.yaml`, `container/templates/args.Dockerfile`	Extends wheel_builder compilation to include NVENC and VP9 codecs: adds build args for nv_codec_headers and libvpx versions, installs source builds for both codec libraries in CUDA and non-CUDA environments, configures FFmpeg with h264_nvenc and libvpx_vp9 encoder support while maintaining LGPL-only licensing; updates build arg declarations and version pins in context and templates.
Runtime FFmpeg deployment and imageio source installation `container/deps/requirements.common.txt`, `container/deps/requirements.sglang.txt`, `container/deps/requirements.trtllm.txt`, `container/deps/requirements.vllm.txt`, `container/templates/dynamo_runtime.Dockerfile`, `container/templates/sglang_runtime.Dockerfile`, `container/templates/trtllm_runtime.Dockerfile`	Updates all runtime images to copy LGPL ffmpeg binary and libraries from wheel_builder stage into /usr/local, runs ldconfig, and sets IMAGEIO_FFMPEG_EXE environment variable; enforces source installation of imageio-ffmpeg via --no-binary directive across all requirements files to avoid bundled GPL binary wheel, allowing imageio to use the in-tree LGPL CLI via environment variable.
vLLM-Omni git-based installation `container/deps/vllm/install_vllm_omni.sh`, `container/context.yaml`, `container/templates/args.Dockerfile`, `container/templates/vllm_runtime.Dockerfile`, `container/templates/dynamo_base.Dockerfile`	Refactors install_vllm_omni.sh to support git checkout via VLLM_OMNI_GIT_URL: computes unified VLLM_OMNI_SPEC that selects git or PyPI installation based on URL presence; updates vllm_runtime and base Dockerfiles with new build args; context.yaml pins vLLM-Omni to specific commit SHA; removes SCCACHE_VERSION default and adds AWS_SDK_CPP_VERSION arg to enable caller-supplied build configuration.
Cosmos3 documentation and launch examples `docs/backends/vllm/cosmos3.md`, `docs/backends/trtllm/trtllm-diffusion.md`, `examples/backends/vllm/launch/agg_omni_cosmos3_image.sh`, `examples/backends/vllm/launch/agg_omni_cosmos3_video.sh`, `examples/backends/vllm/launch/agg_omni_cosmos3_i2v.sh`, `examples/backends/vllm/launch/cosmos3/t2i.json`, `examples/backends/vllm/launch/cosmos3/t2v.json`, `examples/backends/vllm/launch/cosmos3/i2v.json`	Comprehensive Cosmos3 documentation covering checkpoint selection, supported modalities, setup from cosmos3-omni-integration branch with pinned vLLM-Omni commit, and request format specifications; three launch scripts demonstrate aggregated inference with guardrails disabled and media output to filesystem; JSON payload examples show prompt structures and generation parameters for text-to-image, text-to-video, and image-to-video workflows; TensorRT-LLM documentation clarified with container-aware ffmpeg/NVENC setup guidance.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(omni): add Cosmos3 support to vLLM-Omni backend' clearly and concisely summarizes the main change—adding Cosmos3 model support to the vLLM-Omni backend for text-to-image, text-to-video, and image-to-video tasks.
Description check	✅ Passed	The PR description includes Overview (Summary section), Details (Changes section covering worker integration, examples & docs, tests, and dependencies), and Related Issues; however, it lacks a dedicated 'Where should the reviewer start?' section explicitly calling out specific files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Trivy (0.69.3)

Trivy execution failed: 2026-05-31T08:28:08Z FATAL Fatal error run error: fs scan error: scan error: scan failed: failed analysis: post analysis error: post analysis error: ansible scan error: fs filter error: fs filter error: walk error range error: stat .coderabbit-opengrep-fallback.yml: no such file or directory: range error: stat .coderabbit-opengrep-fallback.yml: no such file or directory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (2)

components/src/dynamo/common/utils/video_utils.py (1)
93-93: 💤 Low value

Hoist the PIL import to module top.

from PIL import Image is imported inside the function. The module already implicitly relies on PIL (frames_to_numpy operates on PIL Images), so moving this to the top-level imports is safe and aligns with the repo convention.
♻️ Proposed change
 import numpy as np
+from PIL import Image
-    from PIL import Image
-
     out: list = []
As per coding guidelines: "Keep all imports at module top (no imports inside functions/classes)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/src/dynamo/common/utils/video_utils.py` at line 93, Move the local
import "from PIL import Image" out of the function and add it to the module's
top-level imports in components/src/dynamo/common/utils/video_utils.py;
specifically remove the in-function import (the one used by frames_to_numpy) and
place "from PIL import Image" alongside the other imports at file top so
frames_to_numpy and any other PIL-dependent functions reference the module-level
Image symbol.
container/templates/trtllm_runtime.Dockerfile (1)
154-161: 💤 Low value

Inconsistent error handling for libav/libsw copies compared to sglang.

In sglang_runtime.Dockerfile, the libav*.so* and libsw*.so* copies (line 43) will fail the build if missing, but here (lines 155-156) they silently succeed with || true. If FFmpeg encoding is required for TRT-LLM diffusion (as the comment states), these libraries are also required for the ffmpeg binary to function.

Consider removing 2>/dev/null || true from lines 155-156 to match the sglang behavior and fail fast if the wheel_builder stage is missing required libraries.
Suggested change for consistency
 RUN --mount=type=bind,from=wheel_builder,source=/usr/local/,target=/tmp/usr/local/ \
-    cp -nL /tmp/usr/local/lib/libav*.so* /usr/local/lib/ 2>/dev/null || true && \
-    cp -nL /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ 2>/dev/null || true && \
+    cp -nL /tmp/usr/local/lib/libav*.so* /usr/local/lib/ && \
+    cp -nL /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ && \
     cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ 2>/dev/null || true && \
     cp -nL /tmp/usr/local/bin/ffmpeg /usr/local/bin/ffmpeg && \
     cp -r /tmp/usr/local/src/ffmpeg /usr/local/src/ && \
     ldconfig
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@container/templates/trtllm_runtime.Dockerfile` around lines 154 - 161, The
RUN step currently silences failures for critical libs by appending "2>/dev/null
|| true" to the cp of libav*.so* and libsw*.so*; remove the "2>/dev/null ||
true" (and optional "2>/dev/null") from the cp commands that match "cp -nL
/tmp/usr/local/lib/libav*.so*" and "cp -nL /tmp/usr/local/lib/libsw*.so*" so the
build fails fast if those libraries are missing (keeping the cp for lib*vpx*.so*
and the ffmpeg cp/ldconfig as-is), since ffmpeg (ENV
IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg) requires these libs to function.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/backends/vllm/cosmos3.md`:
- Around line 112-114: The multiline shell snippet in the docs uses backslash
line continuations with inline comments after the backslashes which breaks
copy/paste execution; update the example around the flags --output-modalities,
--no-cosmos3-guardrails, and --media-output-fs-url so comments are on their own
lines (or provide separate full-command variants) instead of trailing the
backslashes, ensuring each continued line ends only with the backslash and the
flag text so the shell command is copy/paste-safe.
- Around line 31-32: Replace the generic link text "[link]" with descriptive
labels matching the checkpoint names so the table rows for `nvidia/Cosmos3-Nano`
and `nvidia/Cosmos3-Super` use link text like "Cosmos3-Nano" and "Cosmos3-Super"
respectively; update the markdown links in the table to read
`[Cosmos3-Nano](https://huggingface.co/nvidia/Cosmos3-Nano)` and
`[Cosmos3-Super](https://huggingface.co/nvidia/Cosmos3-Super)` so linting passes
and the labels clearly identify the checkpoints.
- Around line 31-32: The HF checkpoint links for the model names
`nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` in the docs page are returning
401 in link-check CI; update the two Markdown link targets for those model
entries so they point to publicly accessible URLs that pass docs link-check (for
example swap the current https://huggingface.co/... checkpoint links for the
public model hub pages or an official NVIDIA/public docs page), or alternatively
add those exact HF URLs/statuses to the docs-link-check allowlist; change the
two link targets referenced alongside the `nvidia/Cosmos3-Nano` and
`nvidia/Cosmos3-Super` entries to the new URLs or add them to the allowlist so
CI no longer fails.

In `@examples/backends/vllm/launch/agg_omni_cosmos3_i2v.sh`:
- Line 50: Remove the fragile fixed wait ("sleep 2") from
agg_omni_cosmos3_i2v.sh and replace it with the project’s shared health-check
orchestration: remove the "sleep 2" line and invoke the centralized readiness
check (use the launch framework's health-check helper or wait-for-ready wrapper
used by other launch scripts) to block until the service reports healthy; ensure
you call the same health-check entrypoint used elsewhere in the repo so the
script follows the launch-script convention for readiness handling.

In `@examples/backends/vllm/launch/agg_omni_cosmos3_image.sh`:
- Around line 15-16: The launcher missing gpu_utils integration should source
gpu_utils.sh (via SCRIPT_DIR/../../../common/gpu_utils.sh) and use
build_vllm_gpu_mem_args() when constructing the vLLM CLI invocation in
agg_omni_cosmos3_image.sh; update the script to source gpu_utils.sh near the
other shared utils and insert the output of build_vllm_gpu_mem_args into the
vLLM/vllm-server command-line assembly so GPU memory flags are consistent with
other Cosmos3 launchers.
- Line 48: Replace the fixed "sleep 2" with a proper readiness check: remove the
"sleep 2" line and instead call the shared framework health-check/ready helper
(e.g., a common script or function like wait_for_framework_ready or
check_framework_health) in a loop with a timeout and non-zero exit if not ready;
ensure the script waits for the specific service(s) the launch depends on and
logs progress/errors so startup is deterministic and not flaky.

In `@examples/backends/vllm/launch/agg_omni_cosmos3_video.sh`:
- Line 49: Replace the fixed "sleep 2" with a call to the repository's shared
readiness-check helper (instead of a blind sleep, invoke the common
wait-for-ready/health-check script or function used elsewhere), passing the
service endpoint/port or health URL for the component started in this script and
fail the launch if the check returns non-zero; specifically remove the "sleep 2"
line and invoke the shared readiness checker (e.g., wait_for_service or
wait-for-ready) with the correct args so the script blocks until a successful
health response and exits on timeout/error.

---

Nitpick comments:
In `@components/src/dynamo/common/utils/video_utils.py`:
- Line 93: Move the local import "from PIL import Image" out of the function and
add it to the module's top-level imports in
components/src/dynamo/common/utils/video_utils.py; specifically remove the
in-function import (the one used by frames_to_numpy) and place "from PIL import
Image" alongside the other imports at file top so frames_to_numpy and any other
PIL-dependent functions reference the module-level Image symbol.

In `@container/templates/trtllm_runtime.Dockerfile`:
- Around line 154-161: The RUN step currently silences failures for critical
libs by appending "2>/dev/null || true" to the cp of libav*.so* and libsw*.so*;
remove the "2>/dev/null || true" (and optional "2>/dev/null") from the cp
commands that match "cp -nL /tmp/usr/local/lib/libav*.so*" and "cp -nL
/tmp/usr/local/lib/libsw*.so*" so the build fails fast if those libraries are
missing (keeping the cp for lib*vpx*.so* and the ffmpeg cp/ldconfig as-is),
since ffmpeg (ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg) requires these libs
to function.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4d3be066-b46b-43d3-8b8b-18d9296de8e8

📥 Commits

Reviewing files that changed from the base of the PR and between 5b4bc1d and 22d56b9.

📒 Files selected for processing (29)

components/src/dynamo/common/tests/test_video_utils.py
components/src/dynamo/common/utils/video_utils.py
components/src/dynamo/sglang/request_handlers/video_generation/video_generation_handler.py
components/src/dynamo/vllm/omni/args.py
components/src/dynamo/vllm/omni/base_handler.py
components/src/dynamo/vllm/omni/output_formatter.py
components/src/dynamo/vllm/tests/omni/test_omni_args.py
components/src/dynamo/vllm/tests/omni/test_omni_base_handler.py
container/context.yaml
container/deps/requirements.common.txt
container/deps/requirements.sglang.txt
container/deps/requirements.trtllm.txt
container/deps/requirements.vllm.txt
container/deps/vllm/install_vllm_omni.sh
container/templates/args.Dockerfile
container/templates/dynamo_base.Dockerfile
container/templates/dynamo_runtime.Dockerfile
container/templates/sglang_runtime.Dockerfile
container/templates/trtllm_runtime.Dockerfile
container/templates/vllm_runtime.Dockerfile
container/templates/wheel_builder.Dockerfile
docs/backends/trtllm/trtllm-diffusion.md
docs/backends/vllm/cosmos3.md
examples/backends/vllm/launch/agg_omni_cosmos3_i2v.sh
examples/backends/vllm/launch/agg_omni_cosmos3_image.sh
examples/backends/vllm/launch/agg_omni_cosmos3_video.sh
examples/backends/vllm/launch/cosmos3/i2v.json
examples/backends/vllm/launch/cosmos3/t2i.json
examples/backends/vllm/launch/cosmos3/t2v.json

coderabbitai · 2026-05-31T08:38:34Z

+| `nvidia/Cosmos3-Nano` | Smaller, faster — default in the Dynamo launch scripts below | [link](https://huggingface.co/nvidia/Cosmos3-Nano) |
+| `nvidia/Cosmos3-Super` | Larger, higher quality | [link](https://huggingface.co/nvidia/Cosmos3-Super) |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use descriptive link labels for checkpoint URLs.

[link] is too generic and already flagged by markdownlint. Use labels like Cosmos3-Nano / Cosmos3-Super.

As per coding guidelines, for **/*.md documentation quality should be maintained; replacing non-descriptive link text improves clarity and lint compliance.

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 31-31: Link text should be descriptive

(MD059, descriptive-link-text)

[warning] 32-32: Link text should be descriptive

(MD059, descriptive-link-text)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/backends/vllm/cosmos3.md` around lines 31 - 32, Replace the generic link text "[link]" with descriptive labels matching the checkpoint names so the table rows for `nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` use link text like "Cosmos3-Nano" and "Cosmos3-Super" respectively; update the markdown links in the table to read `[Cosmos3-Nano](https://huggingface.co/nvidia/Cosmos3-Nano)` and `[Cosmos3-Super](https://huggingface.co/nvidia/Cosmos3-Super)` so linting passes and the labels clearly identify the checkpoints.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the checkpoint links that currently fail docs link-check CI.

The current Hugging Face checkpoint URLs are failing lychee with 401, which blocks docs checks. Please switch these to URLs that pass CI (or update the docs-link-check allowlist for these exact domains/statuses).

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 31-31: Link text should be descriptive

(MD059, descriptive-link-text)

[warning] 32-32: Link text should be descriptive

(MD059, descriptive-link-text)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/backends/vllm/cosmos3.md` around lines 31 - 32, The HF checkpoint links for the model names `nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` in the docs page are returning 401 in link-check CI; update the two Markdown link targets for those model entries so they point to publicly accessible URLs that pass docs link-check (for example swap the current https://huggingface.co/... checkpoint links for the public model hub pages or an official NVIDIA/public docs page), or alternatively add those exact HF URLs/statuses to the docs-link-check allowlist; change the two link targets referenced alongside the `nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` entries to the new URLs or add them to the allowlist so CI no longer fails.

coderabbitai · 2026-05-31T08:38:34Z

+    --output-modalities image \            # or: video
+    --no-cosmos3-guardrails \              # skip loading the safety guardrail models
+    --media-output-fs-url file:///tmp/dynamo_media


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The multiline shell example is not copy/paste-safe.

The inline comments after line-continuation backslashes break the command. Move those comments to separate lines (or provide separate command variants) so the snippet executes as documented.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/backends/vllm/cosmos3.md` around lines 112 - 114, The multiline shell snippet in the docs uses backslash line continuations with inline comments after the backslashes which breaks copy/paste execution; update the example around the flags --output-modalities, --no-cosmos3-guardrails, and --media-output-fs-url so comments are on their own lines (or provide separate full-command variants) instead of trailing the backslashes, ensuring each continued line ends only with the backslash and the flag text so the shell command is copy/paste-safe.

coderabbitai · 2026-05-31T08:38:34Z

+python -m dynamo.frontend &
+FRONTEND_PID=$!
+
+sleep 2


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove fixed readiness sleep and use shared health-check orchestration.

sleep 2 is a fragile startup gate and violates the launch-script convention for readiness handling.

As per coding guidelines, launch scripts should “Avoid readiness sleeps/polls; rely on the shared framework health-check patterns instead.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/backends/vllm/launch/agg_omni_cosmos3_i2v.sh` at line 50, Remove the fragile fixed wait ("sleep 2") from agg_omni_cosmos3_i2v.sh and replace it with the project’s shared health-check orchestration: remove the "sleep 2" line and invoke the centralized readiness check (use the launch framework's health-check helper or wait-for-ready wrapper used by other launch scripts) to block until the service reports healthy; ensure you call the same health-check entrypoint used elsewhere in the repo so the script follows the launch-script convention for readiness handling.

coderabbitai · 2026-05-31T08:38:34Z

+source "$SCRIPT_DIR/../../../common/launch_utils.sh"
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align this launcher with shared vLLM GPU-memory utilities.

This script skips gpu_utils.sh and does not use build_vllm_gpu_mem_args, so users can’t control VRAM behavior consistently with the other Cosmos3 launchers.

As per coding guidelines, launchers should source gpu_utils.sh and “Use build_vllm_gpu_mem_args() to construct GPU memory CLI flags for vLLM.”

Also applies to: 50-57

🧰 Tools

🪛 Shellcheck (0.11.0)

[info] 15-15: Not following: ./../../../common/launch_utils.sh was not specified as input (see shellcheck -x).

(SC1091)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/backends/vllm/launch/agg_omni_cosmos3_image.sh` around lines 15 - 16, The launcher missing gpu_utils integration should source gpu_utils.sh (via SCRIPT_DIR/../../../common/gpu_utils.sh) and use build_vllm_gpu_mem_args() when constructing the vLLM CLI invocation in agg_omni_cosmos3_image.sh; update the script to source gpu_utils.sh near the other shared utils and insert the output of build_vllm_gpu_mem_args into the vLLM/vllm-server command-line assembly so GPU memory flags are consistent with other Cosmos3 launchers.

coderabbitai · 2026-05-31T08:38:34Z

+python -m dynamo.frontend &
+FRONTEND_PID=$!
+
+sleep 2


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace fixed startup sleep with framework readiness handling.

Using sleep 2 introduces flaky startup behavior across machines.

As per coding guidelines, launch scripts should “Avoid readiness sleeps/polls; rely on the shared framework health-check patterns instead.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/backends/vllm/launch/agg_omni_cosmos3_image.sh` at line 48, Replace the fixed "sleep 2" with a proper readiness check: remove the "sleep 2" line and instead call the shared framework health-check/ready helper (e.g., a common script or function like wait_for_framework_ready or check_framework_health) in a loop with a timeout and non-zero exit if not ready; ensure the script waits for the specific service(s) the launch depends on and logs progress/errors so startup is deterministic and not flaky.

coderabbitai · 2026-05-31T08:38:34Z

+python -m dynamo.frontend &
+FRONTEND_PID=$!
+
+sleep 2


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use shared readiness checks instead of a fixed sleep.

sleep 2 is not reliable for service readiness and can fail under slower startup conditions.

As per coding guidelines, launch scripts should “Avoid readiness sleeps/polls; rely on the shared framework health-check patterns instead.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/backends/vllm/launch/agg_omni_cosmos3_video.sh` at line 49, Replace the fixed "sleep 2" with a call to the repository's shared readiness-check helper (instead of a blind sleep, invoke the common wait-for-ready/health-check script or function used elsewhere), passing the service endpoint/port or health URL for the component started in this script and fail the launch if the check returns non-zero; specifically remove the "sleep 2" line and invoke the shared readiness checker (e.g., wait_for_service or wait-for-ready) with the correct args so the script blocks until a successful health response and exits on timeout/error.

dynamo-ops · 2026-05-31T08:39:38Z

+            out.append(item)
+            continue
+        arr = np.asarray(item)
+        while arr.ndim > 4:  # [batch, frames, H, W, C] -> [frames, H, W, C]


normalize_image_frames collapses a [B, F, H, W, C] Cosmos3 array by taking arr[0], so image requests with n > 1 silently drop every generated batch after the first. Fix: preserve and flatten all leading batch/frame dimensions before converting frames to PIL images.

🤖 AI Fix

In components/src/dynamo/common/utils/video_utils.py, update normalize_image_frames to replace the while arr.ndim > 4: arr = arr[0] logic with validation that the last three dimensions are H, W, C and arr = arr.reshape((-1, *arr.shape[-3:])) so all [B, F, H, W, C] outputs are emitted.

dynamo-ops · 2026-05-31T08:39:38Z

+RUN --mount=type=bind,source=./container/deps/requirements.vllm.txt,target=/tmp/requirements.vllm.txt \
+    --mount=type=cache,target=/root/.cache/uv,sharing=locked \
+    export UV_CACHE_DIR=/root/.cache/uv && \
+    uv pip install {{ pip_target }} --reinstall-package imageio-ffmpeg --no-deps \


Reinstalling imageio-ffmpeg from source removes the bundled ffmpeg from the vLLM image, but vLLM-Omni video formatting still calls diffusers.export_to_video through imageio's ffmpeg writer and will fail without a configured ffmpeg/h264 encoder. Fix: copy the in-tree ffmpeg CLI/libs into the vLLM image, set IMAGEIO_FFMPEG_EXE, and ensure the vLLM video formatter uses h264_nvenc instead of imageio's default libx264.

🤖 AI Fix

In container/templates/vllm_runtime.Dockerfile, copy /usr/local/bin/ffmpeg plus libav*.so*, libsw*.so*, and lib*vpx*.so* from wheel_builder, run ldconfig, and set ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg; in components/src/dynamo/vllm/omni/output_formatter.py DiffusionFormatter._encode_video, replace diffusers.export_to_video with the shared encode_to_video_bytes(..., output_format="mp4") path so the codec is h264_nvenc.

The vllm-runtime build failed at install_vllm_omni.sh with "Git executable not found" because uv needs git to fetch the vllm-omni PR pin (git+https://...@65b83d87), but the upstream vllm/vllm-openai runtime image does not ship git. The released-wheel install never needed it. Add git to the existing omni apt step, gated on VLLM_OMNI_GIT_URL via ${VLLM_OMNI_GIT_URL:+git} so the PyPI-wheel path (and the eventual revert) keeps the runtime image lean. Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

saturley-hall · 2026-05-31T09:16:38Z

/ok to test 2c48064

The diffusion image tests fed bare MagicMock() objects as images. Since ebe6779 routed _prepare_images through normalize_image_frames(), a non-PIL input takes the np.asarray(item).max() path; MagicMock.__iter__ defaults to empty, so np.asarray(MagicMock()) is a zero-size array and arr.max() raises "zero-size array to reduction operation maximum". These 8 tests only ran in CI once the runtime image built, exposing the failure. Swap the MagicMock image doubles for real PIL images via a _make_pil_image() helper, so they hit the isinstance(item, Image.Image) pass-through and img.save(buf, format="PNG") produces real PNG bytes. Assertions unchanged. Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

saturley-hall · 2026-05-31T14:23:41Z

/ok to test 271214e

ayushag-nv added 3 commits May 29, 2026 10:35

feat(omni): add Cosmos3 image generation support

ebe6779

Signed-off-by: ayushag <ayushag@nvidia.com>

feat(examples): add Cosmos3 omni image/video launch scripts

b9b9ca3

Signed-off-by: ayushag <ayushag@nvidia.com>

feat(examples): add Cosmos3 omni image-to-video launch script

22812d0

Signed-off-by: ayushag <ayushag@nvidia.com>

pull-request-size Bot added the size/L label May 29, 2026

github-actions Bot added feat backend::vllm Relates to the vllm backend multimodal container labels May 29, 2026

fferroni reviewed May 29, 2026

View reviewed changes

ayushag-nv changed the title ~~feat(omni): add Cosmos3-Nano support to vLLM-Omni backend~~ feat(omni): add Cosmos3 support to vLLM-Omni backend May 29, 2026

chore(cosmos3): add docs and sample payloads; revert container git in…

7744835

…stall Signed-off-by: ayushag <ayushag@nvidia.com>

github-actions Bot added the documentation Improvements or additions to documentation label May 29, 2026

test(omni): add Cosmos3 tests and refine guide

0034bee

Signed-off-by: ayushag <ayushag@nvidia.com>

pull-request-size Bot added size/XL and removed size/L labels May 29, 2026

GuanLuo reviewed May 29, 2026

View reviewed changes

saturley-hall and others added 2 commits May 31, 2026 04:09

chore(container): build in-tree ffmpeg CLI and route imageio through …

22d56b9

…it (#10091) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> (cherry picked from commit dc2f352)

github-actions Bot added backend::sglang Relates to the sglang backend backend::trtllm Relates to the trtllm backend labels May 31, 2026

saturley-hall marked this pull request as ready for review May 31, 2026 08:27

saturley-hall requested review from a team as code owners May 31, 2026 08:27

saturley-hall added the blocked Waiting on external dependency label May 31, 2026

coderabbitai Bot reviewed May 31, 2026

View reviewed changes

dynamo-ops reviewed May 31, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to GITLAB May 31, 2026 09:16 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 31, 2026 09:17 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 31, 2026 14:23 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 31, 2026 14:27 Inactive

		\| `nvidia/Cosmos3-Nano` \| Smaller, faster — default in the Dynamo launch scripts below \| [link](https://huggingface.co/nvidia/Cosmos3-Nano) \|
		\| `nvidia/Cosmos3-Super` \| Larger, higher quality \| [link](https://huggingface.co/nvidia/Cosmos3-Super) \|

Conversation

ayushag-nv commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Dependency

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot Bot commented May 29, 2026

Uh oh!

fferroni May 29, 2026

Choose a reason for hiding this comment

Uh oh!

GuanLuo May 29, 2026

Choose a reason for hiding this comment

Uh oh!

ayushag-nv May 29, 2026

Choose a reason for hiding this comment

Uh oh!

saturley-hall commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 31, 2026

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

dynamo-ops May 31, 2026

Choose a reason for hiding this comment

Uh oh!

dynamo-ops May 31, 2026

Choose a reason for hiding this comment

Uh oh!

saturley-hall commented May 31, 2026

Uh oh!

saturley-hall commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ayushag-nv commented May 29, 2026 •

edited by coderabbitai Bot

Loading

github-actions Bot commented May 31, 2026 •

edited

Loading