feat(omni): add Cosmos3 support to vLLM-Omni backend#10132
Conversation
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
| -H 'Content-Type: application/json' \\ | ||
| -d '{ | ||
| "model": "${MODEL}", | ||
| "prompt": "A robot standing in a bright laboratory", |
There was a problem hiding this comment.
imo, for the examples, I think we should provide an appropriate JSON caption, not a dense one.
currently, we don't have any upsampling within the container, so our example captions should only be JSON strings.
If later on, we add JSON upsampling within the container, we can have a normal "dense" prompt as an example and then a extra parameter like "upsample_prompt=True" or whatever.
…stall Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: ayushag <ayushag@nvidia.com>
| @@ -0,0 +1,12 @@ | |||
| { | |||
There was a problem hiding this comment.
This is too long for a sample request. Where is this from? If from web download, can we just point to the url instead of have this in the repo?
There was a problem hiding this comment.
@GuanLuo This is from cosmos private repo. There is just a release branch. We can clean this up before merging. Till then more examples of payloads will be published publicly.
Cosmos3 pipelines are only in the unreleased vllm-omni PR vllm-project/vllm-omni#3454, not in any released wheel. Re-enable the git-install mechanism (reverted in 7744835) so the vllm-runtime container installs vllm-omni from the canonical repo pinned to the current PR head SHA (65b83d87, == refs/pull/3454/head). When vllm_omni_git_url is set, install_vllm_omni.sh installs "vllm-omni @ git+<url>@<ref>"; otherwise it falls back to the released "vllm-omni==<ref>" wheel. Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
/ok to test 22d56b9 |
WalkthroughThis PR integrates NVIDIA Cosmos3 omni model support into Dynamo's vLLM-Omni backend, including codec migration from libx264 to h264_nvenc for MP4 encoding, Cosmos3 guardrails toggleable configuration, FFmpeg build infrastructure with NVENC/VP9 encoder support, updated container runtime images to deploy LGPL ffmpeg instead of bundled GPL binary wheels, vLLM-Omni git-based installation support, and comprehensive documentation with launch examples for text-to-image, text-to-video, and image-to-video generation. ChangesCosmos3 Model Integration
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Trivy (0.69.3)Trivy execution failed: 2026-05-31T08:28:08Z FATAL Fatal error run error: fs scan error: scan error: scan failed: failed analysis: post analysis error: post analysis error: ansible scan error: fs filter error: fs filter error: walk error range error: stat .coderabbit-opengrep-fallback.yml: no such file or directory: range error: stat .coderabbit-opengrep-fallback.yml: no such file or directory Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
components/src/dynamo/common/utils/video_utils.py (1)
93-93: 💤 Low valueHoist the PIL import to module top.
from PIL import Imageis imported inside the function. The module already implicitly relies on PIL (frames_to_numpyoperates on PIL Images), so moving this to the top-level imports is safe and aligns with the repo convention.♻️ Proposed change
import numpy as np +from PIL import Image- from PIL import Image - out: list = []As per coding guidelines: "Keep all imports at module top (no imports inside functions/classes)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/src/dynamo/common/utils/video_utils.py` at line 93, Move the local import "from PIL import Image" out of the function and add it to the module's top-level imports in components/src/dynamo/common/utils/video_utils.py; specifically remove the in-function import (the one used by frames_to_numpy) and place "from PIL import Image" alongside the other imports at file top so frames_to_numpy and any other PIL-dependent functions reference the module-level Image symbol.container/templates/trtllm_runtime.Dockerfile (1)
154-161: 💤 Low valueInconsistent error handling for libav/libsw copies compared to sglang.
In
sglang_runtime.Dockerfile, thelibav*.so*andlibsw*.so*copies (line 43) will fail the build if missing, but here (lines 155-156) they silently succeed with|| true. If FFmpeg encoding is required for TRT-LLM diffusion (as the comment states), these libraries are also required for theffmpegbinary to function.Consider removing
2>/dev/null || truefrom lines 155-156 to match the sglang behavior and fail fast if the wheel_builder stage is missing required libraries.Suggested change for consistency
RUN --mount=type=bind,from=wheel_builder,source=/usr/local/,target=/tmp/usr/local/ \ - cp -nL /tmp/usr/local/lib/libav*.so* /usr/local/lib/ 2>/dev/null || true && \ - cp -nL /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ 2>/dev/null || true && \ + cp -nL /tmp/usr/local/lib/libav*.so* /usr/local/lib/ && \ + cp -nL /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ && \ cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ 2>/dev/null || true && \ cp -nL /tmp/usr/local/bin/ffmpeg /usr/local/bin/ffmpeg && \ cp -r /tmp/usr/local/src/ffmpeg /usr/local/src/ && \ ldconfig🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@container/templates/trtllm_runtime.Dockerfile` around lines 154 - 161, The RUN step currently silences failures for critical libs by appending "2>/dev/null || true" to the cp of libav*.so* and libsw*.so*; remove the "2>/dev/null || true" (and optional "2>/dev/null") from the cp commands that match "cp -nL /tmp/usr/local/lib/libav*.so*" and "cp -nL /tmp/usr/local/lib/libsw*.so*" so the build fails fast if those libraries are missing (keeping the cp for lib*vpx*.so* and the ffmpeg cp/ldconfig as-is), since ffmpeg (ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg) requires these libs to function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/backends/vllm/cosmos3.md`:
- Around line 112-114: The multiline shell snippet in the docs uses backslash
line continuations with inline comments after the backslashes which breaks
copy/paste execution; update the example around the flags --output-modalities,
--no-cosmos3-guardrails, and --media-output-fs-url so comments are on their own
lines (or provide separate full-command variants) instead of trailing the
backslashes, ensuring each continued line ends only with the backslash and the
flag text so the shell command is copy/paste-safe.
- Around line 31-32: Replace the generic link text "[link]" with descriptive
labels matching the checkpoint names so the table rows for `nvidia/Cosmos3-Nano`
and `nvidia/Cosmos3-Super` use link text like "Cosmos3-Nano" and "Cosmos3-Super"
respectively; update the markdown links in the table to read
`[Cosmos3-Nano](https://huggingface.co/nvidia/Cosmos3-Nano)` and
`[Cosmos3-Super](https://huggingface.co/nvidia/Cosmos3-Super)` so linting passes
and the labels clearly identify the checkpoints.
- Around line 31-32: The HF checkpoint links for the model names
`nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` in the docs page are returning
401 in link-check CI; update the two Markdown link targets for those model
entries so they point to publicly accessible URLs that pass docs link-check (for
example swap the current https://huggingface.co/... checkpoint links for the
public model hub pages or an official NVIDIA/public docs page), or alternatively
add those exact HF URLs/statuses to the docs-link-check allowlist; change the
two link targets referenced alongside the `nvidia/Cosmos3-Nano` and
`nvidia/Cosmos3-Super` entries to the new URLs or add them to the allowlist so
CI no longer fails.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_i2v.sh`:
- Line 50: Remove the fragile fixed wait ("sleep 2") from
agg_omni_cosmos3_i2v.sh and replace it with the project’s shared health-check
orchestration: remove the "sleep 2" line and invoke the centralized readiness
check (use the launch framework's health-check helper or wait-for-ready wrapper
used by other launch scripts) to block until the service reports healthy; ensure
you call the same health-check entrypoint used elsewhere in the repo so the
script follows the launch-script convention for readiness handling.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_image.sh`:
- Around line 15-16: The launcher missing gpu_utils integration should source
gpu_utils.sh (via SCRIPT_DIR/../../../common/gpu_utils.sh) and use
build_vllm_gpu_mem_args() when constructing the vLLM CLI invocation in
agg_omni_cosmos3_image.sh; update the script to source gpu_utils.sh near the
other shared utils and insert the output of build_vllm_gpu_mem_args into the
vLLM/vllm-server command-line assembly so GPU memory flags are consistent with
other Cosmos3 launchers.
- Line 48: Replace the fixed "sleep 2" with a proper readiness check: remove the
"sleep 2" line and instead call the shared framework health-check/ready helper
(e.g., a common script or function like wait_for_framework_ready or
check_framework_health) in a loop with a timeout and non-zero exit if not ready;
ensure the script waits for the specific service(s) the launch depends on and
logs progress/errors so startup is deterministic and not flaky.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_video.sh`:
- Line 49: Replace the fixed "sleep 2" with a call to the repository's shared
readiness-check helper (instead of a blind sleep, invoke the common
wait-for-ready/health-check script or function used elsewhere), passing the
service endpoint/port or health URL for the component started in this script and
fail the launch if the check returns non-zero; specifically remove the "sleep 2"
line and invoke the shared readiness checker (e.g., wait_for_service or
wait-for-ready) with the correct args so the script blocks until a successful
health response and exits on timeout/error.
---
Nitpick comments:
In `@components/src/dynamo/common/utils/video_utils.py`:
- Line 93: Move the local import "from PIL import Image" out of the function and
add it to the module's top-level imports in
components/src/dynamo/common/utils/video_utils.py; specifically remove the
in-function import (the one used by frames_to_numpy) and place "from PIL import
Image" alongside the other imports at file top so frames_to_numpy and any other
PIL-dependent functions reference the module-level Image symbol.
In `@container/templates/trtllm_runtime.Dockerfile`:
- Around line 154-161: The RUN step currently silences failures for critical
libs by appending "2>/dev/null || true" to the cp of libav*.so* and libsw*.so*;
remove the "2>/dev/null || true" (and optional "2>/dev/null") from the cp
commands that match "cp -nL /tmp/usr/local/lib/libav*.so*" and "cp -nL
/tmp/usr/local/lib/libsw*.so*" so the build fails fast if those libraries are
missing (keeping the cp for lib*vpx*.so* and the ffmpeg cp/ldconfig as-is),
since ffmpeg (ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg) requires these libs
to function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4d3be066-b46b-43d3-8b8b-18d9296de8e8
📒 Files selected for processing (29)
components/src/dynamo/common/tests/test_video_utils.pycomponents/src/dynamo/common/utils/video_utils.pycomponents/src/dynamo/sglang/request_handlers/video_generation/video_generation_handler.pycomponents/src/dynamo/vllm/omni/args.pycomponents/src/dynamo/vllm/omni/base_handler.pycomponents/src/dynamo/vllm/omni/output_formatter.pycomponents/src/dynamo/vllm/tests/omni/test_omni_args.pycomponents/src/dynamo/vllm/tests/omni/test_omni_base_handler.pycontainer/context.yamlcontainer/deps/requirements.common.txtcontainer/deps/requirements.sglang.txtcontainer/deps/requirements.trtllm.txtcontainer/deps/requirements.vllm.txtcontainer/deps/vllm/install_vllm_omni.shcontainer/templates/args.Dockerfilecontainer/templates/dynamo_base.Dockerfilecontainer/templates/dynamo_runtime.Dockerfilecontainer/templates/sglang_runtime.Dockerfilecontainer/templates/trtllm_runtime.Dockerfilecontainer/templates/vllm_runtime.Dockerfilecontainer/templates/wheel_builder.Dockerfiledocs/backends/trtllm/trtllm-diffusion.mddocs/backends/vllm/cosmos3.mdexamples/backends/vllm/launch/agg_omni_cosmos3_i2v.shexamples/backends/vllm/launch/agg_omni_cosmos3_image.shexamples/backends/vllm/launch/agg_omni_cosmos3_video.shexamples/backends/vllm/launch/cosmos3/i2v.jsonexamples/backends/vllm/launch/cosmos3/t2i.jsonexamples/backends/vllm/launch/cosmos3/t2v.json
| | `nvidia/Cosmos3-Nano` | Smaller, faster — default in the Dynamo launch scripts below | [link](https://huggingface.co/nvidia/Cosmos3-Nano) | | ||
| | `nvidia/Cosmos3-Super` | Larger, higher quality | [link](https://huggingface.co/nvidia/Cosmos3-Super) | |
There was a problem hiding this comment.
Use descriptive link labels for checkpoint URLs.
[link] is too generic and already flagged by markdownlint. Use labels like Cosmos3-Nano / Cosmos3-Super.
As per coding guidelines, for **/*.md documentation quality should be maintained; replacing non-descriptive link text improves clarity and lint compliance.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 31-31: Link text should be descriptive
(MD059, descriptive-link-text)
[warning] 32-32: Link text should be descriptive
(MD059, descriptive-link-text)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/backends/vllm/cosmos3.md` around lines 31 - 32, Replace the generic link
text "[link]" with descriptive labels matching the checkpoint names so the table
rows for `nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` use link text like
"Cosmos3-Nano" and "Cosmos3-Super" respectively; update the markdown links in
the table to read `[Cosmos3-Nano](https://huggingface.co/nvidia/Cosmos3-Nano)`
and `[Cosmos3-Super](https://huggingface.co/nvidia/Cosmos3-Super)` so linting
passes and the labels clearly identify the checkpoints.
Fix the checkpoint links that currently fail docs link-check CI.
The current Hugging Face checkpoint URLs are failing lychee with 401, which blocks docs checks. Please switch these to URLs that pass CI (or update the docs-link-check allowlist for these exact domains/statuses).
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 31-31: Link text should be descriptive
(MD059, descriptive-link-text)
[warning] 32-32: Link text should be descriptive
(MD059, descriptive-link-text)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/backends/vllm/cosmos3.md` around lines 31 - 32, The HF checkpoint links
for the model names `nvidia/Cosmos3-Nano` and `nvidia/Cosmos3-Super` in the docs
page are returning 401 in link-check CI; update the two Markdown link targets
for those model entries so they point to publicly accessible URLs that pass docs
link-check (for example swap the current https://huggingface.co/... checkpoint
links for the public model hub pages or an official NVIDIA/public docs page), or
alternatively add those exact HF URLs/statuses to the docs-link-check allowlist;
change the two link targets referenced alongside the `nvidia/Cosmos3-Nano` and
`nvidia/Cosmos3-Super` entries to the new URLs or add them to the allowlist so
CI no longer fails.
| --output-modalities image \ # or: video | ||
| --no-cosmos3-guardrails \ # skip loading the safety guardrail models | ||
| --media-output-fs-url file:///tmp/dynamo_media |
There was a problem hiding this comment.
The multiline shell example is not copy/paste-safe.
The inline comments after line-continuation backslashes break the command. Move those comments to separate lines (or provide separate command variants) so the snippet executes as documented.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/backends/vllm/cosmos3.md` around lines 112 - 114, The multiline shell
snippet in the docs uses backslash line continuations with inline comments after
the backslashes which breaks copy/paste execution; update the example around the
flags --output-modalities, --no-cosmos3-guardrails, and --media-output-fs-url so
comments are on their own lines (or provide separate full-command variants)
instead of trailing the backslashes, ensuring each continued line ends only with
the backslash and the flag text so the shell command is copy/paste-safe.
| python -m dynamo.frontend & | ||
| FRONTEND_PID=$! | ||
|
|
||
| sleep 2 |
There was a problem hiding this comment.
Remove fixed readiness sleep and use shared health-check orchestration.
sleep 2 is a fragile startup gate and violates the launch-script convention for readiness handling.
As per coding guidelines, launch scripts should “Avoid readiness sleeps/polls; rely on the shared framework health-check patterns instead.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_i2v.sh` at line 50, Remove the
fragile fixed wait ("sleep 2") from agg_omni_cosmos3_i2v.sh and replace it with
the project’s shared health-check orchestration: remove the "sleep 2" line and
invoke the centralized readiness check (use the launch framework's health-check
helper or wait-for-ready wrapper used by other launch scripts) to block until
the service reports healthy; ensure you call the same health-check entrypoint
used elsewhere in the repo so the script follows the launch-script convention
for readiness handling.
| source "$SCRIPT_DIR/../../../common/launch_utils.sh" | ||
|
|
There was a problem hiding this comment.
Align this launcher with shared vLLM GPU-memory utilities.
This script skips gpu_utils.sh and does not use build_vllm_gpu_mem_args, so users can’t control VRAM behavior consistently with the other Cosmos3 launchers.
As per coding guidelines, launchers should source gpu_utils.sh and “Use build_vllm_gpu_mem_args() to construct GPU memory CLI flags for vLLM.”
Also applies to: 50-57
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 15-15: Not following: ./../../../common/launch_utils.sh was not specified as input (see shellcheck -x).
(SC1091)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_image.sh` around lines 15 -
16, The launcher missing gpu_utils integration should source gpu_utils.sh (via
SCRIPT_DIR/../../../common/gpu_utils.sh) and use build_vllm_gpu_mem_args() when
constructing the vLLM CLI invocation in agg_omni_cosmos3_image.sh; update the
script to source gpu_utils.sh near the other shared utils and insert the output
of build_vllm_gpu_mem_args into the vLLM/vllm-server command-line assembly so
GPU memory flags are consistent with other Cosmos3 launchers.
| python -m dynamo.frontend & | ||
| FRONTEND_PID=$! | ||
|
|
||
| sleep 2 |
There was a problem hiding this comment.
Replace fixed startup sleep with framework readiness handling.
Using sleep 2 introduces flaky startup behavior across machines.
As per coding guidelines, launch scripts should “Avoid readiness sleeps/polls; rely on the shared framework health-check patterns instead.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_image.sh` at line 48, Replace
the fixed "sleep 2" with a proper readiness check: remove the "sleep 2" line and
instead call the shared framework health-check/ready helper (e.g., a common
script or function like wait_for_framework_ready or check_framework_health) in a
loop with a timeout and non-zero exit if not ready; ensure the script waits for
the specific service(s) the launch depends on and logs progress/errors so
startup is deterministic and not flaky.
| python -m dynamo.frontend & | ||
| FRONTEND_PID=$! | ||
|
|
||
| sleep 2 |
There was a problem hiding this comment.
Use shared readiness checks instead of a fixed sleep.
sleep 2 is not reliable for service readiness and can fail under slower startup conditions.
As per coding guidelines, launch scripts should “Avoid readiness sleeps/polls; rely on the shared framework health-check patterns instead.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/backends/vllm/launch/agg_omni_cosmos3_video.sh` at line 49, Replace
the fixed "sleep 2" with a call to the repository's shared readiness-check
helper (instead of a blind sleep, invoke the common wait-for-ready/health-check
script or function used elsewhere), passing the service endpoint/port or health
URL for the component started in this script and fail the launch if the check
returns non-zero; specifically remove the "sleep 2" line and invoke the shared
readiness checker (e.g., wait_for_service or wait-for-ready) with the correct
args so the script blocks until a successful health response and exits on
timeout/error.
| out.append(item) | ||
| continue | ||
| arr = np.asarray(item) | ||
| while arr.ndim > 4: # [batch, frames, H, W, C] -> [frames, H, W, C] |
There was a problem hiding this comment.
normalize_image_frames collapses a [B, F, H, W, C] Cosmos3 array by taking arr[0], so image requests with n > 1 silently drop every generated batch after the first. Fix: preserve and flatten all leading batch/frame dimensions before converting frames to PIL images.
🤖 AI Fix
In components/src/dynamo/common/utils/video_utils.py, update normalize_image_frames to replace the while arr.ndim > 4: arr = arr[0] logic with validation that the last three dimensions are H, W, C and arr = arr.reshape((-1, *arr.shape[-3:])) so all [B, F, H, W, C] outputs are emitted.
| RUN --mount=type=bind,source=./container/deps/requirements.vllm.txt,target=/tmp/requirements.vllm.txt \ | ||
| --mount=type=cache,target=/root/.cache/uv,sharing=locked \ | ||
| export UV_CACHE_DIR=/root/.cache/uv && \ | ||
| uv pip install {{ pip_target }} --reinstall-package imageio-ffmpeg --no-deps \ |
There was a problem hiding this comment.
Reinstalling imageio-ffmpeg from source removes the bundled ffmpeg from the vLLM image, but vLLM-Omni video formatting still calls diffusers.export_to_video through imageio's ffmpeg writer and will fail without a configured ffmpeg/h264 encoder. Fix: copy the in-tree ffmpeg CLI/libs into the vLLM image, set IMAGEIO_FFMPEG_EXE, and ensure the vLLM video formatter uses h264_nvenc instead of imageio's default libx264.
🤖 AI Fix
In container/templates/vllm_runtime.Dockerfile, copy /usr/local/bin/ffmpeg plus libav*.so*, libsw*.so*, and lib*vpx*.so* from wheel_builder, run ldconfig, and set ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg; in components/src/dynamo/vllm/omni/output_formatter.py DiffusionFormatter._encode_video, replace diffusers.export_to_video with the shared encode_to_video_bytes(..., output_format="mp4") path so the codec is h264_nvenc.
The vllm-runtime build failed at install_vllm_omni.sh with "Git executable
not found" because uv needs git to fetch the vllm-omni PR pin
(git+https://...@65b83d87), but the upstream vllm/vllm-openai runtime image
does not ship git. The released-wheel install never needed it.
Add git to the existing omni apt step, gated on VLLM_OMNI_GIT_URL via
${VLLM_OMNI_GIT_URL:+git} so the PyPI-wheel path (and the eventual revert)
keeps the runtime image lean.
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
/ok to test 2c48064 |
The diffusion image tests fed bare MagicMock() objects as images. Since ebe6779 routed _prepare_images through normalize_image_frames(), a non-PIL input takes the np.asarray(item).max() path; MagicMock.__iter__ defaults to empty, so np.asarray(MagicMock()) is a zero-size array and arr.max() raises "zero-size array to reduction operation maximum". These 8 tests only ran in CI once the runtime image built, exposing the failure. Swap the MagicMock image doubles for real PIL images via a _make_pil_image() helper, so they hit the isinstance(item, Image.Image) pass-through and img.save(buf, format="PNG") produces real PNG bytes. Assertions unchanged. Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
/ok to test 271214e |
Summary
Adds Dynamo vLLM-Omni backend support for NVIDIA Cosmos3 (
nvidia/Cosmos3-Nano/-Super) — text-to-image, text-to-video, and image-to-video — backed by the native Cosmos3 pipeline in vllm-project/vllm-omni#3454.Changes
Worker integration (
components/src/dynamo/vllm/omni,common/utils)--cosmos3-guardrails/--no-cosmos3-guardrailsflag, routed intoAsyncOmni(model_config={"guardrails": False})so the Cosmos3 safety-guardrail models can be skipped at startup.normalize_image_frames()+ the image output path: the native Cosmos3 pipeline returns numpy[batch, frames, H, W, C]arrays (not PIL), so the formatter normalizes them before PNG-encoding/v1/images/generationsresponses.Examples & docs
agg_omni_cosmos3_{image,video,i2v}.sh(one modality per worker).examples/backends/vllm/launch/cosmos3/(official Cosmos3 prompts mapped to the Dynamo request schema).docs/backends/vllm/cosmos3.md(install, serving, request formats, gotchas).Tests for
normalize_image_frames, the guardrails arg, and themodel_configpassthrough.Dependency
Requires the Cosmos3 pipeline from vllm-omni#3454 (not yet in a released vLLM-Omni). This PR is the Dynamo-side integration only — container pinning of vLLM-Omni is intentionally not included; install vLLM-Omni from that PR per the guide.
Notes
--output-modalities image|video) — request type derives from the worker's configured modality, not the HTTP endpoint.sizeis the OpenAI enum; per-requestnum_inference_steps/guidance_scale/seedgo undernvext; i2vinput_referencemust be an http(s) URL or adata:URI (local paths rejected).Verified t2i / t2v / i2v end-to-end through the frontend.
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
--cosmos3-guardrailsCLI flag to control optional safety features.Improvements
Documentation
Tests