fix(container,media): purge GPL ffmpeg from vllm-runtime, route omni video through LGPL h264_nvenc#10301
fix(container,media): purge GPL ffmpeg from vllm-runtime, route omni video through LGPL h264_nvenc#10301saturley-hall wants to merge 1 commit into
Conversation
…ni video through LGPL h264_nvenc PR #10091 only stripped the imageio-ffmpeg wheel's GPL binary on the vllm image; the base vllm/vllm-openai apt ffmpeg stack (libav*/libx264/libx265/ libmp3lame) and the explicitly-installed sox/libsox-fmt-all were left in place, so the image still ships GPL codecs. Because main can run omni video models, simply purging would break video: the in-tree LGPL ffmpeg has no libx264 (which diffusers.export_to_video defaults to) and its libswscale RGB->YUV path is broken (greens render magenta). So this forward-ports the full cosmos3 fix, not just the compliance subset. Container (vllm_runtime.Dockerfile): - Drop sox/libsox-fmt-all (vLLM-Omni's audio path is pure-numpy now). - Purge the base GPL ffmpeg/codec apt stack, pinning CUDA libs as manually-installed first so autoremove cannot delete libcublas/libcusolver/ libcusparse and break GPU inference. - Copy the LGPL in-tree ffmpeg (libs + CLI, already built unconditionally in wheel_builder) and set IMAGEIO_FFMPEG_EXE. The imageio-ffmpeg --no-binary reinstall is unchanged. Media (encoder, so video stays correct on the LGPL build): - output_formatter._encode_video now encodes via video_utils.encode_to_video_bytes (h264_nvenc / libvpx-vp9) instead of diffusers.export_to_video. - encode_to_video_bytes pre-converts RGB->YUV420p in numpy (BT.601 full range) and pipes planar YUV to ffmpeg, bypassing the broken libswscale path (#10159). - normalize_image_frames flattens Cosmos3-style [B,F,H,W,C] numpy to PIL. Tests updated to match (subprocess-based encode + normalize_image_frames); ATTRIBUTIONS snapshot from the source commits intentionally omitted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
WalkthroughThis pull request migrates video encoding from an ChangesVideo encoding pipeline overhaul
Docker runtime environment
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
components/src/dynamo/common/tests/test_video_utils.py (1)
118-123: ⚡ Quick winHoist the new test imports to module scope.
The added
PILandvideo_utilsimports are inside test functions even though they are not guarded by an optional-dependency skip path. Please keep them at the module top for consistency with the repo’s Python rules.As per coding guidelines, "Keep all imports at module top (flag any import inside functions/classes)."
Also applies to: 131-135, 144-148, 161-163
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/src/dynamo/common/tests/test_video_utils.py` around lines 118 - 123, Move the inline imports for PIL and the video utils out of the test functions and hoist them to module scope: remove the from PIL import Image and from dynamo.common.utils.video_utils import normalize_image_frames (and similar imports found in the other test functions at 131-135, 144-148, 161-163) and place them at the top of test_video_utils.py so all tests (including test_pil_inputs_returned_by_identity) use module-level imports in accordance with the repository import rule.components/src/dynamo/vllm/omni/output_formatter.py (1)
147-153: ⚡ Quick winMove the full frame-prep pipeline off the event loop.
normalize_image_frames()andframes_to_numpy()can do a lot of PIL/NumPy copying before you hop toasyncio.to_thread(). On larger videos that blocks the loop and serializes unrelated requests. Wrap the normalization/conversion together withencode_to_video_bytes()in the worker thread.Suggested shape
- frames_np = frames_to_numpy(normalize_image_frames(images)) - video_bytes = await asyncio.to_thread( - encode_to_video_bytes, - frames_np, - fps=fps, - output_format=output_format, - ) + def _encode() -> bytes: + frames_np = frames_to_numpy(normalize_image_frames(images)) + return encode_to_video_bytes( + frames_np, + fps=fps, + output_format=output_format, + ) + + video_bytes = await asyncio.to_thread(_encode)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/src/dynamo/vllm/omni/output_formatter.py` around lines 147 - 153, The current code calls normalize_image_frames() and frames_to_numpy() on the event loop before awaiting asyncio.to_thread(encode_to_video_bytes...), which can block the loop for large videos; fix this by moving the entire frame-prep pipeline into the worker thread: wrap normalize_image_frames(images), frames_to_numpy(...) and encode_to_video_bytes(...) together inside the callable passed to asyncio.to_thread (or a small helper function), passing fps and output_format through so that the event loop only awaits the thread result and does no heavy PIL/NumPy work; update references to frames_np/video_bytes accordingly and keep parameter names (images, fps, output_format) intact.components/src/dynamo/common/utils/video_utils.py (1)
84-92: ⚡ Quick winMove these imports to module scope.
PIL.Image,subprocess, andtempfileare imported inside hot-path helpers. That violates the repo’s Python rule and makes the dependency/patch surface less predictable than necessary.As per coding guidelines, "Keep all imports at module top (flag any import inside functions/classes)."
Also applies to: 249-250
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/src/dynamo/common/utils/video_utils.py` around lines 84 - 92, Move the inline imports out of hot-path helper functions and place them at module scope: import PIL.Image as Image (or from PIL import Image), import subprocess, and import tempfile at the top of the file, then remove the corresponding in-function imports in normalize_image_frames and the other helper(s) referenced around lines 249-250; ensure functions like normalize_image_frames reference the module-level names (Image, subprocess, tempfile) instead of importing them locally.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@components/src/dynamo/common/utils/video_utils.py`:
- Around line 275-281: Add a timeout to the subprocess.run call and handle
subprocess.TimeoutExpired the same way as subprocess.CalledProcessError: update
the call to subprocess.run(cmd + [tmp.name], input=yuv, check=True,
capture_output=True, timeout=...) (use a reasonable constant like
VIDEO_ENCODE_TIMEOUT) and add an except subprocess.TimeoutExpired as e block
that raises RuntimeError with the same message format using e.stderr (decode
with errors='replace') and chain the exception (from e) so timeouts map to the
existing failure path; keep the existing except subprocess.CalledProcessError
handling unchanged.
In `@container/templates/vllm_runtime.Dockerfile`:
- Around line 234-239: The current RUN chain swallows failures for the critical
libav/libsw copies because the trailing `|| true` is applied to the whole
left-associative chain; remove `|| true` from the `cp -nL
/tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/` group and instead scope it only
to the optional vpx copy so failures in `cp -nL /tmp/usr/local/lib/libav*.so*
/tmp/usr/local/lib/libsw*.so* /usr/local/lib/` (and the subsequent
pkgconfig/bin/src copies and `ldconfig`) still cause the build to fail; you can
do this by splitting the vpx copy into its own command or by wrapping only that
cp with `(...) || true`, keeping the rest of the cp lines and `ldconfig`
unchanged so missing critical libraries trigger a non-zero exit.
---
Nitpick comments:
In `@components/src/dynamo/common/tests/test_video_utils.py`:
- Around line 118-123: Move the inline imports for PIL and the video utils out
of the test functions and hoist them to module scope: remove the from PIL import
Image and from dynamo.common.utils.video_utils import normalize_image_frames
(and similar imports found in the other test functions at 131-135, 144-148,
161-163) and place them at the top of test_video_utils.py so all tests
(including test_pil_inputs_returned_by_identity) use module-level imports in
accordance with the repository import rule.
In `@components/src/dynamo/common/utils/video_utils.py`:
- Around line 84-92: Move the inline imports out of hot-path helper functions
and place them at module scope: import PIL.Image as Image (or from PIL import
Image), import subprocess, and import tempfile at the top of the file, then
remove the corresponding in-function imports in normalize_image_frames and the
other helper(s) referenced around lines 249-250; ensure functions like
normalize_image_frames reference the module-level names (Image, subprocess,
tempfile) instead of importing them locally.
In `@components/src/dynamo/vllm/omni/output_formatter.py`:
- Around line 147-153: The current code calls normalize_image_frames() and
frames_to_numpy() on the event loop before awaiting
asyncio.to_thread(encode_to_video_bytes...), which can block the loop for large
videos; fix this by moving the entire frame-prep pipeline into the worker
thread: wrap normalize_image_frames(images), frames_to_numpy(...) and
encode_to_video_bytes(...) together inside the callable passed to
asyncio.to_thread (or a small helper function), passing fps and output_format
through so that the event loop only awaits the thread result and does no heavy
PIL/NumPy work; update references to frames_np/video_bytes accordingly and keep
parameter names (images, fps, output_format) intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 802ceb91-00d6-4b16-9049-2293eb407fb1
📒 Files selected for processing (5)
components/src/dynamo/common/tests/test_video_utils.pycomponents/src/dynamo/common/utils/video_utils.pycomponents/src/dynamo/vllm/omni/output_formatter.pycomponents/src/dynamo/vllm/tests/omni/test_output_formatter.pycontainer/templates/vllm_runtime.Dockerfile
| with tempfile.NamedTemporaryFile(suffix=f".{output_format}") as tmp: | ||
| try: | ||
| import imageio as iio # type: ignore[no-redef] | ||
| except ImportError: | ||
| raise ImportError( | ||
| "imageio is required for video encoding. " | ||
| "Install with: pip install imageio[ffmpeg]" | ||
| ) | ||
|
|
||
| logger.info(f"Encoding {len(frames)} frames to {output_format} bytes at {fps} fps") | ||
|
|
||
| try: | ||
| buffer = io.BytesIO() | ||
|
|
||
| kwargs: dict = {"fps": fps} | ||
| if output_format == "webm": | ||
| kwargs["codec"] = "libvpx-vp9" | ||
| elif output_format == "mp4": | ||
| kwargs["codec"] = "h264_nvenc" | ||
| else: | ||
| raise ValueError(f"No codec specified for response format: {output_format}") | ||
|
|
||
| if hasattr(iio, "imwrite"): | ||
| # v3 API | ||
| iio.imwrite(buffer, frames, extension=f".{output_format}", **kwargs) | ||
| else: | ||
| # v2 API | ||
| writer = iio.get_writer( # type: ignore[attr-defined] | ||
| buffer, format="FFMPEG", mode="I", **kwargs | ||
| ) | ||
| try: | ||
| for frame in frames: | ||
| writer.append_data(frame) | ||
| finally: | ||
| writer.close() | ||
|
|
||
| video_bytes = buffer.getvalue() | ||
| logger.info(f"Encoded video to {len(video_bytes)} bytes") | ||
| return video_bytes | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Failed to encode video to bytes: {e}") | ||
| raise RuntimeError(f"Video encoding to bytes failed: {e}") from e | ||
| subprocess.run(cmd + [tmp.name], input=yuv, check=True, capture_output=True) | ||
| except subprocess.CalledProcessError as e: | ||
| raise RuntimeError( | ||
| f"Video encoding to bytes failed: {e.stderr.decode(errors='replace')}" | ||
| ) from e |
There was a problem hiding this comment.
Add a timeout around the ffmpeg subprocess.
A wedged ffmpeg process will block this request path indefinitely because subprocess.run(...) has no timeout. Please bound the call and translate TimeoutExpired into the same failure path as CalledProcessError.
Suggested fix
try:
- subprocess.run(cmd + [tmp.name], input=yuv, check=True, capture_output=True)
+ subprocess.run(
+ [*cmd, tmp.name],
+ input=yuv,
+ check=True,
+ capture_output=True,
+ timeout=30,
+ )
except subprocess.CalledProcessError as e:
raise RuntimeError(
f"Video encoding to bytes failed: {e.stderr.decode(errors='replace')}"
) from e
+ except subprocess.TimeoutExpired as e:
+ raise RuntimeError("Video encoding to bytes timed out") from e🧰 Tools
🪛 Ruff (0.15.15)
[error] 277-277: subprocess call: check for execution of untrusted input
(S603)
[warning] 277-277: Consider [*cmd, tmp.name] instead of concatenation
Replace with [*cmd, tmp.name]
(RUF005)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@components/src/dynamo/common/utils/video_utils.py` around lines 275 - 281,
Add a timeout to the subprocess.run call and handle subprocess.TimeoutExpired
the same way as subprocess.CalledProcessError: update the call to
subprocess.run(cmd + [tmp.name], input=yuv, check=True, capture_output=True,
timeout=...) (use a reasonable constant like VIDEO_ENCODE_TIMEOUT) and add an
except subprocess.TimeoutExpired as e block that raises RuntimeError with the
same message format using e.stderr (decode with errors='replace') and chain the
exception (from e) so timeouts map to the existing failure path; keep the
existing except subprocess.CalledProcessError handling unchanged.
| cp -nL /tmp/usr/local/lib/libav*.so* /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ && \ | ||
| cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ 2>/dev/null || true && \ | ||
| cp -nL /tmp/usr/local/lib/pkgconfig/libav*.pc /tmp/usr/local/lib/pkgconfig/libsw*.pc /usr/local/lib/pkgconfig/ && \ | ||
| cp -r /tmp/usr/local/src/ffmpeg /usr/local/src/ | ||
| {% endif %} | ||
| cp -nL /tmp/usr/local/bin/ffmpeg /usr/local/bin/ffmpeg && \ | ||
| cp -r /tmp/usr/local/src/ffmpeg /usr/local/src/ && \ | ||
| ldconfig |
There was a problem hiding this comment.
|| true masks failures of the critical libav/libsw copies, not just the optional vpx copy.
This RUN has no set -e, so it relies on the && chain for fail-fast. Because && and || share precedence and are left-associative, the || true on Line 235 applies to the entire preceding group (mkdir … && cp …libav*.so*/libsw*.so* …). If the critical libav*.so*/libsw*.so* copy fails (e.g. wheel_builder didn't emit them), the failure is swallowed, ldconfig still runs, and the image builds green with a broken/missing LGPL ffmpeg runtime — surfacing only later as a failed omni video-export path.
Isolate || true to just the optional vpx copy:
🔒 Proposed fix to scope || true to the vpx copy
- cp -nL /tmp/usr/local/lib/libav*.so* /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ && \
- cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ 2>/dev/null || true && \
+ cp -nL /tmp/usr/local/lib/libav*.so* /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ && \
+ { cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ 2>/dev/null || true; } && \🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@container/templates/vllm_runtime.Dockerfile` around lines 234 - 239, The
current RUN chain swallows failures for the critical libav/libsw copies because
the trailing `|| true` is applied to the whole left-associative chain; remove
`|| true` from the `cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/`
group and instead scope it only to the optional vpx copy so failures in `cp -nL
/tmp/usr/local/lib/libav*.so* /tmp/usr/local/lib/libsw*.so* /usr/local/lib/`
(and the subsequent pkgconfig/bin/src copies and `ldconfig`) still cause the
build to fail; you can do this by splitting the vpx copy into its own command or
by wrapping only that cp with `(...) || true`, keeping the rest of the cp lines
and `ldconfig` unchanged so missing critical libraries trigger a non-zero exit.
|
|
||
| # Pre-convert RGB->YUV420p in numpy and feed planar YUV directly, bypassing | ||
| # the in-tree ffmpeg's broken libswscale RGB->YUV path. | ||
| yuv = _rgb_to_yuv420p(frames) |
There was a problem hiding this comment.
The encoder materializes the whole raw YUV video in memory before ffmpeg starts, while _rgb_to_yuv420p also allocates full-video float32 RGB/Y/U/V arrays, so default-sized video requests can consume well over 1 GB transient memory per request. Fix: convert and write frames to ffmpeg stdin incrementally instead of returning one whole-video bytes payload.
🤖 AI Fix
In components/src/dynamo/common/utils/video_utils.py, refactor encode_to_video_bytes and _rgb_to_yuv420p to convert RGB to YUV420p per frame or bounded chunk and feed ffmpeg through subprocess.Popen(..., stdin=subprocess.PIPE, ...), then close stdin, check the return code, and read the temp output file.
| | grep -E '^(libcu|libnv|libnccl|cuda)' || true); \ | ||
| if [ -n "$keep" ]; then apt-mark manual $keep >/dev/null; fi; \ | ||
| purge=$(dpkg-query -W -f='${Package}\n' 2>/dev/null \ | ||
| | grep -E '^(ffmpeg|libav[a-z]|libsw[a-z]|libpostproc|libx264|libx265|libmp3lame|libaom|libdav1d|libvpx|libtheora|libvorbis|libopus|libsoxr)' \ |
There was a problem hiding this comment.
The purge regex matches non-ffmpeg packages such as libavahi* and libavif*, so the build can remove unrelated runtime libraries from the base image. Fix: enumerate only the intended FFmpeg library package prefixes.
🤖 AI Fix
In container/templates/vllm_runtime.Dockerfile, change the grep -E pattern in the purge=$(...) assignment to match only libavcodec, libavdevice, libavfilter, libavformat, libavutil, libswresample, and libswscale instead of the broad libav[a-z] and libsw[a-z] patterns.
| cp -rnL /tmp/usr/local/include/libav* /tmp/usr/local/include/libsw* /usr/local/include/ && \ | ||
| cp -nL /tmp/usr/local/lib/libav*.so /tmp/usr/local/lib/libsw*.so /usr/local/lib/ && \ | ||
| cp -nL /tmp/usr/local/lib/libav*.so* /tmp/usr/local/lib/libsw*.so* /usr/local/lib/ && \ | ||
| cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ 2>/dev/null || true && \ |
There was a problem hiding this comment.
wheel_builder builds FFmpeg with shared --enable-libvpx, so treating the lib*vpx*.so* copy as optional can leave /usr/local/bin/ffmpeg or libavcodec.so with an unsatisfied libvpx dependency after the apt purge. Fix: make the libvpx copy required and fail the image build if it is absent.
🤖 AI Fix
In container/templates/vllm_runtime.Dockerfile, remove 2>/dev/null || true from the cp -nL /tmp/usr/local/lib/lib*vpx*.so* /usr/local/lib/ command so missing libvpx from wheel_builder fails the build.
What
Closes the GPL-ffmpeg compliance gap on the vllm runtime image on
mainand keeps omni video generation working on the LGPL build. This is the full forward-port of the cosmos3 fix (not the text-only subset used on the kimi/nemotron release branches), becausemaincan run omni video models.Why
PR #10091 only stripped the
imageio-ffmpegwheel's GPL binary. The vllm image still ships:vllm/vllm-openaiapt GPL ffmpeg stack (libav*,libx264/5,libmp3lame) — unpurgedsox+libsox-fmt-all— dead weight (vLLM-Omni's audio path is pure-numpy now)A purge-only change would break video: the in-tree LGPL ffmpeg has no libx264 (which
diffusers.export_to_videodefaults to for MP4) and its libswscale RGB→YUV path is broken (greens render magenta). So the encoder path must change too.Changes
Container (
vllm_runtime.Dockerfile):sox/libsox-fmt-all.libcu*/libnv*/libnccl*/cuda*) as manually-installed first soautoremovecan't deletelibcublas/libcusolver/libcusparseand break GPU inference.wheel_builder) and setIMAGEIO_FFMPEG_EXE. The existingimageio-ffmpeg --no-binaryreinstall is unchanged.Media (encoder):
output_formatter._encode_videonow encodes viavideo_utils.encode_to_video_bytes(h264_nvenc/libvpx-vp9) instead ofdiffusers.export_to_video.encode_to_video_bytespre-converts RGB→YUV420p in numpy (BT.601 full range) and pipes planar YUV to ffmpeg, bypassing the broken libswscale path (fix(media): pre-convert RGB→YUV420p in numpy to fix corrupted video colors #10159).normalize_image_framesflattens Cosmos3-style[B,F,H,W,C]numpy to PIL.normalize_image_frames).Testing
container/render.py --framework vllm --device cuda --cuda-version 13.0 --target runtimerenders cleanly; verified purge/CUDA-protect, LGPL CLI copy +IMAGEIO_FFMPEG_EXE, and nosoxinstall.video_utilsencoder +normalize_image_frameslogic validated standalone (subprocess/tempfile mocked):h264_nvenc/libvpx-vp9codec selection, YUV420p plane sizing, even-dim cropping, error paths, PIL/5D/float normalization. Full pytest collection needs the compileddynamo.llm(CI).main's generator owns that.🤖 Generated with Claude Code
Summary by CodeRabbit
Tests
Chores