chore(container): build in-tree ffmpeg CLI and route imageio through it#10091
Conversation
Extends wheel_builder's existing ffmpeg build to also produce the CLI binary, with the NVIDIA NVENC encoder for H.264 (h264_nvenc) and libvpx for VP9 (libvpx_vp9). The CLI plus its runtime libs land in the SGLang and TRT-LLM runtime images, and IMAGEIO_FFMPEG_EXE points imageio at it. Python video encoding paths switch from libx264 to h264_nvenc to use the new HW encoder; the WebM/VP9 path is unchanged. imageio-ffmpeg is now installed from source — requirements.common.txt and requirements.trtllm.txt carry a `--no-binary imageio-ffmpeg` directive, and the SGLang and vLLM runtimes reinstall it from source after the upstream base image's install so all four images agree on the in-tree ffmpeg as the sole encoder. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WalkthroughThis PR migrates video encoding from the ChangesVideo encoding and container FFmpeg refactor
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@container/templates/sglang_runtime.Dockerfile`:
- Around line 36-45: The RUN layer copies only unversioned FFmpeg symlinks
(libav*.so and libsw*.so) which can miss the actual versioned SONAME files;
update the cp patterns in the RUN command that currently reference libav*.so and
libsw*.so to copy the versioned shared objects (libav*.so* and libsw*.so*) using
the same flags (cp -nL ...) and preserve the existing error-tolerant redirects
(e.g., 2>/dev/null || true) so the Docker build still succeeds if files are
absent; locate and modify the cp invocation inside the RUN --mount... block that
precedes ENV IMAGEIO_FFMPEG_EXE to use the *.so* glob instead of just *.so.
In `@container/templates/trtllm_runtime.Dockerfile`:
- Around line 149-161: The ENV IMAGEIO_FFMPEG_EXE set in the runtime_full stage
is not carried into the final runtime stage, causing imageio to lose the pinned
/usr/local/bin/ffmpeg; add the same ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg
to the final runtime stage (the later "runtime" stage that rebases from
runtime_full) so the runtime image preserves the IMAGEIO_FFMPEG_EXE variable
used by the TRT-LLM encoder; ensure you update both occurrences mentioned (the
runtime stage and the later block around lines 195-203) so IMAGEIO_FFMPEG_EXE is
present in the final image.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1451ab8e-4bec-4113-a39a-64fd98a3d6a4
📒 Files selected for processing (10)
components/src/dynamo/common/tests/test_video_utils.pycomponents/src/dynamo/common/utils/video_utils.pycomponents/src/dynamo/sglang/request_handlers/video_generation/video_generation_handler.pycontainer/deps/requirements.common.txtcontainer/deps/requirements.trtllm.txtcontainer/templates/sglang_runtime.Dockerfilecontainer/templates/trtllm_runtime.Dockerfilecontainer/templates/vllm_runtime.Dockerfilecontainer/templates/wheel_builder.Dockerfiledocs/backends/trtllm/trtllm-diffusion.md
The wheel_builder base images ship `yasm` but not `nasm`, and modern ffmpeg requires nasm (yasm is deprecated) when --enable-x86asm is in effect. Build was failing with "nasm not found or too old". Restoring --disable-x86asm matches the prior behavior and is fine for our encoder paths: H.264 goes through NVENC (hardware encoder, no CPU asm needed) and VP9 goes through libvpx (which builds its own asm via yasm in a separate compilation unit). FFmpeg's own libav* C fallbacks are sufficient for the mux/demux work this build does. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolves conflict in container/deps/requirements.trtllm.txt: main removed the huggingface-hub pin (no longer needed after upstream changes); we kept the --no-binary imageio-ffmpeg directive added on this branch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… line The requirements-txt-fixer pre-commit hook sorts --no-binary directives alphabetically (ASCII-wise `-` < letters), so it lives at the top of the file while imageio-ffmpeg>=0.6.0 lives alphabetically among the packages. Adds a trailing inline comment on the package line to make the cross-reference obvious without fighting the hook. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…time
Two CI failures from the previous push:
1. vllm-runtime build: `uv pip show/install` needed the {{ pip_target }}
Jinja variable (--system on cuda, --python /opt/venv/bin/python on
xpu/cpu). Without it uv reported "No virtual environment found" and
the imageio-ffmpeg reinstall step exited 2.
2. dynamo-runtime / rust-gpu: --enable-libvpx adds a runtime dep on
libvpx.so.9 to libavcodec.so. The existing FFmpeg copy block only
copied libav*.so / libsw*.so, so dynamo_llm-* failed to load shared
libraries during the Rust GPU check. Now copies lib*vpx*.so* too
(matching the sglang and trtllm runtime images) and runs ldconfig.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The sglang and vllm runtime templates carried inline pip/uv-pip install
commands for the imageio-ffmpeg source-reinstall step, including a
guarded `pip show ... && pip install ...` pattern. Both base images
(lmsysorg/sglang, vllm/vllm-openai) always pre-install imageio-ffmpeg,
so the guards were always-true dead code.
Routes both reinstalls through container/deps/requirements.{sglang,vllm}.txt,
matching the pattern already used by requirements.{common,trtllm}.txt where
the `--no-binary imageio-ffmpeg` directive lives inside the requirements
file. Drops the redundant `pip show` guards.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…IMAGEIO_FFMPEG_EXE Three review fixes: - sglang_runtime: ungate the ffmpeg copy + IMAGEIO_FFMPEG_EXE on enable_media_ffmpeg. lmsysorg/sglang always ships imageio-ffmpeg with a GPL-encumbered prebuilt binary that the previous commit's requirements.sglang.txt now replaces unconditionally, so the LGPL CLI must always be present on PATH for imageio to target. Mirrors the trtllm pattern. - sglang_runtime: cp the versioned shared objects (libav*.so*, libsw*.so*) not just the unversioned symlinks (*.so), so the SONAME files actually land at runtime. - trtllm_runtime: re-declare IMAGEIO_FFMPEG_EXE in the final `runtime` stage's ENV block. The variable was set in runtime_full at line 161, but the final stage rebases from the upstream image and only redeclares Dynamo-specific env, so without this the runtime image lost the pin. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Four ARGs had hardcoded version defaults in templates rather than being sourced from container/context.yaml like the rest of the version pins (FFMPEG_VERSION, NIXL_REF, NATS_VERSION, etc.): - SCCACHE_VERSION=v0.14.0 in dynamo_base.Dockerfile - NV_CODEC_HEADERS_REF=n13.0.19.0 in wheel_builder.Dockerfile - LIBVPX_REF=v1.14.1 in wheel_builder.Dockerfile - AWS_SDK_CPP_VERSION=1.11.760 in wheel_builder.Dockerfile Adds them to context.yaml (the three ffmpeg/sccache pins under `dynamo:`, AWS_SDK_CPP_VERSION under `vllm:` since it only builds for vllm/cuda) and wires them through args.Dockerfile. The stage-local ARG declarations become bare (`ARG NAME`) so they inherit the global default, matching the existing FFMPEG_VERSION pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… omni video through LGPL h264_nvenc The vllm/vllm-openai base image ships a GPL/GPL-3.0 ffmpeg built against libx264/libx265/libmp3lame plus the full libav* codec stack. PR #10091 only stripped the imageio-ffmpeg python wheel's binary for vllm and left the base apt ffmpeg in place (enable_media_ffmpeg was "false", so the in-tree LGPL ffmpeg was never copied in), so the compliance scan still flagged the GPL stack. The sox/libsox-fmt-all packages Dynamo installed are also dead weight: vLLM-Omni replaced sox with a pure-numpy peak_normalize() and never shells out to the sox binary. Mirror the sglang_runtime pattern for vllm-runtime: - Drop sox/libsox-fmt-all from the apt install (keep jq + git). - Purge the base GPL ffmpeg/codec apt stack (dpkg-query + grep, robust across version suffixes) and autoremove orphaned media deps. - Copy the LGPL-only in-tree ffmpeg (libs + CLI) from wheel_builder and set IMAGEIO_FFMPEG_EXE so imageio/diffusers encode through it. - output_formatter._encode_video now uses common video_utils.encode_to_video_bytes (h264_nvenc) instead of diffusers.export_to_video, which defaults to GPL libx264 for mp4 and would fail against the LGPL build. Purge verified safe in the built image: PyAV, soundfile, torchaudio and torchvision bundle their own libraries and do not link the system ffmpeg. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
--enable-nvenc(h264_nvencfor MP4/H.264) and--enable-libvpx(libvpx_vp9for WebM/VP9). Pulls innv-codec-headersandlibvpxfrom source so we don't have to track distro package names across the Ubuntu and manylinux base images.IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpegmakesimageiouse it.codec=\"libx264\"→codec=\"h264_nvenc\"(common/utils/video_utils.py× 2,sglang/.../video_generation_handler.py). The WebM/VP9 path is unchanged. Unit-test fixture updated.--no-binary imageio-ffmpegdirective inrequirements.common.txtandrequirements.trtllm.txtmakesdynamo,trtllm, andfrontendruntimes install the Python wrapper from source automatically. The SGLang and vLLM runtimes reinstall the wrapper from source after the upstream base image's install (lmsysorg/sglang and vllm/vllm-openai both pre-installimageio-ffmpeg), so all four images end up with the same shape: pure-Python wrapper + the in-tree ffmpeg CLI as the only encoder backend.Test plan
python -c \"import imageio_ffmpeg, os; print(os.listdir(os.path.join(os.path.dirname(imageio_ffmpeg.__file__), 'binaries')))\"shows thebinaries/directory is empty or absent.ffmpeg -hide_banner -encoders | grep -E 'h264_nvenc|libvpx_vp9'lists both, andffmpeg -version | grep configuration:shows--disable-gpl --disable-nonfree --enable-nvenc.examples/backends/sglang/launch/text-to-video-diffusion.sh) returns a valid MP4. TRT-LLM video diffusion handler returns a valid MP4. Both runtimes require an NVENC-capable GPU.pytest components/src/dynamo/common/tests/test_video_utils.pypasses.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation