Skip to content

chore(container): build in-tree ffmpeg CLI and route imageio through it#10091

Merged
saturley-hall merged 9 commits into
mainfrom
harrison/ffmpeg-vendoring
May 29, 2026
Merged

chore(container): build in-tree ffmpeg CLI and route imageio through it#10091
saturley-hall merged 9 commits into
mainfrom
harrison/ffmpeg-vendoring

Conversation

@saturley-hall
Copy link
Copy Markdown
Member

@saturley-hall saturley-hall commented May 28, 2026

Summary

  • wheel_builder: extends the existing ffmpeg build so it also produces the CLI binary, with --enable-nvenc (h264_nvenc for MP4/H.264) and --enable-libvpx (libvpx_vp9 for WebM/VP9). Pulls in nv-codec-headers and libvpx from source so we don't have to track distro package names across the Ubuntu and manylinux base images.
  • runtime images: the in-tree ffmpeg CLI + its runtime libs are copied into the SGLang and TRT-LLM runtime images. IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg makes imageio use it.
  • Python: 3 sites switch codec=\"libx264\"codec=\"h264_nvenc\" (common/utils/video_utils.py × 2, sglang/.../video_generation_handler.py). The WebM/VP9 path is unchanged. Unit-test fixture updated.
  • imageio-ffmpeg installation: a --no-binary imageio-ffmpeg directive in requirements.common.txt and requirements.trtllm.txt makes dynamo, trtllm, and frontend runtimes install the Python wrapper from source automatically. The SGLang and vLLM runtimes reinstall the wrapper from source after the upstream base image's install (lmsysorg/sglang and vllm/vllm-openai both pre-install imageio-ffmpeg), so all four images end up with the same shape: pure-Python wrapper + the in-tree ffmpeg CLI as the only encoder backend.

Test plan

  • CI image builds (sglang, trtllm, vllm, dynamo runtimes) succeed.
  • In each runtime image, python -c \"import imageio_ffmpeg, os; print(os.listdir(os.path.join(os.path.dirname(imageio_ffmpeg.__file__), 'binaries')))\" shows the binaries/ directory is empty or absent.
  • In sglang and trtllm runtimes, ffmpeg -hide_banner -encoders | grep -E 'h264_nvenc|libvpx_vp9' lists both, and ffmpeg -version | grep configuration: shows --disable-gpl --disable-nonfree --enable-nvenc.
  • End-to-end: text-to-video diffusion (examples/backends/sglang/launch/text-to-video-diffusion.sh) returns a valid MP4. TRT-LLM video diffusion handler returns a valid MP4. Both runtimes require an NVENC-capable GPU.
  • pytest components/src/dynamo/common/tests/test_video_utils.py passes.

🤖 Generated with Claude Code


Open in Devin Review

Summary by CodeRabbit

  • New Features

    • Added NVIDIA NVENC H.264 codec support for accelerated MP4 video encoding, enabling GPU-powered compression.
  • Bug Fixes

    • Resolved FFmpeg licensing compliance by ensuring LGPL-only binaries are used in container environments instead of GPL-bundled versions.
  • Documentation

    • Updated video generation documentation with FFmpeg requirements and setup instructions for both containerized and standalone deployments.

Review Change Stack

Extends wheel_builder's existing ffmpeg build to also produce the CLI
binary, with the NVIDIA NVENC encoder for H.264 (h264_nvenc) and libvpx
for VP9 (libvpx_vp9). The CLI plus its runtime libs land in the SGLang
and TRT-LLM runtime images, and IMAGEIO_FFMPEG_EXE points imageio at it.

Python video encoding paths switch from libx264 to h264_nvenc to use the
new HW encoder; the WebM/VP9 path is unchanged.

imageio-ffmpeg is now installed from source — requirements.common.txt and
requirements.trtllm.txt carry a `--no-binary imageio-ffmpeg` directive,
and the SGLang and vLLM runtimes reinstall it from source after the
upstream base image's install so all four images agree on the in-tree
ffmpeg as the sole encoder.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@saturley-hall saturley-hall requested review from a team as code owners May 28, 2026 16:59
@github-actions github-actions Bot added chore documentation Improvements or additions to documentation backend::sglang Relates to the sglang backend backend::trtllm Relates to the trtllm backend container labels May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Walkthrough

This PR migrates video encoding from the libx264 codec to NVIDIA's h264_nvenc codec, rebuilds FFmpeg with NVENC and VP9 encoder support, and replaces GPL-bundled imageio-ffmpeg wheels with source-built LGPL FFmpeg across all container runtimes using IMAGEIO_FFMPEG_EXE redirection.

Changes

Video encoding and container FFmpeg refactor

Layer / File(s) Summary
Application codec switch to h264_nvenc
components/src/dynamo/common/utils/video_utils.py, components/src/dynamo/sglang/request_handlers/video_generation/video_generation_handler.py, components/src/dynamo/common/tests/test_video_utils.py
encode_to_mp4, encode_to_video_bytes, and _frames_to_video update default codec from libx264 to h264_nvenc; test assertion updated to validate the new codec.
Prevent GPL imageio-ffmpeg wheel installation
container/deps/requirements.common.txt, container/deps/requirements.trtllm.txt
Both requirements files add --no-binary imageio-ffmpeg directives with comments explaining GPL exposure and directing installs toward in-tree LGPL ffmpeg CLI via IMAGEIO_FFMPEG_EXE.
Build FFmpeg with NVENC and libvpx encoders
container/templates/wheel_builder.Dockerfile
New build arguments pin nv-codec-headers and libvpx source refs; build steps clone and compile these dependencies; FFmpeg configure flags enable NVENC (h264_nvenc) and VP9 (libvpx_vp9) encoders as shared libraries.
Runtime image FFmpeg integration
container/templates/sglang_runtime.Dockerfile, container/templates/trtllm_runtime.Dockerfile, container/templates/vllm_runtime.Dockerfile
Each runtime copies FFmpeg binary and libraries from wheel_builder, sets IMAGEIO_FFMPEG_EXE environment variable, and conditionally force-reinstalls imageio-ffmpeg with --no-binary to ensure no GPL artifact remains on disk.
Documentation for FFmpeg setup
docs/backends/trtllm/trtllm-diffusion.md
Expanded section describing LGPL-only ffmpeg location in container, clarifying absence of wheel-bundled binary, and providing setup instructions for non-container deployments with IMAGEIO_FFMPEG_EXE configuration; notes that MP4 output requires NVIDIA GPU.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: building an in-tree ffmpeg CLI and configuring imageio to use it instead of default ffmpeg sources.
Description check ✅ Passed The description covers all required template sections with comprehensive detail: Summary section explains code changes across wheel_builder, runtime images, Python, and imageio-ffmpeg installation; Details section is thorough; Test plan provides concrete verification steps.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@container/templates/sglang_runtime.Dockerfile`:
- Around line 36-45: The RUN layer copies only unversioned FFmpeg symlinks
(libav*.so and libsw*.so) which can miss the actual versioned SONAME files;
update the cp patterns in the RUN command that currently reference libav*.so and
libsw*.so to copy the versioned shared objects (libav*.so* and libsw*.so*) using
the same flags (cp -nL ...) and preserve the existing error-tolerant redirects
(e.g., 2>/dev/null || true) so the Docker build still succeeds if files are
absent; locate and modify the cp invocation inside the RUN --mount... block that
precedes ENV IMAGEIO_FFMPEG_EXE to use the *.so* glob instead of just *.so.

In `@container/templates/trtllm_runtime.Dockerfile`:
- Around line 149-161: The ENV IMAGEIO_FFMPEG_EXE set in the runtime_full stage
is not carried into the final runtime stage, causing imageio to lose the pinned
/usr/local/bin/ffmpeg; add the same ENV IMAGEIO_FFMPEG_EXE=/usr/local/bin/ffmpeg
to the final runtime stage (the later "runtime" stage that rebases from
runtime_full) so the runtime image preserves the IMAGEIO_FFMPEG_EXE variable
used by the TRT-LLM encoder; ensure you update both occurrences mentioned (the
runtime stage and the later block around lines 195-203) so IMAGEIO_FFMPEG_EXE is
present in the final image.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1451ab8e-4bec-4113-a39a-64fd98a3d6a4

📥 Commits

Reviewing files that changed from the base of the PR and between eb98d8b and 7136e98.

📒 Files selected for processing (10)
  • components/src/dynamo/common/tests/test_video_utils.py
  • components/src/dynamo/common/utils/video_utils.py
  • components/src/dynamo/sglang/request_handlers/video_generation/video_generation_handler.py
  • container/deps/requirements.common.txt
  • container/deps/requirements.trtllm.txt
  • container/templates/sglang_runtime.Dockerfile
  • container/templates/trtllm_runtime.Dockerfile
  • container/templates/vllm_runtime.Dockerfile
  • container/templates/wheel_builder.Dockerfile
  • docs/backends/trtllm/trtllm-diffusion.md

Comment thread container/templates/sglang_runtime.Dockerfile
Comment thread container/templates/trtllm_runtime.Dockerfile
The wheel_builder base images ship `yasm` but not `nasm`, and modern
ffmpeg requires nasm (yasm is deprecated) when --enable-x86asm is in
effect. Build was failing with "nasm not found or too old".

Restoring --disable-x86asm matches the prior behavior and is fine for
our encoder paths: H.264 goes through NVENC (hardware encoder, no CPU
asm needed) and VP9 goes through libvpx (which builds its own asm via
yasm in a separate compilation unit). FFmpeg's own libav* C fallbacks
are sufficient for the mux/demux work this build does.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolves conflict in container/deps/requirements.trtllm.txt: main removed
the huggingface-hub pin (no longer needed after upstream changes); we kept
the --no-binary imageio-ffmpeg directive added on this branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… line

The requirements-txt-fixer pre-commit hook sorts --no-binary directives
alphabetically (ASCII-wise `-` < letters), so it lives at the top of the
file while imageio-ffmpeg>=0.6.0 lives alphabetically among the
packages. Adds a trailing inline comment on the package line to make
the cross-reference obvious without fighting the hook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…time

Two CI failures from the previous push:

1. vllm-runtime build: `uv pip show/install` needed the {{ pip_target }}
   Jinja variable (--system on cuda, --python /opt/venv/bin/python on
   xpu/cpu). Without it uv reported "No virtual environment found" and
   the imageio-ffmpeg reinstall step exited 2.

2. dynamo-runtime / rust-gpu: --enable-libvpx adds a runtime dep on
   libvpx.so.9 to libavcodec.so. The existing FFmpeg copy block only
   copied libav*.so / libsw*.so, so dynamo_llm-* failed to load shared
   libraries during the Rust GPU check. Now copies lib*vpx*.so* too
   (matching the sglang and trtllm runtime images) and runs ldconfig.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread container/templates/sglang_runtime.Dockerfile Outdated
Comment thread container/templates/vllm_runtime.Dockerfile Outdated
Comment thread container/templates/sglang_runtime.Dockerfile Outdated
The sglang and vllm runtime templates carried inline pip/uv-pip install
commands for the imageio-ffmpeg source-reinstall step, including a
guarded `pip show ... && pip install ...` pattern. Both base images
(lmsysorg/sglang, vllm/vllm-openai) always pre-install imageio-ffmpeg,
so the guards were always-true dead code.

Routes both reinstalls through container/deps/requirements.{sglang,vllm}.txt,
matching the pattern already used by requirements.{common,trtllm}.txt where
the `--no-binary imageio-ffmpeg` directive lives inside the requirements
file. Drops the redundant `pip show` guards.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot added the backend::vllm Relates to the vllm backend label May 29, 2026
…IMAGEIO_FFMPEG_EXE

Three review fixes:

- sglang_runtime: ungate the ffmpeg copy + IMAGEIO_FFMPEG_EXE on
  enable_media_ffmpeg. lmsysorg/sglang always ships imageio-ffmpeg with
  a GPL-encumbered prebuilt binary that the previous commit's
  requirements.sglang.txt now replaces unconditionally, so the LGPL CLI
  must always be present on PATH for imageio to target. Mirrors the
  trtllm pattern.

- sglang_runtime: cp the versioned shared objects (libav*.so*, libsw*.so*)
  not just the unversioned symlinks (*.so), so the SONAME files actually
  land at runtime.

- trtllm_runtime: re-declare IMAGEIO_FFMPEG_EXE in the final `runtime`
  stage's ENV block. The variable was set in runtime_full at line 161,
  but the final stage rebases from the upstream image and only redeclares
  Dynamo-specific env, so without this the runtime image lost the pin.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Four ARGs had hardcoded version defaults in templates rather than being
sourced from container/context.yaml like the rest of the version pins
(FFMPEG_VERSION, NIXL_REF, NATS_VERSION, etc.):

- SCCACHE_VERSION=v0.14.0 in dynamo_base.Dockerfile
- NV_CODEC_HEADERS_REF=n13.0.19.0 in wheel_builder.Dockerfile
- LIBVPX_REF=v1.14.1 in wheel_builder.Dockerfile
- AWS_SDK_CPP_VERSION=1.11.760 in wheel_builder.Dockerfile

Adds them to context.yaml (the three ffmpeg/sccache pins under `dynamo:`,
AWS_SDK_CPP_VERSION under `vllm:` since it only builds for vllm/cuda) and
wires them through args.Dockerfile. The stage-local ARG declarations
become bare (`ARG NAME`) so they inherit the global default, matching the
existing FFMPEG_VERSION pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@saturley-hall saturley-hall merged commit dc2f352 into main May 29, 2026
106 checks passed
@saturley-hall saturley-hall deleted the harrison/ffmpeg-vendoring branch May 29, 2026 20:01
saturley-hall added a commit that referenced this pull request May 31, 2026
…it (#10091) (#10154)

Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
saturley-hall added a commit that referenced this pull request May 31, 2026
…it (#10091)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
(cherry picked from commit dc2f352)
saturley-hall added a commit that referenced this pull request Jun 1, 2026
… omni video through LGPL h264_nvenc

The vllm/vllm-openai base image ships a GPL/GPL-3.0 ffmpeg built against
libx264/libx265/libmp3lame plus the full libav* codec stack. PR #10091 only
stripped the imageio-ffmpeg python wheel's binary for vllm and left the base
apt ffmpeg in place (enable_media_ffmpeg was "false", so the in-tree LGPL
ffmpeg was never copied in), so the compliance scan still flagged the GPL
stack. The sox/libsox-fmt-all packages Dynamo installed are also dead weight:
vLLM-Omni replaced sox with a pure-numpy peak_normalize() and never shells
out to the sox binary.

Mirror the sglang_runtime pattern for vllm-runtime:
- Drop sox/libsox-fmt-all from the apt install (keep jq + git).
- Purge the base GPL ffmpeg/codec apt stack (dpkg-query + grep, robust across
  version suffixes) and autoremove orphaned media deps.
- Copy the LGPL-only in-tree ffmpeg (libs + CLI) from wheel_builder and set
  IMAGEIO_FFMPEG_EXE so imageio/diffusers encode through it.
- output_formatter._encode_video now uses common video_utils.encode_to_video_bytes
  (h264_nvenc) instead of diffusers.export_to_video, which defaults to GPL
  libx264 for mp4 and would fail against the LGPL build.

Purge verified safe in the built image: PyAV, soundfile, torchaudio and
torchvision bundle their own libraries and do not link the system ffmpeg.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend::sglang Relates to the sglang backend backend::trtllm Relates to the trtllm backend backend::vllm Relates to the vllm backend chore container documentation Improvements or additions to documentation size/L xpu

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants