Skip to content

[Doc][Misc] Update DreamID-Omni Example; Add DreamID-Omni post process function#2809

Merged
princepride merged 2 commits intovllm-project:mainfrom
yuanheng-zhao:example/upd-dreamid-ex
Apr 16, 2026
Merged

[Doc][Misc] Update DreamID-Omni Example; Add DreamID-Omni post process function#2809
princepride merged 2 commits intovllm-project:mainfrom
yuanheng-zhao:example/upd-dreamid-ex

Conversation

@yuanheng-zhao
Copy link
Copy Markdown
Contributor

@yuanheng-zhao yuanheng-zhao commented Apr 15, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

When adding support for diffusion optimization features, it took me quite a while to successfully run DreamID-Omni...

This PR is a partial generation cleanup and example use case adding, so that users could follow the added example use case and run more quickly.

This PR

  1. Added post process function to DreamID-Omni pipeline (for vllm-omni consistency), so that we could unpack pipeline output with metadata.
  2. Updated DreamID-Omni example usage - Added a single-IP (oneip) usage example
  3. Updated generation script to use vllm-omni's mux_video_audio_bytes to save video

TODO:
For now, the model weights and some dependency files are downloaded and installed by examples/offline_inference/x_to_video_audio/download_dreamid_omni.py, which downloads dependency repo and manually write to environment - this seems not to be elegant and convenient to use. I think we still need to refactor this way of installing dependency in future.

Test Plan

# Example usage for oneip, ref media from the official repo DreamID-Omni
python x_to_video_audio.py \
  --model /path/to/dreamid_omni \
  --prompt "<img1>: In the frame, a woman with black long hair is identified as <sub1>.\n**Overall Environment/Scene**: A lively open-kitchen café at night; stove flames flare, steam rises, and warm pendant lights swing slightly as staff move behind her. The shot is an upper-body close-up.\n**Main Characters/Subjects Appearance**: <sub1> is a young woman with thick dark wavy hair and a side part. She wears a fitted black top under a light apron, a thin gold chain necklace, and small stud earrings.\n**Main Characters/Subjects Actions**: <sub1> tastes the sauce with a spoon, then turns her face toward the camera while still holding the spoon, her expression shifting from focused to conflicted.\n<sub1> maintains eye contact, swallows as if choosing her words, and says, <S>I keep telling myself I’m fine,but some nights it feels like I’m just performing calm.<E>" \
  --image-path 9.png \
  --audio-path 9.wav \
  --video-negative-prompt "jitter, bad hands, blur, distortion" \
  --audio-negative-prompt "robotic, muffled, echo, distorted" \
  --cfg-parallel-size 2 \
  --num-inference-steps 45 \
  --height 704 \
  --width 1280 \
  --output out_dreamid_omni_oneip.mp4

Test Result

out_dreamid_omni.mp4

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@yuanheng-zhao
Copy link
Copy Markdown
Contributor Author

PTAL @Bounty-hunter , @SamitHuang

@princepride princepride enabled auto-merge (squash) April 16, 2026 03:02
@princepride princepride added the ready label to trigger buildkite CI label Apr 16, 2026
@princepride princepride merged commit de5f8a2 into vllm-project:main Apr 16, 2026
7 of 8 checks passed
@yuanheng-zhao yuanheng-zhao deleted the example/upd-dreamid-ex branch April 16, 2026 03:32
lvliang-intel pushed a commit to lvliang-intel/vllm-omni that referenced this pull request Apr 20, 2026
…s function (vllm-project#2809)

Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants