Skip to content

[Bugfix]: modify diffusion pipeline profiler result in videos#2647

Merged
david6666666 merged 1 commit into
vllm-project:mainfrom
bjf-frz:bugfix_videos_diffusion_pipeline_profiler
Apr 10, 2026
Merged

[Bugfix]: modify diffusion pipeline profiler result in videos#2647
david6666666 merged 1 commit into
vllm-project:mainfrom
bjf-frz:bugfix_videos_diffusion_pipeline_profiler

Conversation

@bjf-frz
Copy link
Copy Markdown
Contributor

@bjf-frz bjf-frz commented Apr 9, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR addresses an issue where, after enabling the --enable-diffusion-pipeline-profiler, the /v1/videos interface in wan2.2 does not properly handle peak_memory and stage_durations.

Test Plan

server end:
vllm serve Wan2.2-I2V-A14B-Diffusers/ --omni --port 8091

user end:
python3 benchmarks/diffusion/diffusion_benchmark_serving.py
--base-url http://localhost:8091
--model Wan2.2-I2V-A14B-Diffusers/
--backend v1/videos
--dataset random
--task i2v
--num-prompts 1
--max-concurrency 1
--request-rate inf
--width 640
--height 480
--num-frames 81
--fps 16
--num-inference-steps 2

Test Result

================= Serving Benchmark Result =================
Backend:                                 v1/videos      
Model:                                   /home/admin/Wan2.2-I2V-14B-Distill-Diffusers/
Dataset:                                 random         
Task:                                    i2v            
--------------------------------------------------
Benchmark duration (s):                  12.14          
Request rate:                            inf            
Max request concurrency:                 1              
Successful requests:                     1/1              
--------------------------------------------------
Request throughput (req/s):              0.08           
Latency Mean (s):                        12.1411        
Latency Median (s):                      12.1411        
Latency P99 (s):                         12.1411        
Latency P95 (s):                         12.1411        
--------------------------------------------------
Peak Memory Max (MB):                    74204.00       
Peak Memory Mean (MB):                   74204.00       
Peak Memory Median (MB):                 74204.00       
--------------------------------------------------
Stage Durations Mean (s):
  Wan22I2VPipeline.text_encoder.forward: 0.0515         
  Wan22I2VPipeline.vae.encode:           1.0445         
  Wan22I2VPipeline.vae.decode:           1.5887

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: bjf-frz <frz123db@gmail.com>
@bjf-frz bjf-frz requested a review from hsliuustc0106 as a code owner April 9, 2026 12:23
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@david6666666
Copy link
Copy Markdown
Collaborator

please add purpose

@bjf-frz
Copy link
Copy Markdown
Contributor Author

bjf-frz commented Apr 9, 2026

@Bounty-hunter @david6666666 @wtomin PTAL, thx !

@bjf-frz
Copy link
Copy Markdown
Contributor Author

bjf-frz commented Apr 9, 2026

@yangjianjuan PTAL, thx

@bjf-frz bjf-frz changed the title [WIP][Bugfix]: modify diffusion pipeline profiler result in videos [Bugfix]: modify diffusion pipeline profiler result in videos Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@david6666666 david6666666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One non-blocking test-coverage note below.

Comment thread benchmarks/diffusion/backends.py
@david6666666
Copy link
Copy Markdown
Collaborator

Stage Durations Mean (s):
Wan22I2VPipeline.text_encoder.forward: 0.0515
Wan22I2VPipeline.vae.encode: 1.0445
Wan22I2VPipeline.vae.decode: 1.5887

Do we have dit Stage Durations time?

@bjf-frz
Copy link
Copy Markdown
Contributor Author

bjf-frz commented Apr 10, 2026

@hsliuustc0106

Stage Durations Mean (s): Wan22I2VPipeline.text_encoder.forward: 0.0515 Wan22I2VPipeline.vae.encode: 1.0445 Wan22I2VPipeline.vae.decode: 1.5887

Do we have dit Stage Durations time?

The DIT process in wan2.2 is currently scattered throughout the forward. It needs to be refactored into a dedicated diffuse function to enable proper profiling. This refactoring will be addressed in a separate PR.

@david6666666
Copy link
Copy Markdown
Collaborator

LGTM

@david6666666 david6666666 added the ready label to trigger buildkite CI label Apr 10, 2026
@david6666666 david6666666 merged commit fbb5dd5 into vllm-project:main Apr 10, 2026
8 checks passed
david6666666 pushed a commit to david6666666/vllm-omni that referenced this pull request Apr 10, 2026
…roject#2647)

Signed-off-by: bjf-frz <frz123db@gmail.com>
(cherry picked from commit fbb5dd5)
Signed-off-by: David Chen <530634352@qq.com>
daixinning pushed a commit to daixinning/vllm-omni that referenced this pull request Apr 13, 2026
lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants