[Metrics] Adding vllm-omni diffusion metrics support by erfgss · Pull Request #1977 · vllm-project/vllm-omni

erfgss · 2026-03-18T09:15:38Z

Adding profiling for vllm-omni

Purpose

In the vllm-omni project, the logs printed by the Diffusion/DiT Single diffusion Pipeline model lack some diffusion feature information. This PR supplements this information and improves the log printing format.

Test Plan

Diffusion/DiT Single diffusion Pipeline

Test Result glm_image

python end2end.py \
        --model-path /cy50055764/models/zai-org/GLM-Image \
        --prompt "A beautiful sunset over the ocean" \
        --output output_t2i.png \
        --enable-stats

Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:14<00:00, 74.72s/it]INFO 03-19 02:42:42 [stats.py:519] 
INFO 03-19 02:42:42 [stats.py:519] [Overall Summary]
INFO 03-19 02:42:42 [stats.py:519] +-----------------------------+------------+
INFO 03-19 02:42:42 [stats.py:519] | Field                       |      Value |
INFO 03-19 02:42:42 [stats.py:519] +-----------------------------+------------+
INFO 03-19 02:42:42 [stats.py:519] | e2e_requests                |          1 |
INFO 03-19 02:42:42 [stats.py:519] | e2e_wall_time_ms            | 74,410.325 |
INFO 03-19 02:42:42 [stats.py:519] | e2e_total_tokens            |      1,287 |
INFO 03-19 02:42:42 [stats.py:519] | e2e_avg_time_per_request_ms | 74,410.325 |
INFO 03-19 02:42:42 [stats.py:519] | e2e_avg_tokens_per_s        |     17.296 |
INFO 03-19 02:42:42 [stats.py:519] | e2e_stage_0_wall_time_ms    | 40,252.345 |
INFO 03-19 02:42:42 [stats.py:519] | e2e_stage_1_wall_time_ms    | 34,140.089 |
INFO 03-19 02:42:42 [stats.py:519] +-----------------------------+------------+
INFO 03-19 02:42:42 [stats.py:545] 
INFO 03-19 02:42:42 [stats.py:545] [RequestE2EStats [request_id=0_501f96ba-184a-4d36-a336-3498e498c5e3]]
INFO 03-19 02:42:42 [stats.py:545] +------------------+------------+
INFO 03-19 02:42:42 [stats.py:545] | Field            |      Value |
INFO 03-19 02:42:42 [stats.py:545] +------------------+------------+
INFO 03-19 02:42:42 [stats.py:545] | e2e_total_ms     | 74,393.074 |
INFO 03-19 02:42:42 [stats.py:545] | e2e_total_tokens |      1,287 |
INFO 03-19 02:42:42 [stats.py:545] +------------------+------------+
INFO 03-19 02:42:42 [stats.py:598] 
INFO 03-19 02:42:42 [stats.py:598] [StageRequestStats [request_id=0_501f96ba-184a-4d36-a336-3498e498c5e3]]
INFO 03-19 02:42:42 [stats.py:598] +--------------------------------+------------+------------+
INFO 03-19 02:42:42 [stats.py:598] | Field                          |          0 |          1 |
INFO 03-19 02:42:42 [stats.py:598] +--------------------------------+------------+------------+
INFO 03-19 02:42:42 [stats.py:598] | batch_id                       |          1 |          1 |
INFO 03-19 02:42:42 [stats.py:598] | batch_size                     |          1 |          1 |
INFO 03-19 02:42:42 [stats.py:598] | diffusion_engine_exec_time_ms  |            | 34,137.298 |
INFO 03-19 02:42:42 [stats.py:598] | diffusion_engine_total_time_ms |            | 34,069.762 |
INFO 03-19 02:42:42 [stats.py:598] | image_num                      |            |      1.000 |
INFO 03-19 02:42:42 [stats.py:598] | num_tokens_in                  |          6 |          0 |
INFO 03-19 02:42:42 [stats.py:598] | num_tokens_out                 |      1,281 |          0 |
INFO 03-19 02:42:42 [stats.py:598] | postprocess_time_ms            |            |     66.332 |
INFO 03-19 02:42:42 [stats.py:598] | preprocess_time_ms             |            |      0.031 |
INFO 03-19 02:42:42 [stats.py:598] | preprocessing_time_ms          |            |      0.031 |
INFO 03-19 02:42:42 [stats.py:598] | resolution                     |            |    640.000 |
INFO 03-19 02:42:42 [stats.py:598] | stage_gen_time_ms              | 40,250.322 | 34,139.600 |
INFO 03-19 02:42:42 [stats.py:598] +--------------------------------+------------+------------+
INFO 03-19 02:42:42 [omni_base.py:154] [Summary] {'final_stage_id': {'*': 1},
INFO 03-19 02:42:42 [omni_base.py:154]  'overall_summary': {'e2e_requests': 1,
INFO 03-19 02:42:42 [omni_base.py:154]                      'e2e_wall_time_ms': 74410.32528877258,
INFO 03-19 02:42:42 [omni_base.py:154]                      'e2e_total_tokens': 1287,
INFO 03-19 02:42:42 [omni_base.py:154]                      'e2e_avg_time_per_request_ms': 74410.32528877258,
INFO 03-19 02:42:42 [omni_base.py:154]                      'e2e_avg_tokens_per_s': 17.29598674653542,
INFO 03-19 02:42:42 [omni_base.py:154]                      'e2e_stage_0_wall_time_ms': 40252.344608306885,
INFO 03-19 02:42:42 [omni_base.py:154]                      'e2e_stage_1_wall_time_ms': 34140.08903503418},
INFO 03-19 02:42:42 [omni_base.py:154]  'stage_table': [{'request_id': '0_501f96ba-184a-4d36-a336-3498e498c5e3',
INFO 03-19 02:42:42 [omni_base.py:154]                   'stages': [{'stage_id': 0,
INFO 03-19 02:42:42 [omni_base.py:154]                               'batch_id': 1,
INFO 03-19 02:42:42 [omni_base.py:154]                               'batch_size': 1,
INFO 03-19 02:42:42 [omni_base.py:154]                               'num_tokens_in': 6,
INFO 03-19 02:42:42 [omni_base.py:154]                               'num_tokens_out': 1281,
INFO 03-19 02:42:42 [omni_base.py:154]                               'stage_gen_time_ms': 40250.32162666321,
INFO 03-19 02:42:42 [omni_base.py:154]                               'audio_generated_frames': 0},
INFO 03-19 02:42:42 [omni_base.py:154]                              {'stage_id': 1,
INFO 03-19 02:42:42 [omni_base.py:154]                               'batch_id': 1,
INFO 03-19 02:42:42 [omni_base.py:154]                               'batch_size': 1,
INFO 03-19 02:42:42 [omni_base.py:154]                               'num_tokens_in': 0,
INFO 03-19 02:42:42 [omni_base.py:154]                               'num_tokens_out': 0,
INFO 03-19 02:42:42 [omni_base.py:154]                               'stage_gen_time_ms': 34139.60027694702,
INFO 03-19 02:42:42 [omni_base.py:154]                               'audio_generated_frames': 0,
INFO 03-19 02:42:42 [omni_base.py:154]                               'preprocess_time_ms': 0.030729999707546085,
INFO 03-19 02:42:42 [omni_base.py:154]                               'diffusion_engine_exec_time_ms': 34137.297870999646,
INFO 03-19 02:42:42 [omni_base.py:154]                               'diffusion_engine_total_time_ms': 34069.761635999836,
INFO 03-19 02:42:42 [omni_base.py:154]                               'image_num': 1.0,
INFO 03-19 02:42:42 [omni_base.py:154]                               'resolution': 640.0,
INFO 03-19 02:42:42 [omni_base.py:154]                               'postprocess_time_ms': 66.33246499995948,
INFO 03-19 02:42:42 [omni_base.py:154]                               'preprocessing_time_ms': 0.030729999707546085}]}],
INFO 03-19 02:42:42 [omni_base.py:154]  'trans_table': [{'request_id': '0_501f96ba-184a-4d36-a336-3498e498c5e3',
INFO 03-19 02:42:42 [omni_base.py:154]                   'transfers': [{'edge': '0->1',
INFO 03-19 02:42:42 [omni_base.py:154]                                  'size_kbytes': 0.0,
INFO 03-19 02:42:42 [omni_base.py:154]                                  'tx_time_ms': 0.0,
INFO 03-19 02:42:42 [omni_base.py:154]                                  'rx_decode_time_ms': 0.0,
INFO 03-19 02:42:42 [omni_base.py:154]                                  'in_flight_time_ms': 0.0}]}],
INFO 03-19 02:42:42 [omni_base.py:154]  'e2e_table': [{'request_id': '0_501f96ba-184a-4d36-a336-3498e498c5e3',
INFO 03-19 02:42:42 [omni_base.py:154]                 'e2e_total_ms': 74393.07427406311,
INFO 03-19 02:42:42 [omni_base.py:154]                 'e2e_total_tokens': 1287,
INFO 03-19 02:42:42 [omni_base.py:154]                 'transfers_total_time_ms': 0.0,
INFO 03-19 02:42:42 [omni_base.py:154]                 'transfers_total_kbytes': 0.0}]}

Test Result text_to_image

vllm-omni serve /models/Qwen/Qwen-Image --omni --port 8091 --log-stats

(omni) root@huawei:/cy50055764/cy50055764# vllm-omni serve /cy50055764/models/Qwen/Qwen-Image --omni --port 8091 --log-stats
/cy50055764/cy50055764/vllm-omni/vllm_omni/__init__.py:29: RuntimeWarning: Failed to import version from _version.py: No module named 'vllm_omni._version'
This typically happens in development mode before building.
Using fallback version 'dev'.
  from .version import __version__, __version_tuple__  # isort:skip
/cy50055764/cy50055764/omni/lib/python3.12/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
INFO 03-25 09:19:00 [serve.py:74] Detected diffusion model: /cy50055764/models/Qwen/Qwen-Image
INFO 03-25 09:19:00 [logo.py:45]        █     █     █▄   ▄█       ▄▀▀▀▀▄ █▄   ▄█ █▄    █ ▀█▀ 
INFO 03-25 09:19:00 [logo.py:45]  ▄▄ ▄█ █     █     █ ▀▄▀ █  ▄▄▄  █    █ █ ▀▄▀ █ █ ▀▄  █  █  
INFO 03-25 09:19:00 [logo.py:45]   █▄█▀ █     █     █     █       █    █ █     █ █   ▀▄█  █  
INFO 03-25 09:19:00 [logo.py:45]    ▀▀  ▀▀▀▀▀ ▀▀▀▀▀ ▀     ▀        ▀▀▀▀  ▀     ▀ ▀     ▀ ▀▀▀ 
INFO 03-25 09:19:00 [logo.py:45] 

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.88s/it]
INFO 03-25 09:19:31 [diffusion_model_runner.py:212] Peak GPU memory (this request): 58.66 GB reserved, 57.89 GB allocated, 0.77 GB pool overhead (1.3%)
WARNING 03-25 09:19:31 [diffusion_worker.py:426] SHM pack failed, falling back to raw enqueue: Got unsupported ScalarType BFloat16
(APIServer pid=2088) INFO 03-25 09:19:31 [async_omni_diffusion.py:154] AsyncOmniDiffusion initialized with model: /cy50055764/models/Qwen/Qwen-Image, batch_size: 1
(APIServer pid=2088) INFO 03-25 09:19:31 [stage_diffusion_client.py:54] [StageDiffusionClient] Stage-0 initialized (batch_size=1)
(APIServer pid=2088) INFO 03-25 09:19:31 [async_omni_engine.py:485] [AsyncOmniEngine] Stage 0 initialized (diffusion, batch_size=1)
(APIServer pid=2088) INFO 03-25 09:19:31 [orchestrator.py:158] [Orchestrator] Starting event loop
(APIServer pid=2088) INFO 03-25 09:19:31 [async_omni_engine.py:288] [AsyncOmniEngine] Orchestrator ready with 1 stages
(APIServer pid=2088) INFO 03-25 09:19:31 [omni_base.py:105] [AsyncOmni] AsyncOmniEngine initialized in 31.15 seconds
(APIServer pid=2088) INFO 03-25 09:19:31 [omni_base.py:120] [AsyncOmni] Initialized with 1 stages for model /cy50055764/models/Qwen/Qwen-Image
(APIServer pid=2088) INFO 03-25 09:19:31 [api_server.py:469] Detected pure diffusion mode (single diffusion stage)
(APIServer pid=2088) INFO 03-25 09:19:31 [api_server.py:513] Pure diffusion API server initialized for model: /cy50055764/models/Qwen/Qwen-Image
(APIServer pid=2088) INFO 03-25 09:19:31 [api_server.py:319] Starting vLLM API server (pure diffusion mode) on http://0.0.0.0:8091
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:37] Available routes are:
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /openapi.json, Methods: GET, HEAD
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /docs, Methods: GET, HEAD
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /docs/oauth2-redirect, Methods: GET, HEAD
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /redoc, Methods: GET, HEAD
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /tokenize, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /detokenize, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /load, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /version, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /health, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /metrics, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /ping, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /ping, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /invocations, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/responses, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/responses/{response_id}, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/responses/{response_id}/cancel, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/completions, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/messages, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/messages/count_tokens, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /inference/v1/generate, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /scale_elastic_ep, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /is_scaling_elastic_ep, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/chat/completions/render, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/completions/render, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/chat/completions, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/audio/speech, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/audio/voices, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/audio/voices, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/audio/voices/{name}, Methods: DELETE
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /health, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/models, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/images/generations, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/images/edits, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/videos, Methods: POST
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/videos, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/videos/{video_id}, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/videos/{video_id}, Methods: DELETE
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:46] Route: /v1/videos/{video_id}/content, Methods: GET
(APIServer pid=2088) INFO 03-25 09:19:31 [launcher.py:57] Route: /v1/audio/speech/stream, Endpoint: streaming_speech
(APIServer pid=2088) INFO:     Started server process [2088]
(APIServer pid=2088) INFO:     Waiting for application startup.
(APIServer pid=2088) INFO:     Application startup complete.
(APIServer pid=2088) INFO 03-25 09:21:43 [api_server.py:1263] Generating 1 image(s) 1024x1024
(APIServer pid=2088) INFO 03-25 09:21:43 [orchestrator.py:584] [Orchestrator] _handle_add_request: stage=0 req=img_gen-991542bd612369bf prompt_type=dict original_prompt_type=dict final_stage=0 num_sampling_params=1
INFO 03-25 09:21:43 [manager.py:608] Deactivating all adapters: 0 layers
WARNING 03-25 09:21:43 [kv_transfer_manager.py:381] No connector available for receiving KV cache
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:17<00:00,  2.83it/s]
INFO 03-25 09:22:01 [diffusion_model_runner.py:212] Peak GPU memory (this request): 58.66 GB reserved, 57.89 GB allocated, 0.76 GB pool overhead (1.3%)
WARNING 03-25 09:22:01 [diffusion_worker.py:426] SHM pack failed, falling back to raw enqueue: Got unsupported ScalarType BFloat16
(APIServer pid=2088) INFO 03-25 09:22:01 [diffusion_engine.py:103] Generation completed successfully.
(APIServer pid=2088) INFO 03-25 09:22:02 [diffusion_engine.py:136] Post-processing completed in 0.9100 seconds
(APIServer pid=2088) INFO 03-25 09:22:02 [diffusion_engine.py:139] DiffusionEngine.step breakdown: preprocess=0.00 ms, add_req_and_wait=17994.51 ms, postprocess=910.00 ms, total=18905.05 ms
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] 
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] [Overall Summary]
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] +-----------------------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] | Field                       |      Value |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] +-----------------------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] | e2e_requests                |          1 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] | e2e_wall_time_ms            | 18,909.161 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] | e2e_avg_time_per_request_ms | 18,909.161 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] | e2e_stage_0_wall_time_ms    | 18,908.831 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:533] +-----------------------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] 
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] [RequestE2EStats [request_id=img_gen-991542bd612369bf]]
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] +--------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] | Field        |      Value |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] +--------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] | e2e_total_ms | 18,908.831 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:559] +--------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] 
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] [StageRequestStats [request_id=img_gen-991542bd612369bf]]
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] +--------------------------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | Field                          |          0 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] +--------------------------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | batch_id                       |          1 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | batch_size                     |          1 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | diffusion_engine_exec_time_ms  | 18,905.097 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | diffusion_engine_total_time_ms | 17,994.514 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | image_num                      |      1.000 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | postprocess_time_ms            |    910.005 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | resolution                     |    640.000 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] | stage_gen_time_ms              | 18,907.687 |
(APIServer pid=2088) INFO 03-25 09:22:02 [stats.py:612] +--------------------------------+------------+
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161] [Summary] {'final_stage_id': {'*': 0},
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]  'overall_summary': {'e2e_requests': 1,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                      'e2e_wall_time_ms': 18909.160614013672,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                      'e2e_total_tokens': 0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                      'e2e_avg_time_per_request_ms': 18909.160614013672,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                      'e2e_avg_tokens_per_s': 0.0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                      'e2e_stage_0_wall_time_ms': 18908.830642700195},
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]  'stage_table': [{'request_id': 'img_gen-991542bd612369bf',
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                   'stages': [{'stage_id': 0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'batch_id': 1,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'batch_size': 1,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'num_tokens_in': 0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'num_tokens_out': 0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'stage_gen_time_ms': 18907.686710357666,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'audio_generated_frames': 0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'preprocess_time_ms': 0.0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'diffusion_engine_exec_time_ms': 18905.09681200001,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'diffusion_engine_total_time_ms': 17994.513525000002,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'image_num': 1.0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'resolution': 640.0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                               'postprocess_time_ms': 910.0048660002358}]}],
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]  'trans_table': [{'request_id': 'img_gen-991542bd612369bf', 'transfers': []}],
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]  'e2e_table': [{'request_id': 'img_gen-991542bd612369bf',
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                 'e2e_total_ms': 18908.830642700195,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                 'e2e_total_tokens': 0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                 'transfers_total_time_ms': 0.0,
(APIServer pid=2088) INFO 03-25 09:22:02 [omni_base.py:161]                 'transfers_total_kbytes': 0.0}]}
(APIServer pid=2088) INFO 03-25 09:22:02 [api_server.py:1283] Successfully generated 1 image(s)
(APIServer pid=2088) INFO:     127.0.0.1:33838 - "POST /v1/images/generations HTTP/1.1" 200 OK

Test Result image_to_image

python image_edit.py \
    --model /models/Qwen/Qwen-Image-Edit-2511 \
    --image qwen-bear.png \
    --prompt "Add a white art board written with colorful text 'vLLM-Omni' on grassland. Add a paintbrush in the bear's hands. position the bear standing in front of the art board as if painting" \
    --output output_image_edit.png \
    --num-inference-steps 50 \
    --cfg-scale 4.0 \
    --cache-backend  cache_dit \
    --log-stats

INFO 03-19 02:49:34 [stats.py:519] [Overall Summary]
INFO 03-19 02:49:34 [stats.py:519] +-----------------------------+------------+
INFO 03-19 02:49:34 [stats.py:519] | Field                       |      Value |
INFO 03-19 02:49:34 [stats.py:519] +-----------------------------+------------+
INFO 03-19 02:49:34 [stats.py:519] | e2e_requests                |          1 |
INFO 03-19 02:49:34 [stats.py:519] | e2e_wall_time_ms            | 16,315.407 |
INFO 03-19 02:49:34 [stats.py:519] | e2e_avg_time_per_request_ms | 16,315.407 |
INFO 03-19 02:49:34 [stats.py:519] | e2e_stage_0_wall_time_ms    | 16,315.180 |
INFO 03-19 02:49:34 [stats.py:519] +-----------------------------+------------+
INFO 03-19 02:49:34 [stats.py:545] 
INFO 03-19 02:49:34 [stats.py:545] [RequestE2EStats [request_id=0_7b378aac-60e6-405f-8e52-272fca96b3b3]]
INFO 03-19 02:49:34 [stats.py:545] +--------------+------------+
INFO 03-19 02:49:34 [stats.py:545] | Field        |      Value |
INFO 03-19 02:49:34 [stats.py:545] +--------------+------------+
INFO 03-19 02:49:34 [stats.py:545] | e2e_total_ms | 16,315.180 |
INFO 03-19 02:49:34 [stats.py:545] +--------------+------------+
INFO 03-19 02:49:34 [stats.py:598] 
INFO 03-19 02:49:34 [stats.py:598] [StageRequestStats [request_id=0_7b378aac-60e6-405f-8e52-272fca96b3b3]]
INFO 03-19 02:49:34 [stats.py:598] +--------------------------------+------------+
INFO 03-19 02:49:34 [stats.py:598] | Field                          |          0 |
INFO 03-19 02:49:34 [stats.py:598] +--------------------------------+------------+
INFO 03-19 02:49:34 [stats.py:598] | batch_id                       |          1 |
INFO 03-19 02:49:34 [stats.py:598] | batch_size                     |          1 |
INFO 03-19 02:49:34 [stats.py:598] | diffusion_engine_exec_time_ms  | 16,312.377 |
INFO 03-19 02:49:34 [stats.py:598] | diffusion_engine_total_time_ms | 16,192.174 |
INFO 03-19 02:49:34 [stats.py:598] | image_num                      |      1.000 |
INFO 03-19 02:49:34 [stats.py:598] | postprocess_time_ms            |     50.571 |
INFO 03-19 02:49:34 [stats.py:598] | preprocess_time_ms             |     68.687 |
INFO 03-19 02:49:34 [stats.py:598] | preprocessing_time_ms          |     68.687 |
INFO 03-19 02:49:34 [stats.py:598] | resolution                     |    640.000 |
INFO 03-19 02:49:34 [stats.py:598] | stage_gen_time_ms              | 16,313.807 |
INFO 03-19 02:49:34 [stats.py:598] +--------------------------------+------------+
INFO 03-19 02:49:34 [omni_base.py:154] [Summary] {'final_stage_id': {'*': 0},
INFO 03-19 02:49:34 [omni_base.py:154]  'overall_summary': {'e2e_requests': 1,
INFO 03-19 02:49:34 [omni_base.py:154]                      'e2e_wall_time_ms': 16315.407276153564,
INFO 03-19 02:49:34 [omni_base.py:154]                      'e2e_total_tokens': 0,
INFO 03-19 02:49:34 [omni_base.py:154]                      'e2e_avg_time_per_request_ms': 16315.407276153564,
INFO 03-19 02:49:34 [omni_base.py:154]                      'e2e_avg_tokens_per_s': 0.0,
INFO 03-19 02:49:34 [omni_base.py:154]                      'e2e_stage_0_wall_time_ms': 16315.179586410522},
INFO 03-19 02:49:34 [omni_base.py:154]  'stage_table': [{'request_id': '0_7b378aac-60e6-405f-8e52-272fca96b3b3',
INFO 03-19 02:49:34 [omni_base.py:154]                   'stages': [{'stage_id': 0,
INFO 03-19 02:49:34 [omni_base.py:154]                               'batch_id': 1,
INFO 03-19 02:49:34 [omni_base.py:154]                               'batch_size': 1,
INFO 03-19 02:49:34 [omni_base.py:154]                               'num_tokens_in': 0,
INFO 03-19 02:49:34 [omni_base.py:154]                               'num_tokens_out': 0,
INFO 03-19 02:49:34 [omni_base.py:154]                               'stage_gen_time_ms': 16313.806772232056,
INFO 03-19 02:49:34 [omni_base.py:154]                               'audio_generated_frames': 0,
INFO 03-19 02:49:34 [omni_base.py:154]                               'preprocess_time_ms': 68.6873839999862,
INFO 03-19 02:49:34 [omni_base.py:154]                               'diffusion_engine_exec_time_ms': 16312.377264999668,
INFO 03-19 02:49:34 [omni_base.py:154]                               'diffusion_engine_total_time_ms': 16192.174020000039,
INFO 03-19 02:49:34 [omni_base.py:154]                               'image_num': 1.0,
INFO 03-19 02:49:34 [omni_base.py:154]                               'resolution': 640.0,
INFO 03-19 02:49:34 [omni_base.py:154]                               'postprocess_time_ms': 50.57102099999611,
INFO 03-19 02:49:34 [omni_base.py:154]                               'preprocessing_time_ms': 68.6873839999862}]}],
INFO 03-19 02:49:34 [omni_base.py:154]  'trans_table': [{'request_id': '0_7b378aac-60e6-405f-8e52-272fca96b3b3',
INFO 03-19 02:49:34 [omni_base.py:154]                   'transfers': []}],
INFO 03-19 02:49:34 [omni_base.py:154]  'e2e_table': [{'request_id': '0_7b378aac-60e6-405f-8e52-272fca96b3b3',
INFO 03-19 02:49:34 [omni_base.py:154]                 'e2e_total_ms': 16315.179586410522,
INFO 03-19 02:49:34 [omni_base.py:154]                 'e2e_total_tokens': 0,
INFO 03-19 02:49:34 [omni_base.py:154]                 'transfers_total_time_ms': 0.0,
INFO 03-19 02:49:34 [omni_base.py:154]                 'transfers_total_kbytes': 0.0}]}

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Chen Yang <2082464740@qq.com>

hsliuustc0106

wait for PR1908 refactoring merged

erfgss · 2026-03-18T09:28:21Z

@claude

erfgss · 2026-03-18T09:32:14Z

@Bounty-hunter @david6666666 @ZJY0516 @lishunyang12 PTAL,thanks

erfgss · 2026-03-18T09:32:22Z

wait for PR1908 refactoring merged

ok

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Removed metrics from the output representation. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Signed-off-by: Chen Yang <2082464740@qq.com>

lishunyang12

Left a couple comments — the stats normalization looks good but the omni_base changes need a rebase.

lishunyang12 · 2026-03-19T04:46:10Z

+                stage_meta["stage_type"],
+                req_id,
+                engine_outputs,
+            )


This will conflict with main — process_stage_metrics already handles both the accumulate_diffusion_metrics call and final_output_type passing. Needs a rebase after #1908.

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Refactor output handling and metrics accumulation in the Omni request processing. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

hsliuustc0106 · 2026-03-22T06:45:14Z

I do not understand the meaning of the diffusion examples: in your test result, some of them contain prepocecesing but some does not inlcude it. do we have a arg/param list design for the log stats? In addition, what's the relationship with profiler? @gcanlin

gcanlin · 2026-03-22T06:57:36Z

In addition, what's the relationship with profiler? @gcanlin

Not really related actually. This PR is focusing on rough profiling, such as e2e time. Torch profiler is for kernel level profiling.

hsliuustc0106 · 2026-03-22T08:50:05Z

image_num/resolution should be in integer

hsliuustc0106

Review Summary

This PR adds important profiling capabilities for vLLM-Omni diffusion pipelines, but has blocking issues that must be resolved before merge.

✅ What Works Well

Clean addition of --log-stats flag to example scripts
Better type handling in metrics (int → int | float)
All CI checks pass
Good example test coverage in PR description

🚫 Blocking Issues

1. Merge Conflicts
The PR is in CONFLICTING state. Please rebase against main and resolve conflicts.

2. Missing Unit Tests
The core changes to vllm_omni/metrics/stats.py lack unit test coverage:

New _normalize_diffusion_metric_value() function
Modified accumulate_diffusion_metrics() logic
Edge cases: bool conversion, Real types, invalid types

Required: Add unit tests in tests/metrics/test_stats.py (or equivalent) covering:

Bool → int conversion
Real → float conversion
Invalid type handling (should return None)
Accumulation with various metric types
None value filtering in _as_stage_request_stats

⚠️ Code Quality Issues

3. Breaking Change: stage_durations Removal
OmniRequestOutput.stage_durations was removed but:

No deprecation warning or migration guide
Could break existing code relying on this field
Not documented in PR description

Recommendation: If this field is no longer needed, document the breaking change. If still useful, restore it.

4. Metric Accumulation Timing
In omni_base.py:252-263, accumulate_diffusion_metrics() is called before checking if finished. This means metrics may be accumulated for incomplete requests.

Recommendation: Move the accumulation call inside the if finished block to ensure we only accumulate completed requests.

5. Silent Type Conversion Failures
The normalization function silently skips invalid types without logging. This could hide data quality issues.

Recommendation: Add debug-level logging when skipping unexpected types:

if normalized_value is None:
    logger.debug("Skipping unsupported metric value type: %s for key %s", type(value).__name__, key)

Next Steps

Resolve merge conflicts
Add unit tests for stats.py changes
Address the code quality issues above
Update PR description to document any breaking changes

hsliuustc0106 · 2026-03-22T08:52:47Z

        _m = result.get("metrics")
        if finished and _m is not None:
-            metrics.on_stage_metrics(stage_id, req_id, _m)
+            metrics.accumulate_diffusion_metrics(


This accumulates metrics before checking if the request is finished. Should this be moved inside the if finished and _m is not None: block below? Otherwise we might accumulate metrics for incomplete/partial requests.

hsliuustc0106 · 2026-03-22T08:52:47Z

 logger = init_logger(__name__)


+def _normalize_diffusion_metric_value(value: Any) -> int | float | None:


Consider adding debug logging when returning None:

if normalized_value is None: logger.debug("Skipping unsupported metric type: %s", type(value).__name__)

This helps with debugging if unexpected types appear in production.

hsliuustc0106 · 2026-03-22T08:52:47Z

        if diffusion_metrics:
            for key, value in diffusion_metrics.items():
-                self.diffusion_metrics[req_id][key] += value
+                normalized_value = _normalize_diffusion_metric_value(value)


Using pop() has side effects. Consider whether a defensive copy would be safer:

if req_id in self.diffusion_metrics: metrics = self.diffusion_metrics[req_id].copy() del self.diffusion_metrics[req_id] stats.diffusion_metrics = {k: normalized_value for k, v in metrics.items() ...}

This makes the mutation explicit and avoids surprises if the dict is accessed elsewhere.

hsliuustc0106 · 2026-03-22T08:52:47Z

            f"prompt={self.prompt!r}",
            f"latents={self.latents}",
-            f"metrics={self.metrics}",
            f"multimodal_output={self._multimodal_output}",


Breaking Change: stage_durations is removed from OmniRequestOutput.

If this field is no longer needed, please document this in the PR description as a breaking change. If it's still useful for debugging/profiling, consider restoring it or providing an alternative way to access this data.

hsliuustc0106 · 2026-03-22T09:32:04Z

I remember there was a doc written by @LJH-LBJ about the log-stats, please check and change accordingly

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Signed-off-by: Chen Yang <2082464740@qq.com>

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Signed-off-by: Chen Yang <2082464740@qq.com>

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Signed-off-by: Chen Yang <2082464740@qq.com>

hsliuustc0106 · 2026-04-24T07:50:24Z

I think this can close this PR since #3069 opened

feat: add vllm-omni metrics support

adda875

Signed-off-by: Chen Yang <2082464740@qq.com>

erfgss requested a review from hsliuustc0106 as a code owner March 18, 2026 09:15

Merge branch 'main' into feat/vllmomni_metrics

84e5d6a

hsliuustc0106 reviewed Mar 18, 2026

View reviewed changes

erfgss added 3 commits March 18, 2026 18:07

Merge branch 'main' into feat/vllmomni_metrics

33efadd

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Remove metrics from output string

b7e8220

Removed metrics from the output representation. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Merge branch 'main' into feat/vllmomni_metrics

435ce80

Gaohan123 added this to the v0.18.0 milestone Mar 19, 2026

fix bug

de5e08f

Signed-off-by: Chen Yang <2082464740@qq.com>

erfgss changed the title ~~feat: add vllm-omni metrics support~~ [Profile] Adding vllm-omni metrics support Mar 19, 2026

erfgss and others added 2 commits March 19, 2026 10:26

fix omni_base.py

9680010

Signed-off-by: Chen Yang <2082464740@qq.com>

Merge branch 'main' into feat/vllmomni_metrics

b05aa0a

lishunyang12 reviewed Mar 19, 2026

View reviewed changes

erfgss added 5 commits March 19, 2026 14:20

Merge branch 'main' into feat/vllmomni_metrics

1ae7aa7

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Merge branch 'main' into feat/vllmomni_metrics

6c8bb95

Remove default value for --log-stats argument

e793a4e

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Refactor output handling in omni_base.py

db87ae4

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Simplify output return and metrics processing

b00c4e5

Refactor output handling and metrics accumulation in the Omni request processing. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

hsliuustc0106 requested changes Mar 22, 2026

View reviewed changes

erfgss and others added 4 commits March 22, 2026 19:53

Merge branch 'main' into feat/vllmomni_metrics

e8a6b4a

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Merge branch 'main' into feat/vllmomni_metrics

9160a79

add something about test_stats.py

3b69300

Signed-off-by: Chen Yang <2082464740@qq.com>

Merge branch 'main' into feat/vllmomni_metrics

b255f76

erfgss and others added 5 commits March 23, 2026 10:48

Update stats.py

147f55d

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

fix

ddf632e

Signed-off-by: Chen Yang <2082464740@qq.com>

fix

9feee92

Signed-off-by: Chen Yang <2082464740@qq.com>

Merge branch 'vllm-project:main' into feat/vllmomni_metrics

25337cb

fix

ccf081f

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

erfgss changed the title ~~[Profile] Adding vllm-omni metrics support~~ [Metrics] Adding vllm-omni metrics support Mar 23, 2026

erfgss changed the title ~~[Metrics] Adding vllm-omni metrics support~~ [Metrics] Adding vllm-omni diffusion metrics support Mar 23, 2026

Merge branch 'main' into feat/vllmomni_metrics

dc33340

hsliuustc0106 mentioned this pull request Mar 23, 2026

[Feature]: Support VAE as a Separate Stage to Reduce GPU Memory Pressure in Diffusion Pipelines #2089

Open

1 task

erfgss and others added 5 commits March 24, 2026 09:19

Merge branch 'main' into feat/vllmomni_metrics

c03ebb7

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Merge branch 'main' into feat/vllmomni_metrics

dc48a14

Merge branch 'main' into feat/vllmomni_metrics

85a1514

Merge branch 'main' into feat/vllmomni_metrics

e69365a

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

fix pre-commit

cbc5ec4

Signed-off-by: Chen Yang <2082464740@qq.com>

Gaohan123 modified the milestones: v0.18.0, v0.20.0 Apr 14, 2026

Gaohan123 removed this from the v0.20.0 milestone Apr 30, 2026

		logger = init_logger(__name__)


		def _normalize_diffusion_metric_value(value: Any) -> int \| float \| None:

Conversation

erfgss commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result glm_image

Test Result text_to_image

Test Result image_to_image

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

erfgss commented Mar 18, 2026

Uh oh!

erfgss commented Mar 18, 2026

Uh oh!

erfgss commented Mar 18, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hsliuustc0106 commented Mar 22, 2026

Uh oh!

gcanlin commented Mar 22, 2026

Uh oh!

hsliuustc0106 commented Mar 22, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Review Summary

✅ What Works Well

🚫 Blocking Issues

⚠️ Code Quality Issues

Next Steps

Uh oh!

hsliuustc0106 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Mar 22, 2026

Uh oh!

hsliuustc0106 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

erfgss commented Mar 18, 2026 •

edited

Loading