[Profile] Adding metrics for Diffusion/DiT Single diffusion Pipeline by erfgss · Pull Request #668 · vllm-project/vllm-omni

erfgss · 2026-01-06T09:16:23Z

Adding profiling for vllm-omni

Purpose

In the vllm-omni project, the logs printed by the Diffusion/DiT Single diffusion Pipeline model lack some diffusion feature information. This PR supplements this information and improves the log printing format.

Test Plan

Diffusion/DiT Single diffusion Pipeline

Test Result glm_image

python end2end.py \
        --model-path /cy50055764/models/zai-org/GLM-Image \
        --prompt "A beautiful sunset over the ocean" \
        --output output_t2i.png \
        --enable-stats

Image saved to: output_t2i.png
Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:12<00:00, 72.44s/img, est. speed stage-1 img/s: 18.00, avg e2e_lat: 0.0ms]
INFO 03-05 01:09:36 [stats.py:538] █████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:12<00:00, 72.44s/img, est. speed stage-1 img/s: 18.00, avg e2e_lat: 0.0ms]
INFO 03-05 01:09:36 [stats.py:538] [Overall Summary]
INFO 03-05 01:09:36 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:09:36 [stats.py:538] | Field                       |      Value |
INFO 03-05 01:09:36 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:09:36 [stats.py:538] | e2e_requests                |          1 |
INFO 03-05 01:09:36 [stats.py:538] | e2e_wall_time_ms            | 72,130.432 |
INFO 03-05 01:09:36 [stats.py:538] | e2e_total_tokens            |      1,298 |
INFO 03-05 01:09:36 [stats.py:538] | e2e_avg_time_per_request_ms | 72,130.432 |
INFO 03-05 01:09:36 [stats.py:538] | e2e_avg_tokens_per_s        |     17.995 |
INFO 03-05 01:09:36 [stats.py:538] | e2e_stage_0_wall_time_ms    | 37,729.990 |
INFO 03-05 01:09:36 [stats.py:538] | e2e_stage_1_wall_time_ms    | 34,369.095 |
INFO 03-05 01:09:36 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:09:36 [stats.py:564] 
INFO 03-05 01:09:36 [stats.py:564] [RequestE2EStats [request_id=0_6c1dc0e6-d2cf-4b01-adb9-a2c62e8f02a8]]
INFO 03-05 01:09:36 [stats.py:564] +-------------------------+------------+
INFO 03-05 01:09:36 [stats.py:564] | Field                   |      Value |
INFO 03-05 01:09:36 [stats.py:564] +-------------------------+------------+
INFO 03-05 01:09:36 [stats.py:564] | e2e_total_ms            | 72,129.004 |
INFO 03-05 01:09:36 [stats.py:564] | e2e_total_tokens        |      1,298 |
INFO 03-05 01:09:36 [stats.py:564] | transfers_total_kbytes  |     33.505 |
INFO 03-05 01:09:36 [stats.py:564] | transfers_total_time_ms |      2.604 |
INFO 03-05 01:09:36 [stats.py:564] +-------------------------+------------+
INFO 03-05 01:09:36 [stats.py:617] 
INFO 03-05 01:09:36 [stats.py:617] [StageRequestStats [request_id=0_6c1dc0e6-d2cf-4b01-adb9-a2c62e8f02a8]]
INFO 03-05 01:09:36 [stats.py:617] +---------------------------------+------------+------------+
INFO 03-05 01:09:36 [stats.py:617] | Field                           |          0 |          1 |
INFO 03-05 01:09:36 [stats.py:617] +---------------------------------+------------+------------+
INFO 03-05 01:09:36 [stats.py:617] | batch_id                        |          1 |          1 |
INFO 03-05 01:09:36 [stats.py:617] | batch_size                      |          1 |          1 |
INFO 03-05 01:09:36 [stats.py:617] | diffusion_engine_exec_time_ms   |            | 34,346.956 |
INFO 03-05 01:09:36 [stats.py:617] | diffusion_engine_total_time_ms  |            | 34,255.169 |
INFO 03-05 01:09:36 [stats.py:617] | image_num                       |            |      1.000 |
INFO 03-05 01:09:36 [stats.py:617] | num_inference_steps             |            |     50.000 |
INFO 03-05 01:09:36 [stats.py:617] | num_tokens_in                   |         17 |          0 |
INFO 03-05 01:09:36 [stats.py:617] | num_tokens_out                  |      1,281 |          0 |
INFO 03-05 01:09:36 [stats.py:617] | postprocess_time_ms             |            |     91.089 |
INFO 03-05 01:09:36 [stats.py:617] | preprocess_time_ms              |            |      0.020 |
INFO 03-05 01:09:36 [stats.py:617] | preprocessing_time_ms           |            |      0.020 |
INFO 03-05 01:09:36 [stats.py:617] | resolution                      |            |    640.000 |
INFO 03-05 01:09:36 [stats.py:617] | stage_gen_time_ms               | 37,699.747 | 34,347.180 |
INFO 03-05 01:09:36 [stats.py:617] +---------------------------------+------------+------------+
INFO 03-05 01:09:37 [stats.py:657] 
INFO 03-05 01:09:37 [stats.py:657] [TransferEdgeStats [request_id=0_6c1dc0e6-d2cf-4b01-adb9-a2c62e8f02a8]]
INFO 03-05 01:09:37 [stats.py:657] +-------------------+--------+
INFO 03-05 01:09:37 [stats.py:657] | Field             |   0->1 |
INFO 03-05 01:09:37 [stats.py:657] +-------------------+--------+
INFO 03-05 01:09:37 [stats.py:657] | in_flight_time_ms |  0.847 |
INFO 03-05 01:09:37 [stats.py:657] | rx_decode_time_ms |  1.075 |
INFO 03-05 01:09:37 [stats.py:657] | size_kbytes       | 33.505 |
INFO 03-05 01:09:37 [stats.py:657] | tx_time_ms        |  0.682 |
INFO 03-05 01:09:37 [stats.py:657] +-------------------+--------+

Test Result text_to_image

python path/text_to_image/text_to_image.py --log-stats

Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:15<00:00, 15.77s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 03-05 01:12:15 [stats.py:538] ██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:15<00:00, 15.77s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 03-05 01:12:15 [stats.py:538] [Overall Summary]
INFO 03-05 01:12:15 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:12:15 [stats.py:538] | Field                       |      Value |
INFO 03-05 01:12:15 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:12:15 [stats.py:538] | e2e_requests                |          1 |
INFO 03-05 01:12:15 [stats.py:538] | e2e_wall_time_ms            | 15,769.676 |
INFO 03-05 01:12:15 [stats.py:538] | e2e_avg_time_per_request_ms | 15,769.676 |
INFO 03-05 01:12:15 [stats.py:538] | e2e_stage_0_wall_time_ms    | 15,767.952 |
INFO 03-05 01:12:15 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:12:15 [stats.py:564] 
INFO 03-05 01:12:15 [stats.py:564] [RequestE2EStats [request_id=0_460686f0-3da1-4946-8e80-5329a5c1913e]]
INFO 03-05 01:12:15 [stats.py:564] +--------------+------------+
INFO 03-05 01:12:15 [stats.py:564] | Field        |      Value |
INFO 03-05 01:12:15 [stats.py:564] +--------------+------------+
INFO 03-05 01:12:15 [stats.py:564] | e2e_total_ms | 15,767.088 |
INFO 03-05 01:12:15 [stats.py:564] +--------------+------------+
INFO 03-05 01:12:15 [stats.py:617] 
INFO 03-05 01:12:15 [stats.py:617] [StageRequestStats [request_id=0_460686f0-3da1-4946-8e80-5329a5c1913e]]
INFO 03-05 01:12:15 [stats.py:617] +---------------------------------+------------+
INFO 03-05 01:12:15 [stats.py:617] | Field                           |          0 |
INFO 03-05 01:12:15 [stats.py:617] +---------------------------------+------------+
INFO 03-05 01:12:15 [stats.py:617] | batch_id                        |          1 |
INFO 03-05 01:12:15 [stats.py:617] | batch_size                      |          1 |
INFO 03-05 01:12:15 [stats.py:617] | diffusion_engine_exec_time_ms   | 15,745.841 |
INFO 03-05 01:12:15 [stats.py:617] | diffusion_engine_total_time_ms  | 15,643.542 |
INFO 03-05 01:12:15 [stats.py:617] | image_num                       |      1.000 |
INFO 03-05 01:12:15 [stats.py:617] | num_inference_steps             |     40.000 |
INFO 03-05 01:12:15 [stats.py:617] | postprocess_time_ms             |    101.835 |
INFO 03-05 01:12:15 [stats.py:617] | resolution                      |    640.000 |
INFO 03-05 01:12:15 [stats.py:617] | stage_gen_time_ms               | 15,746.098 |
INFO 03-05 01:12:15 [stats.py:617] +---------------------------------+------------+

Test Result image_to_image

python image_edit.py \
    --model /models/Qwen/Qwen-Image-Edit-2511 \
    --image qwen-bear.png \
    --prompt "Add a white art board written with colorful text 'vLLM-Omni' on grassland. Add a paintbrush in the bear's hands. position the bear standing in front of the art board as if painting" \
    --output output_image_edit.png \
    --num-inference-steps 50 \
    --cfg-scale 4.0 \
    --cache-backend  cache_dit \
    --log-stats

Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:16<00:00, 16.42s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 03-05 01:15:57 [stats.py:538] ██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:16<00:00, 16.42s/img, est. speed stage-0 img/s: 0.00, avg e2e_lat: 0.0ms]
INFO 03-05 01:15:57 [stats.py:538] [Overall Summary]
INFO 03-05 01:15:57 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:15:57 [stats.py:538] | Field                       |      Value |
INFO 03-05 01:15:57 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:15:57 [stats.py:538] | e2e_requests                |          1 |
INFO 03-05 01:15:57 [stats.py:538] | e2e_wall_time_ms            | 16,421.802 |
INFO 03-05 01:15:57 [stats.py:538] | e2e_avg_time_per_request_ms | 16,421.802 |
INFO 03-05 01:15:57 [stats.py:538] | e2e_stage_0_wall_time_ms    | 16,420.381 |
INFO 03-05 01:15:57 [stats.py:538] +-----------------------------+------------+
INFO 03-05 01:15:57 [stats.py:564] 
INFO 03-05 01:15:57 [stats.py:564] [RequestE2EStats [request_id=0_dc76a53f-20ea-4897-aa4b-077d2cf10cf5]]
INFO 03-05 01:15:57 [stats.py:564] +--------------+------------+
INFO 03-05 01:15:57 [stats.py:564] | Field        |      Value |
INFO 03-05 01:15:57 [stats.py:564] +--------------+------------+
INFO 03-05 01:15:57 [stats.py:564] | e2e_total_ms | 16,417.867 |
INFO 03-05 01:15:57 [stats.py:564] +--------------+------------+
INFO 03-05 01:15:57 [stats.py:617] 
INFO 03-05 01:15:57 [stats.py:617] [StageRequestStats [request_id=0_dc76a53f-20ea-4897-aa4b-077d2cf10cf5]]
INFO 03-05 01:15:57 [stats.py:617] +---------------------------------+------------+
INFO 03-05 01:15:57 [stats.py:617] | Field                           |          0 |
INFO 03-05 01:15:57 [stats.py:617] +---------------------------------+------------+
INFO 03-05 01:15:57 [stats.py:617] | batch_id                        |          1 |
INFO 03-05 01:15:57 [stats.py:617] | batch_size                      |          1 |
INFO 03-05 01:15:57 [stats.py:617] | diffusion_engine_exec_time_ms   | 16,363.419 |
INFO 03-05 01:15:57 [stats.py:617] | diffusion_engine_total_time_ms  | 16,234.849 |
INFO 03-05 01:15:57 [stats.py:617] | image_num                       |      1.000 |
INFO 03-05 01:15:57 [stats.py:617] | num_inference_steps             |     50.000 |
INFO 03-05 01:15:57 [stats.py:617] | postprocess_time_ms             |     67.685 |
INFO 03-05 01:15:57 [stats.py:617] | preprocess_time_ms              |     60.106 |
INFO 03-05 01:15:57 [stats.py:617] | preprocessing_time_ms           |     60.106 |
INFO 03-05 01:15:57 [stats.py:617] | resolution                      |    640.000 |
INFO 03-05 01:15:57 [stats.py:617] | stage_gen_time_ms               | 16,363.714 |
INFO 03-05 01:15:57 [stats.py:617] +---------------------------------+------------+

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a443eb8546

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-06T09:19:04Z

    def close(self) -> None:
-        self._finalizer()
-
-    def abort(self, request_id: str | Iterable[str]) -> None:
-        # TODO implement it
-        logger.warning("DiffusionEngine abort is not implemented yet")
-        pass
+        self._finalizer()


Restore DiffusionEngine.abort used by async stages

The DiffusionEngine class no longer defines abort(), but AsyncOmniDiffusion.abort() (and the async stage worker’s ABORT handling) still calls self.engine.abort(...). When an abort task is issued, this will now raise AttributeError and fail to cancel requests. This is a regression for any deployment that uses abort (e.g., client cancellation or timeout handling) and should be fixed by reintroducing DiffusionEngine.abort() or updating callers to guard/route aborts.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-01-06T09:19:04Z

        _recv_dequeue_ts = _time.time()
        task_type = task.get("type", OmniStageTaskType.GENERATE)
        if task_type == OmniStageTaskType.SHUTDOWN:
-            logger.info("Received shutdown signal")
+            logger.error("Received shutdown signal")


Handle profiler start/stop tasks in stage worker

Profiler control tasks are still submitted from omni.py (PROFILER_START/PROFILER_STOP), but the stage worker no longer handles them. As a result, these tasks fall through into the batching path, and the worker immediately accesses t["request_id"], which profiler tasks don’t include, causing KeyError and breaking profiling control. This is a functional regression for any user toggling profiling and should be addressed by reinstating the profiler-task handling or filtering those tasks before batching.

Useful? React with 👍 / 👎.

hsliuustc0106 · 2026-01-06T09:33:35Z

@gcanlin @lishunyang12 PTAL

gcanlin · 2026-01-06T09:54:30Z

cc @SamitHuang @ZJY0516.

lishunyang12 · 2026-01-06T11:06:35Z

Thanks for your contribution, Plz show the test results.

david6666666 · 2026-01-06T12:15:14Z

        sum_total_ms = float(agg.get("sum_total_ms", 0.0))
        samples_total = int(agg.get("total_count", 0.0))
        total_mbps = (sum_bytes * 8.0) / (max(sum_total_ms, 1e-6) * 1000.0) if sum_bytes > 0 else 0.0
+        sum_vae_ms = float(agg.get("sum_vae_time_ms", 0.0))


why put these into transfer_summary, maybe added in other func will be better

david6666666 · 2026-01-06T12:16:25Z

You should add your general design and the effect you want to achieve. also the test results.

hsliuustc0106 · 2026-01-07T00:46:42Z

You should add your general design and the effect you want to achieve. also the test results.

@erfgss LLM can help you generate these must-required infos, please work with them. next time please add your ideas first before submit your PR

erfgss · 2026-01-07T08:20:22Z

@codex Review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 16b1ebcedc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-07T08:25:26Z

+                metrics = {
+                    "preprocess_time_ms": preprocess_ms,
+                    "dit_time_ms": infer_ms,
+                    "denoise_time_ms": infer_ms,
+                    "vae_time_ms": postprocess_ms,


Avoid double-counting diffusion phase timings

Here both dit_time_ms and denoise_time_ms are set to the same infer_ms value, which already represents the full diffusion inference window. This makes per-phase profiling misleading (every request appears to spend the entire inference time in both phases) and inflates any summaries that interpret these as distinct phases. If per-phase timing isn’t available, consider leaving one field unset or only reporting a single aggregate to avoid double-counting.

Useful? React with 👍 / 👎.

erfgss · 2026-01-07T09:57:10Z

@wuhang2014 PTAL

Bounty-hunter · 2026-01-07T08:28:25Z

+                    metrics = {
+                        "preprocess_time_ms": preprocess_ms,
+                        "dit_time_ms": infer_ms,
+                        "denoise_time_ms": infer_ms,


why need this tow field with same infer_ms.

ZJY0516

I don't want to introduce this now honestly.

Given that the DiT component dominates runtime in diffusion models, I'd prefer to keep our focus on total end-to-end performance for now.

ZJY0516 · 2026-01-07T12:43:26Z

-                        metrics={},
+                        metrics={
+                            "preprocess_time_ms": preprocess_ms,
+                            "dit_time_ms": infer_ms,


First, dit_time_ms seems to be duplicated with denoise_time_ms. And we'd better remove vae time since we can not get it

ZJY0516 · 2026-01-07T12:48:35Z

the Multi-Stage Pipeline logs are spamming the output in this PR

lishunyang12 · 2026-01-07T13:15:38Z

Agree. We should focus on e2e proformance now.

hsliuustc0106 · 2026-01-07T16:19:41Z

could you explain the purpose of this PR? a little bit confused

wuhang2014 · 2026-01-09T02:02:39Z

Use contextlib to a elegant coding style, one example is https://github.com/vllm-project/vllm-ascend/blob/main/vllm_ascend/worker/model_runner_v1.py#L1496

erfgss · 2026-01-12T08:02:45Z

could you explain the purpose of this PR? a little bit confused

In the vllm-omni project, the logs printed by the Diffusion/DiT Single diffusion Pipeline model lack some diffusion feature information. This PR supplements this information and improves the log printing format.

ZJY0516 · 2026-01-15T02:40:19Z

FYI — user feedback indicates the diffusion logs are excessive and feel like spam now(not this pr, main branch)

erfgss · 2026-01-15T02:50:03Z

FYI — user feedback indicates the diffusion logs are excessive and feel like spam now(not this pr, main branch)
Which information from the customer's tasks is the most valuable, and what information can we correct, so that we only retain the most valuable information? Thank you.

david6666666 · 2026-01-19T02:37:06Z

@LJH-LBJ ptal thx

LJH-LBJ · 2026-01-20T02:16:45Z

INFO 01-16 09:24:55 [text_to_image.py:196] metrics={'preprocess_time_ms': 0.0, 'dit_time_ms': 37358.25538635254, 'denoise_time_per_step_ms': 747.1651077270508, 'vae_time_ms': 92.57125854492188, 'total_time_ms': 37450.82664489746},
INFO 01-16 09:24:55 [text_to_image.py:196] )], images=[], prompt=None, latents=None, metrics={})]

There are two metrics in the result. Moreover, I think it will be better split the metrics from output and use another class to record all the metrics.

david6666666 · 2026-01-21T02:45:09Z

INFO 01-16 09:24:55 [text_to_image.py:196] metrics={'preprocess_time_ms': 0.0, 'dit_time_ms': 37358.25538635254, 'denoise_time_per_step_ms': 747.1651077270508, 'vae_time_ms': 92.57125854492188, 'total_time_ms': 37450.82664489746},
INFO 01-16 09:24:55 [text_to_image.py:196] )], images=[], prompt=None, latents=None, metrics={})]

There are two metrics in the result. Moreover, I think it will be better split the metrics from output and use another class to record all the metrics.

I think we can start by providing simple metrics, and then you can refactor them in your PR.

david6666666 · 2026-01-21T02:45:43Z

LGTM

ZJY0516 · 2026-01-21T03:39:51Z

+                        "preprocess_time_ms": preprocess_ms,
+                        "dit_time_ms": infer_ms,
+                        "denoise_time_per_step_ms": per_step_ms,
+                        "vae_time_ms": postprocess_ms,


postprocess time is not vae time

see

vllm-omni/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py

Lines 801 to 802 in 9f552d0

image = self.vae.decode(latents, return_dict=False)[0][:, :, 0]

# processed_image = self.image_processor.postprocess(image, output_type=output_type)

erfgss · 2026-03-04T06:58:12Z

@lishunyang12 @LJH-LBJ

lishunyang12

All my previous concerns addressed. LGTM.

LJH-LBJ · 2026-03-04T16:13:39Z

        metrics = {
+            "preprocess_time_ms": round(preprocess_time * 1000, 2),
+            "diffusion_engine_exec_time_ms": round((time.time() - diffusion_engine_start_time) * 1000, 2),
+            "executor_time_ms": round(exec_total_time * 1000, 2),


There's no need to round here—the status.py file will keep three decimal places. The same applies to other similar places.

LJH-LBJ · 2026-03-04T16:30:29Z

+            "diffusion_engine_exec_time_ms": round((time.time() - diffusion_engine_start_time) * 1000, 2),
+            "executor_time_ms": round(exec_total_time * 1000, 2),


I think diffusion_engine_total_time_ms and executor_exec_time_ms will be better

LJH-LBJ · 2026-03-04T16:30:52Z

Please update the newly added metrics in docs/contributing/metrics.md and document their relationships. It appears that:
diffusion_engine_exec_time_ms = exec_total_time + preprocess_time + postprocess_time.

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Added detailed metrics for DiffusionStats including execution and processing times. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Updated formatting and added spacing for clarity in the metrics documentation. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

erfgss · 2026-03-05T06:23:01Z

Please update the newly added metrics in docs/contributing/metrics.md and document their relationships. It appears that: diffusion_engine_exec_time_ms = exec_total_time + preprocess_time + postprocess_time.

have added

LJH-LBJ · 2026-03-05T06:46:18Z

+| num_inference_steps             |     50.000 |
+| postprocess_time_ms             |     67.685 |
+| preprocess_time_ms              |     60.106 |
+| preprocessing_time_ms           |     60.106 |


Is preprocessing_time_ms duplicated？

Removed duplicate preprocessing_time_ms entry. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

LJH-LBJ · 2026-03-05T07:00:09Z

LGTM.

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

david6666666

LGTM now!

…ipeline (vllm-project#668)" This reverts commit b7fcc9d. Signed-off-by: gcanlin <canlinguosdu@gmail.com>

…ipeline (#668)" (#1724) Signed-off-by: gcanlin <canlinguosdu@gmail.com>

…llm-project#668) Signed-off-by: Chen Yang <2082464740@qq.com> Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com> Signed-off-by: lishunyang <lishunyang12@163.com>

…ipeline (vllm-project#668)" (vllm-project#1724) Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: lishunyang <lishunyang12@163.com>

…llm-project#668) Signed-off-by: Chen Yang <2082464740@qq.com> Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

…ipeline (vllm-project#668)" (vllm-project#1724) Signed-off-by: gcanlin <canlinguosdu@gmail.com>

erfgss requested a review from hsliuustc0106 as a code owner January 6, 2026 09:16

erfgss changed the title ~~feat: add profiling for vllm-omni~~ [Profile] Adding profiling for vllm-omni Jan 6, 2026

chatgpt-codex-connector Bot reviewed Jan 6, 2026

View reviewed changes

david6666666 reviewed Jan 6, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jan 7, 2026

View reviewed changes

Bounty-hunter reviewed Jan 7, 2026

View reviewed changes

ZJY0516 reviewed Jan 7, 2026

View reviewed changes

erfgss force-pushed the feat/vllmomni_profiling branch 2 times, most recently from d37f6c1 to 2f704e4 Compare January 13, 2026 07:38

david6666666 added the ready label to trigger buildkite CI label Jan 21, 2026

david6666666 approved these changes Jan 21, 2026

View reviewed changes

ZJY0516 reviewed Jan 21, 2026

View reviewed changes

Merge branch 'main' into feat/vllmomni_profiling

f78a99b

lishunyang12 approved these changes Mar 4, 2026

View reviewed changes

david6666666 added the ready label to trigger buildkite CI label Mar 4, 2026

LJH-LBJ reviewed Mar 4, 2026

View reviewed changes

erfgss added 8 commits March 5, 2026 09:20

Merge branch 'vllm-project:main' into feat/vllmomni_profiling

cf7dbc2

Refactor timing metrics in diffusion_engine.py

6d0272d

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Fix postprocess_time metric assignment

4ac92fb

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Rename executor_time_ms to diffusion_engine_total_time_ms

64a4f68

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Rename variable for execution time tracking

36be0da

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Enhance metrics documentation with DiffusionStats details

479768c

Added detailed metrics for DiffusionStats including execution and processing times. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Improve formatting in metrics.md

57b9fac

Updated formatting and added spacing for clarity in the metrics documentation. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Fix formatting of postprocess_time_ms description

4b3dbf3

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

LJH-LBJ reviewed Mar 5, 2026

View reviewed changes

Fix duplicate entry in metrics documentation

529c2d7

Removed duplicate preprocessing_time_ms entry. Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

erfgss added 2 commits March 6, 2026 08:41

Merge branch 'main' into feat/vllmomni_profiling

d0dd690

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

Add model_class_name and cache options to function

0b88e65

Signed-off-by: erfgss <97771661+erfgss@users.noreply.github.com>

david6666666 approved these changes Mar 6, 2026

View reviewed changes

david6666666 merged commit b7fcc9d into vllm-project:main Mar 6, 2026
7 checks passed

gcanlin added a commit to gcanlin/vllm-omni that referenced this pull request Mar 7, 2026

Revert "[Profile] Adding metrics for Diffusion/DiT Single diffusion P…

c961d79

…ipeline (vllm-project#668)" This reverts commit b7fcc9d. Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin mentioned this pull request Mar 7, 2026

Revert "[Profile] Adding metrics for Diffusion/DiT Single diffusion Pipeline (#668)" #1724

Merged

5 tasks

hsliuustc0106 pushed a commit that referenced this pull request Mar 7, 2026

Revert "[Profile] Adding metrics for Diffusion/DiT Single diffusion P…

074af30

…ipeline (#668)" (#1724) Signed-off-by: gcanlin <canlinguosdu@gmail.com>

clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026

Revert "[Profile] Adding metrics for Diffusion/DiT Single diffusion P…

4b4744b

…ipeline (vllm-project#668)" (vllm-project#1724) Signed-off-by: gcanlin <canlinguosdu@gmail.com>

	image = self.vae.decode(latents, return_dict=False)[0][:, :, 0]
	# processed_image = self.image_processor.postprocess(image, output_type=output_type)

		"diffusion_engine_exec_time_ms": round((time.time() - diffusion_engine_start_time) * 1000, 2),
		"executor_time_ms": round(exec_total_time * 1000, 2),

Conversation

erfgss commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result glm_image

Test Result text_to_image

Test Result image_to_image

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Jan 6, 2026

Uh oh!

gcanlin commented Jan 6, 2026

Uh oh!

lishunyang12 commented Jan 6, 2026

Uh oh!

david6666666 Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

david6666666 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hsliuustc0106 commented Jan 7, 2026

Uh oh!

erfgss commented Jan 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

erfgss commented Jan 7, 2026

Uh oh!

Bounty-hunter Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

ZJY0516 Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

ZJY0516 commented Jan 7, 2026

Uh oh!

lishunyang12 commented Jan 7, 2026

Uh oh!

hsliuustc0106 commented Jan 7, 2026

Uh oh!

wuhang2014 commented Jan 9, 2026

Uh oh!

erfgss commented Jan 12, 2026

Uh oh!

ZJY0516 commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erfgss commented Jan 15, 2026

Uh oh!

david6666666 commented Jan 19, 2026

Uh oh!

LJH-LBJ commented Jan 20, 2026

Uh oh!

david6666666 commented Jan 21, 2026

Uh oh!

david6666666 commented Jan 21, 2026

Uh oh!

ZJY0516 Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

erfgss commented Mar 4, 2026

erfgss commented Jan 6, 2026 •

edited

Loading

david6666666 commented Jan 6, 2026 •

edited

Loading

ZJY0516 commented Jan 15, 2026 •

edited

Loading

LJH-LBJ commented Mar 4, 2026 •

edited

Loading