[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLLM by gcanlin · Pull Request #2551 · vllm-project/vllm-omni

gcanlin · 2026-04-07T09:36:40Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Stat logging wired end-to-end (mirrors AsyncLLM)

AsyncOmniEngine: derive self.log_stats from stage0 engine_args.disable_log_stats and thread it through spawn_stage_core / MultimodalOutputProcessor, instead of hardcoding False.
Single StatLoggerManager for the whole pipeline: one manager with engine_idxs=list(range(num_stages)), so each stage logs under its own Engine NNN label and PrometheusStatLogger is only instantiated once (avoids registry collisions between stages).
Orchestrator._process_stage_outputs: construct IterationStats per output batch and call logger_manager.record(scheduler_stats=..., iteration_stats=..., engine_idx=stage_id), matching AsyncLLM's output_handler loop.
Orchestrator.init: drop the redundant log_stats parameter; derive it as self.log_stats = self.logger_manager is not None.

Single-threaded StatLoggerManager access

StatLoggerManager is not thread-safe, and record() runs in the orchestrator thread while do_log_stats() is called from the API-server main thread — a data race on the internal accumulators.

AsyncOmniEngine._bootstrap_orchestrator: expose the orchestrator event loop as self.orchestrator_loop.
AsyncOmniEngine.do_log_stats (new): schedule manager.log() onto the orchestrator loop via asyncio.run_coroutine_threadsafe, so all access to StatLoggerManager stays on a single thread. No-op when the manager / loop is missing or stopped.
AsyncOmni.do_log_stats: reduced to await self.engine.do_log_stats(). The API-server thread no longer touches orchestrator internals directly — matches AsyncLLM's facade style.

Test Plan

Test Result

(APIServer pid=1023600) INFO 04-07 09:32:07 [loggers.py:259] Engine 000: Avg prompt throughput: 64.8 tokens/s, Avg generation throughput: 56.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 0.0%
(APIServer pid=1023600) INFO 04-07 09:32:07 [loggers.py:259] Engine 001: Avg prompt throughput: 57.0 tokens/s, Avg generation throughput: 123.9 tokens/s, Running: 9 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 0.0%
(APIServer pid=1023600) INFO 04-07 09:32:07 [loggers.py:259] Engine 002: Avg prompt throughput: 3956.3 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
(APIServer pid=1023600) INFO 04-07 09:32:17 [loggers.py:259] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 0.0%
(APIServer pid=1023600) INFO 04-07 09:32:17 [loggers.py:259] Engine 001: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 9 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 0.0%
(APIServer pid=1023600) INFO 04-07 09:32:17 [loggers.py:259] Engine 002: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

chatgpt-codex-connector · 2026-04-07T09:36:47Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

gcanlin · 2026-04-07T09:40:11Z

This PR is totally vibe coded. But looks clean. @princepride @fake0fan @yinpeiqi Could you help take a look?

princepride · 2026-04-07T09:48:02Z

+        manager = getattr(self.engine, "logger_manager", None)
+        if manager is None:
+            return
+        try:
+            manager.log()
+        except Exception:
+            logger.exception("[AsyncOmni] do_log_stats failed")


Cross-thread data race on StatLoggerManager between record() and log()

StatLoggerManager is accessed from two different threads without synchronization. record() is called from the orchestrator's background thread in Orchestrator._process_stage_outputs() (vllm_omni/engine/orchestrator.py:621), while log() is called from the main caller's thread in AsyncOmni.do_log_stats() (vllm_omni/entrypoints/async_omni.py:755). The orchestrator runs in a dedicated threading.Thread created at vllm_omni/engine/async_omni_engine.py:268. In upstream vLLM's AsyncLLM, both record() and log() execute within the same asyncio event loop / thread context. Here they are split across threads, creating a data race on the internal accumulators of StatLoggerManager.

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

yinpeiqi · 2026-04-07T13:45:12Z


    async def do_log_stats(self) -> None:
-        """Log statistics.
+        """Log statistics by flushing per-stage StatLoggerManagers.


I think we'd better don't operate on the orchestration thread from the async_omni thread. In my view, here should be:

class AsyncOmniEngine: async def do_log_stats(self): await self.engine.do_log_stats() class AsyncOmniEngine: async def do_log_stats(self): # let the orchestrator thread do the call

Maybe could be regard as a collect rpc call? I am not very sure. But definitally we'd better don't direct operate on the orchestrator thread from AsyncOmni.

yinpeiqi · 2026-04-07T13:46:41Z

        self.output_processors: list[Any] = output_processors
        self.stage_vllm_configs: list[Any] = stage_vllm_configs
+        self.log_stats = log_stats
+        self.logger_manager: StatLoggerManager | None = logger_manager


Is log_stats still useful in orchestrator? Could we just

self.log_stats = (self.logger_manager != None)

yinpeiqi · 2026-04-07T13:49:15Z

+        # Mirror vLLM AsyncLLM output_handler: feed stats to the logger
+        # manager so LoggingStatLogger can periodically print KV cache /
+        # prefix cache hit rate, and PrometheusStatLogger can publish.
+        if self.logger_manager is not None:


The diffusion engine don't go into this branch. Do we have any plan for diffusion?

Currently, no idea about diffusion logger. Reusing vLLM's logger makes this PR simple. But something like KV cache isn't appropriate to diffusion.

fake0fan · 2026-04-08T03:53:07Z

        self.num_stages = len(self.stage_configs)
        stage0_args = getattr(self.stage_configs[0], "engine_args", None) if self.num_stages > 0 else None
        self.async_chunk = bool(getattr(stage0_args, "async_chunk", False))
+        self.log_stats = not bool(getattr(stage0_args, "disable_log_stats", False))


Overall this looks fine to me. If the StatLoggerManager concurrency issue has been properly resolved, I don't have other blockers.

One small nit: this seems to rely too heavily on the stage0 configuration, which feels somewhat awkward. Probably okay for now, but worth cleaning up later. cc @yinpeiqi

Also, it may be worth taking another look at the logging/stat system for the diffusion path in a follow-up as well, since it seems not fully covered by the current branch yet. @chickeyton

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin · 2026-04-09T07:22:55Z

@princepride @fake0fan @yinpeiqi Thanks for the valuable review! I fixed them now. Please take another look.

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

yinpeiqi · 2026-04-10T08:35:04Z

overall LGTM, please fix the ci

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

vllm-project#2551)" This reverts commit 5d58abb.

vllm-project#2551)" This reverts commit 5d58abb. Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

…project#2551) Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin added 2 commits April 7, 2026 08:22

[Log] enable stats logging by default to match vLLM behavior

016a29f

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

refator

1adf3f5

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin requested a review from hsliuustc0106 as a code owner April 7, 2026 09:36

princepride requested changes Apr 7, 2026

View reviewed changes

fix

c097462

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin requested a review from princepride April 7, 2026 12:16

yinpeiqi reviewed Apr 7, 2026

View reviewed changes

fake0fan reviewed Apr 8, 2026

View reviewed changes

gcanlin added 3 commits April 9, 2026 06:58

fix comments

7619dee

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

Merge branch 'main' into log-stats

152a8b0

add ut

ff957d9

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin added the ready label to trigger buildkite CI label Apr 9, 2026

gcanlin added 3 commits April 9, 2026 07:27

fix lint

41d5dbc

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

fix ut

7dfe1c6

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

merge main

b678295

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin force-pushed the log-stats branch from 4d7f418 to b678295 Compare April 10, 2026 07:44

update

8941aca

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

fix ut

7ac4bfe

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

princepride approved these changes Apr 12, 2026

View reviewed changes

gcanlin merged commit 5d58abb into vllm-project:main Apr 12, 2026
8 checks passed

gcanlin mentioned this pull request Apr 12, 2026

[Bugfix] Fix UT for the missing of log_stats in Engine #2706

Merged

5 tasks

amy-why-3459 added a commit to amy-why-3459/vllm-omni that referenced this pull request Apr 13, 2026

Revert "[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLLM (

bc0d16f

vllm-project#2551)" This reverts commit 5d58abb.

amy-why-3459 mentioned this pull request Apr 13, 2026

[Revert] Revert "[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLL… #2716

Merged

5 tasks

amy-why-3459 added a commit to amy-why-3459/vllm-omni that referenced this pull request Apr 13, 2026

Revert "[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLLM (

104a682

vllm-project#2551)" This reverts commit 5d58abb. Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

daixinning pushed a commit to daixinning/vllm-omni that referenced this pull request Apr 13, 2026

[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLLM (vllm-…

6809e93

…project#2551) Signed-off-by: gcanlin <canlinguosdu@gmail.com>

gcanlin mentioned this pull request Apr 19, 2026

[Redo][Log] Wire stat logging into AsyncOmniEngine matching AsyncLLM #2918

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLLM#2551

[Log] Wire stat loggers into AsyncOmniEngine to match AsyncLLM#2551
gcanlin merged 11 commits intovllm-project:mainfrom
gcanlin:log-stats

gcanlin commented Apr 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 7, 2026

Uh oh!

gcanlin commented Apr 7, 2026

Uh oh!

princepride Apr 7, 2026

Uh oh!

gcanlin Apr 7, 2026

Uh oh!

yinpeiqi Apr 7, 2026

Uh oh!

yinpeiqi Apr 7, 2026

Uh oh!

yinpeiqi Apr 7, 2026

Uh oh!

gcanlin Apr 9, 2026

Uh oh!

fake0fan Apr 8, 2026

Uh oh!

gcanlin commented Apr 9, 2026

Uh oh!

yinpeiqi commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gcanlin commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Apr 7, 2026

Uh oh!

gcanlin commented Apr 7, 2026

Uh oh!

princepride Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

gcanlin Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

yinpeiqi Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

yinpeiqi Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

yinpeiqi Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

gcanlin Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

fake0fan Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gcanlin commented Apr 9, 2026

Uh oh!

yinpeiqi commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gcanlin commented Apr 7, 2026 •

edited

Loading