[Refactor] Refactor Diffusion Scheduler/Executor Boundaries and Request State Flow by yJader · Pull Request #1625 · vllm-project/vllm-omni

yJader · 2026-03-03T05:55:32Z

Purpose

RFC: #874

This PR refactors diffusion runtime boundaries by fully separating scheduler state management from multiprocess IPC execution.

Core goals:

Make Scheduler a pure request-state scheduler (waiting/running/finished) without owning IPC queues.
Make MultiprocDiffusionExecutor a pure IPC runtime (broadcast/result queues + worker lifecycle).
Let DiffusionEngine explicitly drive add_request -> schedule -> execute -> update_from_output.
Consolidate cross-API concurrency control into DiffusionEngine._rpc_lock, covering both add_req_and_wait_for_response and collective_rpc.

Architecture before/after

Before Refactor

┌─────────────────────────────────────────────────────────────────────────┐
│                          AsyncOmniDiffusion                             │
│  - ThreadPoolExecutor(max_workers=1)                                    │
│  - await loop.run_in_executor(self.engine.step, request)                │
└─────────────────────┬───────────────────────────────────────────────────┘
                      │ synchronous call
                      ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                           DiffusionEngine                               │
│  - step(request)                                                        │
│  - add_req_and_wait_for_response(request)                               │
│  - (directly calls) executor.add_req(request)                           │
└─────────────────────┬───────────────────────────────────────────────────┘
                      │
                      ▼
┌─────────────────────────────────────────────────────────────────────────┐
│     MultiprocDiffusionExecutor (depends on Scheduler internal IPC)      │
│  - self.scheduler = Scheduler()                                         │
│  - add_req() -> scheduler.add_req()                                     │
│  - collective_rpc() depends on scheduler._lock / scheduler.mq/result_mq │
└─────────────────────┬───────────────────────────────────────────────────┘
                      │ scheduler.add_req() -> mq.enqueue/dequeue
                      ▼
┌─────────────────────────────────────────────────────────────────────────┐
│              Scheduler (mixed state + IPC responsibilities)             │
│  - _lock, mq, result_mq                                                 │
│  - add_req(request)                                                     │
└─────────────────────┬───────────────────────────────────────────────────┘
                      │ MessageQueue broadcast
                      ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                                Workers                                  │
│  - worker.generate(request)                                             │
│  - full denoising loop executed inside workers                          │
└─────────────────────────────────────────────────────────────────────────┘

After Refactor

┌─────────────────────────────────────────────────────────────────────────┐
│                          AsyncOmniDiffusion                             │
│  - ThreadPoolExecutor(max_workers=1)                                    │
│  - await loop.run_in_executor(self.engine.step, request)                │
└─────────────────────┬───────────────────────────────────────────────────┘
                      │ synchronous call
                      ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                            DiffusionEngine                              │
│  - owns: Scheduler + Executor + _rpc_lock                               │
│  - add_req_and_wait_for_response():                                     │
│      add_request -> schedule -> executor.add_req -> update_from_output  │
│  - collective_rpc(): acquires engine._rpc_lock, then calls executor RPC │
└───────────────┬───────────────────────────────────────┬─────────────────┘
                │                                       │
                │ scheduling state flow                 │ RPC/execution flow
                ▼                                       ▼
┌─────────────────────────────────┐     ┌──────────────────────────────────┐
│ Scheduler (pure state machine)  │     │ MultiprocDiffusionExecutor       │
│  - waiting/running/finished     │     │ (pure IPC runtime)               │
│  - add_request/schedule/update  │     │  - broadcast_mq / result_mq      │
│  - no mq/result_mq ownership    │     │  - add_req / collective_rpc      │
└─────────────────────────────────┘     └─────────────────┬────────────────┘
                                                          │ MessageQueue
                                                          ▼
                                            ┌──────────────────────────────┐
                                            │            Workers           │
                                            │  - generate / RPC handlers   │
                                            └──────────────────────────────┘

Key differences:

Scheduler changes from a mixed “state + IPC” component to a pure scheduling state machine.
Scheduler is also split internally into a sched/ package:
- SchedulerInterface defines the scheduler/engine boundary.
- _BaseScheduler owns shared request state, waiting/running/finished queues, and request-id mapping.
- RequestScheduler only keeps the current request-mode scheduling policy.
MultiprocDiffusionExecutor no longer depends on scheduler internals (_lock/mq/result_mq).
Concurrency locking is moved up to DiffusionEngine._rpc_lock, covering generation and collective_rpc.

Main code changes:

vllm_omni/diffusion/sched/interface.py
- Defines DiffusionRequestStatus, DiffusionRequestState, NewRequestData, CachedRequestData, DiffusionSchedulerOutput, and SchedulerInterface.
- DiffusionSchedulerOutput now explicitly separates scheduled_new_reqs, scheduled_cached_reqs, and finished_req_ids as the stable engine/scheduler scheduling output.
vllm_omni/diffusion/sched/base_scheduler.py
- Extracts shared scheduler state management: _request_states, _waiting, _running, _finished_req_ids.
- Extracts request_id -> sched_req_id mapping and common finish/cleanup logic.
vllm_omni/diffusion/sched/request_scheduler.py
- Implements the current request-mode scheduling policy.
- Keeps single-request scheduling semantics (num_scheduled_reqs is currently 0/1).
vllm_omni/diffusion/sched/__init__.py
- Provides the unified scheduler export surface and keeps Scheduler = RequestScheduler as an alias.
vllm_omni/diffusion/diffusion_engine.py
- Initialize scheduler during engine construction.
- Move lock ownership to engine (_rpc_lock).
- Refactor add_req_and_wait_for_response into a scheduler-driven sync loop that advances execution and state cleanup via DiffusionSchedulerOutput.
- Make collective_rpc(timeout=...) honor end-to-end timeout across lock wait + RPC execution.
vllm_omni/diffusion/executor/multiproc_executor.py
- Executor directly manages broadcast/result queues and worker lifecycle.
- add_req directly performs generate RPC with response type/error/timeout checks.
- collective_rpc no longer relies on scheduler internals or locks.
Tests/docs
- Add tests/diffusion/test_diffusion_scheduler.py.
- Rename and refactor concurrency tests:
  tests/diffusion/test_multiproc_executor_concurrency.py ->
  tests/diffusion/test_multiproc_engine_concurrency.py.

Test Plan

Unit/functional tests:

pytest -m diffusion tests/diffusion/test_diffusion_scheduler.py
pytest -m diffusion tests/diffusion/test_multiproc_engine_concurrency.py

Serving benchmark command:

python3 benchmarks/diffusion/diffusion_benchmark_serving.py \
  --base-url http://localhost:8099 \
  --model Qwen/Qwen-Image \
  --task t2i \
  --dataset vbench \
  --num-prompts 20

Test Result

Benchmark setup:

Backend: vllm-omni
Model: Qwen/Qwen-Image
Dataset: vbench
Task: t2i
Num prompts: 20
Max request concurrency: 1

Benchmark comparison:

Metric	Base (`f52b5153`)	This PR (`72df5305`)	Delta (PR - Base)
Benchmark duration (s)	406.28	406.21	-0.06
Request throughput (req/s)	0.05	0.05	0.00
Latency Mean (s)	20.3133	20.3103	-0.0030
Latency Median (s)	20.3292	20.3391	+0.0099
Latency P99 (s)	20.4397	20.4303	-0.0094
Successful requests	20/20	20/20	same

Result summary:

No throughput or stability regression is observed under this benchmark setup.
Total duration is slightly lower, P99 is slightly better, and median latency only shifts by about +0.01s; overall performance remains effectively on par.

CC List

@hsliuustc0106 @ZJY0516 @lishunyang12

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1183c10d7b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

hsliuustc0106

is this the first PR of #874 ?

yJader · 2026-03-03T06:15:51Z

is this the first PR of #874 ?

No, the first PR is #1368. This is the second PR for RFC #874.

lishunyang12

left a comment inline

lishunyang12 · 2026-03-04T15:28:17Z

+        with self._rpc_lock:
+            target_req_id = self.scheduler.add_request(request)
+
+            while True:


Holding _rpc_lock for the entire generation blocks all other RPCs. Is that intentional?

Yes.

As discussed in #1448, production can have concurrent add_req and collective_rpc calls, and correctness issues can happen if request scheduling and IPC are not protected consistently.

Before this refactor, the call path was engine.add_req_and_wait_for_response -> executor.add_req -> scheduler.add_req, and scheduler.add_req held the lock for the whole enqueue/dequeue round-trip. So generation already serialized with other RPCs through the same lock domain.

After splitting scheduler/executor responsibilities, if we only lock inside the executor but leave scheduler.add_request/schedule/update_from_output outside that critical section, we can reintroduce request/response interleaving risks (the same class of issue as #1448).

So I intentionally moved the lock to engine level and hold it for the full add_req_and_wait_for_response cycle, to keep scheduler state transition + IPC execution atomic and preserve existing API semantics.

You are right that this blocks other RPCs during generation; this is a correctness-first tradeoff. A finer-grained solution would require changing add_req_and_wait_for_response semantics and a larger engine redesign, which may be out of scope for this PR.

Makes sense, thanks for the context. Preserving the existing lock scope is reasonable here.

lishunyang12

scheduler as pure state machine is a nice clean-up. left one nit inline

lishunyang12 · 2026-03-09T01:07:55Z

+                    "please check the stack trace above for the root cause"
+                )
+            if not isinstance(response, DiffusionOutput):
+                raise RuntimeError(f"Unexpected response type for generate: {type(response)!r}")


deadline is always None here, so the zmq.error.Again / TimeoutError catches on dequeue are dead code. Not blocking — just something to clean up or wire through a timeout param later.

True. Since this is not needed for now, it’s better to remove this dead code rather than keep it for potential future use.

ZJY0516

Overall, LGTM

ZJY0516 · 2026-03-09T02:18:59Z

+        self._running.clear()
+        self._finished_req_ids.clear()
+
+    def _make_req_id(self, request: OmniDiffusionRequest) -> str:


I remember vllm-omni will set a request id in api server level. Waht's the difference between them?

OmniDiffusionRequest.request_ids is aligned with prompts (request_ids[i] -> prompts[i])
In contrast, DiffusionRequestState.req_id identifies the entire OmniDiffusionRequest, which is treated as a single scheduling unit.

I found that req_id here is easy to confuse with OmniDiffusionRequest.request_id, while it actually refers to the scheduler-owned request identifier.

Renaming it to sched_req_id makes the ownership explicit and reduces confusion in follow-up development.

Even if this field may disappear in a future refactor when batching is fully moved into the scheduler, the rename still improves clarity today.

(edit in 03d7da4)

yenuo26 · 2026-03-10T01:29:01Z

+    )
+
+
+def test_single_request_success_lifecycle() -> None:


please add level and platform mark, you can see https://github.com/vllm-project/vllm-omni/blob/dev/vllm-align/docs/contributing/ci/tests_markers.md

Thanks, I added the core_model and cpu markers.

Based on vllm-project#1625. - Refactor scheduler/executor boundaries and request state flow - Implement RequestScheduler and StepScheduler architecture - Add _max_batch_size enforcement - Handle output error in DiffusionEngine's dummy run Signed-off-by: jader <yjader@foxmail.com> Signed-off-by: asukaqaq <1311722138@qq.com>

Based on vllm-project#1625. - Refactor scheduler/executor boundaries and request state flow - Implement RequestScheduler and StepScheduler architecture - Add _max_batch_size enforcement - Handle output error in DiffusionEngine's dummy run Signed-off-by: jader <yjader@foxmail.com>

wtomin · 2026-03-13T08:20:08Z

@yJader Please take a look at the CI error tests/diffusion/test_diffusion_scheduler.py failed.

[2026-03-11T10:14:52Z] FAILED tests/diffusion/test_diffusion_scheduler.py::test_single_request_success_lifecycle - AttributeError: 'DiffusionRequestState' object has no attribute 'req_id'
--
[2026-03-11T10:14:52Z] FAILED tests/diffusion/test_diffusion_scheduler.py::test_fifo_single_request_scheduling - AttributeError: 'DiffusionRequestState' object has no attribute 'req_id'
[2026-03-11T10:14:52Z] FAILED tests/diffusion/test_diffusion_scheduler.py::test_abort_request_for_waiting_and_running - AttributeError: 'DiffusionRequestState' object has no attribute 'req_id'

yJader · 2026-03-14T07:04:12Z

For a0c043b

While working on the follow-up step scheduler implementation, I found that the current scheduler design does not support future feature development very well. The scheduling interface, shared state management, and request-mode policy were too tightly coupled, which made it difficult to extend the scheduler cleanly and reuse common logic.

Because of that, I split the scheduler into interface / base_scheduler / request_scheduler:

interface defines the scheduler lifecycle and output contract
base_scheduler owns the shared request state, waiting/running/finished queues, and common cleanup logic
request_scheduler keeps the current request-mode scheduling policy only
This makes it easier to continue with step scheduler development without disturbing the existing request-mode execution path.

I also updated the PR README with the corresponding design notes, test status, and benchmark results. The current test and benchmark results look normal, and I did not observe regressions from this refactor.

yJader · 2026-03-15T02:39:09Z

@yJader Please take a look at the CI error tests/diffusion/test_diffusion_scheduler.py failed.

[2026-03-11T10:14:52Z] FAILED tests/diffusion/test_diffusion_scheduler.py::test_single_request_success_lifecycle - AttributeError: 'DiffusionRequestState' object has no attribute 'req_id'
--
[2026-03-11T10:14:52Z] FAILED tests/diffusion/test_diffusion_scheduler.py::test_fifo_single_request_scheduling - AttributeError: 'DiffusionRequestState' object has no attribute 'req_id'
[2026-03-11T10:14:52Z] FAILED tests/diffusion/test_diffusion_scheduler.py::test_abort_request_for_waiting_and_running - AttributeError: 'DiffusionRequestState' object has no attribute 'req_id'

Fixed

yJader · 2026-03-15T03:13:56Z

buildkite/vllm-omni-amd-ci — Build #3163 failed (42 minutes, 47 seconds)

I checked this ci failure locally and could not reproduce it.

Local run of
pytest -s tests/e2e/offline_inference/test_diffusion_cpu_offload.py
passed with:

Offload peak memory: 18668.125 MB
No offload peak memory: 21388.125 MB
Observed reduction: 2720 MB

In CI, the failure log shows:

Offload peak memory: 14036.0 MB
No offload peak memory: 16434.0 MB
Observed reduction: 2398 MB

So the CI run still shows lower peak memory with CPU offload enabled, but it missed the current threshold by about 102 MB. My guess is that this may be caused by environment-dependent GPU memory noise / measurement variance in CI.

yJader · 2026-03-18T13:22:54Z

I’ve rebased this branch onto the latest upstream main. The recent major changes (#1908) do not affect this PR.

lishunyang12

Re-reviewed after the latest refactor. Previous nit (dead timeout code) is addressed. Two new items from the interface split.

lishunyang12 · 2026-03-18T14:35:02Z

+                        raise RuntimeError("Diffusion scheduler has no runnable requests.")
+                    continue
+
+                sched_req_id = sched_output.scheduled_req_ids[0]


This assumes the scheduled request is always new (scheduled_new_reqs[0]). Currently safe because _max_batch_size=1 + _rpc_lock guarantees the first schedule for the target is a new request. But if batch size ever changes, this will IndexError on cached-only schedules. Worth a guard or at least a comment explaining the invariant.

Thanks for the suggestion. Making max_batch_size > 1 would require more changes across the current execution path to support real scheduler-side batching, so a clarifying comment is the right fix for now. I added that in afe0d34.

- add notes in scheduler - align to vllm-project#1908, move "step_execution" into AsyncOmniEngine._create_default_diffusion_stage_cfg - note: due to 6bdb55a, tests can't pass Signed-off-by: jader <yjader@foxmail.com>

- Add notes to scheduler - Align with vllm-project#1908; move "step_execution" into `AsyncOmniEngine._create_default_diffusion_stage_cfg` - NOTE: Due to 6bdb55a, tests are currently failing and need to be fixed later Signed-off-by: jader <yjader@foxmail.com>

…st State Flow Refactor diffusion runtime boundaries to separate scheduler state management from multiprocess IPC execution. Core goals: - Make Scheduler a pure request-state scheduler (waiting/running/finished) without owning IPC queues. - Make MultiprocDiffusionExecutor a pure IPC runtime (broadcast/result queues + worker lifecycle). - Let DiffusionEngine explicitly drive add_request -> schedule -> execute -> update_from_output. - Consolidate cross-API concurrency control into DiffusionEngine._rpc_lock. Main code changes: - scheduler.py: introduce request status/state output types and pure scheduling APIs; remove scheduler-side IPC ownership. - diffusion_engine.py: engine owns scheduler and _rpc_lock; refactor add_req_and_wait_for_response to scheduler-driven flow. - multiproc_executor.py: executor directly manages IPC queues and worker lifecycle; decouple from scheduler internals. - tests: add diffusion scheduler tests; rename/refactor multiproc concurrency test to engine-focused variant. Test plan: - pytest -m diffusion tests/diffusion/test_diffusion_scheduler.py - pytest -m diffusion tests/diffusion/test_multiproc_engine_concurrency.py Signed-off-by: jader <yjader@foxmail.com>

Signed-off-by: jader <yjader@foxmail.com>

…ests Signed-off-by: jader <yjader@foxmail.com>

Signed-off-by: jader <yjader@foxmail.com>

Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com> Signed-off-by: JiangJie Zhang <76905040+yJader@users.noreply.github.com>

…dling Signed-off-by: jader <yjader@foxmail.com>

Signed-off-by: jader <yjader@foxmail.com>

Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com> Signed-off-by: JiangJie Zhang <76905040+yJader@users.noreply.github.com>

Signed-off-by: jader <yjader@foxmail.com>

david6666666 · 2026-03-23T02:31:34Z

please fix CI, If your PR is ready, please @wtomin or @SamitHuang

Co-authored-by: asukaqaq-s <1311722138@qq.com> Signed-off-by: jader <yjader@foxmail.com>

…e cleanup Signed-off-by: jader <yjader@foxmail.com>

yJader · 2026-03-23T08:52:14Z

please fix CI, If your PR is ready, please @wtomin or @SamitHuang

Fixed. Please take a look when you have time. @wtomin @SamitHuang

wtomin

All of my comments were addressed. LGTM.

…st State Flow (vllm-project#1625) Signed-off-by: jader <yjader@foxmail.com> Signed-off-by: JiangJie Zhang <76905040+yJader@users.noreply.github.com> Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com> Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com>

yJader requested a review from hsliuustc0106 as a code owner March 3, 2026 05:55

chatgpt-codex-connector Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread vllm_omni/diffusion/diffusion_engine.py

hsliuustc0106 reviewed Mar 3, 2026

View reviewed changes

lishunyang12 reviewed Mar 4, 2026

View reviewed changes

Bounty-hunter mentioned this pull request Mar 5, 2026

[Feature]: Abort request when http disconnects JiusiServe/LM-service#75

Closed

1 task

lishunyang12 reviewed Mar 9, 2026

View reviewed changes

ZJY0516 reviewed Mar 9, 2026

View reviewed changes

ZJY0516 added the ready label to trigger buildkite CI label Mar 9, 2026

ZJY0516 requested review from SamitHuang and wtomin March 9, 2026 02:21

wtomin reviewed Mar 9, 2026

View reviewed changes

Comment thread tests/diffusion/test_diffusion_scheduler.py

yenuo26 reviewed Mar 10, 2026

View reviewed changes

asukaqaq-s mentioned this pull request Mar 10, 2026

[Feat] Support step-boundary abort in diffusion #1769

Merged

10 tasks

yJader force-pushed the pr/scheduler branch from a95fd7f to 03d7da4 Compare March 10, 2026 15:05

david6666666 mentioned this pull request Mar 11, 2026

[RFC]: Qwen-Image、Qwen-Image-Layered、Qwen-Image-Edit-Plus、Wan2.2 Production-grade Feature Monitoring JiusiServe/vllm-omni#167

Closed

26 tasks

wtomin mentioned this pull request Mar 12, 2026

[RFC]: Diffusion Models Features Supports Plan #814

Open

54 tasks

Gaohan123 added this to the v0.18.0 milestone Mar 14, 2026

yJader force-pushed the pr/scheduler branch from a0c043b to 72df530 Compare March 18, 2026 13:17

lishunyang12 reviewed Mar 18, 2026

View reviewed changes

yJader force-pushed the pr/scheduler branch from 472843d to afe0d34 Compare March 18, 2026 16:50

yJader and others added 12 commits March 22, 2026 08:20

[Bugfix] Handle output error in DiffusionEngine's dummy run

3c65099

Signed-off-by: jader <yjader@foxmail.com>

refactor: remove dead timeout handling from add_req

0d5c838

Signed-off-by: jader <yjader@foxmail.com>

test: update pytestmark to include core_model and cpu for diffusion t…

7b3fb6b

…ests Signed-off-by: jader <yjader@foxmail.com>

refactor: rename DiffusionRequestState.req_id to sched_req_id

eb76c3d

Signed-off-by: jader <yjader@foxmail.com>

refactor(diffusion): update design for future feature

b29219a

Signed-off-by: jader <yjader@foxmail.com>

fix: Add missing @AbstractMethod

d8285ea

Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com> Signed-off-by: JiangJie Zhang <76905040+yJader@users.noreply.github.com>

docs(diffusion/scheduler): update comments for clarity on request han…

f53ab95

…dling Signed-off-by: jader <yjader@foxmail.com>

feat: add request_ids to OmniDiffusionRequest in dummy run

2f25a9e

Signed-off-by: jader <yjader@foxmail.com>

fix(diffusion): restore audio warmup support

d0c0163

Signed-off-by: jader <yjader@foxmail.com>

Update tests/diffusion/test_diffusion_scheduler.py

8f42b63

Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com> Signed-off-by: JiangJie Zhang <76905040+yJader@users.noreply.github.com>

docs(diffusion/scheduler): update diffusion scheduler design document

324476b

Signed-off-by: jader <yjader@foxmail.com>

yJader force-pushed the pr/scheduler branch 2 times, most recently from 0d6e2f7 to b4ac994 Compare March 22, 2026 15:35

fix(diffusion): update request handling to support new request structure

df3ac75

Co-authored-by: asukaqaq-s <1311722138@qq.com> Signed-off-by: jader <yjader@foxmail.com>

yJader force-pushed the pr/scheduler branch from b4ac994 to df3ac75 Compare March 23, 2026 04:31

fix(diffusion/executor): improve shutdown signal handling and resourc…

b9be68a

…e cleanup Signed-off-by: jader <yjader@foxmail.com>

yJader force-pushed the pr/scheduler branch from 39de810 to b9be68a Compare March 23, 2026 07:30

fake0fan mentioned this pull request Mar 23, 2026

[Feature][RL] Support batching for QwenImage in async mode #1593

Merged

5 tasks

wtomin approved these changes Mar 23, 2026

View reviewed changes

wtomin merged commit 61cd532 into vllm-project:main Mar 23, 2026
7 of 8 checks passed

wtomin mentioned this pull request Mar 26, 2026

[RFC]: vLLM-Omni Diffusion Module — Q2 2026 Roadmap #2226

Open

25 tasks

zzhang-fr mentioned this pull request Mar 27, 2026

[RFC]: Pipeline Parallelism & Stream Batch for Real-Time Video Generation #2280

Open

16 tasks

asukaqaq-s mentioned this pull request May 7, 2026

[RFC]: Refactor engine/runner/pipeline to support step-wise and continuos batching #874

Open

5 tasks

yJader mentioned this pull request May 12, 2026

[RFC] [Refactor]: Unify diffusion request identity around request_id #3550

Open

1 task

Conversation

yJader commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Architecture before/after

Before Refactor

After Refactor

Test Plan

Test Result

CC List

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

yJader commented Mar 3, 2026

Uh oh!

lishunyang12 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yJader Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wtomin commented Mar 13, 2026

Uh oh!

yJader commented Mar 14, 2026

Uh oh!

yJader commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yJader commented Mar 15, 2026

Uh oh!

yJader commented Mar 18, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yJader Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david6666666 commented Mar 23, 2026

Uh oh!

yJader commented Mar 23, 2026

Uh oh!

yJader commented Mar 3, 2026 •

edited

Loading

lishunyang12 left a comment •

edited

Loading

yJader Mar 10, 2026 •

edited

Loading

yJader commented Mar 15, 2026 •

edited

Loading

yJader Mar 18, 2026 •

edited

Loading