[Bugfix] Fix delayed decoding bug for Bagel AR/DIT workflow (L3 test_bagel_img2img error) by natureofnature · Pull Request #2422 · vllm-project/vllm-omni

natureofnature · 2026-04-01T13:43:16Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

When stop_after_transfer is enabled (default), the AR scheduler previously continued decoding for 1–2 extra steps after the KV transfer trigger fired. The request was only stopped in a subsequent update_from_output call, after KV extraction completed. During those extra steps, the still-running parent request consumed scheduling budget, causing companion requests (e.g., cfg_text) to receive different chunked-prefill boundaries. This led to floating-point divergence in the KV cache and visibly degraded image quality in the DiT stage.

With this PR, the following modes should be supported.

	`prefill_finished`	`special_token`
`stop_after_transfer: true` (default)	Supported — Stops decode immediately on trigger. `waiting_for_transfer_free` holds blocks until KV extraction completes. Orchestrator forwards to the next stage via the `finished` output path.	Supported — Computes `snapshot_len`, then immediately stops decode with the same `waiting_for_transfer_free` protection. Fully aligned with `prefill_finished` semantics.
`stop_after_transfer: false`	Supported — Continues decoding after trigger until natural termination (e.g. `max_tokens`). `kv_ready` signal is emitted once KV extraction completes (request still running), allowing the orchestrator to forward early.	Supported — Continues decoding after the special token is detected. KV extraction triggers a `kv_ready` signal for early forwarding; the request finishes naturally and resources are freed on completion.

All four combinations are supported. The stop_after_transfer flag applies the same stop/continue semantics uniformly across both criteria types.

Test Plan

Using prompt “Let the woman wear a blue dress”
L3 test

VLLM_WORKER_MULTIPROC_METHOD=spawn VLLM_TEST_CLEAN_GPU_MEMORY=1 VLLM_IMAGE_FETCH_TIMEOUT=60 pytest -s -v tests/e2e/offline_inference/test_bagel_img2img.py -m "advanced_model" --run-level "advanced_model"

Test Result

Input	output

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@princepride

immediate stop after special tokens are triggered if set stop_after Signed-off-by: natureofnature <wzliu@connect.hku.hk>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 980f1650e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: natureofnature <wzliu@connect.hku.hk>

natureofnature · 2026-04-01T14:35:24Z

@codex review

natureofnature · 2026-04-01T14:36:20Z

@amy-why-3459

princepride

LGTM

chatgpt-codex-connector · 2026-04-01T14:41:16Z

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…bagel_img2img error) (vllm-project#2422) Signed-off-by: natureofnature <wzliu@connect.hku.hk>

fix l3 bagel best_bagel_img2img.py error

980f165

immediate stop after special tokens are triggered if set stop_after Signed-off-by: natureofnature <wzliu@connect.hku.hk>

natureofnature requested a review from hsliuustc0106 as a code owner April 1, 2026 13:43

chatgpt-codex-connector Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread vllm_omni/core/sched/omni_ar_scheduler.py

fix conlict of need kv transfer and critera trigger

e576d53

Signed-off-by: natureofnature <wzliu@connect.hku.hk>

princepride enabled auto-merge (squash) April 1, 2026 14:39

princepride added the ready label to trigger buildkite CI label Apr 1, 2026

princepride approved these changes Apr 1, 2026

View reviewed changes

princepride merged commit bbae904 into vllm-project:main Apr 1, 2026
7 of 8 checks passed

yenuo26 mentioned this pull request Apr 2, 2026

[CI Failure]: tests/e2e/offline_inference/test_bagel_img2img.py::test_bagel_img2img_shared_memory_connector - AssertionError: Pixel mismatch at (100, 100): expected (157, 172, 217), got (139, 155, 185) #2416

Closed

1 task

vraiti pushed a commit to vraiti/vllm-omni that referenced this pull request Apr 9, 2026

[Bugfix] Fix delayed decoding bug for Bagel AR/DIT workflow (L3 test_…

5be4352

…bagel_img2img error) (vllm-project#2422) Signed-off-by: natureofnature <wzliu@connect.hku.hk>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix delayed decoding bug for Bagel AR/DIT workflow (L3 test_bagel_img2img error)#2422

[Bugfix] Fix delayed decoding bug for Bagel AR/DIT workflow (L3 test_bagel_img2img error)#2422
princepride merged 2 commits intovllm-project:mainfrom
natureofnature:bugfix/bagel/kv_transfer_opt

natureofnature commented Apr 1, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

natureofnature commented Apr 1, 2026

Uh oh!

natureofnature commented Apr 1, 2026

Uh oh!

princepride left a comment

Uh oh!

chatgpt-codex-connector Bot commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

natureofnature commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

natureofnature commented Apr 1, 2026

Uh oh!

natureofnature commented Apr 1, 2026

Uh oh!

princepride left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

natureofnature commented Apr 1, 2026 •

edited

Loading