[CI] Update Bagel Pixels by alex-jw-brooks · Pull Request #4081 · vllm-project/vllm-omni

alex-jw-brooks · 2026-06-02T20:35:56Z

Purpose

Re-enables the Bagel tests that were failing in the CI due to incorrect handling in batched CFG; the PR for Lance fixed the correctness of the output for CFG, but it added two changes that change the output, so we need to update the reference pixels.

Initialization changes, i.e.,added _regen_init_noise_on_device to the pipeline. This is the main the reason the output changes a lot.
Correction in number of timesteps

        timesteps = torch.linspace(1, 0, num_timesteps, device=x_t.device)

was changed to add one more timestep

        timesteps = torch.linspace(1, 0, num_timesteps + 1, device=x_t.device)

As a result, the reference image on CUDA seems to have changed from the left one to the right one:

For the img2img, its less dramatic looking, but there are changes as well. You can run the first commit in this PR (which reverted the fixes in Lance) to see the tests pass with the old values as a confidence check.

@Gaohan123 @lishunyang12 @zhangj1an can you please take a look? I will open a separate PR to add the batched CFG path back, but I think it's better to do in separate PRs since the current behavior is actually correct, and generate_image is pretty messy

Signed-off-by: Alex Brooks <albrooks@redhat.com>

chatgpt-codex-connector · 2026-06-02T20:36:01Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

alex-jw-brooks · 2026-06-02T20:36:55Z

 ]

-if current_omni_platform.is_rocm():
-    REFERENCE_PIXELS = [


Not sure why rocm pixels are redefined here, but since they were exactly the same, I deleted them. Unfortunately don't have a rocm device to test the new values on 😅

hsliuustc0106 · 2026-06-03T00:33:09Z

cc @princepride @natureofnature

princepride · 2026-06-03T07:51:49Z

Interesting, have you compare the result with the original code?

alex-jw-brooks · 2026-06-03T08:57:40Z

@princepride yup! The PR is split into two commits so it's easier to compare since I wanted to make sure the timestep fix and initialization were the only reasons the pixels changed. The first one (e840a24c4df9f57fbeeb6a73d7c6b895f0e23d1a) reverts the Lance fixes to show that things will pass with the old values. To verify, I had run the shared memory connector tests.

# on e840a24c4df9f57fbeeb6a73d7c6b895f0e23d1a
pytest tests/distributed/omni_connectors/test_bagel_shared_memory_connector.py -v -s --run-level advanced_model -s

========================== 2 passed, 21 warnings in 226.36s (0:03:46) ==========================

The second commit brings the Lance fixes back and updates the pixel values for tti/i2i to match what the tests currently produce, so passes on cuda with new values

Gaohan123 · 2026-06-03T09:50:50Z

@princepride yup! The PR is split into two commits so it's easier to compare since I wanted to make sure the timestep fix and initialization were the only reasons the pixels changed. The first one (e840a24c4df9f57fbeeb6a73d7c6b895f0e23d1a) reverts the Lance fixes to show that things will pass with the old values. To verify, I had run the shared memory connector tests.
# on e840a24c4df9f57fbeeb6a73d7c6b895f0e23d1a
pytest tests/distributed/omni_connectors/test_bagel_shared_memory_connector.py -v -s --run-level advanced_model -s
========================== 2 passed, 21 warnings in 226.36s (0:03:46) ==========================

The second commit brings the Lance fixes back and updates the pixel values for tti/i2i to match what the tests currently produce, so passes on cuda with new values

May I ask what is the lance fixes?

princepride · 2026-06-03T10:12:43Z

Apologies for the delayed response as I've been quite busy lately.😂 I took a closer look, and it seems a previous code change caused the pixel values to change. Why are we modifying the pixel values in this PR instead of just reverting that previous change directly? I checked the original code in the bagel repository, and the timesteps calculation is exactly the same as before.

zhangj1an · 2026-06-03T14:22:07Z

I think previously bagel was using batched CFG, then the lance PR switched bagel to use sequential CFG (because the lance model re-used bagel as part of its model structure). Alex will bring back batched CFG in #4098.

I will finish review #4098 and this PR tomorrow, also check whether it is possible to not change reference image pixels. in my attempt, i generated a cat image similar to the 1st image in the PR description as shown below, so maybe it is possible (im not sure yet).

main branch	my personal

alex-jw-brooks · 2026-06-03T17:30:43Z

Hi @zhangj1an, thanks! That is actually a stale output. On the current main, you should get a good output since it's calling CFG sequentially, it's just not the same one. Here is a repro script:

from vllm_omni.entrypoints.omni import Omni

if __name__ == "__main__":
    omni = Omni(
        model="ByteDance-Seed/BAGEL-7B-MoT",
        enforce_eager=True,
    )

    formatted_prompt = {
        "prompt": f"<|im_start|>A cute cat<|im_end|>",
        "modalities": ["image"],
    }

    omni_outputs = list(omni.generate(prompts=[formatted_prompt], sampling_params_list=omni.default_sampling_params_list))
    omni_outputs[1].images[0].save("output_main.png")

You should get something like this:

The main reason for the large change in output is that the Lance PR added this change, so it's now regenerating the packed_init_noises on the device. If you comment out the call to this, you'll get something very similar to the expected result.

output_e840a24c4df9f57fbeeb6a73d7c6b895f0e23d1a_no_regen

The result may be off by a little though, because the Lance PR also fixed an off by one error in the timesteps. I.e., from the original Bagel code here:

        timesteps = torch.linspace(1, 0, num_timesteps, device=x_t.device)
        timesteps = timestep_shift * timesteps / (1 + (timestep_shift - 1) * timesteps)
        dts =  timesteps[:-1] - timesteps[1:]
        timesteps = timesteps[:-1] # will have num_timesteps - 1  elements

Lance has an identical timestep creation with the +1 fix here, so this was fixed while Bagel was fixed while porting Lance to Omni, which also shifts the values a bit.

I assume the values changed will be bad for ROCm though since its now device specific noise initialization. So I guess either:

We revert the call to _regen_init_noise_on_device for now so that the noise matches, and make a small adjustment to the pixel values if needed (may fail due to extra timestep, but will be on the edge of the tolerance)
or
We use the new device specific values, and only run it on CUDA for now. I can also see if I can find an AMD GPU to test with to get ground truth values

@Gaohan123 @princepride any preference?

Signed-off-by: Alex Brooks <albrooks@redhat.com>

princepride · 2026-06-04T01:38:01Z

This image should be the expect output.

Signed-off-by: Alex Brooks <albrooks@redhat.com>

alex-jw-brooks · 2026-06-04T06:26:20Z

@princepride we can match the current image, but we need to disable _regen_init_noise_on_device in Bagel to make sure the latents are the same, which feels a bit strange to me since it's just a different latent rather than a bug. Although then we can run it on CUDA and AMD at least 🤞

For now, I've commented the latent regeneration out and set num_inference_steps to 14 to account for the off by one fix in Lance, so it should pass now. FYI @lishunyang12

zhangj1an · 2026-06-04T08:17:17Z

LGTM, is good to merge,

the +1 timestep fix from lance PR is correct and still there, (bagel_transformer.py:1700, linspace(1, 0, num_timesteps + 1))
Bagel now uses the original seeded CPU/fp32 noise, which (stays the same as before,) works on both CUDA and AMD/ROCm. Lance uses its own _regen_init_noise_on_device to sample init noise on-device (CUDA + bf16), which matches with its upstream Lance repo.

Signed-off-by: Alex Brooks <albrooks@redhat.com>

alex-jw-brooks added 2 commits June 2, 2026 20:07

disable lance changes (passing bagel tests)

e840a24

Signed-off-by: Alex Brooks <albrooks@redhat.com>

update ref pixels, bring lance changes back

0c43b71

Signed-off-by: Alex Brooks <albrooks@redhat.com>

alex-jw-brooks requested a review from yenuo26 as a code owner June 2, 2026 20:35

alex-jw-brooks changed the title ~~[CR] Update Bagel Pixels~~ [CI] Update Bagel Pixels Jun 2, 2026

alex-jw-brooks commented Jun 2, 2026

View reviewed changes

Gaohan123 added this to the v0.22.0 milestone Jun 3, 2026

Gaohan123 added the high priority high priority issue, needs to be done asap label Jun 3, 2026

alex-jw-brooks mentioned this pull request Jun 3, 2026

[Perf/Fix] Reimplement Batched CFG Forward for Bagel #4098

Open

princepride approved these changes Jun 3, 2026

View reviewed changes

princepride added ready label to trigger buildkite CI and removed ready label to trigger buildkite CI labels Jun 3, 2026

cuda only for bagel tests

2d011b0

Signed-off-by: Alex Brooks <albrooks@redhat.com>

alex-jw-brooks added 2 commits June 4, 2026 05:23

re-enable back on mi325

f3f3058

Signed-off-by: Alex Brooks <albrooks@redhat.com>

revert lance on device latent changes, use 14 steps for off by one fix

5b3a9ea

Signed-off-by: Alex Brooks <albrooks@redhat.com>

alex-jw-brooks requested review from Isotr0py, RuixiangMa, SamitHuang, ZJY0516, david6666666 and wtomin as code owners June 4, 2026 06:15

comment fix

8865f50

Signed-off-by: Alex Brooks <albrooks@redhat.com>

Gaohan123 added the ready label to trigger buildkite CI label Jun 4, 2026

hsliuustc0106 added merge-test label to trigger buildkite merge test CI and removed ready label to trigger buildkite CI labels Jun 4, 2026

Merge branch 'main' into enable_bagel_test

c09ddb7

Gaohan123 enabled auto-merge (squash) June 4, 2026 09:56

Gaohan123 merged commit e10aca3 into vllm-project:main Jun 4, 2026
6 checks passed

86MaxCao pushed a commit to 86MaxCao/vllm-omni that referenced this pull request Jun 4, 2026

[CI] Update Bagel Pixels (vllm-project#4081)

d9a0275

Signed-off-by: Alex Brooks <albrooks@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Update Bagel Pixels#4081

[CI] Update Bagel Pixels#4081
Gaohan123 merged 7 commits into
vllm-project:mainfrom
alex-jw-brooks:enable_bagel_test

alex-jw-brooks commented Jun 2, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 2, 2026

Uh oh!

alex-jw-brooks Jun 2, 2026

Uh oh!

hsliuustc0106 commented Jun 3, 2026

Uh oh!

princepride commented Jun 3, 2026

Uh oh!

alex-jw-brooks commented Jun 3, 2026 •

edited

Loading

Uh oh!

Gaohan123 commented Jun 3, 2026

Uh oh!

princepride commented Jun 3, 2026

Uh oh!

zhangj1an commented Jun 3, 2026

Uh oh!

alex-jw-brooks commented Jun 3, 2026 •

edited

Loading

Uh oh!

princepride commented Jun 4, 2026

Uh oh!

alex-jw-brooks commented Jun 4, 2026

Uh oh!

zhangj1an commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

alex-jw-brooks commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

chatgpt-codex-connector Bot commented Jun 2, 2026

Uh oh!

alex-jw-brooks Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Jun 3, 2026

Uh oh!

princepride commented Jun 3, 2026

Uh oh!

alex-jw-brooks commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gaohan123 commented Jun 3, 2026

Uh oh!

princepride commented Jun 3, 2026

Uh oh!

zhangj1an commented Jun 3, 2026

Uh oh!

alex-jw-brooks commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

princepride commented Jun 4, 2026

Uh oh!

alex-jw-brooks commented Jun 4, 2026

Uh oh!

zhangj1an commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

alex-jw-brooks commented Jun 2, 2026 •

edited

Loading

alex-jw-brooks commented Jun 3, 2026 •

edited

Loading

alex-jw-brooks commented Jun 3, 2026 •

edited

Loading