[CI][BugFix] Fix and Validate FP8 Z-Image quality gate by david6666666 · Pull Request #3929 · vllm-project/vllm-omni

david6666666 · 2026-05-28T07:03:11Z

Summary

Unskips test_quantization_quality[fp8_z_image] and updates the Z-Image FP8 quality prompt to the requested long floating-archipelago prompt.

Following PR #3279's text-encoder FP8 path, this uses regular online FP8 quantization. The text encoder is FP8 for the early/mid blocks, with only the final 8 text-encoder blocks listed in ignored_layers for BF16 fallback. The Z-Image transformer also keeps the final 15 main attention/FFN blocks in BF16 through ignored_layers.

The final PR diff is intentionally small and only changes tests/diffusion/quantization/test_quantization_quality.py. The visual artifacts are linked below and are not part of the final file diff.

E2E Result

Command:

CUDA_VISIBLE_DEVICES=0 \
VLLM_OMNI_QUALITY_OUTPUT_DIR=$PWD/tests/diffusion/quantization/artifacts/fp8_z_image_quality \
.venv/bin/python -m pytest \
  tests/diffusion/quantization/test_quantization_quality.py::test_quantization_quality[fp8_z_image] \
  -s -v -m ""

Result: passed with max_lpips=0.15.

Metric	Value
LPIPS	0.128773
PSNR	20.125577 dB
MAE	0.047712

I also tested the simpler "text encoder fully FP8 + transformer final 15 fallback" variant. It failed the requested 0.15 gate with LPIPS 0.199243, so the final config keeps text encoder layers 28-35 as BF16 fallback.

Quantized Config

{
    "method": "fp8",
    "ignored_layers": [
        "img_mlp",
        "layers.15..29.{attention.to_qkv,attention.to_out.0,feed_forward.w13,feed_forward.w2}",
        "model.layers.28..35.{self_attn.q_proj,self_attn.k_proj,self_attn.v_proj,self_attn.o_proj,mlp.gate_proj,mlp.up_proj,mlp.down_proj}",
    ],
}

Note: img_mlp appears in Qwen-Image but not in Z-Image, so it is included for parity with PR #1034 but does not match a Z-Image layer by itself.

Visual Comparison

BF16 baseline:

FP8 quantized:

Validation

.venv/bin/python -m pytest tests/diffusion/quantization/test_fp8_config.py -q

test_fp8_config.py: 28 passed.

The local E2E environment used vllm==0.21.0, torch==2.11.0+cu130, lpips==0.1.4, and a single NVIDIA B300 GPU via CUDA_VISIBLE_DEVICES=0.

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

congw729

Maybe upload these images to AWS S3? or move to tests/assets with other images/wavs

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

chatgpt-codex-connector · 2026-05-28T09:08:33Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

david6666666 · 2026-05-28T09:10:07Z

@yuanheng-zhao @RuixiangMa ptal thx

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

Gaohan123

LGTM. Thanks

yuanheng-zhao

LGTM. Btw, are the layers for the final 15 attention/FFN blocks ignored for their weighted influences on generated output diff

david6666666 · 2026-05-28T11:18:35Z

LGTM. Btw, are the layers for the final 15 attention/FFN blocks ignored for their weighted influences on generated output diff

Yes

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com> Co-authored-by: WeiQing Chen <david6666666@users.noreply.github.com>

Unskip FP8 Z-Image quality case

65784f8

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

congw729 reviewed May 28, 2026

View reviewed changes

david6666666 added 2 commits May 28, 2026 07:15

Add ignored layers to FP8 Z-Image quality case

f035107

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

Fix FP8 Z-Image quality gate

eb0c4b1

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

david6666666 changed the title ~~[codex] Unskip FP8 Z-Image quality case~~ [codex] Fix FP8 Z-Image quality gate May 28, 2026

Use text encoder FP8 for Z-Image quality gate

8593781

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

david6666666 changed the title ~~[codex] Fix FP8 Z-Image quality gate~~ [codex] Validate FP8 Z-Image quality gate May 28, 2026

Remove FP8 Z-Image quality artifacts

414d14f

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

david6666666 marked this pull request as ready for review May 28, 2026 09:08

david6666666 requested a review from yenuo26 as a code owner May 28, 2026 09:08

david6666666 added the ready label to trigger buildkite CI label May 28, 2026

david6666666 changed the title ~~[codex] Validate FP8 Z-Image quality gate~~ [CI][BugFix] Validate FP8 Z-Image quality gate May 28, 2026

david6666666 changed the title ~~[CI][BugFix] Validate FP8 Z-Image quality gate~~ [CI][BugFix] Fix and Validate FP8 Z-Image quality gate May 28, 2026

Document PR media attachment rule

2b4a765

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

david6666666 requested review from Gaohan123, hsliuustc0106 and ywang96 as code owners May 28, 2026 09:13

Remove repo-local agent notes

5ab2932

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>

Gaohan123 added this to the v0.22.0 milestone May 28, 2026

Gaohan123 approved these changes May 28, 2026

View reviewed changes

Gaohan123 enabled auto-merge (squash) May 28, 2026 09:34

yuanheng-zhao approved these changes May 28, 2026

View reviewed changes

Merge branch 'main' into codex/unskip-fp8-z-image-quality

4bbe813

Gaohan123 merged commit 1acd27c into main May 28, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI][BugFix] Fix and Validate FP8 Z-Image quality gate#3929

[CI][BugFix] Fix and Validate FP8 Z-Image quality gate#3929
Gaohan123 merged 8 commits into
mainfrom
codex/unskip-fp8-z-image-quality

david6666666 commented May 28, 2026 •

edited

Loading

Uh oh!

congw729 left a comment •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented May 28, 2026

Uh oh!

david6666666 commented May 28, 2026

Uh oh!

Gaohan123 left a comment

Uh oh!

yuanheng-zhao left a comment

Uh oh!

david6666666 commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

david6666666 commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

E2E Result

Quantized Config

Visual Comparison

Validation

Uh oh!

congw729 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented May 28, 2026

Uh oh!

david6666666 commented May 28, 2026

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

yuanheng-zhao left a comment

Choose a reason for hiding this comment

Uh oh!

david6666666 commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

david6666666 commented May 28, 2026 •

edited

Loading

congw729 left a comment •

edited

Loading