Skip to content

[CI][BugFix] Fix and Validate FP8 Z-Image quality gate#3929

Merged
Gaohan123 merged 8 commits into
mainfrom
codex/unskip-fp8-z-image-quality
May 28, 2026
Merged

[CI][BugFix] Fix and Validate FP8 Z-Image quality gate#3929
Gaohan123 merged 8 commits into
mainfrom
codex/unskip-fp8-z-image-quality

Conversation

@david6666666
Copy link
Copy Markdown
Collaborator

@david6666666 david6666666 commented May 28, 2026

Summary

Unskips test_quantization_quality[fp8_z_image] and updates the Z-Image FP8 quality prompt to the requested long floating-archipelago prompt.

Following PR #3279's text-encoder FP8 path, this uses regular online FP8 quantization. The text encoder is FP8 for the early/mid blocks, with only the final 8 text-encoder blocks listed in ignored_layers for BF16 fallback. The Z-Image transformer also keeps the final 15 main attention/FFN blocks in BF16 through ignored_layers.

The final PR diff is intentionally small and only changes tests/diffusion/quantization/test_quantization_quality.py. The visual artifacts are linked below and are not part of the final file diff.

E2E Result

Command:

CUDA_VISIBLE_DEVICES=0 \
VLLM_OMNI_QUALITY_OUTPUT_DIR=$PWD/tests/diffusion/quantization/artifacts/fp8_z_image_quality \
.venv/bin/python -m pytest \
  tests/diffusion/quantization/test_quantization_quality.py::test_quantization_quality[fp8_z_image] \
  -s -v -m ""

Result: passed with max_lpips=0.15.

Metric Value
LPIPS 0.128773
PSNR 20.125577 dB
MAE 0.047712

I also tested the simpler "text encoder fully FP8 + transformer final 15 fallback" variant. It failed the requested 0.15 gate with LPIPS 0.199243, so the final config keeps text encoder layers 28-35 as BF16 fallback.

Quantized Config

{
    "method": "fp8",
    "ignored_layers": [
        "img_mlp",
        "layers.15..29.{attention.to_qkv,attention.to_out.0,feed_forward.w13,feed_forward.w2}",
        "model.layers.28..35.{self_attn.q_proj,self_attn.k_proj,self_attn.v_proj,self_attn.o_proj,mlp.gate_proj,mlp.up_proj,mlp.down_proj}",
    ],
}

Note: img_mlp appears in Qwen-Image but not in Z-Image, so it is included for parity with PR #1034 but does not match a Z-Image layer by itself.

Visual Comparison

BF16 baseline vs FP8 quantized

BF16 baseline:

BF16 baseline

FP8 quantized:

FP8 quantized

Validation

.venv/bin/python -m pytest tests/diffusion/quantization/test_fp8_config.py -q

test_fp8_config.py: 28 passed.

The local E2E environment used vllm==0.21.0, torch==2.11.0+cu130, lpips==0.1.4, and a single NVIDIA B300 GPU via CUDA_VISIBLE_DEVICES=0.

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@congw729 congw729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe upload these images to AWS S3? or move to tests/assets with other images/wavs

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
@david6666666 david6666666 changed the title [codex] Unskip FP8 Z-Image quality case [codex] Fix FP8 Z-Image quality gate May 28, 2026
Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
@david6666666 david6666666 changed the title [codex] Fix FP8 Z-Image quality gate [codex] Validate FP8 Z-Image quality gate May 28, 2026
Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@david6666666 david6666666 added the ready label to trigger buildkite CI label May 28, 2026
@david6666666 david6666666 changed the title [codex] Validate FP8 Z-Image quality gate [CI][BugFix] Validate FP8 Z-Image quality gate May 28, 2026
@david6666666 david6666666 changed the title [CI][BugFix] Validate FP8 Z-Image quality gate [CI][BugFix] Fix and Validate FP8 Z-Image quality gate May 28, 2026
@david6666666
Copy link
Copy Markdown
Collaborator Author

@yuanheng-zhao @RuixiangMa ptal thx

Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
@Gaohan123 Gaohan123 added this to the v0.22.0 milestone May 28, 2026
Copy link
Copy Markdown
Collaborator

@Gaohan123 Gaohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@Gaohan123 Gaohan123 enabled auto-merge (squash) May 28, 2026 09:34
Copy link
Copy Markdown
Collaborator

@yuanheng-zhao yuanheng-zhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Btw, are the layers for the final 15 attention/FFN blocks ignored for their weighted influences on generated output diff

@david6666666
Copy link
Copy Markdown
Collaborator Author

LGTM. Btw, are the layers for the final 15 attention/FFN blocks ignored for their weighted influences on generated output diff

Yes

@Gaohan123 Gaohan123 merged commit 1acd27c into main May 28, 2026
9 checks passed
tzhouam pushed a commit that referenced this pull request May 29, 2026
Signed-off-by: WeiQing Chen <david6666666@users.noreply.github.com>
Co-authored-by: WeiQing Chen <david6666666@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

4 participants