Skip to content

[Quant] 2-GPU ModelOpt FP8 DiT serve configs for Flux / Flux2-Klein / Qwen-Image / Z-Image#4207

Draft
david6666666 wants to merge 1 commit into
vllm-project:mainfrom
david6666666:modelopt-fp8-image-stage-configs
Draft

[Quant] 2-GPU ModelOpt FP8 DiT serve configs for Flux / Flux2-Klein / Qwen-Image / Z-Image#4207
david6666666 wants to merge 1 commit into
vllm-project:mainfrom
david6666666:modelopt-fp8-image-stage-configs

Conversation

@david6666666
Copy link
Copy Markdown
Collaborator

What

DiT-only ModelOpt FP8 serve configs (TP=2, 2-GPU) for the image-gen models whose FP8 checkpoint adapter landed in #2913:

  • flux_dit_2gpu_fp8.yaml (FluxPipeline)
  • flux2_klein_dit_2gpu_fp8.yaml (Flux2KleinPipeline)
  • qwen_image_dit_2gpu_fp8.yaml (QwenImagePipeline)
  • z_image_dit_2gpu_fp8.yaml (ZImagePipeline)

Split out of the video-gen calibration PR (#3305) since these are image-gen scope. Each model_class_name is verified against main's diffusion registry. Configs only — no code changes.

…n, Qwen-Image, Z-Image

DiT-only FP8 stage configs (TP=2) for the image-gen models whose ModelOpt
FP8 adapter landed in vllm-project#2913. Split out of the video-gen calibration PR (vllm-project#3305)
since these are image-gen scope. Class names verified against main's registry.

Signed-off-by: lishunyang12 <lishunyang12@163.com>
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config-only PR adding ModelOpt FP8 serve configs for 4 image-gen models (Flux, Flux2-Klein, Qwen-Image, Z-Image) with TP=2.

Please add verification evidence that these configs work (e.g., a test run or CI log) when ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants