Skip to content

Fix default sampling params to support generator_device in image generations#2778

Open
sustech-lz wants to merge 1 commit into
vllm-project:release/v0.18.0.post1from
sustech-lz:vllm_omni_generator_device
Open

Fix default sampling params to support generator_device in image generations#2778
sustech-lz wants to merge 1 commit into
vllm-project:release/v0.18.0.post1from
sustech-lz:vllm_omni_generator_device

Conversation

@sustech-lz
Copy link
Copy Markdown

Summary

This PR fixes an inconsistency in image generation sampling params handling for diffusion models.

Previously:

  • generator_device could be passed in per-request payload (e.g. curl body),
  • but values from --default-sampling-params were not applied in /v1/images/generations.

Now:

  1. /v1/images/generations also applies stage defaults from --default-sampling-params (aligned with /v1/images/edits behavior).
  2. Default sampling param parsing supports aliases:
    • generator-device
    • generative-device
    • generative_device
      all normalized to generator_device.

Motivation

For Qwen-Image serving in vllm-omni, users expect generator_device to be configurable at server startup via:
--default-sampling-params.
Before this fix, startup defaults were ignored for /v1/images/generations, so users had to pass it in every request.

Changes

  • vllm_omni/entrypoints/openai/api_server.py

    • Apply stage default sampling params in /v1/images/generations.
    • Add normalization/alias mapping in apply_stage_default_sampling_params().
  • tests/entrypoints/openai_api/test_image_server.py

    • Add test: generation endpoint picks up default generator_device.
    • Add test: alias key generator-device maps correctly to generator_device.

Example

Startup:

--default-sampling-params '{"1":{"generator_device":"cpu"}}'

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@sustech-lz sustech-lz force-pushed the vllm_omni_generator_device branch 2 times, most recently from d5b525d to 6cfe613 Compare April 14, 2026 08:33
…rations

Signed-off-by: ZhengLI-Sustech <12232325@mail.sustech.edu.cn>
Signed-off-by: Zheng Li <12232325@mail.sustech.edu.cn>
detail="No diffusion stage found in multi-stage pipeline.",
)
diffusion_stage_id = diffusion_stage_ids[0]
apply_stage_default_sampling_params(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now lets --default-sampling-params override the request-level n, because num_outputs_per_prompt is only set in the constructor before apply_stage_default_sampling_params(). The image edits path re-applies n after defaults, so can we do the same here and add a regression test for request-level n precedence?

lishunyang12
lishunyang12 previously approved these changes Apr 16, 2026
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Fix default sampling params to support generator_device in image generations

Verdict: Approve

What this PR does

  1. Adds apply_stage_default_sampling_params() call to the /v1/images/generations endpoint, aligning it with the existing behavior in /v1/images/edits.
  2. Adds alias normalization in apply_stage_default_sampling_params() so that hyphenated keys like generator-device and generative-device are mapped to generator_device.
  3. Adds two well-targeted tests.

Correctness

  • Application order is correct: Defaults are applied right after gen_params construction, before per-request _update_if_not_none calls. This means user-supplied request values properly override the server defaults (e.g., request.generator_device at line 1357 will overwrite anything set by defaults). Good.
  • app_state_args duplication removed: The variable was previously fetched only for _check_max_generated_image_size; the PR moves it up and reuses it for both default param application and the size check. Clean.
  • Alias fallback is sound: The param_aliases dict handles known aliases explicitly, and param_name.replace("-", "_") provides a reasonable generic fallback for any future hyphenated keys. This avoids a maintenance burden of having to register every alias.
  • Error path: The HTTPException for missing diffusion stage is consistent with the same check in the edits endpoint.

Minor observations (non-blocking)

  1. json.loads on every request: apply_stage_default_sampling_params parses default_params_json from string on every call. Since the value is static (set at startup), it could be parsed once and cached. This is negligible for typical request rates but worth noting for future optimization if this endpoint becomes high-throughput.

  2. Code duplication: The diffusion-stage-lookup block (get stage configs, find diffusion stage IDs, raise if empty) is now duplicated between generate_images and edit_images. A small helper like _get_diffusion_stage_id(stage_configs) could reduce this, but it is minor and can be addressed in a follow-up.

  3. Test fixture side effect: The async_omni_test_client fixture now includes "generative-device":"cuda" in its default sampling params. This means all existing tests using that fixture will have this default applied. I verified the existing tests (test_generate_images_async_omni_sampling_params, etc.) still pass because they either override the relevant params or don't assert on generator_device, so this is fine.

Overall this is a clean, well-scoped fix with good test coverage. LGTM.

@lishunyang12 lishunyang12 dismissed their stale review April 16, 2026 14:55

Replacing with inline comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Support generator_device within --default-sampling-params for multi-modal models (Feedback on PR #2769)

3 participants