Fix default sampling params to support generator_device in image generations#2778
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
d5b525d to
6cfe613
Compare
…rations Signed-off-by: ZhengLI-Sustech <12232325@mail.sustech.edu.cn> Signed-off-by: Zheng Li <12232325@mail.sustech.edu.cn>
6cfe613 to
876a483
Compare
| detail="No diffusion stage found in multi-stage pipeline.", | ||
| ) | ||
| diffusion_stage_id = diffusion_stage_ids[0] | ||
| apply_stage_default_sampling_params( |
There was a problem hiding this comment.
This now lets --default-sampling-params override the request-level n, because num_outputs_per_prompt is only set in the constructor before apply_stage_default_sampling_params(). The image edits path re-applies n after defaults, so can we do the same here and add a regression test for request-level n precedence?
lishunyang12
left a comment
There was a problem hiding this comment.
Review: Fix default sampling params to support generator_device in image generations
Verdict: Approve
What this PR does
- Adds
apply_stage_default_sampling_params()call to the/v1/images/generationsendpoint, aligning it with the existing behavior in/v1/images/edits. - Adds alias normalization in
apply_stage_default_sampling_params()so that hyphenated keys likegenerator-deviceandgenerative-deviceare mapped togenerator_device. - Adds two well-targeted tests.
Correctness
- Application order is correct: Defaults are applied right after
gen_paramsconstruction, before per-request_update_if_not_nonecalls. This means user-supplied request values properly override the server defaults (e.g.,request.generator_deviceat line 1357 will overwrite anything set by defaults). Good. app_state_argsduplication removed: The variable was previously fetched only for_check_max_generated_image_size; the PR moves it up and reuses it for both default param application and the size check. Clean.- Alias fallback is sound: The
param_aliasesdict handles known aliases explicitly, andparam_name.replace("-", "_")provides a reasonable generic fallback for any future hyphenated keys. This avoids a maintenance burden of having to register every alias. - Error path: The
HTTPExceptionfor missing diffusion stage is consistent with the same check in the edits endpoint.
Minor observations (non-blocking)
-
json.loadson every request:apply_stage_default_sampling_paramsparsesdefault_params_jsonfrom string on every call. Since the value is static (set at startup), it could be parsed once and cached. This is negligible for typical request rates but worth noting for future optimization if this endpoint becomes high-throughput. -
Code duplication: The diffusion-stage-lookup block (get stage configs, find diffusion stage IDs, raise if empty) is now duplicated between
generate_imagesandedit_images. A small helper like_get_diffusion_stage_id(stage_configs)could reduce this, but it is minor and can be addressed in a follow-up. -
Test fixture side effect: The
async_omni_test_clientfixture now includes"generative-device":"cuda"in its default sampling params. This means all existing tests using that fixture will have this default applied. I verified the existing tests (test_generate_images_async_omni_sampling_params, etc.) still pass because they either override the relevant params or don't assert ongenerator_device, so this is fine.
Overall this is a clean, well-scoped fix with good test coverage. LGTM.
Summary
This PR fixes an inconsistency in image generation sampling params handling for diffusion models.
Previously:
generator_devicecould be passed in per-request payload (e.g.curlbody),--default-sampling-paramswere not applied in/v1/images/generations.Now:
/v1/images/generationsalso applies stage defaults from--default-sampling-params(aligned with/v1/images/editsbehavior).generator-devicegenerative-devicegenerative_deviceall normalized to
generator_device.Motivation
For Qwen-Image serving in
vllm-omni, users expectgenerator_deviceto be configurable at server startup via:--default-sampling-params.Before this fix, startup defaults were ignored for
/v1/images/generations, so users had to pass it in every request.Changes
vllm_omni/entrypoints/openai/api_server.py/v1/images/generations.apply_stage_default_sampling_params().tests/entrypoints/openai_api/test_image_server.pygenerator_device.generator-devicemaps correctly togenerator_device.Example
Startup:
--default-sampling-params '{"1":{"generator_device":"cpu"}}'