-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[Doc] Improve diffusion generation parameter docs for online serving #2051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
gcanlin
merged 9 commits into
vllm-project:main
from
SamitHuang:docs/diffusion-extra-body-guide
Mar 26, 2026
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
537953a
[Doc] Improve diffusion generation parameter docs for online serving
SamitHuang 824420d
Update examples/online_serving/text_to_image/README.md
SamitHuang 718be65
docs: address PR review feedback
SamitHuang 77ecb0e
docs: slim down diffusion_chat_api.md to avoid content duplication
SamitHuang da4b4fd
docs: simplify glm_image docs to avoid repeating generic request methods
SamitHuang b1bbd83
Merge branch 'main' into docs/diffusion-extra-body-guide
SamitHuang 337287a
docs: mention dedicated endpoints support top-level parameters
SamitHuang b07b9dc
docs: fix diffusion parameter defaults
gcanlin 4481167
Merge branch 'main' into docs/diffusion-extra-body-guide
gcanlin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| # Diffusion Chat Completions API | ||
|
|
||
| vLLM-Omni supports generating and editing images via the `/v1/chat/completions` | ||
| endpoint using diffusion models. This page explains how to pass generation | ||
| parameters (such as `num_inference_steps`, `height`, `width`) to diffusion | ||
| models through this endpoint. | ||
|
|
||
| !!! tip | ||
| For dedicated endpoints that accept generation parameters as top-level | ||
| fields, see [Image Generation API](image_generation_api.md) and | ||
| [Image Edit API](image_edit_api.md). | ||
|
|
||
| ## Passing Generation Parameters | ||
|
|
||
| The `/v1/chat/completions` endpoint follows the OpenAI Chat API schema, which | ||
| does not natively include diffusion-specific fields like `num_inference_steps` | ||
| or `height`. How you pass these extra fields depends on your client. | ||
|
|
||
| ### curl / Python `requests` | ||
|
|
||
| Wrap generation parameters inside an `"extra_body"` key in the JSON body: | ||
|
|
||
| ```bash | ||
| curl -s http://localhost:8091/v1/chat/completions \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "messages": [ | ||
| {"role": "user", "content": "A beautiful landscape painting"} | ||
| ], | ||
| "extra_body": { | ||
| "num_inference_steps": 50, | ||
| "seed": 42 | ||
| } | ||
| }' | ||
| ``` | ||
|
|
||
| ### OpenAI Python SDK | ||
|
|
||
| Use the `extra_body` **keyword argument**. The SDK automatically merges these | ||
| fields into the top-level request body: | ||
|
|
||
| ```python | ||
| response = client.chat.completions.create( | ||
| model="Qwen/Qwen-Image", | ||
| messages=[{"role": "user", "content": "A beautiful landscape painting"}], | ||
| extra_body={ | ||
| "num_inference_steps": 50, | ||
| "seed": 42, | ||
| }, | ||
| ) | ||
| ``` | ||
|
|
||
| !!! note "SDK `extra_body` vs. JSON `extra_body`" | ||
| These two `extra_body` usages look similar but work differently under the | ||
| hood. The SDK flattens the dict into the top-level request JSON, while the | ||
| curl/requests approach sends it as a nested `"extra_body"` key. Both are | ||
| handled correctly by the server. | ||
|
|
||
| !!! note "About the `ignored fields` warning" | ||
| You may see a log message like: | ||
|
|
||
| ``` | ||
| WARNING: The following fields were present in the request but ignored: {'height', 'width', ...} | ||
| ``` | ||
|
|
||
| This is **harmless**. It is emitted by vLLM's request validation layer | ||
| because these fields are not part of the standard OpenAI | ||
| `ChatCompletionRequest` schema. The fields are still stored internally | ||
| and correctly forwarded to the diffusion pipeline. | ||
|
|
||
| ## Model-Specific Examples | ||
|
|
||
| For complete examples with full request/response details, see the model-specific | ||
| guides: | ||
|
|
||
| - [Text-to-Image (Qwen-Image)](../user_guide/examples/online_serving/text_to_image.md) | ||
| - [Image-to-Image (Qwen-Image-Edit, Qwen-Image-Layered)](../user_guide/examples/online_serving/image_to_image.md) | ||
| - [GLM-Image](../user_guide/examples/online_serving/glm_image.md) |
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of curiosity, have you test glm-image recently? I remember it has some bug after last time refactor😂 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.