vllm-project · gcanlin · Mar 26, 2026 · Mar 20, 2026 · Mar 23, 2026 · Mar 23, 2026
@@ -6,6 +6,7 @@ nav:
     - getting_started/installation/*
   - Serving:
     - OpenAI-Compatible API:
+      - Diffusion Chat API: serving/diffusion_chat_api.md
       - Image Generation: serving/image_generation_api.md
       - Image Edit: serving/image_edit_api.md
       - Text to Speech: serving/speech_api.md

@@ -0,0 +1,78 @@
+# Diffusion Chat Completions API
+
+vLLM-Omni supports generating and editing images via the `/v1/chat/completions`
+endpoint using diffusion models. This page explains how to pass generation
+parameters (such as `num_inference_steps`, `height`, `width`) to diffusion
+models through this endpoint.
+
+!!! tip
+    For dedicated endpoints that accept generation parameters as top-level
+    fields, see [Image Generation API](image_generation_api.md) and
+    [Image Edit API](image_edit_api.md).
+
+## Passing Generation Parameters
+
+The `/v1/chat/completions` endpoint follows the OpenAI Chat API schema, which
+does not natively include diffusion-specific fields like `num_inference_steps`
+or `height`. How you pass these extra fields depends on your client.
+
+### curl / Python `requests`
+
+Wrap generation parameters inside an `"extra_body"` key in the JSON body:
+
+```bash
+curl -s http://localhost:8091/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      {"role": "user", "content": "A beautiful landscape painting"}
+    ],
+    "extra_body": {
+      "num_inference_steps": 50,
+      "seed": 42
+    }
+  }'
+```
+
+### OpenAI Python SDK
+
+Use the `extra_body` **keyword argument**. The SDK automatically merges these
+fields into the top-level request body:
+
+```python
+response = client.chat.completions.create(
+    model="Qwen/Qwen-Image",
+    messages=[{"role": "user", "content": "A beautiful landscape painting"}],
+    extra_body={
+        "num_inference_steps": 50,
+        "seed": 42,
+    },
+)
+```
+
+!!! note "SDK `extra_body` vs. JSON `extra_body`"
+    These two `extra_body` usages look similar but work differently under the
+    hood. The SDK flattens the dict into the top-level request JSON, while the
+    curl/requests approach sends it as a nested `"extra_body"` key. Both are
+    handled correctly by the server.
+
+!!! note "About the `ignored fields` warning"
+    You may see a log message like:
+
+    ```
+    WARNING: The following fields were present in the request but ignored: {'height', 'width', ...}
+    ```
+
+    This is **harmless**. It is emitted by vLLM's request validation layer
+    because these fields are not part of the standard OpenAI
+    `ChatCompletionRequest` schema. The fields are still stored internally
+    and correctly forwarded to the diffusion pipeline.
+
+## Model-Specific Examples
+
+For complete examples with full request/response details, see the model-specific
+guides:
+
+- [Text-to-Image (Qwen-Image)](../user_guide/examples/online_serving/text_to_image.md)
+- [Image-to-Image (Qwen-Image-Edit, Qwen-Image-Layered)](../user_guide/examples/online_serving/image_to_image.md)
+- [GLM-Image](../user_guide/examples/online_serving/glm_image.md)
@@ -73,98 +73,50 @@ The default yaml configuration deploys AR on GPU 0 and DiT on GPU 1. You can use
 
 ### Text-to-Image
 
-Generate images from text prompts:
-
-**Using Python client**
-
 ```bash
 python openai_chat_client.py \
     --prompt "A photorealistic mountain landscape at sunset" \
     --height 1024 \
     --width 1024 \
     --output landscape.png
-```
 
-**Using curl**
-
-```bash
-curl -s http://localhost:8091/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "messages": [
-      {"role": "user", "content": "A beautiful sunset over the ocean with sailing boats"}
-    ],
-    "extra_body": {
-      "height": 1024,
-      "width": 1024,
-      "num_inference_steps": 50,
-      "guidance_scale": 1.5,
-      "seed": 42
-    }
-  }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png
-```
-
-Or use the script:
-
-```bash
+# Or use the curl script:
 bash run_curl_text_to_image.sh "A futuristic city skyline at night"
 ```
 
 ### Image-to-Image (Image Editing)
 
-Edit images with text instructions:
-
-**Using Python client**
-
 ```bash
 python openai_chat_client.py \
     --prompt "Convert this image to watercolor style" \
     --image input.png \
     --output watercolor.png
-```
-
-**Using curl**
 
-```bash
-IMG_B64=$(base64 < input.png | tr -d '\n')
-
-curl -s http://localhost:8091/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d @- <<EOF | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png
-{
-  "messages": [{
-    "role": "user",
-    "content": [
-      {"type": "text", "text": "Convert this image to watercolor style"},
-      {"type": "image_url", "image_url": {"url": "data:image/png;base64,'$IMG_B64'"}}
-    ]
-  }],
-  "extra_body": {
-    "height": 1024,
-    "width": 1024,
-    "num_inference_steps": 50,
-    "guidance_scale": 1.5,
-    "seed": 42
-  }
-}
-EOF
+# Or use the curl script:
+bash run_curl_image_edit.sh input.png "Convert to watercolor style"
 ```
 
-Or use the script:
+For general-purpose request methods (curl, OpenAI SDK, Python `requests`), see
+the [Text-to-Image](text_to_image.md) and [Image-to-Image](image_to_image.md)
+guides.
 
-```bash
-bash run_curl_image_edit.sh input.png "Convert to watercolor style"
-```
+## Generation Parameters
 
-## Generation Parameters (extra_body)
+When using `/v1/chat/completions`, pass these inside `extra_body` in the curl
+JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the
+[Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)).
+When using the dedicated [`/v1/images/generations`](../../../../serving/image_generation_api.md)
+or [`/v1/images/edits`](../../../../serving/image_edit_api.md) endpoints, pass
+the supported generation controls as top-level fields directly. For image
+dimensions and count, use `size` and `n` rather than `height` or `width`.
 
 | Parameter             | Type  | Default | Description                         |
 | --------------------- | ----- | ------- | ----------------------------------- |
 | `height`              | int   | 1024    | Image height in pixels              |
 | `width`               | int   | 1024    | Image width in pixels               |
 | `num_inference_steps` | int   | 50      | Number of diffusion denoising steps |
 | `guidance_scale`      | float | 1.5     | Classifier-free guidance scale      |
-| `seed`                | int   | 42      | Random seed for reproducibility     |
+| `seed`                | int   | None    | Optional random seed; `/v1/images/*` generates one server-side if omitted |
 | `negative_prompt`     | str   | None    | Negative prompt                     |
 
 ## Response Format