From 537953a31eaa0d1d99e9d7b8cec74ef141a2a005 Mon Sep 17 00:00:00 2001 From: samithuang <285365963@qq.com> Date: Fri, 20 Mar 2026 16:14:07 +0000 Subject: [PATCH 1/7] [Doc] Improve diffusion generation parameter docs for online serving Add a cross-cutting Diffusion Chat API guide explaining how to pass generation parameters (num_inference_steps, height, width, etc.) via /v1/chat/completions across different clients (curl, OpenAI SDK, Python requests). Update model-specific docs and example READMEs to add OpenAI SDK examples and cross-reference the new guide. Add Qwen-Image-Layered guidance to image_to_image docs with curl, SDK, and Python examples, covering its model-specific parameters (layers, resolution, cfg_scale) and multi-image response format. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com> --- docs/.nav.yml | 1 + docs/serving/diffusion_chat_api.md | 230 ++++++++++++++++++ .../examples/online_serving/glm_image.md | 31 ++- .../examples/online_serving/image_to_image.md | 194 ++++++++++++++- .../examples/online_serving/text_to_image.md | 51 +++- examples/online_serving/glm_image/README.md | 35 ++- .../online_serving/image_to_image/README.md | 141 ++++++++++- .../online_serving/text_to_image/README.md | 41 +++- 8 files changed, 705 insertions(+), 19 deletions(-) create mode 100644 docs/serving/diffusion_chat_api.md diff --git a/docs/.nav.yml b/docs/.nav.yml index bfa9365f6f6..14725cf1511 100644 --- a/docs/.nav.yml +++ b/docs/.nav.yml @@ -6,6 +6,7 @@ nav: - getting_started/installation/* - Serving: - OpenAI-Compatible API: + - Diffusion Chat API: serving/diffusion_chat_api.md - Image Generation: serving/image_generation_api.md - Image Edit: serving/image_edit_api.md - Text to Speech: serving/speech_api.md diff --git a/docs/serving/diffusion_chat_api.md b/docs/serving/diffusion_chat_api.md new file mode 100644 index 00000000000..41e9eada9f7 --- /dev/null +++ b/docs/serving/diffusion_chat_api.md @@ -0,0 +1,230 @@ +# Diffusion Chat Completions API + +vLLM-Omni supports generating images via the `/v1/chat/completions` endpoint using diffusion models. +This page explains how to pass generation parameters (such as `num_inference_steps`, `height`, `width`) +to diffusion models through this endpoint across different client libraries. + +!!! tip + For text-to-image generation without chat context, the dedicated + [`/v1/images/generations`](image_generation_api.md) endpoint accepts these + parameters as top-level fields and may be simpler for your use case. + +## API Endpoints Overview + +vLLM-Omni provides multiple endpoints for diffusion models. Each has its own parameter-passing +convention: + +| Endpoint | Use Case | Parameter Format | +|----------|----------|-----------------| +| `/v1/chat/completions` | Image gen/edit via chat | Generation params in `extra_body` (see below) | +| `/v1/images/generations` | Dedicated text-to-image | Top-level JSON fields | +| `/v1/images/edits` | Dedicated image editing | Multipart form fields | +| `/v1/videos` | Video generation | Multipart form fields | + +## Passing Generation Parameters via `/v1/chat/completions` + +The `/v1/chat/completions` endpoint follows the OpenAI Chat API schema, which does not natively +include diffusion-specific fields like `num_inference_steps` or `height`. vLLM-Omni accepts +these as **extra fields** on the request body. + +There are two supported methods depending on your client: + +### Method 1: Using curl or Python `requests` + +Put generation parameters as **top-level fields** in the JSON body alongside `messages`: + +=== "curl" + + ```bash + curl -s http://localhost:8091/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [ + {"role": "user", "content": "A beautiful landscape painting"} + ], + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42 + }' | jq -r '.choices[0].message.content[0].image_url.url' \ + | cut -d',' -f2- | base64 -d > output.png + ``` + +=== "Python requests" + + ```python + import requests + import base64 + + payload = { + "messages": [ + {"role": "user", "content": "A beautiful landscape painting"} + ], + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42, + } + + resp = requests.post( + "http://localhost:8091/v1/chat/completions", + json=payload, + timeout=300, + ) + data = resp.json() + + img_url = data["choices"][0]["message"]["content"][0]["image_url"]["url"] + _, b64_data = img_url.split(",", 1) + with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) + ``` + +### Method 2: Using the OpenAI Python SDK + +The OpenAI Python SDK uses the `extra_body` keyword argument to pass non-standard fields. +The SDK automatically merges these into the top-level request body: + +```python +from openai import OpenAI +import base64 + +client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") + +response = client.chat.completions.create( + model="Qwen/Qwen-Image", + messages=[ + {"role": "user", "content": "A beautiful landscape painting"} + ], + extra_body={ + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +### Legacy Format: Nested `extra_body` in JSON + +You may see examples that nest generation parameters inside an `"extra_body"` key in the +JSON body. This format is still supported for backward compatibility: + +```json +{ + "messages": [{"role": "user", "content": "A beautiful landscape painting"}], + "extra_body": { + "height": 1024, + "width": 1024, + "num_inference_steps": 50 + } +} +``` + +Both formats (top-level fields and nested `extra_body`) are accepted. + +!!! note "About the `ignored fields` warning" + When sending non-standard fields, you may see a log message like: + + ``` + WARNING: The following fields were present in the request but ignored: {'height', 'width', ...} + ``` + + This warning is **harmless**. It is emitted by vLLM's request validation layer because + these fields are not part of the standard OpenAI `ChatCompletionRequest` schema. + The fields are still stored internally and correctly forwarded to the diffusion pipeline. + +## Image Editing (Image-to-Image) + +For image editing, include both text and image in the message content: + +=== "curl" + + ```bash + IMG_B64=$(base64 -w0 input.png) + + curl -s http://localhost:8092/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [{ + "role": "user", + "content": [ + {"type": "text", "text": "Convert to watercolor style"}, + {"type": "image_url", "image_url": {"url": "data:image/png;base64,'"$IMG_B64"'"}} + ] + }], + "num_inference_steps": 50, + "guidance_scale": 1, + "seed": 42 + }' | jq -r '.choices[0].message.content[0].image_url.url' \ + | cut -d',' -f2 | base64 -d > output.png + ``` + +=== "OpenAI SDK" + + ```python + import base64 + from openai import OpenAI + + client = OpenAI(base_url="http://localhost:8092/v1", api_key="none") + + with open("input.png", "rb") as f: + img_b64 = base64.b64encode(f.read()).decode() + + response = client.chat.completions.create( + model="Qwen/Qwen-Image-Edit", + messages=[{ + "role": "user", + "content": [ + {"type": "text", "text": "Convert to watercolor style"}, + {"type": "image_url", "image_url": { + "url": f"data:image/png;base64,{img_b64}" + }}, + ], + }], + extra_body={ + "num_inference_steps": 50, + "guidance_scale": 1, + "seed": 42, + }, + ) + + img_url = response.choices[0].message.content[0].image_url.url + _, b64_data = img_url.split(",", 1) + with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) + ``` + +## Generation Parameters Reference + +The following parameters are accepted as extra fields on `/v1/chat/completions` for +diffusion models: + +| Parameter | Type | Description | +|-----------|------|-------------| +| `height` | int | Output image height in pixels | +| `width` | int | Output image width in pixels | +| `size` | str | Output size in "WxH" format (alternative to separate height/width) | +| `num_inference_steps` | int | Number of denoising steps | +| `guidance_scale` | float | Classifier-free guidance scale | +| `true_cfg_scale` | float | True CFG scale (Qwen-Image specific) | +| `seed` | int | Random seed for reproducibility | +| `negative_prompt` | str | Text describing what to avoid | +| `num_outputs_per_prompt` | int | Number of images to generate (default: 1) | +| `num_frames` | int | Number of frames (video models) | +| `guidance_scale_2` | float | Secondary guidance scale (Wan2.2 models) | +| `layers` | int | Number of layers to generate (Qwen-Image-Layered, default: 4) | +| `resolution` | int | Resolution for dimension calculation (Qwen-Image-Layered, 640 or 1024) | +| `lora` | object | Per-request LoRA adapter configuration | + +!!! info "Model-specific defaults" + When a parameter is not specified, the underlying diffusion pipeline applies its own + model-specific default. For example, `num_inference_steps` defaults to 50 for most models + but may differ for turbo/distilled variants. diff --git a/docs/user_guide/examples/online_serving/glm_image.md b/docs/user_guide/examples/online_serving/glm_image.md index c0d1764801a..d170a5511a4 100644 --- a/docs/user_guide/examples/online_serving/glm_image.md +++ b/docs/user_guide/examples/online_serving/glm_image.md @@ -104,6 +104,32 @@ curl -s http://localhost:8091/v1/chat/completions \ }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png ``` +**Using OpenAI SDK** + +```python +from openai import OpenAI +import base64 + +client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") + +response = client.chat.completions.create( + model="zai-org/GLM-Image", + messages=[{"role": "user", "content": "A beautiful sunset over the ocean"}], + extra_body={ + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "guidance_scale": 1.5, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + Or use the script: ```bash @@ -156,7 +182,10 @@ Or use the script: bash run_curl_image_edit.sh input.png "Convert to watercolor style" ``` -## Generation Parameters (extra_body) +## Generation Parameters + +These can be passed as top-level fields in curl/requests, or via `extra_body` in the OpenAI SDK. +See the [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md) for details. | Parameter | Type | Default | Description | | --------------------- | ----- | ------- | ----------------------------------- | diff --git a/docs/user_guide/examples/online_serving/image_to_image.md b/docs/user_guide/examples/online_serving/image_to_image.md index 6be2a4a7e82..6b446749739 100644 --- a/docs/user_guide/examples/online_serving/image_to_image.md +++ b/docs/user_guide/examples/online_serving/image_to_image.md @@ -69,10 +69,49 @@ cat < request.json } EOF -curl -s http://localhost:8092/v1/chat/completions -H "Content-Type: application/json" -d @request.json | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2 | base64 -d > output.png +curl -s http://localhost:8092/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d @request.json \ + | jq -r '.choices[0].message.content[0].image_url.url' \ + | cut -d',' -f2 | base64 -d > output.png ``` -### Method 2: Using Python Client +### Method 2: Using OpenAI Python SDK + +```python +import base64 +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:8092/v1", api_key="none") + +with open("input.png", "rb") as f: + img_b64 = base64.b64encode(f.read()).decode() + +response = client.chat.completions.create( + model="Qwen/Qwen-Image-Edit", + messages=[{ + "role": "user", + "content": [ + {"type": "text", "text": "Convert to watercolor style"}, + {"type": "image_url", "image_url": { + "url": f"data:image/png;base64,{img_b64}" + }}, + ], + }], + extra_body={ + "num_inference_steps": 50, + "guidance_scale": 1, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +### Method 3: Using Python Client Script ```bash python openai_chat_client.py --input input.png --prompt "Convert to oil painting style" --output output.png @@ -81,7 +120,7 @@ python openai_chat_client.py --input input.png --prompt "Convert to oil painting python openai_chat_client.py --input input1.png input2.png --prompt "Combine these images into a single scene" --output output.png ``` -### Method 3: Using Gradio Demo +### Method 4: Using Gradio Demo ```bash python gradio_demo.py @@ -124,7 +163,7 @@ python gradio_demo.py ### Image Editing with Parameters -Use `extra_body` to pass generation parameters: +Wrap generation parameters inside `extra_body` in the request JSON: ```json { @@ -147,6 +186,149 @@ Use `extra_body` to pass generation parameters: } ``` +!!! tip "Using the OpenAI SDK" + When using the OpenAI Python SDK, pass these parameters via the `extra_body` + keyword argument. The SDK merges them into the top-level request body automatically: + + ```python + client.chat.completions.create( + model="Qwen/Qwen-Image-Edit", + messages=[...], + extra_body={"num_inference_steps": 50, "guidance_scale": 7.5, "seed": 42}, + ) + ``` + + For details on how generation parameters are handled across different clients, see the + [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md). + +### Layered Image Generation (Qwen-Image-Layered) + +Qwen-Image-Layered generates multiple decomposed layers from a reference image and a text prompt. +Start the server with: + +```bash +vllm serve Qwen/Qwen-Image-Layered --omni --port 8093 +``` + +=== "curl" + + ```bash + IMG_B64=$(base64 -w0 input.png) + + curl -sS http://localhost:8093/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d "$(jq -n --arg img "$IMG_B64" '{ + messages: [{ + role: "user", + content: [ + {type: "image_url", image_url: {url: ("data:image/png;base64," + $img)}}, + {type: "text", text: "a rabbit"} + ] + }], + extra_body: { + num_inference_steps: 50, + cfg_scale: 4.0, + seed: 0, + layers: 4, + resolution: 640 + } + }')" \ + | jq -r '.choices[0].message.content[] | .image_url.url | split(",")[1]' \ + | while IFS= read -r b64; do + ((i++)); echo "$b64" | base64 -d > "layer_${i}.png" + done + ``` + +=== "OpenAI SDK" + + ```python + import base64 + from openai import OpenAI + + client = OpenAI(base_url="http://localhost:8093/v1", api_key="none") + + with open("input.png", "rb") as f: + img_b64 = base64.b64encode(f.read()).decode() + + response = client.chat.completions.create( + model="Qwen/Qwen-Image-Layered", + messages=[{ + "role": "user", + "content": [ + {"type": "image_url", "image_url": { + "url": f"data:image/png;base64,{img_b64}" + }}, + {"type": "text", "text": "a rabbit"}, + ], + }], + extra_body={ + "num_inference_steps": 50, + "cfg_scale": 4.0, + "seed": 0, + "layers": 4, + "resolution": 640, + }, + ) + + for i, item in enumerate(response.choices[0].message.content): + _, b64_data = item.image_url.url.split(",", 1) + with open(f"layer_{i}.png", "wb") as f: + f.write(base64.b64decode(b64_data)) + ``` + +=== "Python requests" + + ```python + import base64 + import requests + + with open("input.png", "rb") as f: + img_b64 = base64.b64encode(f.read()).decode() + + payload = { + "messages": [{ + "role": "user", + "content": [ + {"type": "image_url", "image_url": { + "url": f"data:image/png;base64,{img_b64}" + }}, + {"type": "text", "text": "a rabbit"}, + ], + }], + "extra_body": { + "num_inference_steps": 50, + "cfg_scale": 4.0, + "seed": 0, + "layers": 4, + "resolution": 640, + }, + } + + resp = requests.post( + "http://localhost:8093/v1/chat/completions", + json=payload, + timeout=600, + ) + data = resp.json() + + for i, item in enumerate(data["choices"][0]["message"]["content"]): + _, b64_data = item["image_url"]["url"].split(",", 1) + with open(f"layer_{i}.png", "wb") as f: + f.write(base64.b64decode(b64_data)) + ``` + +The response contains multiple images in `choices[0].message.content` — one per generated layer. + +#### Qwen-Image-Layered Parameters + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `layers` | int | 4 | Number of layers to decompose | +| `resolution` | int | 640 | Resolution for dimension calculation (640 or 1024) | +| `cfg_scale` | float | 4.0 | Classifier-free guidance scale (alias for `true_cfg_scale`) | +| `num_inference_steps` | int | 50 | Number of denoising steps | +| `seed` | int | None | Random seed for reproducibility | + ### Multi-Image Editing (Qwen-Image-Edit-2509) Provide multiple images in `content` (order matters): @@ -166,7 +348,7 @@ Provide multiple images in `content` (order matters): } ``` -## Generation Parameters (extra_body) +## Generation Parameters | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------------- | @@ -178,6 +360,8 @@ Provide multiple images in `content` (order matters): | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | +| `layers` | int | 4 | Number of layers (Qwen-Image-Layered) | +| `resolution` | int | 640 | Resolution, 640 or 1024 (Qwen-Image-Layered) | ## Response Format diff --git a/docs/user_guide/examples/online_serving/text_to_image.md b/docs/user_guide/examples/online_serving/text_to_image.md index 7931294883e..5ea4ba51156 100644 --- a/docs/user_guide/examples/online_serving/text_to_image.md +++ b/docs/user_guide/examples/online_serving/text_to_image.md @@ -71,13 +71,39 @@ curl -s http://localhost:8091/v1/chat/completions \ }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png ``` -### Method 2: Using Python Client +### Method 2: Using OpenAI Python SDK + +```python +from openai import OpenAI +import base64 + +client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") + +response = client.chat.completions.create( + model="Qwen/Qwen-Image", + messages=[{"role": "user", "content": "A beautiful landscape painting"}], + extra_body={ + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +### Method 3: Using Python Client Script ```bash python openai_chat_client.py --prompt "A beautiful landscape painting" --output output.png ``` -### Method 3: Using Gradio Demo +### Method 4: Using Gradio Demo ```bash python gradio_demo.py @@ -151,7 +177,7 @@ lora_adapter/ ### Generation with Parameters -Use `extra_body` to pass generation parameters: +Wrap generation parameters inside `extra_body` in the request JSON: ```json { @@ -168,6 +194,21 @@ Use `extra_body` to pass generation parameters: } ``` +!!! tip "Using the OpenAI SDK" + When using the OpenAI Python SDK, pass these parameters via the `extra_body` + keyword argument. The SDK merges them into the top-level request body automatically: + + ```python + client.chat.completions.create( + model="Qwen/Qwen-Image", + messages=[...], + extra_body={"height": 1024, "width": 1024, "num_inference_steps": 50}, + ) + ``` + + For details on how generation parameters are handled across different clients, see the + [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md). + ### Multimodal Input (Text + Structured Content) ```json @@ -183,7 +224,7 @@ Use `extra_body` to pass generation parameters: } ``` -## Generation Parameters (extra_body) +## Generation Parameters | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------ | @@ -195,7 +236,7 @@ Use `extra_body` to pass generation parameters: | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | -| `--cfg-parallel-size`. | int | 1 | Number of GPUs for CFG parallelism | +| `--cfg-parallel-size` | int | 1 | Number of GPUs for CFG parallelism | ## Response Format diff --git a/examples/online_serving/glm_image/README.md b/examples/online_serving/glm_image/README.md index 5efeba8068c..80dfcb2926c 100644 --- a/examples/online_serving/glm_image/README.md +++ b/examples/online_serving/glm_image/README.md @@ -101,6 +101,36 @@ curl -s http://localhost:8091/v1/chat/completions \ }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png ``` +**Using OpenAI SDK** + +```python +from openai import OpenAI +import base64 + +client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") + +response = client.chat.completions.create( + model="zai-org/GLM-Image", + messages=[{"role": "user", "content": "A beautiful sunset over the ocean"}], + extra_body={ + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "guidance_scale": 1.5, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +> **Note:** The OpenAI SDK's `extra_body` keyword merges parameters into the top-level +> request body. This is different from placing a literal `"extra_body"` key in the JSON +> (as shown in the curl example), but both formats are supported by the server. + Or use the script: ```bash @@ -153,7 +183,10 @@ Or use the script: bash run_curl_image_edit.sh input.png "Convert to watercolor style" ``` -## Generation Parameters (extra_body) +## Generation Parameters + +These parameters can be passed inside `extra_body` in the curl JSON, or via the +`extra_body` keyword argument when using the OpenAI Python SDK. | Parameter | Type | Default | Description | | --------------------- | ----- | ------- | ----------------------------------- | diff --git a/examples/online_serving/image_to_image/README.md b/examples/online_serving/image_to_image/README.md index f69fa8b4286..c5a1cf9ea52 100644 --- a/examples/online_serving/image_to_image/README.md +++ b/examples/online_serving/image_to_image/README.md @@ -69,7 +69,46 @@ EOF curl -s http://localhost:8092/v1/chat/completions -H "Content-Type: application/json" -d @request.json | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2 | base64 -d > output.png ``` -### Method 2: Using Python Client +### Method 2: Using OpenAI Python SDK + +```python +import base64 +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:8092/v1", api_key="none") + +with open("input.png", "rb") as f: + img_b64 = base64.b64encode(f.read()).decode() + +response = client.chat.completions.create( + model="Qwen/Qwen-Image-Edit", + messages=[{ + "role": "user", + "content": [ + {"type": "text", "text": "Convert to watercolor style"}, + {"type": "image_url", "image_url": { + "url": f"data:image/png;base64,{img_b64}" + }}, + ], + }], + extra_body={ + "num_inference_steps": 50, + "guidance_scale": 1, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +> **Note:** The OpenAI SDK's `extra_body` keyword merges parameters into the top-level +> request body. This is different from placing a literal `"extra_body"` key in the JSON +> (as shown in the curl example), but both formats are supported by the server. + +### Method 3: Using Python Client Script ```bash python openai_chat_client.py --input input.png --prompt "Convert to oil painting style" --output output.png @@ -78,7 +117,7 @@ python openai_chat_client.py --input input.png --prompt "Convert to oil painting python openai_chat_client.py --input input1.png input2.png --prompt "Combine these images into a single scene" --output output.png ``` -### Method 3: Using Gradio Demo +### Method 4: Using Gradio Demo ```bash python gradio_demo.py @@ -144,6 +183,97 @@ Use `extra_body` to pass generation parameters: } ``` +### Layered Image Generation (Qwen-Image-Layered) + +Qwen-Image-Layered generates multiple decomposed layers from a reference image and a text prompt. +Start the server with: + +```bash +vllm serve Qwen/Qwen-Image-Layered --omni --port 8093 +``` + +**Using curl** + +```bash +IMG_B64=$(base64 -w0 input.png) + +curl -sS http://localhost:8093/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d "$(jq -n --arg img "$IMG_B64" '{ + messages: [{ + role: "user", + content: [ + {type: "image_url", image_url: {url: ("data:image/png;base64," + $img)}}, + {type: "text", text: "a rabbit"} + ] + }], + extra_body: { + num_inference_steps: 50, + cfg_scale: 4.0, + seed: 0, + layers: 4, + resolution: 640 + } + }')" \ + | jq -r '.choices[0].message.content[] | .image_url.url | split(",")[1]' \ + | while IFS= read -r b64; do + ((i++)); echo "$b64" | base64 -d > "layer_${i}.png" + done +``` + +**Using Python** + +```python +import base64 +import requests + +with open("input.png", "rb") as f: + img_b64 = base64.b64encode(f.read()).decode() + +payload = { + "messages": [{ + "role": "user", + "content": [ + {"type": "image_url", "image_url": { + "url": f"data:image/png;base64,{img_b64}" + }}, + {"type": "text", "text": "a rabbit"}, + ], + }], + "extra_body": { + "num_inference_steps": 50, + "cfg_scale": 4.0, + "seed": 0, + "layers": 4, + "resolution": 640, + }, +} + +resp = requests.post( + "http://localhost:8093/v1/chat/completions", + json=payload, + timeout=600, +) +data = resp.json() + +for i, item in enumerate(data["choices"][0]["message"]["content"]): + _, b64_data = item["image_url"]["url"].split(",", 1) + with open(f"layer_{i}.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +The response contains multiple images in `choices[0].message.content` — one per generated layer. + +#### Qwen-Image-Layered Parameters + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `layers` | int | 4 | Number of layers to decompose | +| `resolution` | int | 640 | Resolution for dimension calculation (640 or 1024) | +| `cfg_scale` | float | 4.0 | Classifier-free guidance scale (alias for `true_cfg_scale`) | +| `num_inference_steps` | int | 50 | Number of denoising steps | +| `seed` | int | None | Random seed for reproducibility | + ### Multi-Image Editing (Qwen-Image-Edit-2509) Provide multiple images in `content` (order matters): @@ -163,7 +293,10 @@ Provide multiple images in `content` (order matters): } ``` -## Generation Parameters (extra_body) +## Generation Parameters + +These parameters can be passed inside `extra_body` in the curl JSON, or via the +`extra_body` keyword argument when using the OpenAI Python SDK. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------------- | @@ -175,6 +308,8 @@ Provide multiple images in `content` (order matters): | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | +| `layers` | int | 4 | Number of layers (Qwen-Image-Layered) | +| `resolution` | int | 640 | Resolution, 640 or 1024 (Qwen-Image-Layered) | ## Response Format diff --git a/examples/online_serving/text_to_image/README.md b/examples/online_serving/text_to_image/README.md index 140036d00c7..528c22cf9eb 100644 --- a/examples/online_serving/text_to_image/README.md +++ b/examples/online_serving/text_to_image/README.md @@ -45,13 +45,43 @@ curl -s http://localhost:8091/v1/chat/completions \ }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png ``` -### Method 2: Using Python Client +### Method 2: Using OpenAI Python SDK + +```python +from openai import OpenAI +import base64 + +client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") + +response = client.chat.completions.create( + model="Qwen/Qwen-Image", + messages=[{"role": "user", "content": "A beautiful landscape painting"}], + extra_body={ + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42, + }, +) + +img_url = response.choices[0].message.content[0].image_url.url +_, b64_data = img_url.split(",", 1) +with open("output.png", "wb") as f: + f.write(base64.b64decode(b64_data)) +``` + +> **Note:** The OpenAI SDK's `extra_body` keyword merges parameters into the top-level +> request body. This is different from placing a literal `"extra_body"` key in the JSON +> (as shown in the curl example), but both formats are supported by the server. + +### Method 3: Using Python Client Script ```bash python openai_chat_client.py --prompt "A beautiful landscape painting" --output output.png ``` -### Method 3: Using Gradio Demo +### Method 4: Using Gradio Demo ```bash python gradio_demo.py @@ -157,7 +187,10 @@ Use `extra_body` to pass generation parameters: } ``` -## Generation Parameters (extra_body) +## Generation Parameters + +These parameters can be passed inside `extra_body` in the curl JSON, or via the +`extra_body` keyword argument when using the OpenAI Python SDK. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------ | @@ -169,7 +202,7 @@ Use `extra_body` to pass generation parameters: | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | -| `--cfg-parallel-size`. | int | 1 | Number of GPUs for CFG parallelism | +| `--cfg-parallel-size` | int | 1 | Number of GPUs for CFG parallelism | ## Response Format From 824420dd43abf90bc316bb52e9043c473c801736 Mon Sep 17 00:00:00 2001 From: Samit <285365963@qq.com> Date: Mon, 23 Mar 2026 23:07:10 +0800 Subject: [PATCH 2/7] Update examples/online_serving/text_to_image/README.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com> Signed-off-by: Samit <285365963@qq.com> --- examples/online_serving/text_to_image/README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/examples/online_serving/text_to_image/README.md b/examples/online_serving/text_to_image/README.md index 528c22cf9eb..7c92b431c18 100644 --- a/examples/online_serving/text_to_image/README.md +++ b/examples/online_serving/text_to_image/README.md @@ -71,9 +71,10 @@ with open("output.png", "wb") as f: f.write(base64.b64decode(b64_data)) ``` -> **Note:** The OpenAI SDK's `extra_body` keyword merges parameters into the top-level -> request body. This is different from placing a literal `"extra_body"` key in the JSON -> (as shown in the curl example), but both formats are supported by the server. +!!! note + The OpenAI SDK's `extra_body` keyword merges parameters into the top-level + request body. This is different from placing a literal `"extra_body"` key in the JSON + (as shown in the curl example), but both formats are supported by the server. ### Method 3: Using Python Client Script From 718be6584eef613b7f1b435a32c923f5a6864c78 Mon Sep 17 00:00:00 2001 From: samithuang <285365963@qq.com> Date: Mon, 23 Mar 2026 15:12:18 +0000 Subject: [PATCH 3/7] docs: address PR review feedback - Remove `--cfg-parallel-size` from param tables (server CLI flag, not request param) - Reframe diffusion_chat_api.md: nested `extra_body` is the primary format for curl/requests, remove "Legacy" label and endpoint overview table - Update curl/requests examples in the guide to use nested `extra_body` - Remove video-specific params from the chat API guide (out of scope) - Unify note wording across READMEs for SDK vs JSON `extra_body` - Fix glm_image.md parameter table intro for consistency Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com> --- docs/serving/diffusion_chat_api.md | 78 +++++++------------ .../examples/online_serving/glm_image.md | 3 +- .../examples/online_serving/text_to_image.md | 1 - examples/online_serving/glm_image/README.md | 8 +- .../online_serving/image_to_image/README.md | 8 +- .../online_serving/text_to_image/README.md | 8 +- 6 files changed, 45 insertions(+), 61 deletions(-) diff --git a/docs/serving/diffusion_chat_api.md b/docs/serving/diffusion_chat_api.md index 41e9eada9f7..b0dfec2e9a2 100644 --- a/docs/serving/diffusion_chat_api.md +++ b/docs/serving/diffusion_chat_api.md @@ -9,29 +9,17 @@ to diffusion models through this endpoint across different client libraries. [`/v1/images/generations`](image_generation_api.md) endpoint accepts these parameters as top-level fields and may be simpler for your use case. -## API Endpoints Overview - -vLLM-Omni provides multiple endpoints for diffusion models. Each has its own parameter-passing -convention: - -| Endpoint | Use Case | Parameter Format | -|----------|----------|-----------------| -| `/v1/chat/completions` | Image gen/edit via chat | Generation params in `extra_body` (see below) | -| `/v1/images/generations` | Dedicated text-to-image | Top-level JSON fields | -| `/v1/images/edits` | Dedicated image editing | Multipart form fields | -| `/v1/videos` | Video generation | Multipart form fields | - -## Passing Generation Parameters via `/v1/chat/completions` +## Passing Generation Parameters The `/v1/chat/completions` endpoint follows the OpenAI Chat API schema, which does not natively include diffusion-specific fields like `num_inference_steps` or `height`. vLLM-Omni accepts these as **extra fields** on the request body. -There are two supported methods depending on your client: +How you pass these fields depends on your client: -### Method 1: Using curl or Python `requests` +### Using curl or Python `requests` -Put generation parameters as **top-level fields** in the JSON body alongside `messages`: +Wrap generation parameters inside an `"extra_body"` key in the JSON body: === "curl" @@ -42,11 +30,13 @@ Put generation parameters as **top-level fields** in the JSON body alongside `me "messages": [ {"role": "user", "content": "A beautiful landscape painting"} ], - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "true_cfg_scale": 4.0, - "seed": 42 + "extra_body": { + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42 + } }' | jq -r '.choices[0].message.content[0].image_url.url' \ | cut -d',' -f2- | base64 -d > output.png ``` @@ -61,11 +51,13 @@ Put generation parameters as **top-level fields** in the JSON body alongside `me "messages": [ {"role": "user", "content": "A beautiful landscape painting"} ], - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "true_cfg_scale": 4.0, - "seed": 42, + "extra_body": { + "height": 1024, + "width": 1024, + "num_inference_steps": 50, + "true_cfg_scale": 4.0, + "seed": 42, + }, } resp = requests.post( @@ -81,7 +73,7 @@ Put generation parameters as **top-level fields** in the JSON body alongside `me f.write(base64.b64decode(b64_data)) ``` -### Method 2: Using the OpenAI Python SDK +### Using the OpenAI Python SDK The OpenAI Python SDK uses the `extra_body` keyword argument to pass non-standard fields. The SDK automatically merges these into the top-level request body: @@ -112,23 +104,11 @@ with open("output.png", "wb") as f: f.write(base64.b64decode(b64_data)) ``` -### Legacy Format: Nested `extra_body` in JSON - -You may see examples that nest generation parameters inside an `"extra_body"` key in the -JSON body. This format is still supported for backward compatibility: - -```json -{ - "messages": [{"role": "user", "content": "A beautiful landscape painting"}], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50 - } -} -``` - -Both formats (top-level fields and nested `extra_body`) are accepted. +!!! note "SDK `extra_body` vs. JSON `extra_body`" + The OpenAI SDK's `extra_body` keyword argument and the literal `"extra_body"` key in + curl/requests JSON serve the same purpose but work differently under the hood. + The SDK flattens `extra_body` fields into the top-level request body, while the JSON + approach nests them. Both are handled correctly by the server. !!! note "About the `ignored fields` warning" When sending non-standard fields, you may see a log message like: @@ -160,9 +140,11 @@ For image editing, include both text and image in the message content: {"type": "image_url", "image_url": {"url": "data:image/png;base64,'"$IMG_B64"'"}} ] }], - "num_inference_steps": 50, - "guidance_scale": 1, - "seed": 42 + "extra_body": { + "num_inference_steps": 50, + "guidance_scale": 1, + "seed": 42 + } }' | jq -r '.choices[0].message.content[0].image_url.url' \ | cut -d',' -f2 | base64 -d > output.png ``` @@ -218,8 +200,6 @@ diffusion models: | `seed` | int | Random seed for reproducibility | | `negative_prompt` | str | Text describing what to avoid | | `num_outputs_per_prompt` | int | Number of images to generate (default: 1) | -| `num_frames` | int | Number of frames (video models) | -| `guidance_scale_2` | float | Secondary guidance scale (Wan2.2 models) | | `layers` | int | Number of layers to generate (Qwen-Image-Layered, default: 4) | | `resolution` | int | Resolution for dimension calculation (Qwen-Image-Layered, 640 or 1024) | | `lora` | object | Per-request LoRA adapter configuration | diff --git a/docs/user_guide/examples/online_serving/glm_image.md b/docs/user_guide/examples/online_serving/glm_image.md index d170a5511a4..45dbb53dbac 100644 --- a/docs/user_guide/examples/online_serving/glm_image.md +++ b/docs/user_guide/examples/online_serving/glm_image.md @@ -184,7 +184,8 @@ bash run_curl_image_edit.sh input.png "Convert to watercolor style" ## Generation Parameters -These can be passed as top-level fields in curl/requests, or via `extra_body` in the OpenAI SDK. +These can be passed inside `extra_body` in the curl JSON, or via the +`extra_body` keyword argument when using the OpenAI Python SDK. See the [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md) for details. | Parameter | Type | Default | Description | diff --git a/docs/user_guide/examples/online_serving/text_to_image.md b/docs/user_guide/examples/online_serving/text_to_image.md index 5ea4ba51156..73eb9613c9b 100644 --- a/docs/user_guide/examples/online_serving/text_to_image.md +++ b/docs/user_guide/examples/online_serving/text_to_image.md @@ -236,7 +236,6 @@ Wrap generation parameters inside `extra_body` in the request JSON: | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | -| `--cfg-parallel-size` | int | 1 | Number of GPUs for CFG parallelism | ## Response Format diff --git a/examples/online_serving/glm_image/README.md b/examples/online_serving/glm_image/README.md index 80dfcb2926c..16685ee5db7 100644 --- a/examples/online_serving/glm_image/README.md +++ b/examples/online_serving/glm_image/README.md @@ -127,9 +127,11 @@ with open("output.png", "wb") as f: f.write(base64.b64decode(b64_data)) ``` -> **Note:** The OpenAI SDK's `extra_body` keyword merges parameters into the top-level -> request body. This is different from placing a literal `"extra_body"` key in the JSON -> (as shown in the curl example), but both formats are supported by the server. +!!! note + The OpenAI SDK's `extra_body` keyword argument merges parameters into the + top-level request body automatically. When using curl or Python `requests`, + wrap generation parameters inside a literal `"extra_body"` key in the JSON + instead (as shown in the curl example above). Or use the script: diff --git a/examples/online_serving/image_to_image/README.md b/examples/online_serving/image_to_image/README.md index c5a1cf9ea52..d9cae7e27c4 100644 --- a/examples/online_serving/image_to_image/README.md +++ b/examples/online_serving/image_to_image/README.md @@ -104,9 +104,11 @@ with open("output.png", "wb") as f: f.write(base64.b64decode(b64_data)) ``` -> **Note:** The OpenAI SDK's `extra_body` keyword merges parameters into the top-level -> request body. This is different from placing a literal `"extra_body"` key in the JSON -> (as shown in the curl example), but both formats are supported by the server. +!!! note + The OpenAI SDK's `extra_body` keyword argument merges parameters into the + top-level request body automatically. When using curl or Python `requests`, + wrap generation parameters inside a literal `"extra_body"` key in the JSON + instead (as shown in the curl example above). ### Method 3: Using Python Client Script diff --git a/examples/online_serving/text_to_image/README.md b/examples/online_serving/text_to_image/README.md index 7c92b431c18..2f88e339a6c 100644 --- a/examples/online_serving/text_to_image/README.md +++ b/examples/online_serving/text_to_image/README.md @@ -72,9 +72,10 @@ with open("output.png", "wb") as f: ``` !!! note - The OpenAI SDK's `extra_body` keyword merges parameters into the top-level - request body. This is different from placing a literal `"extra_body"` key in the JSON - (as shown in the curl example), but both formats are supported by the server. + The OpenAI SDK's `extra_body` keyword argument merges parameters into the + top-level request body automatically. When using curl or Python `requests`, + wrap generation parameters inside a literal `"extra_body"` key in the JSON + instead (as shown in the curl example above). ### Method 3: Using Python Client Script @@ -203,7 +204,6 @@ These parameters can be passed inside `extra_body` in the curl JSON, or via the | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | -| `--cfg-parallel-size` | int | 1 | Number of GPUs for CFG parallelism | ## Response Format From 77ecb0e6ea54a0ff8abcf56acf0664ceeefc553a Mon Sep 17 00:00:00 2001 From: samithuang <285365963@qq.com> Date: Mon, 23 Mar 2026 15:25:08 +0000 Subject: [PATCH 4/7] docs: slim down diffusion_chat_api.md to avoid content duplication Remove duplicated examples, parameter table, and image-editing section that overlap with model-specific docs. Keep only the unique content: the extra_body SDK-vs-JSON explanation and the "ignored fields" warning. Add links to model-specific guides for full examples. Addresses fhfuih's review feedback about single source of truth. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com> --- docs/serving/diffusion_chat_api.md | 218 ++++++----------------------- 1 file changed, 43 insertions(+), 175 deletions(-) diff --git a/docs/serving/diffusion_chat_api.md b/docs/serving/diffusion_chat_api.md index b0dfec2e9a2..d0e2990ad6c 100644 --- a/docs/serving/diffusion_chat_api.md +++ b/docs/serving/diffusion_chat_api.md @@ -1,210 +1,78 @@ # Diffusion Chat Completions API -vLLM-Omni supports generating images via the `/v1/chat/completions` endpoint using diffusion models. -This page explains how to pass generation parameters (such as `num_inference_steps`, `height`, `width`) -to diffusion models through this endpoint across different client libraries. +vLLM-Omni supports generating and editing images via the `/v1/chat/completions` +endpoint using diffusion models. This page explains how to pass generation +parameters (such as `num_inference_steps`, `height`, `width`) to diffusion +models through this endpoint. !!! tip - For text-to-image generation without chat context, the dedicated - [`/v1/images/generations`](image_generation_api.md) endpoint accepts these - parameters as top-level fields and may be simpler for your use case. + For dedicated endpoints that accept generation parameters as top-level + fields, see [Image Generation API](image_generation_api.md) and + [Image Edit API](image_edit_api.md). ## Passing Generation Parameters -The `/v1/chat/completions` endpoint follows the OpenAI Chat API schema, which does not natively -include diffusion-specific fields like `num_inference_steps` or `height`. vLLM-Omni accepts -these as **extra fields** on the request body. +The `/v1/chat/completions` endpoint follows the OpenAI Chat API schema, which +does not natively include diffusion-specific fields like `num_inference_steps` +or `height`. How you pass these extra fields depends on your client. -How you pass these fields depends on your client: - -### Using curl or Python `requests` +### curl / Python `requests` Wrap generation parameters inside an `"extra_body"` key in the JSON body: -=== "curl" - - ```bash - curl -s http://localhost:8091/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d '{ - "messages": [ - {"role": "user", "content": "A beautiful landscape painting"} - ], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "true_cfg_scale": 4.0, - "seed": 42 - } - }' | jq -r '.choices[0].message.content[0].image_url.url' \ - | cut -d',' -f2- | base64 -d > output.png - ``` - -=== "Python requests" - - ```python - import requests - import base64 - - payload = { - "messages": [ - {"role": "user", "content": "A beautiful landscape painting"} - ], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "true_cfg_scale": 4.0, - "seed": 42, - }, +```bash +curl -s http://localhost:8091/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [ + {"role": "user", "content": "A beautiful landscape painting"} + ], + "extra_body": { + "num_inference_steps": 50, + "seed": 42 } + }' +``` - resp = requests.post( - "http://localhost:8091/v1/chat/completions", - json=payload, - timeout=300, - ) - data = resp.json() - - img_url = data["choices"][0]["message"]["content"][0]["image_url"]["url"] - _, b64_data = img_url.split(",", 1) - with open("output.png", "wb") as f: - f.write(base64.b64decode(b64_data)) - ``` - -### Using the OpenAI Python SDK +### OpenAI Python SDK -The OpenAI Python SDK uses the `extra_body` keyword argument to pass non-standard fields. -The SDK automatically merges these into the top-level request body: +Use the `extra_body` **keyword argument**. The SDK automatically merges these +fields into the top-level request body: ```python -from openai import OpenAI -import base64 - -client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") - response = client.chat.completions.create( model="Qwen/Qwen-Image", - messages=[ - {"role": "user", "content": "A beautiful landscape painting"} - ], + messages=[{"role": "user", "content": "A beautiful landscape painting"}], extra_body={ - "height": 1024, - "width": 1024, "num_inference_steps": 50, - "true_cfg_scale": 4.0, "seed": 42, }, ) - -img_url = response.choices[0].message.content[0].image_url.url -_, b64_data = img_url.split(",", 1) -with open("output.png", "wb") as f: - f.write(base64.b64decode(b64_data)) ``` !!! note "SDK `extra_body` vs. JSON `extra_body`" - The OpenAI SDK's `extra_body` keyword argument and the literal `"extra_body"` key in - curl/requests JSON serve the same purpose but work differently under the hood. - The SDK flattens `extra_body` fields into the top-level request body, while the JSON - approach nests them. Both are handled correctly by the server. + These two `extra_body` usages look similar but work differently under the + hood. The SDK flattens the dict into the top-level request JSON, while the + curl/requests approach sends it as a nested `"extra_body"` key. Both are + handled correctly by the server. !!! note "About the `ignored fields` warning" - When sending non-standard fields, you may see a log message like: + You may see a log message like: ``` WARNING: The following fields were present in the request but ignored: {'height', 'width', ...} ``` - This warning is **harmless**. It is emitted by vLLM's request validation layer because - these fields are not part of the standard OpenAI `ChatCompletionRequest` schema. - The fields are still stored internally and correctly forwarded to the diffusion pipeline. - -## Image Editing (Image-to-Image) - -For image editing, include both text and image in the message content: - -=== "curl" - - ```bash - IMG_B64=$(base64 -w0 input.png) - - curl -s http://localhost:8092/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d '{ - "messages": [{ - "role": "user", - "content": [ - {"type": "text", "text": "Convert to watercolor style"}, - {"type": "image_url", "image_url": {"url": "data:image/png;base64,'"$IMG_B64"'"}} - ] - }], - "extra_body": { - "num_inference_steps": 50, - "guidance_scale": 1, - "seed": 42 - } - }' | jq -r '.choices[0].message.content[0].image_url.url' \ - | cut -d',' -f2 | base64 -d > output.png - ``` + This is **harmless**. It is emitted by vLLM's request validation layer + because these fields are not part of the standard OpenAI + `ChatCompletionRequest` schema. The fields are still stored internally + and correctly forwarded to the diffusion pipeline. -=== "OpenAI SDK" - - ```python - import base64 - from openai import OpenAI - - client = OpenAI(base_url="http://localhost:8092/v1", api_key="none") - - with open("input.png", "rb") as f: - img_b64 = base64.b64encode(f.read()).decode() - - response = client.chat.completions.create( - model="Qwen/Qwen-Image-Edit", - messages=[{ - "role": "user", - "content": [ - {"type": "text", "text": "Convert to watercolor style"}, - {"type": "image_url", "image_url": { - "url": f"data:image/png;base64,{img_b64}" - }}, - ], - }], - extra_body={ - "num_inference_steps": 50, - "guidance_scale": 1, - "seed": 42, - }, - ) - - img_url = response.choices[0].message.content[0].image_url.url - _, b64_data = img_url.split(",", 1) - with open("output.png", "wb") as f: - f.write(base64.b64decode(b64_data)) - ``` +## Model-Specific Examples + +For complete examples with full request/response details, see the model-specific +guides: -## Generation Parameters Reference - -The following parameters are accepted as extra fields on `/v1/chat/completions` for -diffusion models: - -| Parameter | Type | Description | -|-----------|------|-------------| -| `height` | int | Output image height in pixels | -| `width` | int | Output image width in pixels | -| `size` | str | Output size in "WxH" format (alternative to separate height/width) | -| `num_inference_steps` | int | Number of denoising steps | -| `guidance_scale` | float | Classifier-free guidance scale | -| `true_cfg_scale` | float | True CFG scale (Qwen-Image specific) | -| `seed` | int | Random seed for reproducibility | -| `negative_prompt` | str | Text describing what to avoid | -| `num_outputs_per_prompt` | int | Number of images to generate (default: 1) | -| `layers` | int | Number of layers to generate (Qwen-Image-Layered, default: 4) | -| `resolution` | int | Resolution for dimension calculation (Qwen-Image-Layered, 640 or 1024) | -| `lora` | object | Per-request LoRA adapter configuration | - -!!! info "Model-specific defaults" - When a parameter is not specified, the underlying diffusion pipeline applies its own - model-specific default. For example, `num_inference_steps` defaults to 50 for most models - but may differ for turbo/distilled variants. +- [Text-to-Image (Qwen-Image)](../user_guide/examples/online_serving/text_to_image.md) +- [Image-to-Image (Qwen-Image-Edit, Qwen-Image-Layered)](../user_guide/examples/online_serving/image_to_image.md) +- [GLM-Image](../user_guide/examples/online_serving/glm_image.md) From da4b4fd48cce74544681463cbf74da295918cc55 Mon Sep 17 00:00:00 2001 From: samithuang <285365963@qq.com> Date: Mon, 23 Mar 2026 15:29:30 +0000 Subject: [PATCH 5/7] docs: simplify glm_image docs to avoid repeating generic request methods Remove inline curl and OpenAI SDK code blocks that duplicate the general text-to-image and image-to-image guides. Keep only the model-specific script examples (openai_chat_client.py, run_curl_*.sh) and link to the general guides for other request methods. Addresses fhfuih's review feedback. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com> --- .../examples/online_serving/glm_image.md | 94 ++-------------- examples/online_serving/glm_image/README.md | 100 ++---------------- 2 files changed, 12 insertions(+), 182 deletions(-) diff --git a/docs/user_guide/examples/online_serving/glm_image.md b/docs/user_guide/examples/online_serving/glm_image.md index 45dbb53dbac..bc151d6f84b 100644 --- a/docs/user_guide/examples/online_serving/glm_image.md +++ b/docs/user_guide/examples/online_serving/glm_image.md @@ -73,115 +73,33 @@ The default yaml configuration deploys AR on GPU 0 and DiT on GPU 1. You can use ### Text-to-Image -Generate images from text prompts: - -**Using Python client** - ```bash python openai_chat_client.py \ --prompt "A photorealistic mountain landscape at sunset" \ --height 1024 \ --width 1024 \ --output landscape.png -``` - -**Using curl** -```bash -curl -s http://localhost:8091/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d '{ - "messages": [ - {"role": "user", "content": "A beautiful sunset over the ocean with sailing boats"} - ], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "guidance_scale": 1.5, - "seed": 42 - } - }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png -``` - -**Using OpenAI SDK** - -```python -from openai import OpenAI -import base64 - -client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") - -response = client.chat.completions.create( - model="zai-org/GLM-Image", - messages=[{"role": "user", "content": "A beautiful sunset over the ocean"}], - extra_body={ - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "guidance_scale": 1.5, - "seed": 42, - }, -) - -img_url = response.choices[0].message.content[0].image_url.url -_, b64_data = img_url.split(",", 1) -with open("output.png", "wb") as f: - f.write(base64.b64decode(b64_data)) -``` - -Or use the script: - -```bash +# Or use the curl script: bash run_curl_text_to_image.sh "A futuristic city skyline at night" ``` ### Image-to-Image (Image Editing) -Edit images with text instructions: - -**Using Python client** - ```bash python openai_chat_client.py \ --prompt "Convert this image to watercolor style" \ --image input.png \ --output watercolor.png -``` - -**Using curl** - -```bash -IMG_B64=$(base64 < input.png | tr -d '\n') - -curl -s http://localhost:8091/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d @- < output.png -{ - "messages": [{ - "role": "user", - "content": [ - {"type": "text", "text": "Convert this image to watercolor style"}, - {"type": "image_url", "image_url": {"url": "data:image/png;base64,'$IMG_B64'"}} - ] - }], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "guidance_scale": 1.5, - "seed": 42 - } -} -EOF -``` -Or use the script: - -```bash +# Or use the curl script: bash run_curl_image_edit.sh input.png "Convert to watercolor style" ``` +For general-purpose request methods (curl, OpenAI SDK, Python `requests`), see +the [Text-to-Image](text_to_image.md) and [Image-to-Image](image_to_image.md) +guides. + ## Generation Parameters These can be passed inside `extra_body` in the curl JSON, or via the diff --git a/examples/online_serving/glm_image/README.md b/examples/online_serving/glm_image/README.md index 16685ee5db7..2a7e301e70e 100644 --- a/examples/online_serving/glm_image/README.md +++ b/examples/online_serving/glm_image/README.md @@ -70,121 +70,33 @@ The default yaml configuration deploys AR on GPU 0 and DiT on GPU 1. You can use ### Text-to-Image -Generate images from text prompts: - -**Using Python client** - ```bash python openai_chat_client.py \ --prompt "A photorealistic mountain landscape at sunset" \ --height 1024 \ --width 1024 \ --output landscape.png -``` - -**Using curl** -```bash -curl -s http://localhost:8091/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d '{ - "messages": [ - {"role": "user", "content": "A beautiful sunset over the ocean with sailing boats"} - ], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "guidance_scale": 1.5, - "seed": 42 - } - }' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > output.png -``` - -**Using OpenAI SDK** - -```python -from openai import OpenAI -import base64 - -client = OpenAI(base_url="http://localhost:8091/v1", api_key="none") - -response = client.chat.completions.create( - model="zai-org/GLM-Image", - messages=[{"role": "user", "content": "A beautiful sunset over the ocean"}], - extra_body={ - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "guidance_scale": 1.5, - "seed": 42, - }, -) - -img_url = response.choices[0].message.content[0].image_url.url -_, b64_data = img_url.split(",", 1) -with open("output.png", "wb") as f: - f.write(base64.b64decode(b64_data)) -``` - -!!! note - The OpenAI SDK's `extra_body` keyword argument merges parameters into the - top-level request body automatically. When using curl or Python `requests`, - wrap generation parameters inside a literal `"extra_body"` key in the JSON - instead (as shown in the curl example above). - -Or use the script: - -```bash +# Or use the curl script: bash run_curl_text_to_image.sh "A futuristic city skyline at night" ``` ### Image-to-Image (Image Editing) -Edit images with text instructions: - -**Using Python client** - ```bash python openai_chat_client.py \ --prompt "Convert this image to watercolor style" \ --image input.png \ --output watercolor.png -``` - -**Using curl** -```bash -IMG_B64=$(base64 < input.png | tr -d '\n') - -curl -s http://localhost:8091/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d @- < output.png -{ - "messages": [{ - "role": "user", - "content": [ - {"type": "text", "text": "Convert this image to watercolor style"}, - {"type": "image_url", "image_url": {"url": "data:image/png;base64,'$IMG_B64'"}} - ] - }], - "extra_body": { - "height": 1024, - "width": 1024, - "num_inference_steps": 50, - "guidance_scale": 1.5, - "seed": 42 - } -} -EOF -``` - -Or use the script: - -```bash +# Or use the curl script: bash run_curl_image_edit.sh input.png "Convert to watercolor style" ``` +For general-purpose request methods (curl, OpenAI SDK, Python `requests`), see +the [Text-to-Image](../text_to_image/README.md) and +[Image-to-Image](../image_to_image/README.md) guides. + ## Generation Parameters These parameters can be passed inside `extra_body` in the curl JSON, or via the From 337287a23d6fdc1f6f2ed0c9b40b920dd13f57fc Mon Sep 17 00:00:00 2001 From: samithuang <285365963@qq.com> Date: Tue, 24 Mar 2026 07:49:18 +0000 Subject: [PATCH 6/7] docs: mention dedicated endpoints support top-level parameters Update the Generation Parameters sections in all model-specific docs to clarify that /v1/images/generations and /v1/images/edits accept parameters as top-level fields, while /v1/chat/completions requires them inside extra_body. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com> --- docs/user_guide/examples/online_serving/glm_image.md | 9 ++++++--- .../user_guide/examples/online_serving/image_to_image.md | 6 ++++++ docs/user_guide/examples/online_serving/text_to_image.md | 6 ++++++ examples/online_serving/glm_image/README.md | 6 ++++-- examples/online_serving/image_to_image/README.md | 6 ++++-- examples/online_serving/text_to_image/README.md | 6 ++++-- 6 files changed, 30 insertions(+), 9 deletions(-) diff --git a/docs/user_guide/examples/online_serving/glm_image.md b/docs/user_guide/examples/online_serving/glm_image.md index bc151d6f84b..37d7de6a64c 100644 --- a/docs/user_guide/examples/online_serving/glm_image.md +++ b/docs/user_guide/examples/online_serving/glm_image.md @@ -102,9 +102,12 @@ guides. ## Generation Parameters -These can be passed inside `extra_body` in the curl JSON, or via the -`extra_body` keyword argument when using the OpenAI Python SDK. -See the [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md) for details. +When using `/v1/chat/completions`, pass these inside `extra_body` in the curl +JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the +[Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)). +When using the dedicated [`/v1/images/generations`](../../../../serving/image_generation_api.md) +or [`/v1/images/edits`](../../../../serving/image_edit_api.md) endpoints, pass +them as top-level fields directly. | Parameter | Type | Default | Description | | --------------------- | ----- | ------- | ----------------------------------- | diff --git a/docs/user_guide/examples/online_serving/image_to_image.md b/docs/user_guide/examples/online_serving/image_to_image.md index 6b446749739..da6cbf220de 100644 --- a/docs/user_guide/examples/online_serving/image_to_image.md +++ b/docs/user_guide/examples/online_serving/image_to_image.md @@ -350,6 +350,12 @@ Provide multiple images in `content` (order matters): ## Generation Parameters +When using `/v1/chat/completions`, pass these inside `extra_body` in the curl +JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the +[Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)). +When using the dedicated [`/v1/images/edits`](../../../../serving/image_edit_api.md) +endpoint, pass them as top-level form fields directly. + | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------------- | | `height` | int | None | Output image height in pixels | diff --git a/docs/user_guide/examples/online_serving/text_to_image.md b/docs/user_guide/examples/online_serving/text_to_image.md index a1f2b8c9997..9d29cd5063c 100644 --- a/docs/user_guide/examples/online_serving/text_to_image.md +++ b/docs/user_guide/examples/online_serving/text_to_image.md @@ -226,6 +226,12 @@ Wrap generation parameters inside `extra_body` in the request JSON: ## Generation Parameters +When using `/v1/chat/completions`, pass these inside `extra_body` in the curl +JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the +[Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)). +When using the dedicated [`/v1/images/generations`](../../../../serving/image_generation_api.md) +endpoint, pass them as top-level JSON fields directly. + | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------ | | `height` | int | None | Image height in pixels | diff --git a/examples/online_serving/glm_image/README.md b/examples/online_serving/glm_image/README.md index 2a7e301e70e..54a4708a606 100644 --- a/examples/online_serving/glm_image/README.md +++ b/examples/online_serving/glm_image/README.md @@ -99,8 +99,10 @@ the [Text-to-Image](../text_to_image/README.md) and ## Generation Parameters -These parameters can be passed inside `extra_body` in the curl JSON, or via the -`extra_body` keyword argument when using the OpenAI Python SDK. +When using `/v1/chat/completions`, pass these inside `extra_body` in the curl +JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK. +When using the dedicated `/v1/images/generations` or `/v1/images/edits` +endpoints, pass them as top-level fields directly. | Parameter | Type | Default | Description | | --------------------- | ----- | ------- | ----------------------------------- | diff --git a/examples/online_serving/image_to_image/README.md b/examples/online_serving/image_to_image/README.md index d9cae7e27c4..1d0a1d3961d 100644 --- a/examples/online_serving/image_to_image/README.md +++ b/examples/online_serving/image_to_image/README.md @@ -297,8 +297,10 @@ Provide multiple images in `content` (order matters): ## Generation Parameters -These parameters can be passed inside `extra_body` in the curl JSON, or via the -`extra_body` keyword argument when using the OpenAI Python SDK. +When using `/v1/chat/completions`, pass these inside `extra_body` in the curl +JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK. +When using the dedicated `/v1/images/edits` endpoint, pass them as top-level +form fields directly. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------------- | diff --git a/examples/online_serving/text_to_image/README.md b/examples/online_serving/text_to_image/README.md index af7e5857722..af27bc05602 100644 --- a/examples/online_serving/text_to_image/README.md +++ b/examples/online_serving/text_to_image/README.md @@ -214,8 +214,10 @@ Use `extra_body` to pass generation parameters: ## Generation Parameters -These parameters can be passed inside `extra_body` in the curl JSON, or via the -`extra_body` keyword argument when using the OpenAI Python SDK. +When using `/v1/chat/completions`, pass these inside `extra_body` in the curl +JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK. +When using the dedicated `/v1/images/generations` endpoint, pass them as +top-level JSON fields directly. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------ | From b07b9dca13b1b2e09cc2bf9f39f7f2e7965d5b5e Mon Sep 17 00:00:00 2001 From: gcanlin Date: Thu, 26 Mar 2026 08:40:32 +0000 Subject: [PATCH 7/7] docs: fix diffusion parameter defaults Signed-off-by: gcanlin --- docs/user_guide/examples/online_serving/glm_image.md | 5 +++-- docs/user_guide/examples/online_serving/image_to_image.md | 6 ++++-- docs/user_guide/examples/online_serving/text_to_image.md | 4 +++- examples/online_serving/glm_image/README.md | 6 ++++-- examples/online_serving/image_to_image/README.md | 8 +++++--- examples/online_serving/text_to_image/README.md | 6 ++++-- 6 files changed, 23 insertions(+), 12 deletions(-) diff --git a/docs/user_guide/examples/online_serving/glm_image.md b/docs/user_guide/examples/online_serving/glm_image.md index 37d7de6a64c..f7027b906db 100644 --- a/docs/user_guide/examples/online_serving/glm_image.md +++ b/docs/user_guide/examples/online_serving/glm_image.md @@ -107,7 +107,8 @@ JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)). When using the dedicated [`/v1/images/generations`](../../../../serving/image_generation_api.md) or [`/v1/images/edits`](../../../../serving/image_edit_api.md) endpoints, pass -them as top-level fields directly. +the supported generation controls as top-level fields directly. For image +dimensions and count, use `size` and `n` rather than `height` or `width`. | Parameter | Type | Default | Description | | --------------------- | ----- | ------- | ----------------------------------- | @@ -115,7 +116,7 @@ them as top-level fields directly. | `width` | int | 1024 | Image width in pixels | | `num_inference_steps` | int | 50 | Number of diffusion denoising steps | | `guidance_scale` | float | 1.5 | Classifier-free guidance scale | -| `seed` | int | 42 | Random seed for reproducibility | +| `seed` | int | None | Optional random seed; `/v1/images/*` generates one server-side if omitted | | `negative_prompt` | str | None | Negative prompt | ## Response Format diff --git a/docs/user_guide/examples/online_serving/image_to_image.md b/docs/user_guide/examples/online_serving/image_to_image.md index da6cbf220de..b19e9462da0 100644 --- a/docs/user_guide/examples/online_serving/image_to_image.md +++ b/docs/user_guide/examples/online_serving/image_to_image.md @@ -354,7 +354,9 @@ When using `/v1/chat/completions`, pass these inside `extra_body` in the curl JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)). When using the dedicated [`/v1/images/edits`](../../../../serving/image_edit_api.md) -endpoint, pass them as top-level form fields directly. +endpoint, pass the supported generation controls as top-level form fields +directly. For image dimensions and count, use `size` and `n` rather than +`height`, `width`, or `num_outputs_per_prompt`. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------------- | @@ -362,7 +364,7 @@ endpoint, pass them as top-level form fields directly. | `width` | int | None | Output image width in pixels | | `size` | str | None | Output image size (e.g., "1024x1024") | | `num_inference_steps` | int | 50 | Number of denoising steps | -| `guidance_scale` | float | 7.5 | CFG guidance scale | +| `guidance_scale` | float | 1.0 | CFG guidance scale | | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | diff --git a/docs/user_guide/examples/online_serving/text_to_image.md b/docs/user_guide/examples/online_serving/text_to_image.md index 9d29cd5063c..2e79749b3b2 100644 --- a/docs/user_guide/examples/online_serving/text_to_image.md +++ b/docs/user_guide/examples/online_serving/text_to_image.md @@ -230,7 +230,9 @@ When using `/v1/chat/completions`, pass these inside `extra_body` in the curl JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK (see the [Diffusion Chat API guide](../../../../serving/diffusion_chat_api.md)). When using the dedicated [`/v1/images/generations`](../../../../serving/image_generation_api.md) -endpoint, pass them as top-level JSON fields directly. +endpoint, pass the supported generation controls as top-level JSON fields +directly. For image dimensions and count, use `size` and `n` rather than +`height`, `width`, or `num_outputs_per_prompt`. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------ | diff --git a/examples/online_serving/glm_image/README.md b/examples/online_serving/glm_image/README.md index 54a4708a606..13ed00861da 100644 --- a/examples/online_serving/glm_image/README.md +++ b/examples/online_serving/glm_image/README.md @@ -102,7 +102,9 @@ the [Text-to-Image](../text_to_image/README.md) and When using `/v1/chat/completions`, pass these inside `extra_body` in the curl JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK. When using the dedicated `/v1/images/generations` or `/v1/images/edits` -endpoints, pass them as top-level fields directly. +endpoints, pass the supported generation controls as top-level fields directly. +For image dimensions and count, use `size` and `n` rather than `height` or +`width`. | Parameter | Type | Default | Description | | --------------------- | ----- | ------- | ----------------------------------- | @@ -110,7 +112,7 @@ endpoints, pass them as top-level fields directly. | `width` | int | 1024 | Image width in pixels | | `num_inference_steps` | int | 50 | Number of diffusion denoising steps | | `guidance_scale` | float | 1.5 | Classifier-free guidance scale | -| `seed` | int | 42 | Random seed for reproducibility | +| `seed` | int | None | Optional random seed; `/v1/images/*` generates one server-side if omitted | | `negative_prompt` | str | None | Negative prompt | ## Response Format diff --git a/examples/online_serving/image_to_image/README.md b/examples/online_serving/image_to_image/README.md index 1d0a1d3961d..789258473fd 100644 --- a/examples/online_serving/image_to_image/README.md +++ b/examples/online_serving/image_to_image/README.md @@ -299,8 +299,10 @@ Provide multiple images in `content` (order matters): When using `/v1/chat/completions`, pass these inside `extra_body` in the curl JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK. -When using the dedicated `/v1/images/edits` endpoint, pass them as top-level -form fields directly. +When using the dedicated `/v1/images/edits` endpoint, pass the supported +generation controls as top-level form fields directly. For image dimensions and +count, use `size` and `n` rather than `height`, `width`, or +`num_outputs_per_prompt`. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------------- | @@ -308,7 +310,7 @@ form fields directly. | `width` | int | None | Output image width in pixels | | `size` | str | None | Output image size (e.g., "1024x1024") | | `num_inference_steps` | int | 50 | Number of denoising steps | -| `guidance_scale` | float | 7.5 | CFG guidance scale | +| `guidance_scale` | float | 1.0 | CFG guidance scale | | `seed` | int | None | Random seed (reproducible) | | `negative_prompt` | str | None | Negative prompt | | `num_outputs_per_prompt` | int | 1 | Number of images to generate | diff --git a/examples/online_serving/text_to_image/README.md b/examples/online_serving/text_to_image/README.md index af27bc05602..87b6a56438e 100644 --- a/examples/online_serving/text_to_image/README.md +++ b/examples/online_serving/text_to_image/README.md @@ -216,8 +216,10 @@ Use `extra_body` to pass generation parameters: When using `/v1/chat/completions`, pass these inside `extra_body` in the curl JSON, or via the `extra_body` keyword argument in the OpenAI Python SDK. -When using the dedicated `/v1/images/generations` endpoint, pass them as -top-level JSON fields directly. +When using the dedicated `/v1/images/generations` endpoint, pass the supported +generation controls as top-level JSON fields directly. For image dimensions and +count, use `size` and `n` rather than `height`, `width`, or +`num_outputs_per_prompt`. | Parameter | Type | Default | Description | | ------------------------ | ----- | ------- | ------------------------------ |