vllm-project · david6666666 · Dec 31, 2025
@@ -0,0 +1,143 @@
+# Text-To-Video
+
+This example demonstrates how to deploy Wan2.2 video models for online video generation
+using vLLM-Omni. The API base is `v1/chat/completions`.
+
+## Start Server
+
+### Text-to-Video (T2V)
+
+```bash
+vllm serve Wan-AI/Wan2.2-T2V-A14B-Diffusers --omni --port 8093 \
+  --boundary-ratio 0.875 \
+  --flow-shift 5.0
+```
+
+### Image-to-Video (I2V)
+
+```bash
+vllm serve Wan-AI/Wan2.2-I2V-A14B-Diffusers --omni --port 8094 \
+  --boundary-ratio 0.875 \
+  --flow-shift 5.0
+```
+
+Or use the startup script:
+
+```bash
+bash run_server.sh
+```
+
+## API Calls
+
+### Method 1: Using curl (Text-to-Video)
+
+```bash
+bash run_curl_text_to_video.sh
+```
+
+### Method 2: Using curl (Image-to-Video)
+
+```bash
+bash run_curl_image_to_video.sh input.png "A cinematic slow zoom into the scene"
+```
+
+## Request Format
+
+### Text-to-Video
+
+```json
+{
+  "messages": [
+    {"role": "user", "content": "A serene lakeside sunrise with mist over the water."}
+  ],
+  "extra_body": {
+    "height": 720,
+    "width": 1280,
+    "num_frames": 81,
+    "num_inference_steps": 40,
+    "guidance_scale": 4.0,
+    "guidance_scale_2": 4.0,
+    "seed": 42,
+    "fps": 24
+  }
+}
+```
+
+### Image-to-Video
+
+```json
+{
+  "messages": [
+    {
+      "role": "user",
+      "content": [
+        {"type": "text", "text": "Make the scene come alive with gentle motion"},
+        {"type": "image_url", "image_url": {"url": "data:image/png;base64,..." }}
+      ]
+    }
+  ],
+  "extra_body": {
+    "height": 720,
+    "width": 1280,
+    "num_frames": 81,
+    "num_inference_steps": 40,
+    "guidance_scale": 4.0,
+    "seed": 42,
+    "fps": 24
+  }
+}
+```
+
+## Generation Parameters (extra_body)
+
+| Parameter                | Type  | Default | Description                                    |
+| ------------------------ | ----- | ------- | ---------------------------------------------- |
+| `height`                 | int   | None    | Video height in pixels                         |
+| `width`                  | int   | None    | Video width in pixels                          |
+| `num_frames`             | int   | None    | Number of frames to generate                   |
+| `num_inference_steps`    | int   | 50      | Number of denoising steps                      |
+| `guidance_scale`         | float | None    | CFG scale                                      |
+| `guidance_scale_2`        | float | None    | Optional high-noise CFG (Wan2.2)               |
+| `seed`                   | int   | None    | Random seed (reproducible)                     |
+| `negative_prompt`        | str   | None    | Negative prompt                                |
+| `num_outputs_per_prompt` | int   | 1       | Number of videos to generate                   |
+| `fps`                    | int   | 24      | Output video FPS (used for MP4 encoding only)  |
+
+## Response Format
+
+```json
+{
+  "id": "chatcmpl-xxx",
+  "created": 1234567890,
+  "model": "Wan-AI/Wan2.2-T2V-A14B-Diffusers",
+  "choices": [{
+    "index": 0,
+    "message": {
+      "role": "assistant",
+      "content": [{
+        "type": "video_url",
+        "video_url": {
+          "url": "data:video/mp4;base64,..."
+        }
+      }]
+    },
+    "finish_reason": "stop"
+  }],
+  "usage": {...}
+}
+```
+
+## Extract Video
+
+```bash
+cat response.json | jq -r '.choices[0].message.content[0].video_url.url' \
+  | sed 's/^data:video[^,]*,\s*//' | base64 -d > output.mp4
+```
+
+## File Description
+
+| File                         | Description                    |
+| ---------------------------- | ------------------------------ |
+| `run_server.sh`              | Server startup script          |
+| `run_curl_text_to_video.sh`  | Text-to-video curl example     |
+| `run_curl_image_to_video.sh` | Image-to-video curl example    |
@@ -0,0 +1,50 @@
+#!/bin/bash
+# Wan2.2 image-to-video curl example
+
+SERVER="${SERVER:-http://localhost:8094}"
+INPUT_IMAGE="${1:-input.png}"
+PROMPT="${2:-Make the scene come alive with gentle motion.}"
+CURRENT_TIME=$(date +%Y%m%d%H%M%S)
+OUTPUT="${OUTPUT:-wan22_i2v_${CURRENT_TIME}.mp4}"
+
+if [ ! -f "$INPUT_IMAGE" ]; then
+    echo "Input image not found: $INPUT_IMAGE"
+    exit 1
+fi
+
+IMG_B64=$(base64 -w0 "$INPUT_IMAGE")
+
+echo "Generating video..."
+echo "Prompt: $PROMPT"
+echo "Input: $INPUT_IMAGE"
+echo "Output: $OUTPUT"
+
+curl -s "$SERVER/v1/chat/completions" \
+  -H "Content-Type: application/json" \
+  -d "{
+    \"messages\": [{
+      \"role\": \"user\",
+      \"content\": [
+        {\"type\": \"text\", \"text\": \"$PROMPT\"},
+        {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/png;base64,$IMG_B64\"}}
+      ]
+    }],
+    \"extra_body\": {
+      \"height\": 720,
+      \"width\": 1280,
+      \"num_frames\": 81,
+      \"num_inference_steps\": 40,
+      \"guidance_scale\": 4.0,
+      \"seed\": 42,
+      \"fps\": 24
+    }
+  }" | jq -r '.choices[0].message.content[0].video_url.url' \
+  | sed 's/^data:video[^,]*,\s*//' | base64 -d > "$OUTPUT"
+
+if [ -f "$OUTPUT" ]; then
+    echo "Video saved to: $OUTPUT"
+    echo "Size: $(du -h "$OUTPUT" | cut -f1)"
+else
+    echo "Failed to generate video"
+    exit 1
+fi
@@ -0,0 +1,38 @@
+#!/bin/bash
+# Wan2.2 text-to-video curl example
+
+SERVER="${SERVER:-http://localhost:8093}"
+PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"
+CURRENT_TIME=$(date +%Y%m%d%H%M%S)
+OUTPUT="${OUTPUT:-wan22_t2v_${CURRENT_TIME}.mp4}"
+
+echo "Generating video..."
+echo "Prompt: $PROMPT"
+echo "Output: $OUTPUT"
+
+curl -s "$SERVER/v1/chat/completions" \
+  -H "Content-Type: application/json" \
+  -d "{
+    \"messages\": [
+      {\"role\": \"user\", \"content\": \"$PROMPT\"}
+    ],
+    \"extra_body\": {
+      \"height\": 720,
+      \"width\": 1280,
+      \"num_frames\": 81,
+      \"num_inference_steps\": 40,
+      \"guidance_scale\": 4.0,
+      \"guidance_scale_2\": 4.0,
+      \"seed\": 42,
+      \"fps\": 24
+    }
+  }" | jq -r '.choices[0].message.content[0].video_url.url' \
+  | sed 's/^data:video[^,]*,\s*//' | base64 -d > "$OUTPUT"
+
+if [ -f "$OUTPUT" ]; then
+    echo "Video saved to: $OUTPUT"
+    echo "Size: $(du -h "$OUTPUT" | cut -f1)"
+else
+    echo "Failed to generate video"
+    exit 1
+fi
@@ -0,0 +1,18 @@
+#!/bin/bash
+# Wan2.2 video generation online serving startup script
+
+MODEL="${MODEL:-Wan-AI/Wan2.2-T2V-A14B-Diffusers}"
+PORT="${PORT:-8093}"
+BOUNDARY_RATIO="${BOUNDARY_RATIO:-0.875}"
+FLOW_SHIFT="${FLOW_SHIFT:-5.0}"
+
+echo "Starting Wan2.2 server..."
+echo "Model: $MODEL"
+echo "Port: $PORT"
+echo "Boundary ratio: $BOUNDARY_RATIO"
+echo "Flow shift: $FLOW_SHIFT"
+
+vllm serve "$MODEL" --omni \
+    --port "$PORT" \
+    --boundary-ratio "$BOUNDARY_RATIO" \
+    --flow-shift "$FLOW_SHIFT"