vllm-project · hsliuustc0106 · Mar 9, 2026 · Mar 4, 2026 · Mar 4, 2026 · Mar 4, 2026
@@ -125,6 +125,81 @@ Lists available voices for the loaded model.
     "voices": ["aiden", "dylan", "eric", "ono_anna", "ryan", "serena", "sohee", "uncle_fu", "vivian"]
 }
 ```
+```
+POST /v1/audio/voices
+Content-Type: multipart/form-data
+```
+
+Upload a new voice sample for voice cloning in Base task TTS requests.
+
+**Form Parameters:**
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `audio_sample` | file | Yes | Audio file (max 10MB, supported formats: wav, mp3, flac, ogg, aac, webm, mp4) |
+| `consent` | string | Yes | Consent recording ID |
+| `name` | string | Yes | Name for the new voice |
+
+**Response Example:**
+
+```json
+{
+  "success": true,
+  "voice": {
+    "name": "custom_voice_1",
+    "consent": "user_consent_id",
+    "created_at": 1738660000,
+    "mime_type": "audio/wav",
+    "file_size": 1024000
+  }
+}
+```
+
+**Usage Example:**
+
+```bash
+curl -X POST http://localhost:8091/v1/audio/voices \
+  -F "audio_sample=@/path/to/voice_sample.wav" \
+  -F "consent=user_consent_id" \
+  -F "name=custom_voice_1"
+```
+
+
+```bash
+DELETE /v1/audio/voices/{name}
+```
+
+Delete an uploaded voice sample.
+
+**Path Parameters:**
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `name` | string | Yes | Name of the voice to delete |
+
+**Response Example:**
+
+```json
+{
+  "success": true,
+  "message": "Voice 'custom_voice_1' deleted successfully"
+}
+```
+
+**Error Response (404 Not Found):**
+
+```json
+{
+  "success": false,
+  "error": "Voice 'unknown_voice' not found"
+}
+```
+
+**Usage Example:**
+
+```bash
+curl -X DELETE http://localhost:8091/v1/audio/voices/custom_voice_1
+```
 
 ## Examples
 
@@ -185,6 +260,25 @@ curl -X POST http://localhost:8091/v1/audio/speech \
     }' --output cloned.wav
 ```
 
+upload voice
+```bash
+curl -X POST http://localhost:8091/v1/audio/voices \
+  -F "audio_sample=@/path/to/voice_sample.wav" \
+  -F "consent=user_consent_id" \
+  -F "name=custom_voice_1"
+```
+
+use upload voice
+```bash
+curl -X POST http://localhost:8091/v1/audio/speech \
+    -H "Content-Type: application/json" \
+    -d '{
+        "input": "Hello, this is a cloned voice",
+        "task_type": "Base",
+        "voice": "custom_voice_1"
+    }' --output cloned.wav
+```
+
 ## Supported Models
 
 | Model | Task Type | Description |

@@ -184,29 +184,68 @@ sudo apt install ffmpeg
 
 ## API Reference
 
-### Endpoint
-
-```
-POST /v1/audio/speech
-Content-Type: application/json
-```
+### Voices Endpoint
 
-This endpoint follows the [OpenAI Audio Speech API](https://platform.openai.com/docs/api-reference/audio/createSpeech) format with additional Qwen3-TTS parameters.
+#### GET /v1/audio/voices
 
-### Voices Endpoint
+List all available voices/speakers from the loaded model, including both built-in model voices and uploaded custom voices.
 
+**Response Example:**
+```json
+{
+  "voices": ["vivian", "ryan", "custom_voice_1"],
+  "uploaded_voices": [
+    {
+      "name": "custom_voice_1",
+      "consent": "user_consent_id",
+      "created_at": 1738660000,
+      "file_size": 1024000,
+      "mime_type": "audio/wav"
+    }
+  ]
+}
 ```
-GET /v1/audio/voices
-```
 
-Lists available voices for the loaded model:
+#### POST /v1/audio/voices
+
+Upload a new voice sample for voice cloning in Base task TTS requests.
-Upload a new voice sample for voice cloning in Base task TTS requests.
+Upload a new voice sample that can be used for voice cloning in subsequent TTS requests with any supported task type.
-Upload a new voice sample for voice cloning in Base task TTS requests.
+Upload a new voice sample that can be used for voice cloning in subsequent TTS requests with any supported task type.
 
+**Form Parameters:**
+- `audio_sample` (required): Audio file (max 10MB, supported formats: wav, mp3, flac, ogg, aac, webm, mp4)
+- `consent` (required): Consent recording ID
+- `name` (required): Name for the new voice
+
+**Response Example:**
 ```json
 {
-    "voices": ["aiden", "dylan", "eric", "one_anna", "ryan", "serena", "sohee", "uncle_fu", "vivian"]
+  "success": true,
+  "voice": {
+    "name": "custom_voice_1",
+    "consent": "user_consent_id",
+    "created_at": 1738660000,
+    "mime_type": "audio/wav",
+    "file_size": 1024000
+  }
 }
 ```
 
+**Usage Example:**
+```bash
+curl -X POST http://localhost:8000/v1/audio/voices \
+  -F "audio_sample=@/path/to/voice_sample.wav" \
+  -F "consent=user_consent_id" \
+  -F "name=custom_voice_1"
+```
+
+### Endpoint
+
+```
+POST /v1/audio/speech
+Content-Type: application/json
+```
+
+This endpoint follows the [OpenAI Audio Speech API](https://platform.openai.com/docs/api-reference/audio/createSpeech) format with additional Qwen3-TTS parameters.
+
 ### Request Body
 
 ```json