Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions packages/cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ This package is published to npm as **`@qvac/cli`** and lives in the QVAC monore
- [`bundle sdk`](#bundle-sdk)
- [`verify deps`](#verify-deps)
- [`verify bundle`](#verify-bundle)
- [`serve openai`](#serve-openai)
- [Configuration](#configuration)
- [System Requirements](#system-requirements)
- [Development](#development)
Expand Down Expand Up @@ -297,6 +298,16 @@ on-device runtime version from a mobile dependency tree. **Pass
strict ABI verification; otherwise mobile bundles will emit
`unknown-runtime-version` and skip the ABI check pass.

### `serve openai`

Run an **OpenAI-compatible HTTP server** backed by locally configured QVAC models (`serve.models` in `qvac.config.*`).

```bash
qvac serve openai [options]
```

See **[docs/serve-openai.md](./docs/serve-openai.md)** for supported `/v1/...` routes, multipart request shapes, and how to register models β€” including **`whispercpp-audio-translation`** for `POST /v1/audio/translations` (Whisper translate-to-English).

## Configuration

The CLI reads configuration from `qvac.config.{json,js,mjs,ts}` in your project root.
Expand Down
110 changes: 110 additions & 0 deletions packages/cli/docs/serve-openai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# `qvac serve openai`

The CLI exposes an **OpenAI-compatible HTTP API** (`qvac serve openai`) so tools and SDKs that target OpenAI can run against local QVAC models.

This document describes the supported routes and how to configure `serve.models` for each capability. For general CLI usage, see [README.md](../README.md).

## Implemented endpoints (today)

| Method | Path | Notes |
|--------|------|--------|
| `GET` | `/v1/models` | Lists **loaded** models |
| `GET` | `/v1/models/{id}` | Model metadata |
| `DELETE` | `/v1/models/{id}` | Unload |
| `POST` | `/v1/chat/completions` | Chat |
| `POST` | `/v1/embeddings` | Embeddings |
| `POST` | `/v1/audio/transcriptions` | Speech-to-text (source language) |
| `POST` | `/v1/audio/translations` | Speech-to-text **into English** (Whisper translate task) |

Other OpenAI routes may be added over time; this file is updated when they ship.

## `POST /v1/audio/translations`

OpenAI’s **translations** endpoint always returns **English text**. It maps to Whisper’s **translate** task (not β€œtranscribe then run a text translator”).

### Request

- **Content-Type:** `multipart/form-data`
- **Fields:**
- `file` (required) β€” audio file (same as transcriptions)
- `model` (required) β€” must name a `serve.models` alias whose **endpoint category** is `audio-translation` (see below)
- `prompt` (optional) β€” passed through to the SDK transcribe path (Whisper initial prompt where supported)
- `response_format` (optional) β€” `json` (default) or `text`. `srt`, `vtt`, and `verbose_json` are not implemented yet.
- **Not supported:** `language`. Per-request language selection is not part of OpenAI’s translations API; output is always English. Use `/v1/audio/transcriptions` if you need non-English text.

### Registering a translation model (`whispercpp-audio-translation`)

Use the virtual SDK type **`whispercpp-audio-translation`** in `serve.models`. The CLI resolves it to the real engine **`whispercpp-transcription`** and **forces** `translate: true` on the **loadModel** `modelConfig` (Whisper translate-to-English). Nested `whisperConfig: { … }` in JSON is flattened into the top-level `modelConfig` for this alias so it matches what `@qvac/sdk` expects.

You may omit `translate`. If you set `translate: false` (top-level or under `whisperConfig`), it is **overridden to `true`** with a console warning.

The recommended shape is the same `"model": "<SDK_CONSTANT>"` shorthand used elsewhere in `serve.models`, with `type` set to the virtual translation type. The constant resolves to its registry `src`; `type` switches the alias from the constant's natural addon (`whispercpp-transcription`) to `whispercpp-audio-translation`.

**Minimal JSON β€” same weights as a transcription alias, second alias for translate:**

```json
{
"serve": {
"models": {
"whisper-transcribe": { "model": "WHISPER_EN_TINY_Q8_0", "preload": true },
"whisper-translate": {
"model": "WHISPER_EN_TINY_Q8_0",
"type": "whispercpp-audio-translation",
"preload": true
}
}
}
}
```

**Optional full `config`** uses the same **flat** Whisper keys as other `serve.models` Whisper entries (see [changelog example](./changelog/0.2.2/api.md): `language`, `n_threads`, `strategy`, … alongside `contextParams` / `miscConfig` if needed). You may also nest tuning under `whisperConfig`; for **`whispercpp-audio-translation` only**, those keys are merged to the top level before load.

**Example with extra Whisper tuning (flat keys, same style as transcriptions):**

```yaml
serve:
models:
whisper-1:
model: WHISPER_EN_TINY_Q8_0
type: whispercpp-audio-translation
preload: true
config:
language: auto
n_threads: 4
strategy: greedy
contextParams:
use_gpu: true
miscConfig:
caption_enabled: false
```

If you need to point at non-registry weights (a local path, `https://…`, `registry://…`, etc.), drop the `model` shorthand and use the explicit `{ "type": "whispercpp-audio-translation", "src": "<weights>" }` form. `src` is passed to `@qvac/sdk` as `modelSrc` verbatim, so it cannot be an SDK constant name in that form β€” use the `model` shorthand above when you want constant resolution.

### Example (`curl`)

```bash
curl -s http://127.0.0.1:11434/v1/audio/translations \
-F model=whisper-translate \
-F file=@./sample.wav \
-F response_format=json
```

Response (`json`): `{ "text": "..." }`
Response (`text`): body is plain UTF-8 text.

### Same weights as transcriptions

You normally use the **same** underlying weights for both transcription and translation; register **two aliases** that share the same `"model": "WHISPER_…"` constant β€” one without `type` (defaults to transcription) and one with `type: "whispercpp-audio-translation"`.

### Errors

| HTTP | `error.code` | When |
|------|----------------|------|
| 400 | `invalid_content_type` | Not `multipart/form-data` |
| 400 | `missing_file` / `missing_model` | Required fields missing |
| 400 | `unsupported_param` | e.g. `language` present |
| 400 | `unsupported_response_format` | `srt`, `vtt`, `verbose_json` |
| 400 | `invalid_model_type` | Alias is not an `audio-translation` model (use `type: whispercpp-audio-translation` in `serve.models`) |
| 404 | `model_not_found` | Unknown alias |
| 503 | `model_not_ready` | Model not loaded yet |
| 500 | `translation_error` | SDK / engine failure |
1 change: 1 addition & 0 deletions packages/cli/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"files": [
"dist/**/*",
"README.md",
"docs/**/*.md",
"LICENSE",
"NOTICE"
],
Expand Down
6 changes: 6 additions & 0 deletions packages/cli/src/serve/adapters/openai/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,12 @@ export function createOpenAIAdapter (): APIAdapter {
return true
}

if (method === 'POST' && path === '/v1/audio/translations') {
const { handleTranslations } = await import('./routes/translations.js')
await handleTranslations(req, res, ctx)
return true
}

if (method === 'POST' && path === '/v1/images/generations') {
const { handleImagesGenerations } = await import('./routes/images.js')
await handleImagesGenerations(req, res, ctx)
Expand Down
122 changes: 122 additions & 0 deletions packages/cli/src/serve/adapters/openai/routes/translations.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import type { IncomingMessage, ServerResponse } from 'node:http'
import { sendJson, sendText, sendError } from '../../../http.js'
import { readMultipart } from '../../../multipart.js'
import { resolveModelAlias } from '../../../config.js'
import { sdkTranscribe } from '../../../core/sdk.js'
import type { RouteContext } from '../../types.js'

const SUPPORTED_RESPONSE_FORMATS = new Set(['json', 'text'])
const UNSUPPORTED_RESPONSE_FORMATS = new Set(['srt', 'vtt', 'verbose_json'])

export async function handleTranslations (req: IncomingMessage, res: ServerResponse, ctx: RouteContext): Promise<void> {
const contentType = req.headers['content-type'] ?? ''
if (!contentType.includes('multipart/form-data')) {
sendError(res, 400, 'invalid_content_type', 'Content-Type must be multipart/form-data.')
return
}

let fields: Map<string, string>
let file: { fieldName: string; fileName: string; contentType: string; data: Buffer } | null

try {
const result = await readMultipart(req)
fields = result.fields
file = result.file
} catch (err) {
const message = err instanceof Error ? err.message : String(err)
ctx.logger.error(`Multipart parse error: ${message}`)
sendError(res, 400, 'invalid_multipart', 'Failed to parse multipart request.')
return
}

if (!file || file.fieldName !== 'file') {
sendError(res, 400, 'missing_file', '"file" field is required.')
return
}

const modelName = fields.get('model')
if (!modelName) {
sendError(res, 400, 'missing_model', '"model" field is required.')
return
}

if (fields.has('language')) {
sendError(
res,
400,
'unsupported_param',
'The "language" field is not supported on /v1/audio/translations. Output is always English.'
)
return
}

const responseFormat = fields.get('response_format') ?? 'json'
if (UNSUPPORTED_RESPONSE_FORMATS.has(responseFormat)) {
sendError(res, 400, 'unsupported_response_format', `response_format "${responseFormat}" is not supported. Use "json" or "text".`)
return
}
if (!SUPPORTED_RESPONSE_FORMATS.has(responseFormat)) {
sendError(res, 400, 'invalid_response_format', `Unknown response_format "${responseFormat}". Use "json" or "text".`)
return
}

const prompt = fields.get('prompt')
const temperature = fields.get('temperature')

const modelEntry = resolveModelAlias(ctx.serveConfig, modelName) ?? ctx.registry.getEntry(modelName)

if (!modelEntry) {
sendError(res, 404, 'model_not_found', `Model "${modelName}" is not available. Check serve.models config.`)
return
}

const endpointCategory = 'endpointCategory' in modelEntry ? modelEntry.endpointCategory : undefined
if (endpointCategory !== 'audio-translation') {
sendError(
res,
400,
'invalid_model_type',
`Model "${modelName}" is not registered for audio translation. Register an alias with type "whispercpp-audio-translation" in serve.models.`
)
return
}

const alias = 'alias' in modelEntry ? (modelEntry.alias as string) : modelEntry.id
const registryEntry = ctx.registry.getEntry(alias)
if (!registryEntry || registryEntry.state !== ctx.registry.STATES.READY) {
sendError(res, 503, 'model_not_ready', `Model "${modelName}" is not loaded yet.`)
return
}

if (temperature) {
ctx.logger.warn(`Ignoring unsupported param: temperature=${temperature}`)
}

const sdkModelId = registryEntry.sdkModelId ?? registryEntry.id
const fileSizeKB = Math.round(file.data.length / 1024)

ctx.logger.info(` translate model=${alias} file=${file.fileName} size=${fileSizeKB}KB format=${responseFormat}${prompt ? ' prompt=yes' : ''}`)

const transcribe = ctx.transcribeOverride ?? sdkTranscribe

try {
const text = await transcribe({
modelId: sdkModelId,
audioChunk: file.data,
fileName: file.fileName,
prompt
})

ctx.logger.info(` translate done chars=${text.length}`)

if (responseFormat === 'text') {
sendText(res, 200, text)
} else {
sendJson(res, 200, { text })
}
} catch (err) {
const message = err instanceof Error ? err.message : String(err)
ctx.logger.error(`Translation error for "${alias}": ${message}`)
sendError(res, 500, 'translation_error', 'An internal error occurred during audio translation.')
}
}
7 changes: 7 additions & 0 deletions packages/cli/src/serve/adapters/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ export interface RouteContext {
registry: ModelRegistry
serveConfig: ServeConfig
logger: Logger
/** @internal Unit tests only β€” replaces sdkTranscribe when set */
transcribeOverride?: (opts: {
modelId: string
audioChunk: Buffer
fileName: string
prompt?: string | undefined
}) => Promise<string>
}

export type RouteHandler = (req: IncomingMessage, res: ServerResponse, ctx: RouteContext) => Promise<void> | void
Expand Down
Loading
Loading