Skip to content

QVAC-18735 feat[api]: add POST /v1/audio/translations to qvac serve OpenAI adapter#2031

Merged
lauripiisang merged 4 commits into
mainfrom
feat/QVAC-18735-openai-audio-translations
May 14, 2026
Merged

QVAC-18735 feat[api]: add POST /v1/audio/translations to qvac serve OpenAI adapter#2031
lauripiisang merged 4 commits into
mainfrom
feat/QVAC-18735-openai-audio-translations

Conversation

@lauripiisang

@lauripiisang lauripiisang commented May 13, 2026

Copy link
Copy Markdown
Contributor

🎯 What problem does this PR solve?

  • qvac serve openai had no OpenAI-compatible translations endpoint, so consumers that hit POST /v1/audio/translations (Whisper translate-to-English task) had to fall back to a separate transcribe + text-translate pipeline.
  • serve.models had no way to declare a Whisper alias whose endpoint category was audio-translation, since the only Whisper type was whispercpp-transcription.

📝 How does it solve it?

  • Adds POST /v1/audio/translations to the OpenAI HTTP adapter. Route gates on endpointCategory === 'audio-translation', rejects language (OpenAI translations are English-only), and supports json (default) / text response formats.
  • Introduces virtual serve.models type whispercpp-audio-translation. The CLI resolves it to the real engine whispercpp-transcription and forces translate: true at parse time (warns if the operator set translate: false). Nested whisperConfig: { ... } is flattened into the top-level modelConfig so it matches what @qvac/sdk loadModel expects.
  • Extends the constant-shorthand serve.models entry with an optional type override, so the recommended config keeps using the same "model": "<SDK_CONSTANT>" shape as every other entry:
"whisper-translate": {
  "model": "WHISPER_EN_TINY_Q8_0",
  "type": "whispercpp-audio-translation",
  "preload": true
}
  • Docs: new packages/cli/docs/serve-openai.md, README pointer, and package.json files now includes docs/**/*.md so the doc ships with the published package.

🧪 How was it tested?

  • Unit: npm test — adds config.test.ts (constant + type override path, virtual-type flattening, translate-true enforcement) and translations.test.ts (validation branches: missing fields, language rejection, non-translation alias gate, unsupported formats, json/text responses).
  • BATS smoke: npm run test:batscli.bats smoke cases mirroring the transcriptions set.
  • BATS e2e: npm run test:e2ee2e.bats registers test-whisper-translate via model + type override, hits /v1/audio/translations for both json and text, asserts rejection of a transcription-only alias and of a chat alias, and verifies DELETE unloads both whisper aliases.
  • Verified locally end-to-end against WHISPER_EN_TINY_Q8_0.

🔌 API Changes

New endpointPOST /v1/audio/translations:

curl -s http://127.0.0.1:11434/v1/audio/translations \
  -F model=whisper-translate \
  -F file=@./sample.wav \
  -F response_format=json
# => { "text": "..." }   (always English)

New serve.models shape — Whisper translation alias via constant + type override:

{
  "serve": {
    "models": {
      "whisper-transcribe": { "model": "WHISPER_EN_TINY_Q8_0", "preload": true },
      "whisper-translate": {
        "model": "WHISPER_EN_TINY_Q8_0",
        "type": "whispercpp-audio-translation",
        "preload": true
      }
    }
  }
}

The explicit { "type": "whispercpp-audio-translation", "src": "<weights>" } form is still accepted for non-registry weights (literal URL / path / registry://…); src is passed to the SDK verbatim and is not resolved against SDK model constants.

Ticket

QVAC-18735

@lauripiisang lauripiisang requested review from a team as code owners May 13, 2026 16:08
- e2e.bats: cover POST /v1/audio/translations with WHISPER_EN_TINY_Q8_0
  alias, assert it rejects transcription-only and chat aliases, and that
  DELETE unloads both whisper aliases.
- serve/config.ts: flatten whisperConfig into top-level modelConfig keys
  for whispercpp-audio-translation (whisper loadModel expects flat fields,
  not nested whisperConfig); force translate=true and warn otherwise.
- config.test.ts: assert flat translate/language/n_threads and no
  whisperConfig key; cover top-level translate=false override.
- docs/serve-openai.md: clarify src accepts SDK model constants and show
  the flat config shape.
The virtual `whispercpp-audio-translation` type previously required the
explicit `{ type, src }` shape, but `src` is passed to the SDK verbatim
so an SDK constant name like `WHISPER_EN_TINY_Q8_0` failed with
MODEL_NOT_FOUND. Allow constant entries to carry an optional `type`
override instead, so `{ "model": "WHISPER_EN_TINY_Q8_0", "type":
"whispercpp-audio-translation" }` resolves the constant via the
registry and then runs through the virtual-type mapping
(`whispercpp-transcription` + audio-translation + translate=true).

- serve/config.ts: ConstantModelEntry gains optional `type`;
  resolveModelConstant routes the override through
  resolveExplicitServeModel. Explicit `{ type, src }` branch is
  unchanged (src is still a literal modelSrc).
- config.test.ts: exports + covers natural-addon resolution, the
  whisper → audio-translation override, and unknown-constant errors.
- e2e.bats: test-whisper-translate now uses the model+type shape.
- docs/serve-openai.md: recommend the model+type shorthand; note that
  explicit src is for non-registry weights only.
@lauripiisang lauripiisang force-pushed the feat/QVAC-18735-openai-audio-translations branch from 1de73d4 to 3141840 Compare May 13, 2026 19:11

@opaninakuffo opaninakuffo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds /v1/audio/translations, the whispercpp-audio-translation virtual type (flatten + forced translate), startup/docs/package files, and solid unit + BATS coverage. Mirrors the transcriptions handler cleanly and gates on audio-translation nicely.

@github-actions

github-actions Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (2/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

@simon-iribarren simon-iribarren left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve. Verified locally: build ✓, typecheck ✓, 243/243 tests pass. CI green on all technical gates.

Three things worth surfacing:

  1. routes/translations.ts is a ~95% duplicate of routes/transcriptions.ts. Only ~12 effective lines differ (endpoint category, language policy, log labels, error code), and validation order is already drifting between the two siblings. Worth factoring a shared handleWhisperAudio(req, res, ctx, { requiredCategory, opLabel, errorCode, languagePolicy }) helper before the third Whisper variant lands.
  2. transcribeOverride was added to the public RouteContext interface but only translations.ts consumes it. transcriptions.ts still calls sdkTranscribe directly. Wrong test seam — either flip both routes to read from ctx, or pass sdkTranscribe into a handler factory and drop the RouteContext field.
  3. whisperConfig nested→flat flattening is asymmetric. config.ts::resolveExplicitServeModel flattens only for the virtual whispercpp-audio-translation type. Plain whispercpp-transcription entries with nested whisperConfig pass through unchanged, even though the new serve-openai.md tells transcription users to use flat keys. Silent footgun — either flatten symmetrically or document the asymmetry.

Smaller nits:

  • response_format allowlist is case-sensitive; OpenAI clients send lowercase, but the divergence will catch someone.
  • console.warn in the config parser — prefer the structured logger.
  • No client-disconnect propagation (same wart as transcriptions).
  • No changelog/0.3.0/api.md update in this PR — easy to add.

@lauripiisang

Copy link
Copy Markdown
Contributor Author

/review

@lauripiisang

Copy link
Copy Markdown
Contributor Author

/review

@lauripiisang lauripiisang merged commit 92c6076 into main May 14, 2026
21 checks passed
@lauripiisang lauripiisang deleted the feat/QVAC-18735-openai-audio-translations branch May 14, 2026 10:46
Proletter pushed a commit that referenced this pull request May 24, 2026
…penAI adapter (#2031)

* feat[api]: add POST /v1/audio/translations to qvac serve OpenAI adapter

* test[api]: add e2e + flatten whisper translate config

- e2e.bats: cover POST /v1/audio/translations with WHISPER_EN_TINY_Q8_0
  alias, assert it rejects transcription-only and chat aliases, and that
  DELETE unloads both whisper aliases.
- serve/config.ts: flatten whisperConfig into top-level modelConfig keys
  for whispercpp-audio-translation (whisper loadModel expects flat fields,
  not nested whisperConfig); force translate=true and warn otherwise.
- config.test.ts: assert flat translate/language/n_threads and no
  whisperConfig key; cover top-level translate=false override.
- docs/serve-openai.md: clarify src accepts SDK model constants and show
  the flat config shape.

* fix[api]: allow type override on constant serve.models entries

The virtual `whispercpp-audio-translation` type previously required the
explicit `{ type, src }` shape, but `src` is passed to the SDK verbatim
so an SDK constant name like `WHISPER_EN_TINY_Q8_0` failed with
MODEL_NOT_FOUND. Allow constant entries to carry an optional `type`
override instead, so `{ "model": "WHISPER_EN_TINY_Q8_0", "type":
"whispercpp-audio-translation" }` resolves the constant via the
registry and then runs through the virtual-type mapping
(`whispercpp-transcription` + audio-translation + translate=true).

- serve/config.ts: ConstantModelEntry gains optional `type`;
  resolveModelConstant routes the override through
  resolveExplicitServeModel. Explicit `{ type, src }` branch is
  unchanged (src is still a literal modelSrc).
- config.test.ts: exports + covers natural-addon resolution, the
  whisper → audio-translation override, and unknown-constant errors.
- e2e.bats: test-whisper-translate now uses the model+type shape.
- docs/serve-openai.md: recommend the model+type shorthand; note that
  explicit src is for non-registry weights only.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants