Skip to content

QVAC-19255 tts-ggml: enable Supertonic GPU via tts-cpp 2026-06-05#2473

Merged
freddy311082 merged 2 commits into
mainfrom
QVAC-19255-tts-ggml-supertonic-gpu
Jun 5, 2026
Merged

QVAC-19255 tts-ggml: enable Supertonic GPU via tts-cpp 2026-06-05#2473
freddy311082 merged 2 commits into
mainfrom
QVAC-19255-tts-ggml-supertonic-gpu

Conversation

@ogad-tether

Copy link
Copy Markdown
Contributor

Summary

Companion to tetherto/qvac-registry-vcpkg#184 (now merged at bb702251). That PR bumps the tts-cpp port to master HEAD 128dae42 (PR #31 supertonic_optimizations) which brings the QVAC-18605 Supertonic Vulkan/Metal optimisations (rounds 1-13, ~34× realtime on Apple M-series Metal) and the QVAC-19254 sched + cpu_backend refactor for Adreno OpenCL. This PR removes the now-stale Supertonic is CPU-only today rejection gates on the downstream tts-ggml addon and lets caller GPU intent flow through to the merged tts-cpp tier policy.

C++ addon (addon/src/model-interface/supertonic/)

  • SupertonicModel.cpp::validateConfig — removed the if (wantsGpu) { throw StatusError(... "CPU only today" ...) } block. The cross-field conflict check (useGPU=true + nGpuLayers=0, or useGPU=false + nGpuLayers!=0) is preserved so callers can't silently get the opposite backend they asked for.
  • SupertonicModel.cpp::loadLocked — removed the #ifdef __ANDROID__ force-off block. Android GPU routing is now delegated to tts-cpp's init_gpu_backend, which already allowlists Qualcomm Adreno (Adreno 700+ → OpenCL, otherwise tier fallback) and skips Mali / non-Adreno GPUs.
  • SupertonicConfig.hpp — updated the useGpu docstring.

JS addon (index.js)

  • Removed the parallel wantsGpu rejection for Supertonic.
  • The default precondition for useGPU = false now also requires nGpuLayers == null so a caller passing nGpuLayers: 99 alone doesn't get a silent conflict with the JS-side default.
  • The engine-agnostic conflict check is preserved verbatim.

Tests

  • addon/tests/test_supertonic_config.cpp — flipped UseGpuTrueRejectedWithExplanationUseGpuTrueAcceptedAtConstruction, NGpuLayersGreaterThanZeroRejectedNGpuLayersGreaterThanZeroAccepted. Added UseGpuNGpuLayersConflictStillRejected to lock in the kept conflict check. Simplified NGpuLayersZeroAcceptedAndDeferredLoad.
  • test/integration/gpu-smoke.test.js — flipped the Supertonic entry from "useGPU=true is rejected at constructor" to "useGPU=true must engage the GPU backend on GPU-capable platforms", mirroring the existing Chatterbox GPU smoke contract (assertGpuBackend, NO_GPU skip, same loadSupertonicTTS / runSupertonicTTS plumbing).
  • test/unit/supertonic{,-mtl}.inference.test.js — updated stale assertion text.

Docs

  • README.md, index.d.ts — updated to describe the engine-agnostic tier policy; no more "Supertonic stays CPU-only" claims.
  • examples/supertonic*-tts.js (4 files) — NOTE blocks rewritten to reference the new GPU support and point at the opt-in pattern; the examples themselves keep useGPU=false so they run identically everywhere.

Version + dependency

Validation

  • Upstream tts-cpp (prior to bump): 38/38 supertonic ctests pass on master HEAD 128dae42.
  • Local bench (n=10 timed runs, "quick brown fox", on Apple M-series):
    • C++ Metal native: 92 ms / 0.029 RTF / 34.07× realtime
    • C++ Vulkan via MoltenVK: validates end-to-end (backend: Vulkan0 (device 0: Apple M2), all auto-policies engaged); 33.94× on sustained 11s-audio payloads
    • C++ CPU: 231 ms / 13.57×
    • ONNX CPU contrast: 139 ms / 22.40×
  • TTS GGML overlay CI on tetherto/qvac (actions/runs/27006424023): 26/27 jobs pass against the merged tts-cpp; the one failure is the structural merge-guard / validate-pr "needs 'verified' label" check that fires on every workflow_dispatch run regardless of code.
  • Local on this branch:
    • npm run test:unit61/61 brittle-bare tests pass / 204/204 assertions
    • npm run test:dts → clean
    • standard lint on edited files → clean
  • Adversarial subagent review on this diff: SAFE on all 13 invariants (rejection paths cleanly removed, conflict check preserved, Chatterbox untouched, tests correctly inverted, no dangling refs to the old "CPU only today" claim outside historical CHANGELOG entries).

Test plan

  • JS unit tests pass locally (npm run test:unit)
  • TypeScript .d.ts compiles (npm run test:dts)
  • standard lint clean on edited files
  • No production-code references to "CPU only today" / "Supertonic is CPU-only" / "silently wrong" (only historical CHANGELOG entries retain the phrase, as is proper for an immutable changelog)
  • CI green on this PR (C++ unit tests + integration tests, including the GPU smoke contract on every CI runner that doesn't set NO_GPU=true)
  • Real Adreno 700+ device validation (open from the upstream tts-cpp PR testing all trigger reusable lib workflow #31 test plan; not blocking this PR)

🤖 Generated with Claude Code

Companion to tetherto/qvac-registry-vcpkg#184 (the trio registry bump
that ships tts-cpp@2026-06-05#0). tts-cpp@2026-06-05 brings the
QVAC-18605 Supertonic Vulkan/Metal optimisations (rounds 1-13, ~34x
realtime on Apple M-series Metal) and the QVAC-19254 sched/cpu_backend
refactor for Adreno OpenCL, lifting the previous "Supertonic is
CPU-only today" engine-boundary limitation. This PR removes the now-
stale rejection gates on the downstream tts-ggml addon and lets caller
GPU intent flow through to the merged tts-cpp tier policy
(init_gpu_backend: Adreno 700+ -> OpenCL, otherwise Vulkan/Metal/CUDA
via registry walk, otherwise CPU).

C++ addon (addon/src/model-interface/supertonic/):
- SupertonicModel.cpp::validateConfig: removed the `if (wantsGpu) {
  throw StatusError(... "CPU only today" ...) }` block. The conflicting-
  pair check (useGPU=true + nGpuLayers=0 or vice versa) is preserved
  so callers can't silently get the opposite backend they asked for.
- SupertonicModel.cpp::loadLocked: removed the `#ifdef __ANDROID__`
  force-off block. Android GPU routing is now delegated to tts-cpp's
  init_gpu_backend, which already allowlists Qualcomm Adreno and skips
  Mali / non-Adreno GPUs that would abort ggml_backend_graph_compute.
- SupertonicConfig.hpp: updated the useGpu docstring.

JS addon (index.js):
- Removed the parallel `wantsGpu` rejection for Supertonic. The default
  precondition for `useGPU = false` now also requires `nGpuLayers ==
  null` so a caller passing `nGpuLayers: 99` alone doesn't get a
  silent conflict with the JS-side default.
- The cross-field conflict check (useGPU=true + nGpuLayers=0 or vice
  versa) lives outside the ENGINE_SUPERTONIC branch and is preserved.

Tests:
- addon/tests/test_supertonic_config.cpp: flipped
  UseGpuTrueRejectedWithExplanation -> UseGpuTrueAcceptedAtConstruction,
  NGpuLayersGreaterThanZeroRejected -> NGpuLayersGreaterThanZeroAccepted.
  Added UseGpuNGpuLayersConflictStillRejected to lock in the kept
  conflict check. Simplified NGpuLayersZeroAcceptedAndDeferredLoad.
- test/integration/gpu-smoke.test.js: flipped the Supertonic entry
  from "useGPU=true is rejected at constructor" to "useGPU=true must
  engage the GPU backend on GPU-capable platforms", mirroring the
  existing Chatterbox smoke contract (assertGpuBackend, NO_GPU skip,
  same loadSupertonicTTS / runSupertonicTTS plumbing).
- test/unit/supertonic{,-mtl}.inference.test.js: updated stale
  assertion text ("nGpuLayers=0 is the only allowed GPU value..." and
  "supertonic stays CPU-only on the JS side").

Docs:
- README.md: useGPU row now describes the engine-agnostic tier policy.
- index.d.ts: useGPU + nGpuLayers JSDoc no longer claim Supertonic
  rejects GPU intent.
- examples/supertonic{,-mtl,-mtl-sweep,-sentence-stream}-tts.js: NOTE
  blocks rewritten to point at the tts-cpp@2026-06-05 GPU support and
  the GPU opt-in pattern. The examples themselves keep useGPU=false
  so they run identically everywhere.

Version + dependency:
- vcpkg.json: `tts-cpp` version>= bumped from `2026-06-03#1` to
  `2026-06-05` (the registry baseline after #184).
- package.json: 0.2.0 -> 0.2.1.
- CHANGELOG.md: new [0.2.1] entry under "Added" + "Changed".

Validation (upstream tts-cpp, prior to bump):
- 38/38 supertonic ctests pass on the merged tts-cpp@128dae42.
- Local Metal n=10 bench: F1 92 ms / 0.029 RTF / 34.07x realtime, M1
  95 ms / 0.030 RTF / 33.63x. CPU 13.6x, ONNX-CPU 22.4x for contrast.
- TTS GGML overlay CI on tetherto/qvac (actions/runs/27006424023):
  26/27 jobs pass; the merge-guard / validate-pr failure is a
  structural "needs verified label" check that fires on every
  workflow_dispatch run regardless of code (same on pre-merge baseline
  actions/runs/26878322979).
- Adversarial subagent review on this diff: SAFE on all 13 invariants
  (rejection paths cleanly removed, conflict check preserved,
  Chatterbox untouched, tests correctly inverted, no dangling refs to
  the old "CPU only today" claim outside historical CHANGELOG entries).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ogad-tether ogad-tether requested review from a team as code owners June 5, 2026 17:52
@ogad-tether ogad-tether self-assigned this Jun 5, 2026
@ogad-tether ogad-tether added tier1 verified Authorize secrets / label-gate in PR workflows labels Jun 5, 2026
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

@freddy311082

Copy link
Copy Markdown
Contributor

/review

@freddy311082 freddy311082 merged commit 79378e3 into main Jun 5, 2026
37 checks passed
@freddy311082 freddy311082 deleted the QVAC-19255-tts-ggml-supertonic-gpu branch June 5, 2026 20:37
Zbig9000 added a commit that referenced this pull request Jun 10, 2026
…06-05 (#2473)" (#2502)

This reverts commit 79378e3.

@qvac/tts-ggml@0.2.1 crashes on Android at addon load: the Bare worklet
aborts with SIGABRT ~1s into bootstrap, which presents as a bootstrap
timeout and has failed every Android e2e run since Jun 9 (iOS and desktop
are unaffected). Device-farm logcat:

  AddonError: ADDON_NOT_FOUND: Cannot find addon '.' ...
    Candidates: - linked:libqvac__tts-ggml.0.2.1.so
    [cause]: Error: dlopen failed
  F libc: Fatal signal 6 (SIGABRT) in tid ... (mqt_v_js)

Root cause: 0.2.1 bumped tts-cpp 2026-06-03#1 -> 2026-06-05, which pins
upstream qvac-ext-lib-whisper.cpp@128dae42 (the QVAC-19254 sched +
cpu_backend refactor). That refactor makes direct ggml_backend_is_cpu /
ggml_get_type_traits_cpu calls inside the statically-linked tts-cpp lib.
On Android the shared ggml-speech vcpkg port builds the CPU backend as
runtime-dlopen'd per-microarch MODULE .so variants (GGML_CPU_ALL_VARIANTS
=ON + GGML_BACKEND_DL=ON; no static CPU archive), so those two symbols are
left UND in libqvac__tts-ggml.*.so with no DT_NEEDED able to resolve them
(the CPU variant libs are only dlopen'd lazily inside Engine construction,
long after Bare loads the addon). The addon therefore can't be linked and
the process aborts. On iOS/desktop the CPU backend is statically linked,
so the symbols resolve and there is no crash.

This reverts PR #2473 (the Supertonic GPU enablement) in full and pins
tts-cpp back to 2026-06-03#1 -- the last-known-good revision that 0.2.0
shipped and that the team verified green on Android (smoke suite). With
tts-cpp reverted, Supertonic is CPU-only again, so the validateConfig /
loadLocked useGPU rejection gates, the C++ unit tests, the gpu-smoke
integration test, and the README / index.d.ts / examples are all reverted
to keep the package internally consistent.

Released as 0.2.2 (not a rollback to 0.2.0): the broken 0.2.1 dev build is
already in the package registry the e2e installs from, and the SDK depends
on ^0.2.0, so the fix must carry a higher version to be selected.

The proper fix belongs upstream (QVAC-19254 follow-up against tts-cpp /
ggml-speech): make ggml_backend_is_cpu / ggml_get_type_traits_cpu
defined-and-internal on Android by statically linking ggml-cpu into the
addon the way desktop/iOS already do, keeping the GPU backends dynamic.
The Supertonic GPU work can re-land once that is in place.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Ishan Vohra <ishanvohra2@gmail.com>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tier1 verified Authorize secrets / label-gate in PR workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants