QVAC-19255 tts-ggml: enable Supertonic GPU via tts-cpp 2026-06-05 by ogad-tether · Pull Request #2473 · tetherto/qvac

ogad-tether · 2026-06-05T17:52:18Z

Summary

Companion to tetherto/qvac-registry-vcpkg#184 (now merged at bb702251). That PR bumps the tts-cpp port to master HEAD 128dae42 (PR #31 supertonic_optimizations) which brings the QVAC-18605 Supertonic Vulkan/Metal optimisations (rounds 1-13, ~34× realtime on Apple M-series Metal) and the QVAC-19254 sched + cpu_backend refactor for Adreno OpenCL. This PR removes the now-stale Supertonic is CPU-only today rejection gates on the downstream tts-ggml addon and lets caller GPU intent flow through to the merged tts-cpp tier policy.

C++ addon (`addon/src/model-interface/supertonic/`)

SupertonicModel.cpp::validateConfig — removed the if (wantsGpu) { throw StatusError(... "CPU only today" ...) } block. The cross-field conflict check (useGPU=true + nGpuLayers=0, or useGPU=false + nGpuLayers!=0) is preserved so callers can't silently get the opposite backend they asked for.
SupertonicModel.cpp::loadLocked — removed the #ifdef __ANDROID__ force-off block. Android GPU routing is now delegated to tts-cpp's init_gpu_backend, which already allowlists Qualcomm Adreno (Adreno 700+ → OpenCL, otherwise tier fallback) and skips Mali / non-Adreno GPUs.
SupertonicConfig.hpp — updated the useGpu docstring.

JS addon (`index.js`)

Removed the parallel wantsGpu rejection for Supertonic.
The default precondition for useGPU = false now also requires nGpuLayers == null so a caller passing nGpuLayers: 99 alone doesn't get a silent conflict with the JS-side default.
The engine-agnostic conflict check is preserved verbatim.

Tests

addon/tests/test_supertonic_config.cpp — flipped UseGpuTrueRejectedWithExplanation → UseGpuTrueAcceptedAtConstruction, NGpuLayersGreaterThanZeroRejected → NGpuLayersGreaterThanZeroAccepted. Added UseGpuNGpuLayersConflictStillRejected to lock in the kept conflict check. Simplified NGpuLayersZeroAcceptedAndDeferredLoad.
test/integration/gpu-smoke.test.js — flipped the Supertonic entry from "useGPU=true is rejected at constructor" to "useGPU=true must engage the GPU backend on GPU-capable platforms", mirroring the existing Chatterbox GPU smoke contract (assertGpuBackend, NO_GPU skip, same loadSupertonicTTS / runSupertonicTTS plumbing).
test/unit/supertonic{,-mtl}.inference.test.js — updated stale assertion text.

Docs

README.md, index.d.ts — updated to describe the engine-agnostic tier policy; no more "Supertonic stays CPU-only" claims.
examples/supertonic*-tts.js (4 files) — NOTE blocks rewritten to reference the new GPU support and point at the opt-in pattern; the examples themselves keep useGPU=false so they run identically everywhere.

Version + dependency

vcpkg.json — tts-cpp version>= bumped from 2026-06-03#1 to 2026-06-05 (the registry baseline after feat(qvac-lib-registry-client): add findBy() method using schema's findBy #184).
package.json — 0.2.0 → 0.2.1.
CHANGELOG.md — new [0.2.1] - 2026-06-05 entry under "Added" + "Changed".

Validation

Upstream tts-cpp (prior to bump): 38/38 supertonic ctests pass on master HEAD 128dae42.
Local bench (n=10 timed runs, "quick brown fox", on Apple M-series):
- C++ Metal native: 92 ms / 0.029 RTF / 34.07× realtime
- C++ Vulkan via MoltenVK: validates end-to-end (backend: Vulkan0 (device 0: Apple M2), all auto-policies engaged); 33.94× on sustained 11s-audio payloads
- C++ CPU: 231 ms / 13.57×
- ONNX CPU contrast: 139 ms / 22.40×
TTS GGML overlay CI on tetherto/qvac (actions/runs/27006424023): 26/27 jobs pass against the merged tts-cpp; the one failure is the structural merge-guard / validate-pr "needs 'verified' label" check that fires on every workflow_dispatch run regardless of code.
Local on this branch:
- npm run test:unit → 61/61 brittle-bare tests pass / 204/204 assertions
- npm run test:dts → clean
- standard lint on edited files → clean
Adversarial subagent review on this diff: SAFE on all 13 invariants (rejection paths cleanly removed, conflict check preserved, Chatterbox untouched, tests correctly inverted, no dangling refs to the old "CPU only today" claim outside historical CHANGELOG entries).

Test plan

JS unit tests pass locally (npm run test:unit)
TypeScript .d.ts compiles (npm run test:dts)
standard lint clean on edited files
No production-code references to "CPU only today" / "Supertonic is CPU-only" / "silently wrong" (only historical CHANGELOG entries retain the phrase, as is proper for an immutable changelog)
CI green on this PR (C++ unit tests + integration tests, including the GPU smoke contract on every CI runner that doesn't set NO_GPU=true)
Real Adreno 700+ device validation (open from the upstream tts-cpp PR testing all trigger reusable lib workflow #31 test plan; not blocking this PR)

🤖 Generated with Claude Code

Companion to tetherto/qvac-registry-vcpkg#184 (the trio registry bump that ships tts-cpp@2026-06-05#0). tts-cpp@2026-06-05 brings the QVAC-18605 Supertonic Vulkan/Metal optimisations (rounds 1-13, ~34x realtime on Apple M-series Metal) and the QVAC-19254 sched/cpu_backend refactor for Adreno OpenCL, lifting the previous "Supertonic is CPU-only today" engine-boundary limitation. This PR removes the now- stale rejection gates on the downstream tts-ggml addon and lets caller GPU intent flow through to the merged tts-cpp tier policy (init_gpu_backend: Adreno 700+ -> OpenCL, otherwise Vulkan/Metal/CUDA via registry walk, otherwise CPU). C++ addon (addon/src/model-interface/supertonic/): - SupertonicModel.cpp::validateConfig: removed the `if (wantsGpu) { throw StatusError(... "CPU only today" ...) }` block. The conflicting- pair check (useGPU=true + nGpuLayers=0 or vice versa) is preserved so callers can't silently get the opposite backend they asked for. - SupertonicModel.cpp::loadLocked: removed the `#ifdef __ANDROID__` force-off block. Android GPU routing is now delegated to tts-cpp's init_gpu_backend, which already allowlists Qualcomm Adreno and skips Mali / non-Adreno GPUs that would abort ggml_backend_graph_compute. - SupertonicConfig.hpp: updated the useGpu docstring. JS addon (index.js): - Removed the parallel `wantsGpu` rejection for Supertonic. The default precondition for `useGPU = false` now also requires `nGpuLayers == null` so a caller passing `nGpuLayers: 99` alone doesn't get a silent conflict with the JS-side default. - The cross-field conflict check (useGPU=true + nGpuLayers=0 or vice versa) lives outside the ENGINE_SUPERTONIC branch and is preserved. Tests: - addon/tests/test_supertonic_config.cpp: flipped UseGpuTrueRejectedWithExplanation -> UseGpuTrueAcceptedAtConstruction, NGpuLayersGreaterThanZeroRejected -> NGpuLayersGreaterThanZeroAccepted. Added UseGpuNGpuLayersConflictStillRejected to lock in the kept conflict check. Simplified NGpuLayersZeroAcceptedAndDeferredLoad. - test/integration/gpu-smoke.test.js: flipped the Supertonic entry from "useGPU=true is rejected at constructor" to "useGPU=true must engage the GPU backend on GPU-capable platforms", mirroring the existing Chatterbox smoke contract (assertGpuBackend, NO_GPU skip, same loadSupertonicTTS / runSupertonicTTS plumbing). - test/unit/supertonic{,-mtl}.inference.test.js: updated stale assertion text ("nGpuLayers=0 is the only allowed GPU value..." and "supertonic stays CPU-only on the JS side"). Docs: - README.md: useGPU row now describes the engine-agnostic tier policy. - index.d.ts: useGPU + nGpuLayers JSDoc no longer claim Supertonic rejects GPU intent. - examples/supertonic{,-mtl,-mtl-sweep,-sentence-stream}-tts.js: NOTE blocks rewritten to point at the tts-cpp@2026-06-05 GPU support and the GPU opt-in pattern. The examples themselves keep useGPU=false so they run identically everywhere. Version + dependency: - vcpkg.json: `tts-cpp` version>= bumped from `2026-06-03#1` to `2026-06-05` (the registry baseline after #184). - package.json: 0.2.0 -> 0.2.1. - CHANGELOG.md: new [0.2.1] entry under "Added" + "Changed". Validation (upstream tts-cpp, prior to bump): - 38/38 supertonic ctests pass on the merged tts-cpp@128dae42. - Local Metal n=10 bench: F1 92 ms / 0.029 RTF / 34.07x realtime, M1 95 ms / 0.030 RTF / 33.63x. CPU 13.6x, ONNX-CPU 22.4x for contrast. - TTS GGML overlay CI on tetherto/qvac (actions/runs/27006424023): 26/27 jobs pass; the merge-guard / validate-pr failure is a structural "needs verified label" check that fires on every workflow_dispatch run regardless of code (same on pre-merge baseline actions/runs/26878322979). - Adversarial subagent review on this diff: SAFE on all 13 invariants (rejection paths cleanly removed, conflict check preserved, Chatterbox untouched, tests correctly inverted, no dangling refs to the old "CPU only today" claim outside historical CHANGELOG entries). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-06-05T19:45:32Z

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

freddy311082 · 2026-06-05T19:48:31Z

/review

…06-05 (#2473)" (#2502) This reverts commit 79378e3. @qvac/tts-ggml@0.2.1 crashes on Android at addon load: the Bare worklet aborts with SIGABRT ~1s into bootstrap, which presents as a bootstrap timeout and has failed every Android e2e run since Jun 9 (iOS and desktop are unaffected). Device-farm logcat: AddonError: ADDON_NOT_FOUND: Cannot find addon '.' ... Candidates: - linked:libqvac__tts-ggml.0.2.1.so [cause]: Error: dlopen failed F libc: Fatal signal 6 (SIGABRT) in tid ... (mqt_v_js) Root cause: 0.2.1 bumped tts-cpp 2026-06-03#1 -> 2026-06-05, which pins upstream qvac-ext-lib-whisper.cpp@128dae42 (the QVAC-19254 sched + cpu_backend refactor). That refactor makes direct ggml_backend_is_cpu / ggml_get_type_traits_cpu calls inside the statically-linked tts-cpp lib. On Android the shared ggml-speech vcpkg port builds the CPU backend as runtime-dlopen'd per-microarch MODULE .so variants (GGML_CPU_ALL_VARIANTS =ON + GGML_BACKEND_DL=ON; no static CPU archive), so those two symbols are left UND in libqvac__tts-ggml.*.so with no DT_NEEDED able to resolve them (the CPU variant libs are only dlopen'd lazily inside Engine construction, long after Bare loads the addon). The addon therefore can't be linked and the process aborts. On iOS/desktop the CPU backend is statically linked, so the symbols resolve and there is no crash. This reverts PR #2473 (the Supertonic GPU enablement) in full and pins tts-cpp back to 2026-06-03#1 -- the last-known-good revision that 0.2.0 shipped and that the team verified green on Android (smoke suite). With tts-cpp reverted, Supertonic is CPU-only again, so the validateConfig / loadLocked useGPU rejection gates, the C++ unit tests, the gpu-smoke integration test, and the README / index.d.ts / examples are all reverted to keep the package internally consistent. Released as 0.2.2 (not a rollback to 0.2.0): the broken 0.2.1 dev build is already in the package registry the e2e installs from, and the SDK depends on ^0.2.0, so the fix must carry a higher version to be selected. The proper fix belongs upstream (QVAC-19254 follow-up against tts-cpp / ggml-speech): make ggml_backend_is_cpu / ggml_get_type_traits_cpu defined-and-internal on Android by statically linking ggml-cpu into the addon the way desktop/iOS already do, keeping the GPU backends dynamic. The Supertonic GPU work can re-land once that is in place. Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Ishan Vohra <ishanvohra2@gmail.com> Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>

ogad-tether requested review from a team as code owners June 5, 2026 17:52

ogad-tether temporarily deployed to release June 5, 2026 17:53 — with GitHub Actions Inactive

ogad-tether self-assigned this Jun 5, 2026

ogad-tether added tier1 verified Authorize secrets / label-gate in PR workflows labels Jun 5, 2026

ogad-tether temporarily deployed to release June 5, 2026 17:59 — with GitHub Actions Inactive

ogad-tether temporarily deployed to release June 5, 2026 18:02 — with GitHub Actions Inactive

gianni-cor approved these changes Jun 5, 2026

View reviewed changes

Merge branch 'main' into QVAC-19255-tts-ggml-supertonic-gpu

6826595

freddy311082 temporarily deployed to release June 5, 2026 19:48 — with GitHub Actions Inactive

freddy311082 temporarily deployed to release June 5, 2026 19:58 — with GitHub Actions Inactive

freddy311082 merged commit 79378e3 into main Jun 5, 2026
37 checks passed

freddy311082 deleted the QVAC-19255-tts-ggml-supertonic-gpu branch June 5, 2026 20:37

freddy311082 temporarily deployed to release June 5, 2026 20:37 — with GitHub Actions Inactive

Zbig9000 mentioned this pull request Jun 9, 2026

revert: "QVAC-19255 tts-ggml: enable Supertonic GPU via tts-cpp 2026-06-05 (#2473)" #2502

Merged

Zbig9000 mentioned this pull request Jun 11, 2026

QVAC-19255 feat[api]: reintroduce Supertonic GPU support (desktop/iOS; Android CPU-only) #2506

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QVAC-19255 tts-ggml: enable Supertonic GPU via tts-cpp 2026-06-05#2473

QVAC-19255 tts-ggml: enable Supertonic GPU via tts-cpp 2026-06-05#2473
freddy311082 merged 2 commits into
mainfrom
QVAC-19255-tts-ggml-supertonic-gpu

ogad-tether commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

freddy311082 commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ogad-tether commented Jun 5, 2026

Summary

C++ addon (addon/src/model-interface/supertonic/)

JS addon (index.js)

Tests

Docs

Version + dependency

Validation

Test plan

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tier-based Approval Status

Uh oh!

freddy311082 commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

C++ addon (`addon/src/model-interface/supertonic/`)

JS addon (`index.js`)

github-actions Bot commented Jun 5, 2026 •

edited

Loading