tetherto · Zbig9000 · Jun 10, 2026 · Jun 9, 2026 · Jun 9, 2026 · Jun 9, 2026
@@ -5,7 +5,52 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
-## [0.2.1] - 2026-06-05
+## [0.2.2] - 2026-06-09
+
+### Fixed
+
+- **Android: revert the `tts-cpp` `2026-06-05` bump (introduced in 0.2.1)
+  that crashed the addon at `dlopen` during bootstrap, taking down every
+  Android e2e run.** `tts-cpp` `2026-06-05` pins upstream
+  `qvac-ext-lib-whisper.cpp@128dae42` (the QVAC-19254 "sched + cpu_backend
+  refactor"), which added direct `ggml_backend_is_cpu` /
+  `ggml_get_type_traits_cpu` calls inside the statically-linked `tts-cpp`
+  library. On Android the shared `ggml-speech` port builds the CPU backend
+  as runtime-`dlopen`'d per-microarch MODULE `.so` variants
+  (`GGML_CPU_ALL_VARIANTS=ON` + `GGML_BACKEND_DL=ON`; no static CPU
+  archive), so those two symbols are left `UND` in
+  `libqvac__tts-ggml.*.so`'s dynamic symbol table with no `DT_NEEDED` able
+  to resolve them — the CPU variant libraries are only `dlopen`'d lazily
+  inside Engine construction, long after Bare loads the addon. Bare's
+  resolver therefore fails to register the addon
+  (`ADDON_NOT_FOUND: linked:libqvac__tts-ggml.*.so` / `dlopen failed`) and
+  the unhandled rejection aborts the process (SIGABRT) ~1 s into
+  bootstrap. iOS and desktop (Linux/macOS/Windows) statically link the CPU
+  backend and were never affected. Pin `tts-cpp` back to `2026-06-03#1`
+  (the last-known-good revision, the one 0.2.0 shipped) so the Android
+  addon loads cleanly again.
+
+### Reverted
+
+- Reverts the 0.2.1 Supertonic GPU enablement (QVAC-19255, #2473) in full:
+  the `tts-cpp` pin, the `SupertonicModel.cpp` / `index.js` `useGPU` /
+  `nGpuLayers` gate removals, the flipped C++ unit tests and
+  `gpu-smoke.test.js` integration test, and the README / `index.d.ts` /
+  examples docs. With `tts-cpp` back at `2026-06-03#1` Supertonic is
+  CPU-only again, so the rejection gates and the CPU-only contract are
+  restored to keep the package internally consistent. The Supertonic GPU
+  work should re-land once the Android CPU-backend linkage is fixed
+  upstream (QVAC-19254 follow-up against `tts-cpp` / `ggml-speech`, e.g.
+  by statically linking `ggml-cpu` into the addon on Android the way
+  desktop/iOS already do).
+
+## [0.2.1] - 2026-06-05 — superseded by 0.2.2
+
+> **Broken on Android.** The `tts-cpp` `2026-06-05` dependency this release
+> introduced crashes the addon at load time (`dlopen` failure → SIGABRT)
+> on Android ARM64; iOS and desktop are unaffected. Reverted in 0.2.2 (see
+> above). The entry below describes what 0.2.1 attempted and is retained
+> for history.
 
 ### Added
 

@@ -250,7 +250,7 @@ backend persist its compiled program cache across launches.
 | `backendsDir`             | string     | `path.join(__dirname, 'prebuilds')` | Root dir the addon scans for dynamically-loaded ggml backend `.so` files.  Required on Android (host should pass `path.join(__dirname, 'prebuilds')`); ignored on platforms that statically link the backend |
 | `openclCacheDir`          | string     | unset      | Android-only: directory where the OpenCL backend persists its compiled program-binary cache.  Setting it across runs avoids re-JITing the kernels on every fresh process |
 | `config.language`         | string     | `"en"`     | Chatterbox MTL accepts `es/fr/de/pt/it/zh/ja/ko/...`; turbo & Supertonic are English |
-| `config.useGPU`           | boolean    | `false`    | Set to `true` to route through Metal / Vulkan / OpenCL if available, on either Chatterbox or Supertonic.  Backend selection follows tts-cpp's `init_gpu_backend` tier policy (Adreno 700+ → OpenCL, otherwise Vulkan/Metal/CUDA via the registry walk, otherwise CPU) |
+| `config.useGPU`           | boolean    | `false`    | Set to `true` to route through Metal / Vulkan / OpenCL if available.  Ignored on Android (forced to CPU at the C++ engine boundary); rejected by Supertonic at construction time (engine is CPU-only today) |
 | `config.outputSampleRate` | number     | 24000      | Resample native 24 kHz output |
 | `opts.stats`              | boolean    | `false`    | Populate `response.stats` with RTF, `backendDevice` (0=CPU, 1=GPU), `backendId` (0=CPU, 1=Metal, 3=Vulkan, 4=OpenCL, 99=other) etc. |
 | `opts.exclusiveRun`       | boolean    | `false`    | Serialize overlapping streaming runs |

@@ -19,13 +19,10 @@ struct SupertonicConfig {
    * Tri-state GPU intent (mirrors ChatterboxConfig::useGpu):
    *   - std::nullopt: unspecified, let the engine use its library default.
    *   - true:         if nGpuLayers unset, maps to nGpuLayers=99.
-   *                   Honoured as of tts-cpp@2026-06-05 (QVAC-18605
-   *                   Supertonic Vulkan/Metal optimisations + QVAC-19254
-   *                   sched/cpu_backend refactor for Adreno OpenCL).
-   *                   Backend selection follows tts-cpp's init_gpu_backend
-   *                   tier policy (Adreno 700+ -> OpenCL, otherwise
-   *                   Vulkan/Metal/CUDA via the registry walk, otherwise
-   *                   CPU).
+   *                   Note: SupertonicModel::validateConfig still rejects
+   *                   any GPU intent today because the Supertonic
+   *                   engine is CPU-only ("CPU only today" — see
+   *                   tts-cpp include/tts-cpp/supertonic/engine.h).
    *   - false:        if nGpuLayers unset, forces nGpuLayers=0 (CPU).
    *
    * Conflicts with nGpuLayers (true + 0, or false + !=0) are rejected

@@ -126,12 +126,19 @@ void SupertonicModel::validateConfig(const SupertonicConfig& cfg) {
               "(useGPU:true + nGpuLayers!=0, or useGPU:false + nGpuLayers=0).");
     }
   }
-  // GPU execution is supported as of tts-cpp@2026-06-05 (QVAC-18605
-  // Supertonic Vulkan/Metal optimisations + QVAC-19254 sched/cpu_backend
-  // refactor for Adreno OpenCL).  Backend selection follows tts-cpp's
-  // init_gpu_backend tier policy: Adreno 700+ -> OpenCL, otherwise
-  // Vulkan/Metal/CUDA via the registry walk, otherwise CPU.  Caller
-  // intent (useGPU / nGpuLayers) is honoured.
+  const bool wantsGpu =
+      cfg.useGpu.value_or(false) ||
+      (cfg.nGpuLayers.has_value() && *cfg.nGpuLayers != 0);
+  if (wantsGpu) {
+    throw StatusError(
+        general_error::InvalidArgument,
+        "SupertonicModel: GPU execution is not supported by the Supertonic "
+        "engine yet (see tts-cpp include/tts-cpp/supertonic/engine.h: \"CPU "
+        "only today\"). GPU output is currently silently wrong "
+        "(~4x quieter, slightly truncated) on the Vulkan vector-estimator "
+        "+ vocoder path. Pass useGPU: false (and leave nGpuLayers unset or "
+        "0) when constructing a Supertonic model.");
+  }
 }
 
 void SupertonicModel::load() {
@@ -153,13 +160,23 @@ void SupertonicModel::reload() {
 void SupertonicModel::loadLocked() {
   if (engine_) return;
 
-  // Android GPU policy is delegated to tts-cpp's init_gpu_backend tier
-  // policy as of QVAC-19254: it allowlists Qualcomm Adreno (OpenCL on
-  // Adreno 700+, falls through to Vulkan / CPU on other tiers) and
-  // skips Mali / non-Adreno GPUs that would abort ggml_backend_graph_
-  // compute.  No extra force-off at this boundary; consumers asking
-  // for useGPU=true on Android will get Adreno-OpenCL when available
-  // and CPU otherwise.
+  // Force useGPU to false on Android until Vulkan (Mali) and OpenCL (Adreno)
+  // stabilize for the Supertonic graph.
+#ifdef __ANDROID__
+  {
+    const bool wantsGpu =
+        cfg_.useGpu.value_or(false) ||
+        (cfg_.nGpuLayers.has_value() && *cfg_.nGpuLayers != 0);
+    if (wantsGpu) {
+      QLOG(logger::Priority::WARNING,
+           "Supertonic: useGPU=true is currently ignored on Android "
+           "(GPU backends disabled at engine boundary pending Vulkan/Mali "
+           "and OpenCL/Adreno driver fixes); falling back to CPU.");
+    }
+    cfg_.useGpu     = false;
+    cfg_.nGpuLayers = 0;
+  }
+#endif
 
   try {
     engine_ = std::make_shared<tts_cpp::supertonic::Engine>(toEngineOptions(cfg_));

@@ -80,65 +80,50 @@ TEST(SupertonicValidate, NonexistentNoiseNpyRejected) {
   EXPECT_THROW(SupertonicModel{cfg}, StatusError);
 }
 
-TEST(SupertonicValidate, UseGpuTrueAcceptedAtConstruction) {
-  // QVAC-19255 (companion to PR-bump-to-tts-cpp-128dae42): Supertonic
-  // gained Vulkan/Metal GPU support in tts-cpp@2026-06-05 (QVAC-18605
-  // rounds 1-13 + QVAC-19254 sched). validateConfig must now ACCEPT
-  // useGPU=true at construction time. The stub GGUF file still fails
-  // parsing on load() — that's exercised below — but construction
-  // itself no longer rejects on GPU intent.
+TEST(SupertonicValidate, UseGpuTrueRejectedWithExplanation) {
   auto cfg = minimallyValidStubConfig();
   cfg.useGpu = true;
-  std::unique_ptr<SupertonicModel> m;
-  EXPECT_NO_THROW(m = std::make_unique<SupertonicModel>(cfg));
-  ASSERT_NE(m, nullptr);
-  EXPECT_FALSE(m->isLoaded());
-}
-
-TEST(SupertonicValidate, NGpuLayersGreaterThanZeroAccepted) {
-  // Companion to UseGpuTrueAcceptedAtConstruction: explicit
-  // nGpuLayers > 0 is no longer rejected at validation. Loading the
-  // stub will throw on GGUF parse, but the constructor must succeed.
-  auto cfg = minimallyValidStubConfig();
-  cfg.nGpuLayers = 99;
-  std::unique_ptr<SupertonicModel> m;
-  EXPECT_NO_THROW(m = std::make_unique<SupertonicModel>(cfg));
-  ASSERT_NE(m, nullptr);
-  EXPECT_FALSE(m->isLoaded());
-}
-
-TEST(SupertonicValidate, UseGpuNGpuLayersConflictStillRejected) {
-  // The cross-field conflict check (useGPU=true + nGpuLayers=0, or
-  // useGPU=false + nGpuLayers!=0) is still enforced after the GPU
-  // gate was lifted, so callers can't silently get the opposite
-  // backend they asked for.
-  auto cfg = minimallyValidStubConfig();
-  cfg.useGpu = true;
-  cfg.nGpuLayers = 0;
   bool threw = false;
   try {
     SupertonicModel m(cfg);
   } catch (const StatusError& e) {
     threw = true;
     const std::string what = e.what();
-    EXPECT_NE(what.find("conflicts with nGpuLayers"), std::string::npos)
-        << "error should explain the conflict; got: " << what;
+    EXPECT_NE(what.find("GPU"), std::string::npos)
+        << "error should mention GPU; got: " << what;
+    EXPECT_NE(what.find("Supertonic"), std::string::npos)
+        << "error should mention Supertonic engine; got: " << what;
   }
   EXPECT_TRUE(threw);
 }
 
+TEST(SupertonicValidate, NGpuLayersGreaterThanZeroRejected) {
+  auto cfg = minimallyValidStubConfig();
+  cfg.nGpuLayers = 99;
+  EXPECT_THROW(SupertonicModel{cfg}, StatusError);
+}
+
 TEST(SupertonicValidate, NGpuLayersZeroAcceptedAndDeferredLoad) {
   auto cfg = minimallyValidStubConfig();
   cfg.nGpuLayers = 0;
-  // Validation passes (CPU path); the stub file then fails GGUF
-  // parsing on load() (not at construction — load is deferred to
-  // waitForLoadInitialization). Locks the contract that construction
-  // succeeds for any internally-consistent CPU config.
+  // Validation passes (CPU-only path); the stub file then fails GGUF
+  // parsing on load() (not at construction — load is now deferred to
+  // waitForLoadInitialization).  The eventual throw must NOT be the
+  // GPU-rejection branch.
   std::unique_ptr<SupertonicModel> m;
   EXPECT_NO_THROW(m = std::make_unique<SupertonicModel>(cfg));
   ASSERT_NE(m, nullptr);
   EXPECT_FALSE(m->isLoaded());
-  EXPECT_THROW(m->load(), StatusError);
+  bool threw = false;
+  try {
+    m->load();
+  } catch (const StatusError& e) {
+    threw = true;
+    const std::string what = e.what();
+    EXPECT_EQ(what.find("GPU"), std::string::npos)
+        << "nGpuLayers=0 should not trigger the GPU-rejection path; got: " << what;
+  }
+  EXPECT_TRUE(threw);
   EXPECT_FALSE(m->isLoaded());
 }
 

@@ -31,8 +31,8 @@
  * `bash scripts/convert-models.sh -t supertonic-mtl`).  The
  * English-pinned single-sentence entry point lives in supertonic-tts.js.
  *
- * NOTE: Supertonic gained GPU support in tts-cpp@2026-06-05.  This
- * example keeps useGPU=false so it runs identically everywhere.
+ * NOTE: Supertonic is CPU-only in tts-cpp today.  This example sets
+ * useGPU=false explicitly to match.
  */
 
 const fs = require('bare-fs')

@@ -37,9 +37,8 @@
  * supertonic-mtl-sweep-tts.js; for the simpler English-pinned entry
  * point see supertonic-tts.js.
  *
- * NOTE: Supertonic gained GPU support in tts-cpp@2026-06-05.  This
- * example keeps useGPU=false so it runs identically everywhere; flip
- * to true on GPU-capable hosts to engage Metal / Vulkan / Adreno-OpenCL.
+ * NOTE: Supertonic is CPU-only in tts-cpp today.  This example sets
+ * useGPU=false explicitly to match.
  */
 
 const fs = require('bare-fs')

@@ -20,9 +20,9 @@
  * Expects the Supertonic GGUF at:
  *   models/supertonic.gguf
  *
- * NOTE: Supertonic gained GPU support in tts-cpp@2026-06-05; this
- * example keeps useGPU=false so it runs identically everywhere.  See
- * supertonic-tts.js for the GPU opt-in pattern.
+ * NOTE: Supertonic is CPU-only in tts-cpp today; this example sets
+ * useGPU=false explicitly.  See supertonic-tts.js for the full
+ * limitation context.
  */
 
 const fs = require('bare-fs')

@@ -29,11 +29,11 @@
  * ONNX bundle into a single .gguf via
  * scripts/convert-supertonic2-to-gguf.py --arch supertonic.
  *
- * NOTE: Supertonic gained GPU support in tts-cpp@2026-06-05 (QVAC-18605
- * Vulkan/Metal optimisations + QVAC-19254 Adreno OpenCL sched). Pass
- * useGPU=true on GPU-capable hosts to engage Metal / Vulkan / CUDA /
- * Adreno-OpenCL via the tts-cpp init_gpu_backend tier policy; this
- * example keeps useGPU=false so it runs identically everywhere.
+ * NOTE: Supertonic is CPU-only in tts-cpp today (engine docstring at
+ * include/tts-cpp/supertonic/engine.h: "CPU only today").  Passing
+ * useGPU=true throws at construction with a message pointing at the
+ * limitation; the example explicitly sets useGPU=false.  Chatterbox
+ * (turbo + MTL) keeps GPU enabled by default.
  */
 
 const fs = require('bare-fs')

@@ -49,7 +49,7 @@ declare interface TTSGgmlFiles {
 declare interface TTSGgmlRuntimeConfig {
   /** Language code; default "en". Chatterbox MTL accepts es/fr/de/pt/it/zh/ja/ko/... */
   language?: string
-  /** Route inference through a GPU backend (Metal / Vulkan / CUDA / OpenCL) if available, on either Chatterbox or Supertonic.  Defaults to `false` for both engines (opt-in via `useGPU: true` on GPU-capable hosts). */
+  /** Route inference through a GPU backend (Metal / Vulkan / CUDA / OpenCL) if available.  Defaults to `false` for both engines (opt-in via `useGPU: true` on GPU-capable hosts).  Supertonic still rejects `useGPU: true` at construction time (engine is CPU-only today). */
   useGPU?: boolean
   /** Resample the engine's native rate (24 kHz Chatterbox, 44.1 kHz Supertonic) to this rate before emitting (8000-192000 Hz). */
   outputSampleRate?: number
@@ -68,7 +68,7 @@ declare interface TTSGgmlOptions {
   voiceDir?: string
   /** RNG seed for CFM initial noise + SineGen excitation (Chatterbox) / vector-estimator latent (Supertonic). */
   seed?: number
-  /** Move N layers to the GPU backend.  Chatterbox + Supertonic: pass 99 to move everything. */
+  /** Move N layers to the GPU backend.  Chatterbox: pass 99 to move everything.  Supertonic: must be 0 / unset (engine is CPU-only today). */
   nGpuLayers?: number
   /** Override `std::thread::hardware_concurrency()`. */
   threads?: number

@@ -362,11 +362,22 @@ class TTSGgml {
           'agnostic runStream() / runStreaming() / run({ streamOutput: true }) APIs.'
         )
       }
-      // GPU is supported as of tts-cpp@2026-06-05 (QVAC-18605 Supertonic
-      // Vulkan/Metal optimisations + QVAC-19254 sched/cpu_backend for
-      // Adreno OpenCL). Default-off mirrors Chatterbox; callers opt in
-      // with config: { useGPU: true } on GPU-capable hosts.
-      if (this._config.useGPU === undefined && this._nGpuLayers == null) {
+      const wantsGpu =
+        this._config.useGPU === true ||
+        (this._nGpuLayers != null && this._nGpuLayers !== 0)
+      if (wantsGpu) {
+        throw new Error(
+          'tts-ggml: GPU execution is not supported by the Supertonic engine yet ' +
+          '(see tts-cpp include/tts-cpp/supertonic/engine.h: "CPU only today"). ' +
+          'GPU output is currently silently wrong (~4x quieter, slightly truncated) ' +
+          'because the Vulkan path of the supertonic vector-estimator + vocoder is ' +
+          'not yet validated.  Pass config: { useGPU: false } (and leave nGpuLayers ' +
+          'unset, or set it to 0) when constructing a Supertonic model. ' +
+          'Chatterbox also defaults to CPU now; opt in with ' +
+          'config: { useGPU: true } on GPU-capable hosts.'
+        )
+      }
+      if (this._config.useGPU === undefined) {
         this._config.useGPU = false
       }
     } else if (this._config.useGPU === undefined && this._nGpuLayers == null) {

@@ -1,6 +1,6 @@
 {
   "name": "@qvac/tts-ggml",
-  "version": "0.2.1",
+  "version": "0.2.2",
   "description": "Text to Speech (TTS) addon for qvac (ggml backend, wrapping the chatterbox + supertonic engines from tts-cpp)",
   "addon": true,
   "engines": {

@@ -164,43 +164,25 @@ test('Chatterbox GPU smoke - useGPU=true must engage the GPU backend on GPU-capa
   }
 })
 
-test('Supertonic GPU smoke - useGPU=true must engage the GPU backend on GPU-capable platforms', { timeout: 600000, skip: NO_GPU }, async (t) => {
-  // QVAC-19255: Supertonic gained Vulkan/Metal/Adreno-OpenCL support
-  // in tts-cpp@2026-06-05 (QVAC-18605 rounds 1-13 + QVAC-19254 sched).
-  // This test mirrors the Chatterbox GPU smoke above: useGPU=true on
-  // a GPU-capable platform must resolve to a real GPU backend, not
-  // silently fall back to CPU.
-  const baseDir = getBaseDir()
-  const modelsDir = path.join(baseDir, 'models')
-
-  const download = await ensureSupertonicModel({ targetDir: modelsDir })
-  if (!download || !download.success) {
-    t.fail('Supertonic GGUF not available - registry fetch failed. Run `npm run download-models:registry` or stage models locally.')
-    return
-  }
-
-  const supertonicPath = download.path ||
-    path.join(modelsDir, 'supertonic.gguf')
-
-  const model = await loadSupertonicTTS({
-    supertonicModelPath: supertonicPath,
-    language: 'en',
-    voice: 'F1',
-    useGPU: true
-  })
+test('Supertonic GPU smoke - useGPU=true is rejected at constructor (engine is CPU-only today)', { timeout: 60000 }, async (t) => {
+  const TTSGgml = require('@qvac/tts-ggml')
+  let threw = false
   try {
-    const result = await runSupertonicTTS(
-      model,
-      { text: 'GPU smoke check.' },
-      { minSamples: 5000 }
-    )
-    console.log(result.output)
-    t.ok(result.passed, 'Supertonic/GPU produced expected sample count')
-    t.ok(result.data.sampleCount > 0, 'Supertonic/GPU produced audio')
-    assertGpuBackend(t, 'Supertonic', result.data.stats)
-  } finally {
-    try { await model.unload() } catch (_e) {}
+    /* eslint no-new: 0 */
+    new TTSGgml({
+      engine: TTSGgml.ENGINE_SUPERTONIC,
+      files: { supertonicModel: '/dev/null' },
+      voice: 'F1',
+      config: { language: 'en', useGPU: true }
+    })
+  } catch (e) {
+    threw = true
+    t.ok(/CPU only today/.test(e.message),
+      'rejection message references the engine docstring')
+    t.ok(/Pass config:.*useGPU: false/.test(e.message),
+      'rejection message tells user how to fix')
   }
+  t.ok(threw, 'TTSGgml constructor should throw on Supertonic + useGPU:true')
 })
 
 // CPU smoke: useGPU:false must actually pin the engine to CPU on every