Skip to content

parakeet-cpp: Android dynamic backend loading + Adreno-tier GPU policy#23

Merged
GustavoA1604 merged 3 commits into
masterfrom
android-gpu-dynamic-loading
May 18, 2026
Merged

parakeet-cpp: Android dynamic backend loading + Adreno-tier GPU policy#23
GustavoA1604 merged 3 commits into
masterfrom
android-gpu-dynamic-loading

Conversation

@GustavoA1604

@GustavoA1604 GustavoA1604 commented May 18, 2026

Copy link
Copy Markdown

Summary

Brings parakeet-cpp's Android backend story up to parity with
qvac/packages/llm-llamacpp:

  • Vulkan and OpenCL ship as dlopen'd MODULE .so files (qvac-ext-ggml@speech's
    GGML_BACKEND_DL=ON), discovered at runtime via ggml_backend_load_all_from_path().
  • Zero static GPU backend init calls anywhere in libparakeet. Verified on host:
    nm libparakeet.dylib | grep ggml_backend_\(vulkan\|opencl\|metal\|cuda\|blas\)_init
    returns empty.
  • Backend selection mirrors llm-llamacpp's
    BackendSelection.cpp
    tier policy: Adreno 700+ → OpenCL, every other GPU → Vulkan (or Metal / CUDA
    on the matching platform).

What's changed

src/parakeet_ctc.cppinit_gpu_backend

  • Rewrote the registry walk to bucket GPU/IGPU devices into
    {opencl_adreno_700plus, other_gpu, opencl_other} and pick per the tier policy,
    replacing the previous "first GPU/IGPU in registry order, skip Adreno 6xx" logic.
  • parse_adreno_version() handles the standard "Adreno 7xx/8xx" naming AND the
    Snapdragon X Elite "Adreno X" naming (mapped to synthetic 800 so it takes the
    OpenCL branch). Existing PARAKEET_ALLOW_ADRENO_6XX env override preserved.
  • New public entry points set_backends_directory(dir) / set_opencl_cache_dir(dir)
    (declared in parakeet_ctc.h) so embedded host apps can point the ggml-backend
    registry at a custom per-module folder before the first Engine construction.
    Both honour a "first Engine wins" contract gated on a new g_backends_loaded
    atomic flipped under the shared mutex before the load-all call inside
    ensure_backends_loaded releases it — racing setters either land their value
    (and have it picked up by the in-flight load) or atomically observe the flag
    and fall into the warn-once branch.

include/parakeet/engine.hEngineOptions

  • backends_dir — forwarded to ggml_backend_load_all_from_path() on first
    Engine construction. Empty → ggml's compile-time default search path.
  • opencl_cache_dir — Android-only, sets $GGML_OPENCL_CACHE_DIR for
    ggml-opencl's program-binary cache (qvac-ext-ggml@speech program-binary cache
    patch). Strongly recommended in production on Android to skip the cold
    clBuildProgram cost.

src/parakeet_engine.cpp

  • Engine ctor calls set_backends_directory / set_opencl_cache_dir before
    load_from_gguf when the respective EngineOptions fields are non-empty.

src/main.cpp

  • New --backends-dir DIR CLI flag with the same lifetime contract as
    --opencl-cache-dir (applied before any backend init).

CMakeLists.txt

  • On Android, defaults GGML_BACKEND_DL=ON + GGML_CPU_ALL_VARIANTS=ON +
    GGML_CPU_REPACK=ON + GGML_VULKAN=ON + GGML_OPENCL=ON +
    GGML_VULKAN_DISABLE_COOPMAT{,2}=ON, matching the qvac llm-llamacpp Android
    port (qvac-registry-vcpkg/ports/llama-cpp/portfile.cmake).
    Override at the cmake command line as usual.
  • Fixed the PARAKEET_GGML_LIB_PREFIX block: it now sets
    GGML_LIB_OUTPUT_PREFIX="speech-" as a cache variable before
    add_subdirectory(ggml), and the post-hoc rename loop is removed. The previous
    version would double-rename when consumed by qvac-ext-ggml@speech's default
    prefix qvac-speech-, producing libspeech-qvac-speech-ggml-vulkan.so style
    filenames that nothing on the runtime side discovered.
  • Dropped the dead GGML_USE_VULKAN / GGML_USE_OPENCL / GGML_USE_METAL /
    GGML_USE_CUDA / GGML_USE_BLAS defines from parakeet-backend-defs and the
    parakeet_apply_backend_defs() helper. No source in parakeet-cpp uses
    #ifdef GGML_USE_* anymore (everything goes through the registry); shipping
    these defines would falsely advertise a static backend dependency that the
    GGML_BACKEND_DL=ON Android/Linux builds explicitly do not have.

scripts/setup-ggml.sh

  • Bumped to point at qvac-ext-ggml@speech (which carries the speech-stack patch
    series + the qvac-speech- lib filename prefix this PR's prefix change relies
    on).

Companion change

This PR depends on a small ggml-backend loader patch landing on
qvac-ext-ggml@speech: an Android __ANDROID__ block in
ggml_backend_load_best that enumerates per-arch CPU variant names
(cpu-android_armv{8.0,8.2,8.6,9.0,9.2}_*) as candidates for the bare-name
dlopen fallback. Without it, GGML_CPU_ALL_VARIANTS=ON builds on Android
fail to register the CPU backend at runtime (the APK's compressed .so layout
under useLegacyPackaging=false leaves nothing for fs::directory_iterator
to scan, and the existing fallback only composed the base name
libqvac-speech-ggml-cpu.so — which doesn't exist with CPU_ALL_VARIANTS).
Mirrors the equivalent fallback already present downstream on
qvac-fabric-llm.cpp's ggml fork.

tetherto/qvac-ext-ggml#11

Testing

  • Host build: cmake -S . -B build -DPARAKEET_BUILD_EXECUTABLES=ON && cmake --build build configures clean and produces correctly-prefixed
    libspeech-ggml-{base,vulkan,opencl,cpu,blas,metal}.dylib files alongside
    libparakeet.dylib. No static GPU backend symbols leaked into libparakeet
    (verified with nm).
  • On-device Android via tetherto/qvac-test-addon-mobile against the
    consumer integration suite, on Samsung S23 FE (Cortex-A78, ARMv8.2 + dotprod +
    i8mm):
    • Engine constructs successfully against a q4_0 GGUF (TDT, EOU,
      Sortformer).
    • The Android per-arch CPU fallback picks libqvac-speech-ggml-cpu-android_armv8.2_2.so
      via the score function — no rc=10 from init_cpu_backend anymore.
    • Tier policy correctly selects Vulkan when useGPU=true and the device
      isn't an Adreno 7xx+.
    • Integration tests (runAccuracyMultilangTest, runMultipleTranscriptionsTest,
      runColdStartTimingTest, runDuplexStreamingTest, runEouStreamingTest,
      runMobilePerf*Test, etc.) now actually exercise the engine instead of
      fail-fast on loadGgufOrSkip.

GustavoA1604 and others added 3 commits May 18, 2026 13:48
Brings the parakeet-cpp Android backend story up to parity with
qvac/packages/llm-llamacpp:

  * Vulkan and OpenCL ship as separately-loaded MODULE .so files
    (qvac-ext-ggml@speech's GGML_BACKEND_DL=ON), discovered at
    runtime via `ggml_backend_load_all_from_path()`.
  * No static GPU backend init calls anywhere in libparakeet --
    `nm libparakeet.dylib | grep ggml_backend_(vulkan|opencl|metal|cuda|blas)_init`
    returns empty (verified on host).
  * Backend selection mirrors llm-llamacpp's BackendSelection.cpp
    tier policy: Adreno 700+ -> OpenCL, every other GPU -> Vulkan
    (or Metal / CUDA on the matching platform).

Changes:

`src/parakeet_ctc.cpp` (`init_gpu_backend`)
  - Rewrote the registry walk to bucket GPU/IGPU devices into
    {opencl_adreno_700plus, other_gpu, opencl_other} and pick the
    bucket per the tier policy, instead of the previous "first
    GPU/IGPU in registry order, skip Adreno 6xx" logic.
  - `parse_adreno_version()` handles the standard "Adreno 7xx/8xx"
    naming AND the Snapdragon X Elite "Adreno X<n>" naming (mapped
    to synthetic 800 so it takes the OpenCL branch). Existing
    PARAKEET_ALLOW_ADRENO_6XX env override preserved.
  - Added `set_backends_directory(dir)` / `set_opencl_cache_dir(dir)`
    public entry points (also declared in `parakeet_ctc.h`) so
    embedded host apps can point the ggml-backend registry at a
    custom per-module folder before the first Engine construction.
    Both honour a "first Engine wins" contract: the gate is a new
    `g_backends_loaded` atomic flipped under the shared mutex
    *before* the load-all call inside `ensure_backends_loaded`
    releases it, so a setter racing a first-Engine construction
    either lands its value (and has it picked up by the in-flight
    load) or atomically observes the flag and falls into the
    warn-once branch. Previously the gate was
    `!g_recorded_backends_dir.empty()`, which conflated "registry
    loaded" with "registry loaded from a non-empty dir" -- a
    second-Engine setter after a first Engine that used the default
    search path would silently write to `g_backends_dir` without
    re-scanning, with zero diagnostic. Symmetric behaviour applied
    to set_opencl_cache_dir.

`include/parakeet/engine.h` (`EngineOptions`)
  - `backends_dir`: forwarded to `ggml_backend_load_all_from_path()`
    on first Engine construction. Empty -> ggml's compile-time
    default search path.
  - `opencl_cache_dir`: Android-only, sets $GGML_OPENCL_CACHE_DIR
    for ggml-opencl's program-binary cache (the qvac-ext-ggml@speech
    program-binary cache patch). Strongly recommended in production
    on Android to skip the cold clBuildProgram cost.

`src/parakeet_engine.cpp` (Engine ctor)
  - Calls `set_backends_directory` / `set_opencl_cache_dir` before
    `load_from_gguf` when the respective EngineOptions fields are
    non-empty.

`src/main.cpp` (CLI)
  - New `--backends-dir DIR` flag with the same lifetime contract as
    `--opencl-cache-dir` (applied before any backend init).

`CMakeLists.txt`
  - On Android, default GGML_BACKEND_DL=ON + GGML_CPU_ALL_VARIANTS=ON
    + GGML_CPU_REPACK=ON + GGML_VULKAN=ON + GGML_OPENCL=ON +
    GGML_VULKAN_DISABLE_COOPMAT{,2}=ON, matching the qvac llm-llamacpp
    Android port (qvac-registry-vcpkg/ports/llama-cpp/portfile.cmake).
    Override at the cmake command line as usual.
  - Fixed the PARAKEET_GGML_LIB_PREFIX block: it now sets
    GGML_LIB_OUTPUT_PREFIX="speech-" as a cache variable BEFORE
    add_subdirectory(ggml), and the post-hoc rename loop is removed.
    The previous version would double-rename when consumed by the
    qvac-ext-ggml@speech default prefix `qvac-speech-`, producing
    `libspeech-qvac-speech-ggml-vulkan.so` style filenames that
    nothing on the runtime side discovered.
  - Dropped the dead GGML_USE_VULKAN / GGML_USE_OPENCL / GGML_USE_METAL
    / GGML_USE_CUDA / GGML_USE_BLAS defines from `parakeet-backend-defs`
    and the `parakeet_apply_backend_defs()` helper. No source in
    parakeet-cpp uses `#ifdef GGML_USE_*` anymore (everything goes
    through the registry); shipping these defines would falsely
    advertise a static backend dependency that the GGML_BACKEND_DL=ON
    Android/Linux builds explicitly do not have.

Verified by:
  * Host build: `cmake -S . -B build -DPARAKEET_BUILD_EXECUTABLES=ON
    && cmake --build build` produces correctly-prefixed
    `libspeech-ggml-{base,vulkan,opencl,cpu,blas,metal}.dylib` files
    alongside libparakeet.dylib.
  * On-device Android (qvac-test-addon-mobile, Samsung S23 FE):
    Engine constructs successfully against a q4_0 GGUF, the tier
    policy selects the right backend (Vulkan when GPU is requested,
    CPU armv8.2_2 variant via the new ggml-backend Android per-arch
    fallback), and the addon's integration test suite runs without
    `rc=10` from init_cpu_backend.

Co-authored-by: Cursor <cursoragent@cursor.com>
@GustavoA1604 GustavoA1604 requested review from a team as code owners May 18, 2026 18:59
@GustavoA1604 GustavoA1604 changed the title Android gpu dynamic loading parakeet-cpp: Android dynamic backend loading + Adreno-tier GPU policy May 18, 2026
@GustavoA1604 GustavoA1604 merged commit 0f2b178 into master May 18, 2026
66 of 73 checks passed
GustavoA1604 added a commit to GustavoA1604/qvac-registry-vcpkg that referenced this pull request May 19, 2026
Repoints the port at the latest tetherto/qvac-ext-lib-whisper.cpp@master
tip (08df2e70b8b71f8225af6ae35d3576eccea5ae7f), which folds in two
PRs:

  * tetherto/qvac-ext-lib-whisper.cpp#23 -- parakeet-cpp: android
    dynamic backend loading + Adreno-tier GPU policy. The parakeet-cpp
    subtree now defaults Android builds to GGML_BACKEND_DL=ON +
    GGML_CPU_ALL_VARIANTS=ON + GGML_CPU_REPACK=ON + GGML_VULKAN=ON +
    GGML_OPENCL=ON, matching the qvac llm-llamacpp Android port. Vulkan
    and OpenCL ship as separately-loadable MODULE .so files; per-arch
    CPU variants ship as `libqvac-speech-ggml-cpu-android_armv*_*.so`.
    Backend selection is centralised in `init_gpu_backend()`: Adreno
    700+ -> OpenCL, every other GPU -> Vulkan (or Metal / CUDA on
    matching platforms). No static GPU backend entry points are linked
    anywhere in libparakeet; the ggml-backend registry walk handles
    every case in both GGML_BACKEND_DL=ON and GGML_BACKEND_DL=OFF
    modes. Also adds public `set_backends_directory()` /
    `set_opencl_cache_dir()` entry points plus the matching
    `EngineOptions::backends_dir` / `opencl_cache_dir` fields and the
    `--backends-dir` CLI flag so embedded host apps can pin the
    backends scan directory and the ggml-opencl program-binary cache
    per-process.

  * tetherto/qvac-ext-lib-whisper.cpp#24 -- parakeet-cpp: address PR
    #22 AOSC v2.1 review comments (Sortformer streaming fixes that
    landed shortly after PR #23 merged; safe to fold in).

Date-stamped rather than port-versioned because the upstream commits
land Android-specific backend-loading machinery that previous pv1
builds genuinely lacked (not just a bugfix on the same source set).
Consumers pinning to `2026-05-05#1` keep the StreamingSegment
.starts_word baseline; consumers tracking the date-stamped baseline
move forward to the dynamic-backend Android shape.

Dependency floor on ggml-speech tightened from `2026-04-09#1` to
`2026-04-09#2` -- the new Android CPU_ALL_VARIANTS path requires the
per-arch CPU variant dlopen fallback that landed in ggml-speech pv2
(previous commit). Without that floor a downstream registry override
could silently pull pv1 and fail to register any CPU backend at
runtime under AGP's `useLegacyPackaging=false` (the universal Android
default since 3.6).

No behaviour change on macOS / iOS (Metal still statically linked
into libggml-*) or desktop Linux / Windows (Vulkan / CUDA likewise
static). The Android-defaults block in parakeet-cpp's CMakeLists.txt
is gated on `CMAKE_SYSTEM_NAME STREQUAL "Android"` and only flips
the dynamic-loading switches there. Verified by host build:
`nm libparakeet.dylib | grep ggml_backend_(vulkan|opencl|metal|cuda|blas)_init`
returns empty.

git-tree for ports/parakeet-cpp: 4f9b873.

Co-authored-by: Cursor <cursoragent@cursor.com>
GustavoA1604 added a commit to GustavoA1604/qvac-registry-vcpkg that referenced this pull request May 19, 2026
Repoints the port at the latest tetherto/qvac-ext-lib-whisper.cpp@master
tip (ef0f2ae637dc3be8bcd52b17374f9bb804beb06b), which folds in three
PRs:

  * tetherto/qvac-ext-lib-whisper.cpp#23 -- parakeet-cpp: android
    dynamic backend loading + Adreno-tier GPU policy. The parakeet-cpp
    subtree now defaults Android builds to GGML_BACKEND_DL=ON +
    GGML_CPU_ALL_VARIANTS=ON + GGML_CPU_REPACK=ON + GGML_VULKAN=ON +
    GGML_OPENCL=ON, matching the qvac llm-llamacpp Android port. Vulkan
    and OpenCL ship as separately-loadable MODULE .so files; per-arch
    CPU variants ship as `libqvac-speech-ggml-cpu-android_armv*_*.so`.
    Backend selection is centralised in `init_gpu_backend()`: Adreno
    700+ -> OpenCL, every other GPU -> Vulkan (or Metal / CUDA on
    matching platforms). No static GPU backend entry points are linked
    anywhere in libparakeet; the ggml-backend registry walk handles
    every case in both GGML_BACKEND_DL=ON and GGML_BACKEND_DL=OFF
    modes. Also adds public `set_backends_directory()` /
    `set_opencl_cache_dir()` entry points plus the matching
    `EngineOptions::backends_dir` / `opencl_cache_dir` fields and the
    `--backends-dir` CLI flag so embedded host apps can pin the
    backends scan directory and the ggml-opencl program-binary cache
    per-process.

  * tetherto/qvac-ext-lib-whisper.cpp#24 -- parakeet-cpp: address PR
    #22 AOSC v2.1 review comments (Sortformer streaming fixes that
    landed shortly after PR #23 merged; safe to fold in).

  * tetherto/qvac-ext-lib-whisper.cpp#25 -- Fix missing include for
    windows (compile-only follow-up to PR #23; needed for the Windows
    desktop dev path that exercises the new init_gpu_backend tier
    policy).

Date-stamped rather than port-versioned because the upstream commits
land Android-specific backend-loading machinery that previous pv1
builds genuinely lacked (not just a bugfix on the same source set).
Consumers pinning to `2026-05-05#1` keep the StreamingSegment
.starts_word baseline; consumers tracking the date-stamped baseline
move forward to the dynamic-backend Android shape.

Dependency floor on ggml-speech tightened from `2026-04-09#1` to
`2026-04-09#2` -- the new Android CPU_ALL_VARIANTS path requires the
per-arch CPU variant dlopen fallback that landed in ggml-speech pv2
(previous commit). Without that floor a downstream registry override
could silently pull pv1 and fail to register any CPU backend at
runtime under AGP's `useLegacyPackaging=false` (the universal Android
default since 3.6).

No behaviour change on macOS / iOS (Metal still statically linked
into libggml-*) or desktop Linux / Windows (Vulkan / CUDA likewise
static). The Android-defaults block in parakeet-cpp's CMakeLists.txt
is gated on `CMAKE_SYSTEM_NAME STREQUAL "Android"` and only flips
the dynamic-loading switches there. Verified by host build:
`nm libparakeet.dylib | grep ggml_backend_(vulkan|opencl|metal|cuda|blas)_init`
returns empty.

git-tree for ports/parakeet-cpp: 2961794.

Co-authored-by: Cursor <cursoragent@cursor.com>
GustavoA1604 added a commit to GustavoA1604/qvac-registry-vcpkg that referenced this pull request May 19, 2026
Repoints the port at the latest tetherto/qvac-ext-lib-whisper.cpp@master
tip (ef0f2ae637dc3be8bcd52b17374f9bb804beb06b), which folds in three
PRs:

  * tetherto/qvac-ext-lib-whisper.cpp#23 -- parakeet-cpp: android
    dynamic backend loading + Adreno-tier GPU policy. The parakeet-cpp
    subtree now defaults Android builds to GGML_BACKEND_DL=ON +
    GGML_CPU_ALL_VARIANTS=ON + GGML_CPU_REPACK=ON + GGML_VULKAN=ON +
    GGML_OPENCL=ON, matching the qvac llm-llamacpp Android port. Vulkan
    and OpenCL ship as separately-loadable MODULE .so files; per-arch
    CPU variants ship as `libqvac-speech-ggml-cpu-android_armv*_*.so`.
    Backend selection is centralised in `init_gpu_backend()`: Adreno
    700+ -> OpenCL, every other GPU -> Vulkan (or Metal / CUDA on
    matching platforms). No static GPU backend entry points are linked
    anywhere in libparakeet; the ggml-backend registry walk handles
    every case in both GGML_BACKEND_DL=ON and GGML_BACKEND_DL=OFF
    modes. Also adds public `set_backends_directory()` /
    `set_opencl_cache_dir()` entry points plus the matching
    `EngineOptions::backends_dir` / `opencl_cache_dir` fields and the
    `--backends-dir` CLI flag so embedded host apps can pin the
    backends scan directory and the ggml-opencl program-binary cache
    per-process.

  * tetherto/qvac-ext-lib-whisper.cpp#24 -- parakeet-cpp: address PR
    #22 AOSC v2.1 review comments (Sortformer streaming fixes that
    landed shortly after PR #23 merged; safe to fold in).

  * tetherto/qvac-ext-lib-whisper.cpp#25 -- Fix missing include for
    windows (compile-only follow-up to PR #23; needed for the Windows
    desktop dev path that exercises the new init_gpu_backend tier
    policy).

Date-stamped rather than port-versioned because the upstream commits
land Android-specific backend-loading machinery that previous pv1
builds genuinely lacked (not just a bugfix on the same source set).
Consumers pinning to `2026-05-05#1` keep the StreamingSegment
.starts_word baseline; consumers tracking the date-stamped baseline
move forward to the dynamic-backend Android shape.

Dependency floor on ggml-speech tightened from `2026-04-09#1` to
`2026-04-09#2` -- the new Android CPU_ALL_VARIANTS path requires the
per-arch CPU variant dlopen fallback that landed in ggml-speech pv2
(previous commit). Without that floor a downstream registry override
could silently pull pv1 and fail to register any CPU backend at
runtime under AGP's `useLegacyPackaging=false` (the universal Android
default since 3.6).

No behaviour change on macOS / iOS (Metal still statically linked
into libggml-*) or desktop Linux / Windows (Vulkan / CUDA likewise
static). The Android-defaults block in parakeet-cpp's CMakeLists.txt
is gated on `CMAKE_SYSTEM_NAME STREQUAL "Android"` and only flips
the dynamic-loading switches there. Verified by host build:
`nm libparakeet.dylib | grep ggml_backend_(vulkan|opencl|metal|cuda|blas)_init`
returns empty.

git-tree for ports/parakeet-cpp: 2961794.

Co-authored-by: Cursor <cursoragent@cursor.com>
GustavoA1604 added a commit to GustavoA1604/qvac-registry-vcpkg that referenced this pull request May 19, 2026
Repoints the port at the latest tetherto/qvac-ext-lib-whisper.cpp@master
tip (ef0f2ae637dc3be8bcd52b17374f9bb804beb06b), which folds in three
PRs:

  * tetherto/qvac-ext-lib-whisper.cpp#23 -- parakeet-cpp: android
    dynamic backend loading + Adreno-tier GPU policy. The parakeet-cpp
    subtree now defaults Android builds to GGML_BACKEND_DL=ON +
    GGML_CPU_ALL_VARIANTS=ON + GGML_CPU_REPACK=ON + GGML_VULKAN=ON +
    GGML_OPENCL=ON, matching the qvac llm-llamacpp Android port. Vulkan
    and OpenCL ship as separately-loadable MODULE .so files; per-arch
    CPU variants ship as `libqvac-speech-ggml-cpu-android_armv*_*.so`.
    Backend selection is centralised in `init_gpu_backend()`: Adreno
    700+ -> OpenCL, every other GPU -> Vulkan (or Metal / CUDA on
    matching platforms). No static GPU backend entry points are linked
    anywhere in libparakeet; the ggml-backend registry walk handles
    every case in both GGML_BACKEND_DL=ON and GGML_BACKEND_DL=OFF
    modes. Also adds public `set_backends_directory()` /
    `set_opencl_cache_dir()` entry points plus the matching
    `EngineOptions::backends_dir` / `opencl_cache_dir` fields and the
    `--backends-dir` CLI flag so embedded host apps can pin the
    backends scan directory and the ggml-opencl program-binary cache
    per-process.

  * tetherto/qvac-ext-lib-whisper.cpp#24 -- parakeet-cpp: address PR
    #22 AOSC v2.1 review comments (Sortformer streaming fixes that
    landed shortly after PR #23 merged; safe to fold in).

  * tetherto/qvac-ext-lib-whisper.cpp#25 -- Fix missing include for
    windows (compile-only follow-up to PR #23; needed for the Windows
    desktop dev path that exercises the new init_gpu_backend tier
    policy).

Date-stamped rather than port-versioned because the upstream commits
land Android-specific backend-loading machinery that previous pv1
builds genuinely lacked (not just a bugfix on the same source set).
Consumers pinning to `2026-05-05#1` keep the StreamingSegment
.starts_word baseline; consumers tracking the date-stamped baseline
move forward to the dynamic-backend Android shape.

Dependency floor on ggml-speech tightened from `2026-04-09#1` to
`2026-04-09#2` -- the new Android CPU_ALL_VARIANTS path requires the
per-arch CPU variant dlopen fallback that landed in ggml-speech pv2
(previous commit). Without that floor a downstream registry override
could silently pull pv1 and fail to register any CPU backend at
runtime under AGP's `useLegacyPackaging=false` (the universal Android
default since 3.6).

No behaviour change on macOS / iOS (Metal still statically linked
into libggml-*) or desktop Linux / Windows (Vulkan / CUDA likewise
static). The Android-defaults block in parakeet-cpp's CMakeLists.txt
is gated on `CMAKE_SYSTEM_NAME STREQUAL "Android"` and only flips
the dynamic-loading switches there. Verified by host build:
`nm libparakeet.dylib | grep ggml_backend_(vulkan|opencl|metal|cuda|blas)_init`
returns empty.

git-tree for ports/parakeet-cpp: 2961794.

Co-authored-by: Cursor <cursoragent@cursor.com>
gianni-cor pushed a commit that referenced this pull request May 28, 2026
parakeet-cpp: Android dynamic backend loading + Adreno-tier GPU policy
@gianni-cor gianni-cor deleted the android-gpu-dynamic-loading branch May 28, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant