Skip to content

feat: Android per-arch CPU variants for ggml-speech + parakeet-cpp dependency pin#157

Merged
GustavoA1604 merged 2 commits into
mainfrom
update-ggml-speech
May 20, 2026
Merged

feat: Android per-arch CPU variants for ggml-speech + parakeet-cpp dependency pin#157
GustavoA1604 merged 2 commits into
mainfrom
update-ggml-speech

Conversation

@GustavoA1604

@GustavoA1604 GustavoA1604 commented May 20, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

  • On Android, ggml-speech was built with GGML_CPU_STATIC=ON, folding a single CPU backend archive into the consumer .bare module. That prevented per-architecture CPU variant selection (ARMv8.0 / 8.2 / 8.6 / 9.0 / 9.2, with repack variants) and did not match the dynamic-backend packaging model already used for Vulkan and OpenCL.
  • With AGP's default useLegacyPackaging=false, native .so files stay compressed inside the APK. Directory-based backend discovery fails silently, and the bare-name dlopen fallback only resolves lib<prefix>ggml-cpu.so — which does not exist once CPU is shipped as per-arch MODULE libraries.
  • parakeet-cpp consumers had no explicit manifest constraint requiring the new ggml-speech build, making accidental downgrades to the pre-flip port-version possible.

How does it solve it?

ggml-speech → port-version 3

  • Bump upstream REF to 9562ed04 on tetherto/qvac-ext-ggml@speech, which includes the ggml-backend: android per-arch CPU variant dlopen fallback patch. When directory iteration finds nothing inside a compressed APK, the loader iterates known variant names instead of composing a single bare libqvac-speech-ggml-cpu.so.
  • Flip Android CPU build mode from GGML_CPU_STATIC=ON to:
    • GGML_CPU_ALL_VARIANTS=ON — emit one MODULE .so per ARM feature tier (v8.0, 8.2, 8.6, 9.0, 9.2)
    • GGML_CPU_REPACK=ON — include int8mm / SME repack variants
  • Runtime selection unchangedggml_backend_load_best("cpu") scores each variant via ggml_backend_cpu_aarch64_score against the device's HWCAPs (e.g. ARMv9.2+SME on Pixel 9, ARMv8.0 on older devices).
  • Non-Android targets unchanged — the new CMake flags are gated behind if(VCPKG_TARGET_IS_ANDROID).
  • Hybrid GPU mode preserved — Vulkan and OpenCL remain MODULE .so files loaded via GGML_BACKEND_DL=ON; the existing portfile file(GLOB ...) install step picks them up from the buildtree bin/ directory.

parakeet-cpp2026-05-20#2

  • Tighten dependency from ggml-speech >= 2026-04-09#1 to >= 2026-04-09#3, making the per-arch CPU variant requirement visible in manifest-mode resolutions and preventing silent downgrades to pv2.
  • Includes upstream bump to 2026-05-20 (AOSC v2.1, from #156) with REF ef0f2ae.

GustavoA1604 and others added 2 commits May 20, 2026 10:07
Flip the Android backend mode from `GGML_CPU_STATIC=ON` to
`GGML_CPU_ALL_VARIANTS=ON` + `GGML_CPU_REPACK=ON`, switching the CPU
backend from a single statically-linked archive folded into the
consumer .bare module to a fan of per-arch MODULE .so files (one each
for ARMv8.0, 8.2, 8.6, 9.0, 9.2; with and without the int8mm/sme/...
repack variants).

Pairs with the speech-branch dlopen-fallback patch already shipped in
this port's REF (commit 9562ed04 -- "ggml-backend: android per-arch
CPU variant dlopen fallback"), which iterates the known variant names
when the consumer APK keeps native libs compressed
(AGP `useLegacyPackaging=false`, the default since AGP 3.6). Without
the dlopen fallback the variant lookup would silently fail because
`fs::directory_iterator` finds nothing inside the APK and the
existing bare-name `dlopen` fallback only composed
`lib<prefix>ggml-cpu.so` -- which doesn't exist in this build mode.

Selection logic at runtime is unchanged: `ggml_backend_load_best("cpu")`
runs each variant through `ggml_backend_cpu_aarch64_score` so the
device's HWCAP picks the highest-tier variant it supports
(ARMv9.2+SME on a Pixel 9, ARMv8.0 on older OnePlus 9, etc.). Matches
the equivalent shape on qvac-fabric-llm's ggml fork; the
`transcription-parakeet` addon's cmake install rules already pick up
all libqvac-speech-ggml-cpu-android_armv* .so files from the
buildtree's bin/ directory (no addon-side change needed).

Non-Android targets unchanged (`if(VCPKG_TARGET_IS_ANDROID)`).
REF stays at 9562ed04; this is portfile-only.

Co-authored-by: Cursor <cursoragent@cursor.com>
Tightens the `ggml-speech` constraint from `>= 2026-04-09#2` to
`>= 2026-04-09#3` so consumers pin against the ggml-speech build that
ships the Android per-arch CPU MODULE `.so` variants
(`GGML_CPU_ALL_VARIANTS=ON` + `GGML_CPU_REPACK=ON`). Without the
explicit pin the `>=` constraint already floats to pv3 via the
baseline, but the explicit pin makes the requirement visible in
manifest-mode resolutions and protects against accidental downgrades
to the pre-flip pv2 (which only emits a single statically-linked
ggml-cpu archive and would silently leave the per-arch variant
selection unused).

REF and source SHA512 stay at the upstream `2026-05-20#1` parakeet-cpp
tarball; this is portfile-only.

Co-authored-by: Cursor <cursoragent@cursor.com>
@GustavoA1604 GustavoA1604 changed the title Update ggml speech feat: Android per-arch CPU variants for ggml-speech + parakeet-cpp dependency pin May 20, 2026
@GustavoA1604 GustavoA1604 merged commit 183c726 into main May 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants