LLama.cpp GPU Support on Android Device #16606

Siddhesh2377 · 2025-10-16T05:58:23Z

Siddhesh2377
Oct 16, 2025

Here's an optimized version of your announcement:

GPU Acceleration for Android llama.cpp via OpenCL - Working Implementation

I've successfully implemented GPU acceleration for llama.cpp on Android using OpenCL, specifically optimized for Qualcomm Adreno GPUs. This implementation uses the existing llama.cpp repository without any modifications or custom patches.

Repository: https://github.com/Siddhesh2377/Ai-Core

Performance Results:

The screenshots demonstrate significant performance improvements with GPU offloading enabled via OpenCL on Snapdragon hardware.

Implementation Details:

Full GPU offloading (all 28 layers) via OpenCL backend
KV cache offloading to GPU
State management with persistent KV cache support
Works with Q3_K_S and other quantization formats
No modifications to llama.cpp core required

Architecture:

The implementation is packaged as a single .aar library that provides:

CPU and GPU inference modes
KV cache management with save/restore functionality
State persistence across app sessions
TTS support integration
Multi-ABI support (arm64-v8a, x86_64)

Build Configuration:

Key CMake flags for OpenCL:

GGML_OPENCL=ON
GGML_VULKAN=OFF
KQV offloading enabled when using GPU

OpenCL headers and runtime linking configured for Android NDK build system. The implementation uses the Qualcomm-optimized OpenCL backend that was recently merged into llama.cpp.

Testing:

Tested on Qualcomm Adreno GPU with measurable performance improvements over CPU-only mode. Token generation speed increases significantly with full layer offloading.

Note: This uses the standard llama.cpp repository. All OpenCL support is already present in the upstream codebase - this is purely an Android integration implementation demonstrating how to properly configure and build it for mobile devices.

---

Feel free to adjust any technical details based on your specific measurements or implementation choices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLama.cpp GPU Support on Android Device #16606

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

LLama.cpp GPU Support on Android Device #16606

Uh oh!

Siddhesh2377 Oct 16, 2025

Replies: 0 comments

Siddhesh2377
Oct 16, 2025