Skip to content

feat: sync llama.cpp to b9297#347

Merged
jhen0409 merged 2 commits into
mainfrom
auto/sync-llama.cpp
May 25, 2026
Merged

feat: sync llama.cpp to b9297#347
jhen0409 merged 2 commits into
mainfrom
auto/sync-llama.cpp

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented May 21, 2026

🤖 Automated llama.cpp sync

This PR was automatically created/updated by the daily sync workflow.

Changes:

  • Updated llama.cpp submodule from b9254 to b9297
  • Regenerated bindings and build files

Verification:

  • ✅ Bootstrap script completed successfully (including iOS Metal compilation)
  • ✅ iOS frameworks build completed successfully (macOS job)
  • ✅ Android libraries build completed successfully (Ubuntu job)
  • ✅ C++ unit tests passed
  • ✅ TypeScript build completed successfully
📋 llama.cpp changes (b9254 → b9297)
  • b0df4c0 model : add NVFP4 MTP scale tensors (#23563)
  • a497476 ggml : Check the right iface method before using the fallback 2d get (#23514)
  • 95405ac vulkan: fix windows find_package of SPIRV-Headers (#23215)
  • 0f3cb3f opencl: generalize Adreno MoE kernels on M (#23449)
  • 1acee6b server: only parse empty msg if continuing an assistant msg (#23506)
  • ef570f6 perplexity : fix integer overflow (#23496)
  • cc9e331 SYCL: improve MoE prefill throughput (#23142)
  • bcfd198 sycl : Level Zero detection in ggml_sycl_init (#23097)
  • 56f16f2 SYCL : gated_delta_net K>1 (#23174)
  • 8cc67ef SYCL: add BF16 to DMMV kernel path (~4x tg speedup on Intel Arc) (#21580)
  • 95feeab docs: Update documentation with Granite 4.0/4.1 (#23404)
  • 99d4026 ggml-zendnn : add Q8_0 quantization support (#23414)
  • 9c92e96 cmake : build router app only during standalone builds (#23521)
  • afcda09 vocab : fix HybridDNA tokenizer (#23466)
  • bbce619 cmake : add install() for impl libraries + fix apple builds (#23511)
  • 4f0e43d CUDA: fix PDL CC check for JIT compilation (#23471)
  • bb28c1f cmake : remove STATIC from impl libraries, enable LLAMA_BUILD_APP by default (#23462)
  • ee7c305 Update WebGPU support and add link to blog/demo (#23483)
  • 47c0eda vulkan: fuse snake activation (mul, sin, sqr, mul, add) (#22855)
  • 5306f4b fix(flash-attn): replace f32 with kv_type and q_type (#23372)
  • 40d5358 tests : move save-load-state from examples to tests (#23336)
  • b65bb4b server: expose prompt token counts in /slots endpoint (#23454)
  • a1a69f7 metal : optimize concat kernel and fix set kernel threads (#23411)
  • 52fb93a server : free draft/MTP resources on sleep to fix VRAM leak (#23461)
  • c902171 server: re-inject subcommand when router spawns children under unified binary (#23442)
  • 1d7ab2b app : add batched-bench, fit-params, quantize & perplexity (#23459)
  • 12e5d99 mtp: use inp_out_ids for skipping logit computation (#23433)
  • 7ea23dd vocab : add Carbon-3B (HybridDNATokenizer) support (#23410)
  • 2fc8d18 doc: fix spec mtp typo (#23435)
  • 5e932a1 ui: Improve Git Hooks for UI development (#23403)
  • 2754ce1 ggml : Check the right iface method before using the fallback 2d get (#23306)
  • eeeaf61 llama-graph: fix null-buffer crash in llm_graph_input_attn_kv_iswa for SWA-only models (#23131)
  • 0be8468 hexagon: ssm-conv fix for large prompts (#23307)
  • ce02093 app : show version (#23426)
  • 6a257d4 mtmd, model : merge HunyuanOCR into HunyuanVL and fix OCR vision precision (#23329)
  • 3a479c9 ui: Add max image size option (#22849)
  • ad27757 Move to backend sampling for MTP draft path (#23287)
  • 3a6db74 opencl: refactor backend initilization (#23318)
  • 510b5c2 common/speculative : fix nullptr crash in get_devices_str (#23386)
  • a8681a0 mtmd : DeepSeek-OCR image processing fixes, img_tool::resize padding refactor (#23345)
  • acd604f vulkan: optimize operations in the IM2COL shader (#22685)
  • 6ce9671 feat: Add WAV MIME type variants and improve audio format detection (#23396)
  • c9872a2 hexagon: HMX quantized matmul rework (#23368)

Please review and merge if all checks pass.

@github-actions github-actions Bot force-pushed the auto/sync-llama.cpp branch from ae41b47 to ecd26bd Compare May 22, 2026 04:01
@github-actions github-actions Bot changed the title feat: sync llama.cpp to b9260 feat: sync llama.cpp to b9279 May 22, 2026
@github-actions github-actions Bot force-pushed the auto/sync-llama.cpp branch from ecd26bd to a7a22d0 Compare May 23, 2026 04:01
@github-actions github-actions Bot changed the title feat: sync llama.cpp to b9279 feat: sync llama.cpp to b9294 May 23, 2026
@github-actions github-actions Bot force-pushed the auto/sync-llama.cpp branch from a7a22d0 to df465ed Compare May 24, 2026 04:00
@github-actions github-actions Bot changed the title feat: sync llama.cpp to b9294 feat: sync llama.cpp to b9297 May 24, 2026
@jhen0409 jhen0409 merged commit aa64d01 into main May 25, 2026
@jhen0409 jhen0409 deleted the auto/sync-llama.cpp branch May 25, 2026 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant