Skip to content

server: (router) alloc tmp buffer on heap#23159

Merged
ngxson merged 2 commits into
ggml-org:masterfrom
ngxson:xsn/server_nits_heap_alloc
May 16, 2026
Merged

server: (router) alloc tmp buffer on heap#23159
ngxson merged 2 commits into
ggml-org:masterfrom
ngxson:xsn/server_nits_heap_alloc

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented May 16, 2026

Overview

missed a comment from #22683 (comment)

Requirements

@ngxson ngxson requested a review from a team as a code owner May 16, 2026 18:47
@ngxson ngxson merged commit b64739e into ggml-org:master May 16, 2026
44 of 49 checks passed
kgrama pushed a commit to kgrama/llama.cpp that referenced this pull request May 19, 2026
xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 19, 2026
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request May 19, 2026
baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026
srossitto79 pushed a commit to srossitto79/llama.cpp that referenced this pull request May 23, 2026
carlosfundora pushed a commit to carlosfundora/llama.cpp-1-bit-turbo that referenced this pull request May 24, 2026
winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
Jcfunk added a commit to Jcfunk/llama.cpp that referenced this pull request Jun 2, 2026
* turboquant/HEAD: (82 commits)
  docs(readme): credit Google's original TurboQuant + explain the '+'
  docs(readme): fix turbo ladder ordering + cite K-compression paper
  docs(readme): reorder KV configs as a ladder + 'start light' guidance
  docs(readme): add Chronara to deployments + AtomicChat link
  docs: restructure README — professional layout, deployments, paper links
  docs: tighten README — add turbo2, missing features, paper links
  docs: keep upstream README, prepend fork-specific summary
  docs: replace upstream README with fork-specific summary
  fix(xxd.cmake): handle missing input file (not just empty)
  fix(ci): 4 cross-vendor -Werror failures + defensive xxd.cmake
  cmake : fix LLAMA_BUILD_UI logic (ggml-org#23190)
  fix(ggml-cuda): HIP nodiscard + MUSA cudaMemcpyToSymbol alias
  fix(turbo-quant): add forward declaration for turbo_cpu_fwht_inverse
  fix(metal): set ne12/ne13/r2/r3 function constants in mul_mm_tq_rotated pipeline
  webui: support video files as input (ggml-org#22830)
  server: (router) alloc tmp buffer on heap (ggml-org#23159)
  server: skip device enumeration in router mode to avoid creating CUDA primary context (ggml-org#23137)
  vulkan: removed duplicate #include <memory> in headers (ggml-org#23144)
  ui: Add request timeout for MCP tool calls (ggml-org#23138)
  sync : ggml
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants