Skip to content

chore: ⬆️ Update ggml-org/llama.cpp to d6588daa800058dfa54f1d7ea695b1a810c8ae18#10093

Merged
mudler merged 2 commits into
mudler:masterfrom
ci-forks:update/LLAMA_VERSION
May 31, 2026
Merged

chore: ⬆️ Update ggml-org/llama.cpp to d6588daa800058dfa54f1d7ea695b1a810c8ae18#10093
mudler merged 2 commits into
mudler:masterfrom
ci-forks:update/LLAMA_VERSION

Conversation

@localai-bot
Copy link
Copy Markdown
Collaborator

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@localai-bot localai-bot force-pushed the update/LLAMA_VERSION branch from 7d28b3f to 5397b5e Compare May 30, 2026 20:28
Upstream llama.cpp (ggml-org/llama.cpp#23884), pulled in by this bump,
now emits an initial "begin" partial whose to_json() returns null. It
exists only to signal the HTTP layer to flush 200 status headers before
any token is produced.

gRPC has no such concept, and PredictStream had no guard: the null result
was fed straight into build_reply_from_json, which threw an uncaught
exception. That surfaced as a generic "Unexpected error in RPC handling"
and the task was cancelled the instant it launched, breaking the
PredictStream e2e spec.

Skip null results in both the first-result handling and the streaming
loop, mirroring upstream's own `if (first_result_json == nullptr)` guard.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
@mudler mudler enabled auto-merge (squash) May 31, 2026 10:10
@mudler mudler merged commit aa80d46 into mudler:master May 31, 2026
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants