Fix meta backend tensor reads for split tensors during state serialization by ssam18 · Pull Request #22063 · ggml-org/llama.cpp

ssam18 · 2026-04-17T19:29:30Z

This PR fixes a crash when saving recurrent state with tensor-split models using the meta backend. The previous code assumed that a tensor read would always map to a single segment, which is not always true when -sm tensor is enabled. The fix handles multi-segment tensor reads correctly instead of hitting the split_state.n_segments == 1 assertion. This should allow checkpoint/state serialization to work reliably with tensor-parallel CUDA setups. Fixes #22058

pwilkin · 2026-04-18T07:51:22Z

Can confirm it now works. Performance is slightly worse at 93 t/s tg vs 110 on -sm layer.

pwilkin · 2026-04-18T08:06:24Z

Actually, never mind, was missing NCCL. With NCCL installed, performance is up at 123 t/s.

…-org#22063)

ggml-backend-meta: add multi-segment read support in get_tensor

6170c04

github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 17, 2026

JohannesGaessler approved these changes Apr 17, 2026

View reviewed changes

JohannesGaessler requested a review from pwilkin April 17, 2026 22:15

pwilkin approved these changes Apr 18, 2026

View reviewed changes

JohannesGaessler merged commit 59accc8 into ggml-org:master Apr 18, 2026
50 of 51 checks passed

EmilPi mentioned this pull request Apr 18, 2026

Eval bug: Qwen 2.5 27B Crashes with ctxcp on sm tensor #21878

Open

samuraieng pushed a commit to samuraieng/llama.cpp that referenced this pull request Apr 19, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

0f493b7

…-org#22063)

mengqin pushed a commit to mengqin/llama.cpp that referenced this pull request Apr 20, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

853c84a

…-org#22063)

ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

98c35df

…-org#22063)

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

634c96a

…-org#22063)

xkong-anaconda mentioned this pull request May 1, 2026

llama.cpp: update from b8728 to b8994 AnacondaRecipes/llama.cpp-feedstock#40

Merged

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

41d4c92

…-org#22063)

jimbothigpen pushed a commit to jimbothigpen/frankenturbo2 that referenced this pull request May 2, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

6ecfdfe

…-org#22063)

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026

ggml-backend-meta: add multi-segment read support in get_tensor (ggml…

05a6199

…-org#22063)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix meta backend tensor reads for split tensors during state serialization#22063

Fix meta backend tensor reads for split tensors during state serialization#22063
JohannesGaessler merged 1 commit into
ggml-org:masterfrom
ssam18:fix/meta-buffer-get-tensor-multi-segment

ssam18 commented Apr 17, 2026 •

edited

Loading

Uh oh!

pwilkin commented Apr 18, 2026

Uh oh!

Uh oh!

pwilkin commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ssam18 commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Apr 18, 2026

Uh oh!

Uh oh!

pwilkin commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ssam18 commented Apr 17, 2026 •

edited

Loading