Skip to content

llama: fit ctx size for CPU only#21568

Merged
JohannesGaessler merged 1 commit into
ggml-org:masterfrom
JohannesGaessler:llama-fit-cpu-only
Apr 18, 2026
Merged

llama: fit ctx size for CPU only#21568
JohannesGaessler merged 1 commit into
ggml-org:masterfrom
JohannesGaessler:llama-fit-cpu-only

Conversation

@JohannesGaessler
Copy link
Copy Markdown
Contributor

Alternative to #19711 (comment) .

I think the correct way to reduce context size for CPU-only builds is to accumulate the host buffer types and to compare those vs. total system memory. This PR is currently only partially tested (and thus a draft) because I don't have a convenient combination of model and system memory sizes ready.

Requirements

@JohannesGaessler
Copy link
Copy Markdown
Contributor Author

Fixes #19646 .

@taronaeo
Copy link
Copy Markdown
Member

taronaeo commented Apr 8, 2026

I'll test it in a few days and get back once I have the results. Thanks for taking a look at it! :)

Copy link
Copy Markdown
Member

@taronaeo taronaeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on 2 GB and 32 GB memory systems respectively. Both working as intended.

@JohannesGaessler JohannesGaessler marked this pull request as ready for review April 13, 2026 14:33
@JohannesGaessler JohannesGaessler merged commit fd1c0ec into ggml-org:master Apr 18, 2026
49 of 51 checks passed
samuraieng pushed a commit to samuraieng/llama.cpp that referenced this pull request Apr 19, 2026
mengqin pushed a commit to mengqin/llama.cpp that referenced this pull request Apr 20, 2026
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Apr 21, 2026
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
jimbothigpen pushed a commit to jimbothigpen/frankenturbo2 that referenced this pull request May 2, 2026
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants