-
Notifications
You must be signed in to change notification settings - Fork 2.3k
docs: local LLMs context size tip #3454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1bd2b35 to
428dca4
Compare
|
Also, the context size increases VRAM usage, so there is a LLM-Model-VRAM-Calculator that estimates how much space is required for a model and its context mentioned in #1817 |
|
Also I think we need to edit "(e.g. |
michaelneale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this, yeah there are so many little permutations and variables, this makes sense to me (I never could remember how to set the size).
|
deepseek-r1 works on openrouter at least with tool calling (the ones I have tried) and I think it may on ollama now but haven't tried it, so yeah. |
|
i want to fix this up a bit before merging |
Ok, which model is the example of a popular model that doesn't support tool calling? I suggest gemma3 for this one. |
|
thanks so much for this! |
|
@angiejones Wait I didn't change the example! Haha |
|
@angiejones we need to change this part:
Need your review to decide what to do with this part. Just delete this? |
how come delete? |
Signed-off-by: jjjuk <[email protected]>
angiejones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to split the warning into two
* 'main' of github.com:block/goose: Adds the `WaitingForUserInput` state (#3620) docs: update extensions library (#3612) Fixing grants path (#3632) docs: June 2024 Community All-Stars Spotlight (#3631) grant program (#3630) Lifei/sub recipe desktop temp (#3576) docs: local LLMs context size tip (#3454) fix: Handle non-default base path for OpenAI compatible model fetching (#3566) Goose security updates (#3579) fix: multi-tool calls in streamed openai-compatible responses (#3609) feat: subagent turn count, simple agent loop (#3597) feat: subagent independent extension manager (#3596) Improve session history loading resiliency (#3588) Added logging and changed default route case to not redirect to home when there is an invalid route (#3610) fix: chat sidebar layout overlapping content occasionally (#3590) fix: loading shared sessions (#3607)
Signed-off-by: jjjuk <[email protected]> Co-authored-by: angiejones <[email protected]> Signed-off-by: Adam Tarantino <[email protected]>
Hi everyone! Personally had issues with context size and Ollama, and saw several other cases. So I've extended tool calling warning with information about context size. Updated this one for Ollama and Ramalama.
Before:

After, for Ollama:

After, for Ramalama:

and small change in Ramalama example:
If you wan't to change something I'm happy to help!