-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: ollama tool shim #1448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ollama tool shim #1448
Conversation
michaelneale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very cool - surprised to also see it with openrouter - I assume it works ok there for similar models?
|
@alicehau I think we should get this in when ollama release is ready, as looks promising. |
yes, I've tested this out with at least deepseek r1 models on openrouter and it works the same! |
| use super::errors::ProviderError; | ||
| use crate::message::{Message, MessageContent}; | ||
| use crate::model::ModelConfig; | ||
| use crate::providers::formats::openai::create_request; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also consider copying the formats/openai.rs file into a new formats/ollama.rs and having some of these changes there. either is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my thought here was to leave open the possibility of a toolshim backed by a non-ollama model
123f057 to
65d041c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also need to add to reference and summarize agents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought those were just legacy at this point? I don't recall any way for folks to switch their agents?
salman1993
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a minor comment on moving the toolshim check up in the agent
1a31146 to
fdd46c3
Compare
56b986f to
190a26c
Compare
|
Made some big refactoring changes to pull all the shim logic out of the individual providers and put it into the agent instead |
baxen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Excited to try this
I think we should eventually move this logic into the creation of a "Model" where the model always has tool calling supported, either by a shim like this or natively. that i think we can revisit with a config upgrade in the near future. This seems like a good place to start trying it out now
crates/goose/src/agents/truncate.rs
Outdated
| let config = capabilities.provider().get_model_config(); | ||
| let mut system_prompt = capabilities.get_system_prompt().await; | ||
| let mut toolshim_tools = vec![]; | ||
| if config.interpret_chat_tool_calls { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it sounds like i missed a version where this was handled in the provider instead? I like that separation of concerns actually. But also this definitely works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the original version was implemented in each provider. Moved in this direction of putting logic into the agent as a lot of the code was just duplicated across providers to check if the shim was on, then modify the tools passed, system prompt, and interpret the response received back. It can also be nicer to have in the agent as then you don't have to modify each new provider to add the shim.
crates/goose/src/model.rs
Outdated
| /// Optional maximum tokens to generate | ||
| pub max_tokens: Option<i32>, | ||
| /// Whether to interpret tool calls | ||
| pub interpret_chat_tool_calls: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i'd maybe just have tool_call_interpreter_model be an Option and it being Some/None replaces this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently have a default tool shim model, mistral-nemo set so you don't have to pass in the model. But I'm open to requiring you to pass a toolshim model also
| let content = response["message"]["content"].as_str().unwrap_or_default(); | ||
|
|
||
| // Try to parse the content as JSON | ||
| if let Ok(content_json) = serde_json::from_str::<Value>(content) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if there are any ways to incrementally improve error handling here.
- if we can't parse the json should we retry the structured generation? it should very rarely happen/never?
- same as above for empty tool calls
- same as above for a tool call with no name field
- if we have a tool call with a name but arguments field should we surface that as a tool call error
not sure if any of this comes up in practice or does the structured generation just always consistently solve that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the structured outputs from ollama ensure you conform to the schema you pass in every time as they are masking tokens that don't conform in the sampling stage (https://blog.danielclayton.co.uk/posts/ollama-structured-outputs/).
empty tool calls are actually ok, as not every response should have a tool call (i.e., agent asks for clarification or just completed the task)
if there are tool calls, they will always have "name" and "arguments" fields since those are passed as required to ollama's structured outputs
* upstream/main: feat(google_drive): move credentials into keychain, add optional fallback (block#1603) feat: add session list command in cli (block#1586) feat: google sheets support (in google drive builtin MCP server) (block#1601) fix: deep link opening when window is closed (block#1633) docs: edits to docker guide (block#1639) feat: ollama tool shim (block#1448) feat: add write approve mode (block#1628) ui: auto update card upon config (block#1610) fix: fix tool output expansion checks (block#1634) fix: remove conditional that breaks output display for tool calls (block#1631) docs: Persistent Command History (block#1627) change to make build work on windows, macos, linux (block#1618) chore(release): release version 1.0.13 (block#1623) fix: handle mac screenshots with the image tool (block#1622) feat: write eval results to eval dir (block#1620) [fix] fix model config logging to remove api key (block#1619) fix: ensure repeating benches return to initial run-dir (block#1617)
Co-authored-by: Alice Hau <ahau@squareup.com>
Co-authored-by: Alice Hau <ahau@squareup.com>
Tool shim that uses ollama models to "interpret" LLM responses generated by other models particularly without native tool calling e.g., deepseek. The shim is currently a feature in the openrouter and ollama providers.
The shim instructs the primary model to output json for intended tool usage, the interpretive model uses ollama structured outputs to translate the primary model's message into valid json, and then that json is translated into valid tool calls to be invoked.
To use, set the env var
GOOSE_TOOLSHIM=1The default interpreter model is mistral, but you can override this with the env var GOOSE_TOOLSHIM_MODEL=[name of ollama model].
GOOSE_TOOLSHIM=1 GOOSE_TOOLSHIM_MODEL=llama3.2 cargo run --bin goose sessionRecommend running ollama server with env var
OLLAMA_CONTEXT_LENGTHset to something higher than the 2048 default limit. e.g.,OLLAMA_CONTEXT_LENGTH=50000 ollama serve. This feature is only available from the ollama source build as it hasn't been released yet.The most promising combination right now seems to be full deepseek-r1 via openrouter + toolshim.