Skip to content

Conversation

@alicehau
Copy link
Contributor

@alicehau alicehau commented Feb 28, 2025

Tool shim that uses ollama models to "interpret" LLM responses generated by other models particularly without native tool calling e.g., deepseek. The shim is currently a feature in the openrouter and ollama providers.

The shim instructs the primary model to output json for intended tool usage, the interpretive model uses ollama structured outputs to translate the primary model's message into valid json, and then that json is translated into valid tool calls to be invoked.

To use, set the env var GOOSE_TOOLSHIM=1

The default interpreter model is mistral, but you can override this with the env var GOOSE_TOOLSHIM_MODEL=[name of ollama model].

GOOSE_TOOLSHIM=1 GOOSE_TOOLSHIM_MODEL=llama3.2 cargo run --bin goose session

Recommend running ollama server with env var OLLAMA_CONTEXT_LENGTH set to something higher than the 2048 default limit. e.g., OLLAMA_CONTEXT_LENGTH=50000 ollama serve. This feature is only available from the ollama source build as it hasn't been released yet.

The most promising combination right now seems to be full deepseek-r1 via openrouter + toolshim.

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cool - surprised to also see it with openrouter - I assume it works ok there for similar models?

@michaelneale
Copy link
Collaborator

@alicehau I think we should get this in when ollama release is ready, as looks promising.

@ahau-square
Copy link
Contributor

very cool - surprised to also see it with openrouter - I assume it works ok there for similar models?

yes, I've tested this out with at least deepseek r1 models on openrouter and it works the same!

@ahau-square ahau-square requested a review from laanak08 March 4, 2025 14:54
use super::errors::ProviderError;
use crate::message::{Message, MessageContent};
use crate::model::ModelConfig;
use crate::providers::formats::openai::create_request;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also consider copying the formats/openai.rs file into a new formats/ollama.rs and having some of these changes there. either is fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thought here was to leave open the possibility of a toolshim backed by a non-ollama model

@ahau-square ahau-square force-pushed the ahau/toolshim-refactor branch from 123f057 to 65d041c Compare March 7, 2025 23:26
@ahau-square ahau-square marked this pull request as ready for review March 10, 2025 15:43
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to add to reference and summarize agents?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought those were just legacy at this point? I don't recall any way for folks to switch their agents?

Copy link
Contributor

@salman1993 salman1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a minor comment on moving the toolshim check up in the agent

@ahau-square ahau-square force-pushed the ahau/toolshim-refactor branch from 56b986f to 190a26c Compare March 10, 2025 22:00
@ahau-square
Copy link
Contributor

Made some big refactoring changes to pull all the shim logic out of the individual providers and put it into the agent instead

Copy link
Collaborator

@baxen baxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Excited to try this

I think we should eventually move this logic into the creation of a "Model" where the model always has tool calling supported, either by a shim like this or natively. that i think we can revisit with a config upgrade in the near future. This seems like a good place to start trying it out now

let config = capabilities.provider().get_model_config();
let mut system_prompt = capabilities.get_system_prompt().await;
let mut toolshim_tools = vec![];
if config.interpret_chat_tool_calls {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it sounds like i missed a version where this was handled in the provider instead? I like that separation of concerns actually. But also this definitely works

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the original version was implemented in each provider. Moved in this direction of putting logic into the agent as a lot of the code was just duplicated across providers to check if the shim was on, then modify the tools passed, system prompt, and interpret the response received back. It can also be nicer to have in the agent as then you don't have to modify each new provider to add the shim.

/// Optional maximum tokens to generate
pub max_tokens: Option<i32>,
/// Whether to interpret tool calls
pub interpret_chat_tool_calls: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i'd maybe just have tool_call_interpreter_model be an Option and it being Some/None replaces this field?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently have a default tool shim model, mistral-nemo set so you don't have to pass in the model. But I'm open to requiring you to pass a toolshim model also

let content = response["message"]["content"].as_str().unwrap_or_default();

// Try to parse the content as JSON
if let Ok(content_json) = serde_json::from_str::<Value>(content) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if there are any ways to incrementally improve error handling here.

  • if we can't parse the json should we retry the structured generation? it should very rarely happen/never?
  • same as above for empty tool calls
  • same as above for a tool call with no name field
  • if we have a tool call with a name but arguments field should we surface that as a tool call error

not sure if any of this comes up in practice or does the structured generation just always consistently solve that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the structured outputs from ollama ensure you conform to the schema you pass in every time as they are masking tokens that don't conform in the sampling stage (https://blog.danielclayton.co.uk/posts/ollama-structured-outputs/).

empty tool calls are actually ok, as not every response should have a tool call (i.e., agent asks for clarification or just completed the task)

if there are tool calls, they will always have "name" and "arguments" fields since those are passed as required to ollama's structured outputs

@alicehau alicehau merged commit 259ccd5 into main Mar 12, 2025
6 checks passed
@alicehau alicehau deleted the ahau/toolshim-refactor branch March 12, 2025 15:01
sheagcraig added a commit to sheagcraig/goose that referenced this pull request Mar 12, 2025
* upstream/main:
  feat(google_drive): move credentials into keychain, add optional fallback (block#1603)
  feat: add session list command in cli (block#1586)
  feat: google sheets support (in google drive builtin MCP server) (block#1601)
  fix: deep link opening when window is closed (block#1633)
  docs: edits to docker guide (block#1639)
  feat: ollama tool shim (block#1448)
  feat: add write approve mode (block#1628)
  ui: auto update card upon config (block#1610)
  fix: fix tool output expansion checks (block#1634)
  fix: remove conditional that breaks output display for tool calls (block#1631)
  docs: Persistent Command History (block#1627)
  change to make build work on windows, macos, linux (block#1618)
  chore(release): release version 1.0.13 (block#1623)
  fix: handle mac screenshots with the image tool (block#1622)
  feat: write eval results to eval dir (block#1620)
  [fix] fix model config logging to remove api key (block#1619)
  fix: ensure repeating benches return to initial run-dir (block#1617)
ahau-square pushed a commit that referenced this pull request May 2, 2025
Co-authored-by: Alice Hau <ahau@squareup.com>
cbruyndoncx pushed a commit to cbruyndoncx/goose that referenced this pull request Jul 20, 2025
Co-authored-by: Alice Hau <ahau@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants