Skip to content

Conversation

@michaelneale
Copy link
Collaborator

@michaelneale michaelneale commented Aug 12, 2025

This generalises the GOOSE_LEAD/WORKER variable structure with a more general multi model approach (ie can remove that old code path).
This is clearly marked with an x- experimental config prefix (as per convention) so the exact format can evolve as people use this.

This is really phase 1, phase 2 will be both GUI/ux changes to support and also having more opionionated defaults when providers are configured, but need to get this out in the wild so we can see how things perform with all the permutations of providers we can't test by hand.

For example, in config:

x-advanced-models:
- provider: databricks
  model: goose-gpt-5
  role: reviewer
- provider: anthropic
  model: claude-opus-4-1-20250805
  role: deep-thinker
- provider: anthropic
  model: claude-opus-4-1-20250805
  role: lead

which maps to premade_roles.yaml which define rules for those models and when they activate:

roles:
  # Deep reasoning and analysis
  - role: "deep-thinker"
    rules:
      triggers:
        keywords: ["think", "reason", "analyze", "explain why", "how does", "what if"]
        match_type: "any"
        complexity_threshold: "high"
        source: "human"  # Only trigger on human messages
      active_turns: 3
      priority: 10

which can be used as is - or each trigger value/setting can be overridding in the personal config (if one of the pre-made set doesn't fit - normally you just say role + provider + model - and let it work it out, but you can customise).

By default the main provider/model is used, and these will supplement when the rules are activate/met, and run for a certain amount of time (this already helped me as the gptoss:120B model was good at spotting short circuit logic bugs other models failed to see).

This could also be used to do most of the work in a low cost or zero cost local model as well.

run with --debug to see it log switching providers and models as it works

discussion #3980
implements: #4036

* main: (67 commits)
  blog: Transforming AI Assistance with Goose Mentor Mode (#4151)
  upgraded all npm packages and fixed related issues (#4072)
  Docs: @-mentions in goosehints (#4171)
  fix: consistent font sizing in ToolCallWithResponse (#4167)
  Temporarily disable TODO Tool (#4158)
  docs: add integrated MCP server config to jetbrains tutorial  (#4120)
  docs: remove figma MCP from suggested servers (#4123)
  Blog: The AI Skeptic’s Guide to Context Windows (#4152)
  Docs: Auto-compact context (#4116)
  chore(deps): bump brace-expansion from 1.1.11 to 1.1.12 in /documentation (#4149)
  Recipe config to limit tool availability (#4020)
  docs: fix warning message (#4148)
  feat: adds cursor-agent as a cli provider (#4101)
  chore: remove vector search tool selection strategy (#3933)
  docs: add streamable_http install links (#4130)
  feat: iterating on summarize oneshot prompt (#4113)
  feat(mcp): Persist OAuth credentials to keyring (#4007)
  Sanitize Tags Unicode Block at prompt level (#4047)
  Fixing typos (#4114)
  chore(release): release version 1.4.0 (#4069)
  ...
…m:block/goose into micn/multi-model-multi-provider-autopilot

* 'micn/multi-model-multi-provider-autopilot' of github.com:block/goose:
  printing out debugging
  simplifying
* main: (42 commits)
  feat: Add message queue system with interruption handling (#4179)
  Start extensions concurrently  (#4234)
  Add X-Title and referer headers on exchange to tetrate (#4250)
  docs: update View/Edit Recipe menu item name (#4267)
  Remove unused game (#4226)
  fix issue where app redirects to home after initialization but user has already started a chat (#4260)
  Feat: Let providers configure a fast model for summarization (#4228)
  docs: update tool selection strategy (#4258)
  feat: upgrade `@mcp-ui/client` package and improve UI message handling (#4164)
  stop replacing chat window when changing working directory (#4200)
  Only fetch session tokens when chat state is idle to avoid resetting during streaming (#4104)
  bump timeouts for e2e tests (#4251)
  docs: custom context files improvements (#4096)
  chore: upgrade rmcp to 0.6.0 (#4243)
  doc: uvx not npx (#4240)
  Add PKCE support for Tetrate Agent Router Service (#4165)
  Read AGENTS.md by default (#4232)
  docs: configure provider and model (#4235)
  docs: add figma tutorial (#4231)
  Add Nix flake for reproducible builds (#4213)
  ...
@michaelneale michaelneale marked this pull request as ready for review August 26, 2025 07:50
@joeeuston-dev
Copy link
Contributor

This sounds like an awesome feature. Currently I'm having to switch models when I do more multi modal work as compared to my 'everything' model.

@michaelneale
Copy link
Collaborator Author

@joeeuston-dev yeah - I keep finding uses for it - still tuning what defaults I want, but lots of possibilities here and fairly simple in the end!

Copy link
Collaborator

@jamadeo jamadeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool and definitely something Goose needs IMO. I'm worried a bit about the number of knobs to turn with a setup like this though. Would users dive into things like triggering keywords, tool/turn counts and priority? If they would, how do you judge if you're tuning it well? Seems it would take quite a lot of trial and error.

What about an approach that uses a model to judge when to switch roles? If you're already configuring Goose to be multi-model, it could be a reasonable requirement to assign a model to have the role of "routing".

It could also be interesting to combine this with the TO-DO list strategy. Each TODO item gets a role pre-assigned.

@michaelneale
Copy link
Collaborator Author

@jamadeo "This is very cool and definitely something Goose needs IMO. I'm worried a bit about the number of knobs to turn with a setup like this though. Would users dive into things like triggering keywords, tool/turn counts and priority?"

no - I would hope they don't need to, they only time would be if they really went bespoke, and it is more about simplifying the changes to goose itself, as people discover patterns or learn them, vs needing to change code (similar to with providers we make it as easy as possible ideally to add another one). But absolutely no - they should only need to care about a few roles, we can even have some sensible default setup, and cli/gui can also suggest others - make sense?

* main: (38 commits)
  feat: linux computer control for android (termux) (#3890)
  feat: Added scroll state support for chat-session-list navigation (#4360)
  docs: typo fix (#4376)
  blog: goose janitor (#4131)
  Fix eleven labs audio transcription and added more logging (#4358)
  feat: re-introduce session sharing (#4370)
  remove duplicate blog post (#4369)
  fix focus ring under form submits (#4332)
  Trigger docs deployment
  update tetrate blog date to today (#4368)
  tetrate signup: blog/launch post (#4313)
  Implement graceful recipe error handling with filename display (#4363)
  docs: airgapped operation by bypassing hermit for desktop app (#4063)
  remove Ollama card from welcome screen (#4348)
  feat: initial implementation of extension malware check (#4272)
  Add Tetrate Agent Router Service to Provider Registry (#4354)
  Goose Simple Compact UX (#4202)
  Refactor Extensions Install Modal (#4328)
  fix: url path trailing slash for custom-providers (#4345)
  docs: update available and onboarding providers list (#4356)
  ...
@michaelneale
Copy link
Collaborator Author

@DOsinga thoughts on if we should consoldate model config to:

models:

... like this (can have that instead of a variable if we want)? this then can help the cli and GUI have a simpler structure to target when storing a model.

@michaelneale
Copy link
Collaborator Author

@jamadeo What about an approach that uses a model to judge when to switch roles?

I like that, wouldn't want it on each turn. One option is that early on it could use that model to "turn the nobs and dials" (ie can help select appropriate roles) which will then kick in so you don't have to? What I would like to get to is what is the minimal setup (ie if you setup N providers and M models - how can we best make use of them, how much do we do automatically and how much do we let users direct it).

@michaelneale michaelneale requested a review from jamadeo August 28, 2025 23:48
* main:
  new recipe to lint-check my code (#4416)
  removing a leftover syntax error (#4415)
  Iand/updating recipe validation workflow (#4413)
  Iand/updating recipe validation workflow (#4410)
  Fix (Ollama provider): Unsupported operation: streaming not implemented (#4303)
  change databricks default to claude sonnet 4 (#4405)
  Iand/updating recipe validation workflow (#4406)
  Add metrics for recipe metadata in scheduler, UI, and CLI (#4399)
  Iand/updating recipe validation workflow (#4403)
  making small updates to recipe validation workflow (#4401)
  Automate OpenRouter API Key Distribution for External Recipe Contributors (#3198)
  Enhance `convert_path_with_tilde_expansion` to handle Windows (#4390)
  make sure all cookbook recipes have a title and version, but no id (#4395)
  Nest TODO State in session data (#4361)
  Fast model falls back to regular (#4375)
  Update windows instructions (#4333)
* main:
  chore: move list recipes and archive recipe to goose server (#4422)
  deleting a recipe and testing workflow (#4451)
  adding a new recipe (#4449)
  docs: autovisualiser extension (#4380)
  trying to restore functionality for api-key sending after merging a recipe (#4446)
  restoring a deleted recipe (#4445)
  testing recipe removal (#4443)
  updating our 3 workflows to only operate if the PR is adding/editing a recipe (#4441)
  [cookbook recipe] Update Wording  (#4438)
  feat: show enabled extensions at top of extensions page (#4423)
  test recipe (#4436)
  Extensions loading indicator on desktop launch (#4412)
  removing trailing slash (#4433)
  [recipe cookbook] test recipe (#4431)
  [recipe cookbook] switching to SHA (#4429)
  [recipe cookbook] Update url build (#4427)
  [Recipe Cookbook] test recipe flow (#4426)
  [Recipe cookbook] Addressing GitHub api format issue (#4424)
  feat: integrate tool call icons with status indicators and daisy chaining (#4279)
* main:
  Align Dynamic Task Interface with Recipe Interface (#4311)
  docs: copilot auth and mcp-ui links (#4497)
  docs: July and August 2025 Community All-Stars Update (#4501)
  remove clicking outside to close recipe warning (#4502)
  lower min width to 450 for small screens
  Convert recipe create and import forms to use tanstack form and zod schema validation (#4499)
  Repo CI: use a writable location for Goose home directory (#4500)
  feat: Add functionality to delete session in history list view (#4480)
  fix: recipe deeplink "+" characters and folder change (#4471)
  Add session to agents (#4216)
  fix: need to send errors to appropriate stream (#4491)
  Add Docker support for Goose in CI/CD pipelines (#4434)
  Add visual indicator while recipe loads (#4447)
  Disable chat input while extensions load (#4417)
  chore(release): release version 1.7.0 (#4391)
  fix double filtering (#4409)
  Rewrite the developer mcp using the rmcp sdk (#4297)
  docs: sessions reorg and conversation features (#4462)
Copy link
Collaborator

@jamadeo jamadeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it. I like it as an experiment and we definitely need the model-switching infrastructure. I think we'll change our minds on the switching logic and eventually have preferred models that fulfill each role.

@michaelneale michaelneale merged commit 439d293 into main Sep 4, 2025
10 checks passed
@michaelneale michaelneale deleted the micn/multi-model-multi-provider-autopilot branch September 4, 2025 01:24
katzdave added a commit that referenced this pull request Sep 4, 2025
* 'main' of github.com:block/goose:
  Fix databricks streaming errors  (#4506)
  docs: malware check for uvx and npx extensions (#4508)
  fix: auto-compact on context limit error (#3635)
  feat: multi model and multi provider config and auto switching (#4035)
This was referenced Sep 9, 2025
thebristolsound pushed a commit to thebristolsound/goose that referenced this pull request Sep 11, 2025
…#4035)

Signed-off-by: Matt Donovan <mattddonovan@protonmail.com>
HikaruEgashira pushed a commit to HikaruEgashira/goose that referenced this pull request Oct 3, 2025
…#4035)

Signed-off-by: HikaruEgashira <hikaru-egashira@c-fo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants