diff --git a/documentation/docs/experimental/index.md b/documentation/docs/experimental/index.md index 451b2cda8ca2..57e189364361 100644 --- a/documentation/docs/experimental/index.md +++ b/documentation/docs/experimental/index.md @@ -39,6 +39,11 @@ The list of experimental features may change as Goose development progresses. So description="An experimental extension enabling Goose to work within VS Code." link="/docs/experimental/vs-code-extension" /> + diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index 5b160aba8706..eea499437dc8 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -747,6 +747,14 @@ To use the Azure Credential Chain: This method simplifies authentication and enhances security for enterprise environments. +## Multi-Model Configuration + +Beyond single-model setups, goose supports [multi-model configurations](/docs/guides/multi-model/) that can use different models and providers for specialized tasks: + +- **AutoPilot** - Intelligent, context-aware switching between specialized models based on conversation content and complexity +- **Lead/Worker Model** - Automatic switching between a lead model for initial turns and a worker model for execution tasks +- **Planning Mode** - Manual planning phase using a dedicated model to create detailed project breakdowns before execution + --- If you have any questions or need help with a specific provider, feel free to reach out to us on [Discord](https://discord.gg/block-opensource) or on the [Goose repo](https://github.com/block/goose). diff --git a/documentation/docs/guides/config-file.md b/documentation/docs/guides/config-file.md index b6a8110e571f..0361d386236a 100644 --- a/documentation/docs/guides/config-file.md +++ b/documentation/docs/guides/config-file.md @@ -26,7 +26,7 @@ The following settings can be configured at the root level of your config.yaml f | `GOOSE_MAX_TURNS` | [Maximum number of turns](/docs/guides/sessions/smart-context-management#maximum-turns) allowed without user input | Integer (e.g., 10, 50, 100) | 1000 | No | | `GOOSE_LEAD_PROVIDER` | Provider for lead model in [lead/worker mode](/docs/guides/environment-variables#leadworker-model-configuration) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No | | `GOOSE_LEAD_MODEL` | Lead model for lead/worker mode | Model name | None | No | -| `GOOSE_PLANNER_PROVIDER` | Provider for [planning mode](/docs/guides/creating-plans) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No | +| `GOOSE_PLANNER_PROVIDER` | Provider for [planning mode](/docs/guides/multi-model/creating-plans) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No | | `GOOSE_PLANNER_MODEL` | Model for planning mode | Model name | Falls back to `GOOSE_MODEL` | No | | `GOOSE_TOOLSHIM` | Enable tool interpretation | true/false | false | No | | `GOOSE_TOOLSHIM_OLLAMA_MODEL` | Model for tool interpretation | Model name (e.g., "llama3.2") | System default | No | @@ -37,6 +37,10 @@ The following settings can be configured at the root level of your config.yaml f | `GOOSE_RECIPE_GITHUB_REPO` | GitHub repository for recipes | Format: "org/repo" | None | No | | `GOOSE_AUTO_COMPACT_THRESHOLD` | Set the percentage threshold at which Goose [automatically summarizes your session](/docs/guides/sessions/smart-context-management#automatic-compaction). | Float between 0.0 and 1.0 (disabled at 0.0)| 0.8 | No | +:::info Automatic Multi-Model Configuration +The experimental [AutoPilot](/docs/guides/multi-model/autopilot) feature provides intelligent, context-aware model switching. Configure models for different roles using the `x-advanced-models` setting. +::: + ## Experimental Features These settings enable experimental features that are in active development. These may change or be removed in future releases. @@ -137,6 +141,6 @@ This will show all active settings and their current values. ## See Also -- [Environment Variables](./environment-variables.md) - For environment variable configuration -- [Using Extensions](/docs/getting-started/using-extensions.md) - For more details on extension configuration -- [Creating Plans](./creating-plans.md) - For information about planning mode configuration \ No newline at end of file +- **[Multi-Model Configuration](/docs/guides/multi-model/)** - For multiple model-selection strategies +- **[Environment Variables](./environment-variables.md)** - For environment variable configuration +- **[Using Extensions](/docs/getting-started/using-extensions.md)** - For more details on extension configuration \ No newline at end of file diff --git a/documentation/docs/guides/environment-variables.md b/documentation/docs/guides/environment-variables.md index faf32b76e45c..2906b0fbc6bf 100644 --- a/documentation/docs/guides/environment-variables.md +++ b/documentation/docs/guides/environment-variables.md @@ -52,6 +52,10 @@ export GOOSE_PROVIDER__API_KEY="your-api-key-here" These variables configure a [lead/worker model pattern](/docs/tutorials/lead-worker) where a powerful lead model handles initial planning and complex reasoning, then switches to a faster/cheaper worker model for execution. The switch happens automatically based on your settings. +:::info Automatic Multi-Model Switching +The experimental [AutoPilot](/docs/guides/multi-model/autopilot) feature provides intelligent, context-aware model switching. Configure models for different roles using the `x-advanced-models` setting. +::: + | Variable | Purpose | Values | Default | |----------|---------|---------|---------| | `GOOSE_LEAD_MODEL` | **Required to enable lead mode.** Name of the lead model | Model name (e.g., "gpt-4o", "claude-sonnet-4-20250514") | None | @@ -84,7 +88,7 @@ export GOOSE_LEAD_FALLBACK_TURNS=2 ### Planning Mode Configuration -These variables control Goose's [planning functionality](/docs/guides/creating-plans). +These variables control Goose's [planning functionality](/docs/guides/multi-model/creating-plans). | Variable | Purpose | Values | Default | |----------|---------|---------|---------| @@ -205,7 +209,7 @@ These variables allow you to override the default context window size (token lim | `GOOSE_CONTEXT_LIMIT` | Override context limit for the main model | Integer (number of tokens) | Model-specific default or 128,000 | | `GOOSE_LEAD_CONTEXT_LIMIT` | Override context limit for the lead model in [lead/worker mode](/docs/tutorials/lead-worker) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | | `GOOSE_WORKER_CONTEXT_LIMIT` | Override context limit for the worker model in lead/worker mode | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | -| `GOOSE_PLANNER_CONTEXT_LIMIT` | Override context limit for the [planner model](/docs/guides/creating-plans) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | +| `GOOSE_PLANNER_CONTEXT_LIMIT` | Override context limit for the [planner model](/docs/guides/multi-model/creating-plans) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | **Examples** diff --git a/documentation/docs/guides/multi-model/_category_.json b/documentation/docs/guides/multi-model/_category_.json new file mode 100644 index 000000000000..85bc7259d187 --- /dev/null +++ b/documentation/docs/guides/multi-model/_category_.json @@ -0,0 +1,5 @@ +{ + "label": "Multi-Model Config", + "position": 20, + "description": "Configure multiple models and providers for model-switching strategies." +} diff --git a/documentation/docs/guides/multi-model/autopilot.md b/documentation/docs/guides/multi-model/autopilot.md new file mode 100644 index 000000000000..888e4645a005 --- /dev/null +++ b/documentation/docs/guides/multi-model/autopilot.md @@ -0,0 +1,127 @@ +--- +sidebar_position: 1 +title: Automatic Multi-Model Switching +sidebar_label: Automatic Model Switching +--- + +The AutoPilot feature enables intelligent, context-aware switching between different models. You simply work naturally with goose, and AutoPilot chooses the right model based on conversation content, complexity, tool usage patterns, and other triggers. + +:::warning Experimental Feature +AutoPilot is an experimental feature. Behavior and configuration may change in future releases. +::: + +## How AutoPilot Works + +After you configure which models to use for different roles, AutoPilot handles the rest. During your sessions, it automatically switches to the most appropriate model for your current task—whether you need specialized coding help, complex reasoning, or just want a second opinion. + +**For example:** +- When you ask to "debug this error," AutoPilot switches to a model optimized for debugging +- When you request "analyze the performance implications," it switches to a model better suited for complex reasoning +- When you're doing repetitive coding tasks, it uses a cost-effective model, but escalates to a more powerful one when it encounters failures + +Switching happens automatically based on: +- The terminology used in your requests ("debug", "analyze", "implement") +- How complex the task appears to be +- Whether previous attempts have failed and need a different approach +- How much autonomous work has been happening without your input + +When AutoPilot switches to a specialized model, it stays with that model for a configured number of turns before evaluating whether to switch back to the base model or to a different specialized model based on the new context. + +:::info +You can use `goose session --debug` in goose CLI to see when AutoPilot switches models. Note that each switch applies the provider's rate limits and pricing. +::: + +## Configuration + +Add the `x-advanced-models` section to your [`config.yaml`](/docs/guides/config-file) file and map your model preferences to [predefined](#predefined-roles) or custom roles. + +The `provider`, `model` and `role` parameters are required. + +```yaml +# Base provider and model (always available) +GOOSE_PROVIDER: "anthropic" +GOOSE_MODEL: "claude-sonnet-4-20250514" + +# AutoPilot models +x-advanced-models: +- provider: openai + model: o1-preview + role: deep-thinker +- provider: openai + model: gpt-4o + role: debugger +- provider: anthropic + model: claude-opus-4-20250805 + role: reviewer +``` + +**Migrate From Lead/Worker Model** + +This example shows how you can reproduce [lead model](/docs/tutorials/lead-worker) behavior using `x-advanced-models`. + +```yaml +# Before: Defined lead model using environment variables +# GOOSE_LEAD_PROVIDER=openai +# GOOSE_LEAD_MODEL=o1-preview + +# After: AutoPilot equivalent +GOOSE_PROVIDER: "anthropic" +GOOSE_MODEL: "claude-sonnet-4-20250514" # Base is used as the worker model + +x-advanced-models: +- provider: openai + model: o1-preview + role: lead # Use the predefined lead role (or define a custom role) +``` + +### Predefined Roles + +AutoPilot includes a set of predefined roles defined in [`premade_roles.yaml`](https://github.com/block/goose/blob/main/crates/goose/src/agents/model_selector/premade_roles.yaml) that goose is aware of by default. Examples include: + +- **deep-thinker**: Activates for complex reasoning tasks +- **debugger**: Switches in for error resolution +- **reviewer**: Monitors after extensive tool usage +- **coder**: Handles code implementation tasks +- **mathematician**: Processes mathematical computations + +### Custom Roles + +You can create custom roles with specific triggers by defining them in your `config.yaml` file: + +```yaml +x-advanced-models: +- provider: openai + model: gpt-4o + role: custom-debugger + rules: + triggers: + keywords: ["bug", "broken", "failing", "crash"] + consecutive_failures: 1 + active_turns: 5 + priority: 15 +``` + +
+Custom Role Configuration Fields + +**Rule Configuration:** +| Parameter | Description | Values | +|-----------|-------------|---------| +| `triggers` | Conditions that activate the role | Object (see parameters below) | +| `active_turns` | Number of turns the rule stays active once triggered | Integer (default: 5) | +| `priority` | Selection priority when multiple roles match | Integer (higher wins, default: 0) | + +**Trigger Parameters:** + +| Parameter | Description | Values | +|-----------|-------------|---------| +| `keywords` | Words that activate the role | Array of strings | +| `match_type` | How to match keywords | "any", "all" | +| `complexity_threshold` | Minimum complexity level | "low", "medium", "high" | +| `consecutive_failures` | Failures in sequence | Integer | +| `first_turn` | Trigger on conversation start | Boolean | +| `source` | Message source filter | "human", "machine", "any" | + +The previous table includes several common rule trigger parameters. For the complete list, see the `TriggerRules` struct in [`autopilot.rs`](https://github.com/block/goose/blob/main/crates/goose/src/agents/model_selector/autopilot.rs). + +
diff --git a/documentation/docs/guides/creating-plans.md b/documentation/docs/guides/multi-model/creating-plans.md similarity index 97% rename from documentation/docs/guides/creating-plans.md rename to documentation/docs/guides/multi-model/creating-plans.md index acc095e04551..bd5f37acac80 100644 --- a/documentation/docs/guides/creating-plans.md +++ b/documentation/docs/guides/multi-model/creating-plans.md @@ -24,7 +24,7 @@ The Goose Desktop doesn't have a `plan` keyword. If you want Goose Desktop to cr Unless you ask Goose to "create a plan", it might just start into the project work. ::: -The Goose CLI's plan mode is interactive, asking clarifying questions to understand your project before creating a plan. If you can provide thoughtful and informative answers to those questions, Goose can generate a really useul and actionable plan. +The Goose CLI's plan mode is interactive, asking clarifying questions to understand your project before creating a plan. If you can provide thoughtful and informative answers to those questions, Goose can generate a really useful and actionable plan. ## Set your planner provider and model In some workflows, it can be helpful to use one LLM for planning and a different one for execution. For example, GPT-4.1 tends to excel at strategic planning and breaking down complex tasks into clear, logical steps. On the other hand, Claude Sonnet 3.5 is particularly strong at writing clean, efficient code and following instructions precisely. By using GPT-4.1 to plan and Claude to execute, you can play to the strengths of both models and get better results overall. @@ -34,8 +34,10 @@ The Goose CLI plan mode uses two configuration values: - `GOOSE_PLANNER_PROVIDER`: Which provider to use for planning - `GOOSE_PLANNER_MODEL`: Which model to use for planning -:::tip Use Lead/Worker Mode For Automatic Model Switching -[Lead/worker mode](/docs/guides/environment-variables#leadworker-model-configuration) is an alternative to plan mode. It allows you to configure a powerful lead model for initial planning and reasoning before automatically switching to a faster and/or cheaper worker model for execution. Both modes help you achieve optimal results by balancing model capabilities with cost and speed. +:::tip Multi-Model Alternatives to Plan Mode +goose also supports two options for automatic model switching that help balance model capabilities with cost and speed: +- **[Lead/Worker mode](/docs/guides/environment-variables#leadworker-model-configuration)**: Turn-based switching between two models +- **[AutoPilot](/docs/guides/multi-model/autopilot)**: Context-aware switching between multiple models ::: ### Set Goose planner environment variables diff --git a/documentation/docs/guides/multi-model/index.mdx b/documentation/docs/guides/multi-model/index.mdx new file mode 100644 index 000000000000..fe60d17c2486 --- /dev/null +++ b/documentation/docs/guides/multi-model/index.mdx @@ -0,0 +1,58 @@ +--- +title: Multi-Model Configuration +hide_title: true +description: Approaches for configuring model-switching behavior to optimize for cost, performance, and results. +--- + +import Card from '@site/src/components/Card'; +import styles from '@site/src/components/Card/styles.module.css'; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +

Multi-Model Configuration

+

+ goose supports several approaches for using different models within a single session, allowing you to optimize for cost, performance, and task specialization. Strategies range from manual or turn-based model selection to dynamic, context-aware switching. +

+ +
+

📚 Documentation & Guides

+
+ + + + +
+
+ +
+

📝 Featured Blog Posts

+
+ + +
+
+ diff --git a/documentation/docs/guides/sessions/smart-context-management.md b/documentation/docs/guides/sessions/smart-context-management.md index df8ddc44184d..e87f8232e7ad 100644 --- a/documentation/docs/guides/sessions/smart-context-management.md +++ b/documentation/docs/guides/sessions/smart-context-management.md @@ -287,7 +287,7 @@ Context limits are automatically detected based on your model name, but Goose pr | **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` | | **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` | | **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` | -| **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` | +| **Planner** | Set context for [planner models](/docs/guides/multi-model/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` | :::info This setting only affects the displayed token usage and progress indicators. Actual context management is handled by your LLM, so you may experience more or less usage than the limit you set, regardless of what the display shows. diff --git a/documentation/docs/tutorials/lead-worker.md b/documentation/docs/tutorials/lead-worker.md index 0c89be1d7ae2..310084e8c1c3 100644 --- a/documentation/docs/tutorials/lead-worker.md +++ b/documentation/docs/tutorials/lead-worker.md @@ -13,6 +13,10 @@ The lead/worker model is a smart hand-off system. The "lead" model (think: GPT-4 If things go sideways (e.g. the worker model gets confused or keeps making mistakes), Goose notices and automatically pulls the lead model back in to recover. Once things are back on track, the worker takes over again. +:::tip Consider AutoPilot for Advanced Model Switching +[AutoPilot](/docs/guides/multi-model/autopilot) supports turn-based switching and also offers intelligent context-aware switching between multiple models. +::: + ## Turn-Based System A **turn** is one full interaction - your prompt and the model's response. Goose switches models based on turns: @@ -101,7 +105,7 @@ export RUST_LOG=goose::providers::lead_worker=info ``` ## Planning Mode Compatibility -The lead/worker model is an automatic alternative to the [Goose CLI's `/plan` command](/docs/guides/creating-plans.md). You can assign separate models to use as the lead/worker and planning models. For example: +The lead/worker model is an automatic alternative to the [Goose CLI's `/plan` command](/docs/guides/multi-model/creating-plans). You can assign separate models to use as the lead/worker and planning models. For example: ```bash export GOOSE_PROVIDER="openai" diff --git a/documentation/docs/tutorials/plan-feature-devcontainer-setup.md b/documentation/docs/tutorials/plan-feature-devcontainer-setup.md index 7ef8a2423c76..c3f59dbfb6ca 100644 --- a/documentation/docs/tutorials/plan-feature-devcontainer-setup.md +++ b/documentation/docs/tutorials/plan-feature-devcontainer-setup.md @@ -10,7 +10,7 @@ description: "Learn how to use Goose's Plan feature to break down complex tasks Using Goose for large, complex tasks can feel overwhelming, especially when you're unsure of exactly how you want to approach it in advance. I experienced this when I needed to set up a complex development environment for an [API course](https://github.com/LinkedInLearning/java-automated-api-testing-with-rest-assured-5989068) I published. Between Docker configurations, database initialization, devcontainer setup, and GitHub Codespaces integration, there are dozens of moving pieces that need to work together perfectly. One missing configuration or incorrect dependency can derail the entire process. -This tutorial shows you how to use Goose's [Plan feature](/docs/guides/creating-plans) to transform a complex devcontainer setup into a systematic, executable roadmap. You'll learn how to brainstorm with Goose, refine your requirements, and let Goose create both a detailed plan and implementation checklist. +This tutorial shows you how to use Goose's [Plan feature](/docs/guides/multi-model/creating-plans) to transform a complex devcontainer setup into a systematic, executable roadmap. You'll learn how to brainstorm with Goose, refine your requirements, and let Goose create both a detailed plan and implementation checklist. ## What You'll Learn diff --git a/documentation/docusaurus.config.ts b/documentation/docusaurus.config.ts index c5a739d47fe2..c5aea1d7be58 100644 --- a/documentation/docusaurus.config.ts +++ b/documentation/docusaurus.config.ts @@ -150,6 +150,10 @@ const config: Config = { from: '/docs/guides/goose-in-docker', to: '/docs/tutorials/goose-in-docker' }, + { + from: '/docs/guides/creating-plans', + to: '/docs/guides/multi-model/creating-plans' + }, // MCP tutorial redirects - moved from /docs/tutorials/ to /docs/mcp/ { from: '/docs/tutorials/agentql-mcp',