From f91d408fe7b9454732738646aa70c87a7943183a Mon Sep 17 00:00:00 2001 From: Diane Diaz Date: Fri, 11 Jul 2025 15:17:42 -0700 Subject: [PATCH 1/3] model context limit overrides --- .../docs/guides/environment-variables.md | 4 +- .../docs/guides/smart-context-management.md | 58 +++++++++++++++++++ 2 files changed, 61 insertions(+), 1 deletion(-) diff --git a/documentation/docs/guides/environment-variables.md b/documentation/docs/guides/environment-variables.md index 486de5118399..e959b799793b 100644 --- a/documentation/docs/guides/environment-variables.md +++ b/documentation/docs/guides/environment-variables.md @@ -127,7 +127,7 @@ export GOOSE_MAX_TURNS=25 export GOOSE_MAX_TURNS=100 ``` -### Context Limit Configuration +### Model Context Limit Overrides These variables allow you to override the default context window size (token limit) for your models. This is particularly useful when using [LiteLLM proxies](https://docs.litellm.ai/docs/providers/litellm_proxy) or custom models that don't match Goose's predefined model patterns. @@ -152,6 +152,8 @@ export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution export GOOSE_PLANNER_CONTEXT_LIMIT=1000000 ``` +For more details and examples, see [Model Context Limit Overrides](/docs/guides/smart-context-management#model-context-limit-overrides). + ## Tool Configuration These variables control how Goose handles [tool permissions](/docs/guides/managing-tools/tool-permissions) and their execution. diff --git a/documentation/docs/guides/smart-context-management.md b/documentation/docs/guides/smart-context-management.md index 2f167cf6dbae..a753fbdfd503 100644 --- a/documentation/docs/guides/smart-context-management.md +++ b/documentation/docs/guides/smart-context-management.md @@ -137,6 +137,64 @@ Key information has been preserved while reducing context length. +## Model Context Limit Overrides + +Context limits are automatically detected based on your model name, but Goose provides settings to override the default limits: + +| Model | Description | Best For | Setting | +|-------|-------------|----------|---------| +| **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` | +| **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` | +| **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` | +| **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` | + +This feature is particularly useful with: + +- **LiteLLM Proxy Models**: When using LiteLLM with custom model names that don't match Goose's patterns +- **Enterprise Deployments**: Custom model deployments with non-standard naming +- **Fine-tuned Models**: Custom models with different context limits than their base versions +- **Development/Testing**: Temporarily adjusting context limits for testing purposes + +Goose resolves context limits with the following precedence (highest to lowest): + +1. Explicit context_limit in model configuration (if set programmatically) +2. Specific environment variable (e.g., `GOOSE_LEAD_CONTEXT_LIMIT`) +3. Global environment variable (`GOOSE_CONTEXT_LIMIT`) +4. Model-specific default based on name pattern matching +5. Global default (128,000 tokens) + +Session [environment variables](/docs/guides/environment-variables#model-context-limit-overrides) take precedence over the corresponding key in the [configuration file](/docs/guides/config-file). + +:::info +These settings cannot be configured through the Desktop app or using `goose configure`. +::: + +**Scenario 1: LiteLLM proxy with custom model name** + +```bash +# LiteLLM proxy with custom model name +export GOOSE_PROVIDER="openai" +export GOOSE_MODEL="my-custom-gpt4-proxy" +export GOOSE_CONTEXT_LIMIT=200000 # Override the 32k default +``` + +**Scenario 2: Lead/worker setup with different context limits** + +```bash +# Different context limits for planning vs execution +export GOOSE_LEAD_MODEL="claude-opus-custom" +export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning +export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution +``` + +**Scenario 3: Planner with large context** + +```bash +# Large context for complex planning +export GOOSE_PLANNER_MODEL="gpt-4-custom" +export GOOSE_PLANNER_CONTEXT_LIMIT=1000000 +``` + ## Maximum Turns The `Max Turns` limit is the maximum number of consecutive turns that Goose can take without user input (default: 1000). When the limit is reached, Goose stops and prompts: "I've reached the maximum number of actions I can do without user input. Would you like me to continue?" If the user answers in the affirmative, Goose continues until the limit is reached and then prompts again. From 7653c367b509092c3fa72744fc43827baff5de9e Mon Sep 17 00:00:00 2001 From: Diane Diaz Date: Mon, 14 Jul 2025 09:08:18 -0700 Subject: [PATCH 2/3] env vars only --- .../docs/guides/smart-context-management.md | 36 +++++++++++++++---- 1 file changed, 29 insertions(+), 7 deletions(-) diff --git a/documentation/docs/guides/smart-context-management.md b/documentation/docs/guides/smart-context-management.md index a753fbdfd503..aba2823cab81 100644 --- a/documentation/docs/guides/smart-context-management.md +++ b/documentation/docs/guides/smart-context-management.md @@ -148,6 +148,10 @@ Context limits are automatically detected based on your model name, but Goose pr | **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` | | **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` | +:::info +This setting only affects the displayed token usage and progress indicators. Actual context management is handled by your LLM, so you may experience more or less usage than the limit you set, regardless of what the display shows. +::: + This feature is particularly useful with: - **LiteLLM Proxy Models**: When using LiteLLM with custom model names that don't match Goose's patterns @@ -163,13 +167,31 @@ Goose resolves context limits with the following precedence (highest to lowest): 4. Model-specific default based on name pattern matching 5. Global default (128,000 tokens) -Session [environment variables](/docs/guides/environment-variables#model-context-limit-overrides) take precedence over the corresponding key in the [configuration file](/docs/guides/config-file). +**Configuration** -:::info -These settings cannot be configured through the Desktop app or using `goose configure`. -::: -**Scenario 1: LiteLLM proxy with custom model name** + + + + Model context limit overrides are not yet available in the Goose Desktop app. + + + + + Context limit overrides only work as [environment variables](/docs/guides/environment-variables#model-context-limit-overrides), not in the config file. + + ```bash + export GOOSE_CONTEXT_LIMIT=1000 + goose session + ``` + + + + + +**Scenarios** + +1. LiteLLM proxy with custom model name ```bash # LiteLLM proxy with custom model name @@ -178,7 +200,7 @@ export GOOSE_MODEL="my-custom-gpt4-proxy" export GOOSE_CONTEXT_LIMIT=200000 # Override the 32k default ``` -**Scenario 2: Lead/worker setup with different context limits** +2. Lead/worker setup with different context limits ```bash # Different context limits for planning vs execution @@ -187,7 +209,7 @@ export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution ``` -**Scenario 3: Planner with large context** +3. Planner with large context ```bash # Large context for complex planning From f65fbdd71c0c32fe0a5e60cf678ce557b12d23ec Mon Sep 17 00:00:00 2001 From: Diane Diaz Date: Mon, 14 Jul 2025 09:44:32 -0700 Subject: [PATCH 3/3] relocate section --- .../docs/guides/smart-context-management.md | 159 +++++++++--------- 1 file changed, 79 insertions(+), 80 deletions(-) diff --git a/documentation/docs/guides/smart-context-management.md b/documentation/docs/guides/smart-context-management.md index aba2823cab81..ae332f3b0b76 100644 --- a/documentation/docs/guides/smart-context-management.md +++ b/documentation/docs/guides/smart-context-management.md @@ -137,86 +137,6 @@ Key information has been preserved while reducing context length. -## Model Context Limit Overrides - -Context limits are automatically detected based on your model name, but Goose provides settings to override the default limits: - -| Model | Description | Best For | Setting | -|-------|-------------|----------|---------| -| **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` | -| **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` | -| **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` | -| **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` | - -:::info -This setting only affects the displayed token usage and progress indicators. Actual context management is handled by your LLM, so you may experience more or less usage than the limit you set, regardless of what the display shows. -::: - -This feature is particularly useful with: - -- **LiteLLM Proxy Models**: When using LiteLLM with custom model names that don't match Goose's patterns -- **Enterprise Deployments**: Custom model deployments with non-standard naming -- **Fine-tuned Models**: Custom models with different context limits than their base versions -- **Development/Testing**: Temporarily adjusting context limits for testing purposes - -Goose resolves context limits with the following precedence (highest to lowest): - -1. Explicit context_limit in model configuration (if set programmatically) -2. Specific environment variable (e.g., `GOOSE_LEAD_CONTEXT_LIMIT`) -3. Global environment variable (`GOOSE_CONTEXT_LIMIT`) -4. Model-specific default based on name pattern matching -5. Global default (128,000 tokens) - -**Configuration** - - - - - - Model context limit overrides are not yet available in the Goose Desktop app. - - - - - Context limit overrides only work as [environment variables](/docs/guides/environment-variables#model-context-limit-overrides), not in the config file. - - ```bash - export GOOSE_CONTEXT_LIMIT=1000 - goose session - ``` - - - - - -**Scenarios** - -1. LiteLLM proxy with custom model name - -```bash -# LiteLLM proxy with custom model name -export GOOSE_PROVIDER="openai" -export GOOSE_MODEL="my-custom-gpt4-proxy" -export GOOSE_CONTEXT_LIMIT=200000 # Override the 32k default -``` - -2. Lead/worker setup with different context limits - -```bash -# Different context limits for planning vs execution -export GOOSE_LEAD_MODEL="claude-opus-custom" -export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning -export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution -``` - -3. Planner with large context - -```bash -# Large context for complex planning -export GOOSE_PLANNER_MODEL="gpt-4-custom" -export GOOSE_PLANNER_CONTEXT_LIMIT=1000000 -``` - ## Maximum Turns The `Max Turns` limit is the maximum number of consecutive turns that Goose can take without user input (default: 1000). When the limit is reached, Goose stops and prompts: "I've reached the maximum number of actions I can do without user input. Would you like me to continue?" If the user answers in the affirmative, Goose continues until the limit is reached and then prompts again. @@ -344,6 +264,85 @@ After sending your first message, Goose Desktop and Goose CLI display token usag +## Model Context Limit Overrides + +Context limits are automatically detected based on your model name, but Goose provides settings to override the default limits: + +| Model | Description | Best For | Setting | +|-------|-------------|----------|---------| +| **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` | +| **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` | +| **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` | +| **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` | + +:::info +This setting only affects the displayed token usage and progress indicators. Actual context management is handled by your LLM, so you may experience more or less usage than the limit you set, regardless of what the display shows. +::: + +This feature is particularly useful with: + +- **LiteLLM Proxy Models**: When using LiteLLM with custom model names that don't match Goose's patterns +- **Enterprise Deployments**: Custom model deployments with non-standard naming +- **Fine-tuned Models**: Custom models with different context limits than their base versions +- **Development/Testing**: Temporarily adjusting context limits for testing purposes + +Goose resolves context limits with the following precedence (highest to lowest): + +1. Explicit context_limit in model configuration (if set programmatically) +2. Specific environment variable (e.g., `GOOSE_LEAD_CONTEXT_LIMIT`) +3. Global environment variable (`GOOSE_CONTEXT_LIMIT`) +4. Model-specific default based on name pattern matching +5. Global default (128,000 tokens) + +**Configuration** + + + + + Model context limit overrides are not yet available in the Goose Desktop app. + + + + + Context limit overrides only work as [environment variables](/docs/guides/environment-variables#model-context-limit-overrides), not in the config file. + + ```bash + export GOOSE_CONTEXT_LIMIT=1000 + goose session + ``` + + + + + +**Scenarios** + +1. LiteLLM proxy with custom model name + +```bash +# LiteLLM proxy with custom model name +export GOOSE_PROVIDER="openai" +export GOOSE_MODEL="my-custom-gpt4-proxy" +export GOOSE_CONTEXT_LIMIT=200000 # Override the 32k default +``` + +2. Lead/worker setup with different context limits + +```bash +# Different context limits for planning vs execution +export GOOSE_LEAD_MODEL="claude-opus-custom" +export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning +export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution +``` + +3. Planner with large context + +```bash +# Large context for complex planning +export GOOSE_PLANNER_MODEL="gpt-4-custom" +export GOOSE_PLANNER_CONTEXT_LIMIT=1000000 +``` + ## Cost Tracking Display estimated real-time costs of your session at the bottom of the Goose Desktop window.