Skip to content

Conversation

@katzdave
Copy link
Collaborator

@katzdave katzdave commented Nov 18, 2025

#5255

Track compaction attempts and give up if we get stuck in a doom loop (2 tries)

Copilot AI review requested due to automatic review settings November 18, 2025 20:26
@katzdave katzdave marked this pull request as draft November 18, 2025 20:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an infinite compaction loop that can occur when the context limit is exceeded for small models or with large inputs. The fix introduces a counter to track consecutive compaction attempts and stops after one failed retry with a helpful error message.

  • Adds consecutive_compactions counter to prevent infinite loops when compaction doesn't resolve context limit errors
  • Improves error message clarity in the compaction module
  • Updates error handling to break instead of continuing when compaction fails

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File Description
ui/desktop/package-lock.json Auto-generated npm dependency resolution changes (incidental, not related to fix)
crates/goose/src/context_mgmt/mod.rs Improves error message when compaction fails after removing all tool responses
crates/goose/src/agents/agent.rs Implements consecutive compaction counter and doom loop prevention logic
Files not reviewed (1)
  • ui/desktop/package-lock.json: Language not supported

conversation = compacted_conversation;
did_recovery_compact_this_iteration = true;
yield AgentEvent::HistoryReplaced(conversation.clone());
continue;
Copy link

Copilot AI Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The continue here continues the inner while let Some(next) = stream.next().await loop, but the stream was created before compaction (line 955). After successful compaction, you need to break from the inner loop so the outer loop (line 912) creates a new stream with the compacted conversation. Using continue here will just fetch the next item from the old stream, which won't use the compacted conversation.

Suggested change
continue;
break;

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

* 'main' of github.com:block/goose:
  blog: fixing img url (#5895)
  blog: MCPs for Developers (#5884)
  docs: Extension Manager MCP (#5883)
  Update cleanup marker logic for Fedora users. (#5868)
  Improve AWS credential loading and configuration handling in BedrockProvider  (#5699)
@katzdave katzdave marked this pull request as ready for review December 3, 2025 17:00
Copilot AI review requested due to automatic review settings December 3, 2025 17:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

format!("Ran into this error trying to compact: {e}.\n\nPlease retry if you think this is a transient or recoverable error.")
)
);
error!("Compaction failed: {}", e);
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When compaction fails, the user no longer receives any feedback message. The old code yielded an AgentEvent::Message explaining the error and suggesting retry. Consider adding back a user-facing error message to inform them why the conversation stopped.

Suggested change
error!("Compaction failed: {}", e);
error!("Compaction failed: {}", e);
yield AgentEvent::Message(
Message::assistant().with_system_notification(
SystemNotificationType::InlineMessage,
"Unable to continue: Compaction failed due to an internal error. Try using a shorter message, a model with a larger context window, or start a new session."
)
);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@alexhancock alexhancock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a great thing to catch and handle!

Err(ProviderError::ContextLengthExceeded(_error_msg)) => {
compaction_attempts += 1;

if compaction_attempts >= 2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could just check if the message length is a certain value? Conversation only contains a single user message and compaction failed?

This would be durable to other failure modes on the compaction request not working (like the provider connection is bad, or something else goes wrong)

What do you think?

Copy link
Collaborator Author

@katzdave katzdave Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most common situation this occurs is when pasting something gigantic, so I don't think the single message case is all that general.

That triggers out of context error -> triggers compaction -> loop.

Agree on the sentiment of seeing if the message length is a certain size, but a bit messy in practice and would need to both estimate token counts + have a good understanding of context window size and system prompt size.

I think will go with as is for now, but might be something there using the data from canonical models.

Copilot AI review requested due to automatic review settings December 16, 2025 19:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

format!("Ran into this error trying to compact: {e}.\n\nPlease retry if you think this is a transient or recoverable error.")
)
);
error!("Compaction failed: {}", e);
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When compaction fails, no user-facing message is shown. The old code yielded an AgentEvent::Message explaining the error. Now users will see the conversation end without explanation, which is confusing. Consider adding back a user notification before the break.

Suggested change
error!("Compaction failed: {}", e);
error!("Compaction failed: {}", e);
yield AgentEvent::Message(
Message::assistant().with_system_notification(
SystemNotificationType::InlineMessage,
"Unable to continue: Compaction failed due to context limit. Try using a shorter message, a model with a larger context window, or start a new session."
)
);

Copilot uses AI. Check for mistakes.
@katzdave katzdave merged commit bd00ca2 into main Dec 16, 2025
24 checks passed
@katzdave katzdave deleted the dkatz/compation-doom branch December 16, 2025 19:51
zanesq added a commit that referenced this pull request Dec 16, 2025
…s-predefined-models

* 'main' of github.com:block/goose: (81 commits)
  fix: display shell output as static text instead of spinner (#6041)
  fix : Custom providers with empty API keys show as configured in desktop (#6105)
  Add .agents/skills and ~/.config/agent/skills to skills discovery paths (#6139)
  fix: use instructions for system prompt and prompt for user message in subagents (#6121)
  Fix compaction loop for small models or large input (#5803)
  feat: Centralize theme management with ThemeContext (#6137)
  OpenRouter & Xai streaming (#5873)
  fix: resolve mcp-hermit cleanup path expansion issue (#5953)
  feat: add goose PR reviewer workflow (#6124)
  perf: Avoid repeated MCP queries during streaming responses (#6138)
  Fix YAML serialization for recipes with special characters (#5796)
  Add more posthog analytics (privacy aware) (#6122)
  docs: add Sugar MCP server to extensions registry (#6077)
  Fix tokenState loading on new sessions (#6129)
  bump bedrock dep versions (#6090)
  Don't persist ephemeral extensions when resuming sessions (#5974)
  chore(deps): bump mdast-util-to-hast from 13.2.0 to 13.2.1 in /ui/desktop (#5939)
  chore(deps): bump node-forge from 1.3.1 to 1.3.2 in /documentation (#5898)
  Add Scorecard supply-chain security workflow (#5810)
  Don't show subagent tool when we're a subagent (#6125)
  ...

# Conflicts:
#	crates/goose/src/providers/formats/databricks.rs
aharvard added a commit that referenced this pull request Dec 17, 2025
* main:
  fix: we don't need to warn about tool count when in code mode (#6149)
  deps: upgrade agent-client-protocol to 0.9.0 (#6109)
  fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966)
  More slash commands (#5858)
  fix: MCP UI not rendering due to CallToolResult structure change (#6143)
  fix: display shell output as static text instead of spinner (#6041)
  fix : Custom providers with empty API keys show as configured in desktop (#6105)
  Add .agents/skills and ~/.config/agent/skills to skills discovery paths (#6139)
  fix: use instructions for system prompt and prompt for user message in subagents (#6121)
  Fix compaction loop for small models or large input (#5803)
  feat: Centralize theme management with ThemeContext (#6137)
  OpenRouter & Xai streaming (#5873)
  fix: resolve mcp-hermit cleanup path expansion issue (#5953)
  feat: add goose PR reviewer workflow (#6124)
  perf: Avoid repeated MCP queries during streaming responses (#6138)
  Fix YAML serialization for recipes with special characters (#5796)
  Add more posthog analytics (privacy aware) (#6122)
  docs: add Sugar MCP server to extensions registry (#6077)
dianed-square added a commit that referenced this pull request Dec 17, 2025
* origin/main: (57 commits)
  docs: create/edit recipe button (#6145)
  fix(google): Fix 400 Bad Request error with Gemini 3 thought signatures (#6035)
  fix: we don't need to warn about tool count when in code mode (#6149)
  deps: upgrade agent-client-protocol to 0.9.0 (#6109)
  fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966)
  More slash commands (#5858)
  fix: MCP UI not rendering due to CallToolResult structure change (#6143)
  fix: display shell output as static text instead of spinner (#6041)
  fix : Custom providers with empty API keys show as configured in desktop (#6105)
  Add .agents/skills and ~/.config/agent/skills to skills discovery paths (#6139)
  fix: use instructions for system prompt and prompt for user message in subagents (#6121)
  Fix compaction loop for small models or large input (#5803)
  feat: Centralize theme management with ThemeContext (#6137)
  OpenRouter & Xai streaming (#5873)
  fix: resolve mcp-hermit cleanup path expansion issue (#5953)
  feat: add goose PR reviewer workflow (#6124)
  perf: Avoid repeated MCP queries during streaming responses (#6138)
  Fix YAML serialization for recipes with special characters (#5796)
  Add more posthog analytics (privacy aware) (#6122)
  docs: add Sugar MCP server to extensions registry (#6077)
  ...
katzdave added a commit that referenced this pull request Dec 17, 2025
…icing

* 'main' of github.com:block/goose: (35 commits)
  docs: skills (#6062)
  fix: add conditional configuration for GOOSE_BIN_DIR in PATH (#5940)
  Update dependencies to help in Fedora packaging (#5835)
  fix: make goose reviewer less bad (#6154)
  docs: create/edit recipe button (#6145)
  fix(google): Fix 400 Bad Request error with Gemini 3 thought signatures (#6035)
  fix: we don't need to warn about tool count when in code mode (#6149)
  deps: upgrade agent-client-protocol to 0.9.0 (#6109)
  fix(providers): fix for gemini-cli on windows to work around cmd's multiline prompt limitations #5911 (#5966)
  More slash commands (#5858)
  fix: MCP UI not rendering due to CallToolResult structure change (#6143)
  fix: display shell output as static text instead of spinner (#6041)
  fix : Custom providers with empty API keys show as configured in desktop (#6105)
  Add .agents/skills and ~/.config/agent/skills to skills discovery paths (#6139)
  fix: use instructions for system prompt and prompt for user message in subagents (#6121)
  Fix compaction loop for small models or large input (#5803)
  feat: Centralize theme management with ThemeContext (#6137)
  OpenRouter & Xai streaming (#5873)
  fix: resolve mcp-hermit cleanup path expansion issue (#5953)
  feat: add goose PR reviewer workflow (#6124)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants