-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Manual compaction counting fix + cli cleanup #5480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the conversation compaction feature by simplifying the return type and centralizing token tracking logic. The changes rename "summarize" to "compact" throughout the codebase for consistency, remove redundant token count tracking in the compaction flow, and integrate metrics updates directly into the agent response handling.
Key changes:
- Simplified
compact_messagesreturn type from(Conversation, Vec<usize>, Option<ProviderUsage>)to(Conversation, ProviderUsage) - Centralized token tracking logic in
update_session_metricswith special handling for compaction usage - Renamed "/summarize" command to "/compact" with backward compatibility for the deprecated command
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| crates/goose/src/context_mgmt/mod.rs | Simplified compact_messages return type and removed token count tracking logic |
| crates/goose/src/agents/reply_parts.rs | Added compaction-aware token tracking with SYSTEM_PROMPT_TOKEN_ESTIMATE and is_compaction_usage parameter |
| crates/goose/src/agents/agent.rs | Updated compact_messages call sites to use new signature and added session metrics updates |
| crates/goose/src/agents/mod.rs | Exported MANUAL_COMPACT_TRIGGER constant for CLI usage |
| crates/goose-cli/src/session/mod.rs | Refactored manual compaction to use agent response processing instead of direct API calls |
| crates/goose-cli/src/session/input.rs | Renamed InputResult::Summarize to InputResult::Compact with deprecation warning for /summarize |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| use crate::session::SessionManager; | ||
| use rmcp::model::Tool; | ||
|
|
||
| const SYSTEM_PROMPT_TOKEN_ESTIMATE: i32 = 5000; |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This magic number lacks documentation explaining why 5000 tokens is used as an estimate for the system prompt. Add a comment explaining the rationale behind this value and consider making it configurable if system prompt size varies significantly across different use cases.
| const SYSTEM_PROMPT_TOKEN_ESTIMATE: i32 = 5000; | |
| /// Estimated token count for the system prompt. | |
| /// The default value of 5000 tokens is chosen based on typical prompt sizes and model limitations. | |
| /// If your use case requires a different estimate, you can override this value by setting the | |
| /// `SYSTEM_PROMPT_TOKEN_ESTIMATE` environment variable. | |
| fn system_prompt_token_estimate() -> i32 { | |
| std::env::var("SYSTEM_PROMPT_TOKEN_ESTIMATE") | |
| .ok() | |
| .and_then(|v| v.parse::<i32>().ok()) | |
| .unwrap_or(5000) | |
| } |
| accumulate(session.accumulated_output_tokens, usage.usage.output_tokens); | ||
|
|
||
| let (current_total, current_input, current_output) = if is_compaction_usage { | ||
| // After compaction: summary output becomes new input context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after manual compaction, should this not really just go to zero? have we sent the manually compacted conversation to the model yet or only to the client? although I could see how that would be confusing to a user
it would be good to avoid a hard coded system prompt estimate, just because that could really be all over the place
I suppose we could do something like use the token counts from the first message you send, but that would be complicated to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think this one is more of a question of what we want to show the user. Let's try dropping it to the summary size, there will be a bit of whiplash for the user when the system prompt re-enters on the next message, but I don't love that constant either.
| /// * A tuple containing: | ||
| /// - `Conversation`: The compacted messages | ||
| /// - `Vec<usize>`: Token counts for each message | ||
| /// - `Option<ProviderUsage>`: Provider usage from summarization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, I saw we didn't use this. There's some code on the frontend you could clean up with this too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack will take another pass on the frontend too.
…se into dkatz/manual-compact-fix-usage * 'dkatz/manual-compact-fix-usage' of github.com:block/goose: (35 commits) Fix image processing (#5544) docs: AI attribution for PRs (#5547) chore(tests/mcp): testing for MCP sampling (#5456) docs: adding HOWTOAI.md (#5533) added configuration content, also added signoff, fix merging issue with another commit by creating a clean branch. removed and closed commits that caused signoff issues. (#5519) Fixes Gemini API parse issue by converting nullable type arrays to single types in tool schemas (#5530) Troubleshooting diagnostics doc (#5526) fix link to Ollama FAQ (#5531) docs: remove speech-mcp (#5514) fix: adds ProviderRetry to openai provider (#5518) docs: extensions directory minor updates (#5466) Docs/json recipe support (#5492) docs: recipe buttons (#5507) Improve system theme detection and fallback (#5427) [Autovisualiser] remove unnecessary content from mermaid HTML template (#5505) Improve subagents docs (#5484) FIX: prefer linux in WSL and add INSTALL_OS override for CLI (#5215) Propagate session ID in LLM and MCP requests (#5165) feat: YT Short for Canva MCP + goose (#5495) Change Recipes Test Script (#5457) ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| let (current_total, current_input, current_output) = if is_compaction_usage { | ||
| // After compaction: summary output becomes new input context | ||
| let new_input = usage.usage.output_tokens.map(|out| out); |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The identity closure map(|out| out) is redundant. Consider simplifying to just usage.usage.output_tokens directly.
| let new_input = usage.usage.output_tokens.map(|out| out); | |
| let new_input = usage.usage.output_tokens; |
| s if s == CMD_SUMMARIZE => Some(InputResult::Summarize), | ||
| s if s == CMD_COMPACT => Some(InputResult::Compact), | ||
| s if s == CMD_SUMMARIZE_DEPRECATED => { | ||
| println!("{}", console::style("⚠️ Note: /summarize has been renamed to /compact and will be removed in a future release.").yellow()); |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The deprecation message string is very long (over 100 characters). Consider breaking it into multiple lines or storing it as a constant for better readability and maintainability.
* 'main' of github.com:block/goose: (21 commits) Manual compaction counting fix + cli cleanup (#5480) chore(deps): bump prismjs and react-syntax-highlighter in /ui/desktop (#5549) fix: remove qwen3-coder from provider/mcp smoke tests (#5551) fix: do not build unsigned desktop app bundles on every PR in ci. add manual option. (#5550) fix: update Husky prepare script to v9 format (#5522) Fix 404 for responsible coding guide (#5543) fix hermit `text file busy` issues on linux (#5372) Fix image processing (#5544) docs: AI attribution for PRs (#5547) chore(tests/mcp): testing for MCP sampling (#5456) docs: adding HOWTOAI.md (#5533) added configuration content, also added signoff, fix merging issue with another commit by creating a clean branch. removed and closed commits that caused signoff issues. (#5519) Fixes Gemini API parse issue by converting nullable type arrays to single types in tool schemas (#5530) Troubleshooting diagnostics doc (#5526) fix link to Ollama FAQ (#5531) docs: remove speech-mcp (#5514) fix: adds ProviderRetry to openai provider (#5518) docs: extensions directory minor updates (#5466) Docs/json recipe support (#5492) docs: recipe buttons (#5507) ...
* main: (85 commits) improve linux tray icon support (#5425) feat: log rotation (#5561) use app.isPackaged instead of checking for node env development (#5465) disable RPM build-ID generation to prevent package conflicts (#5563) Add Diagnostics Info to Q&A and Bug Report Templates (#5565) fix: improve server error messages to include HTTP status code (#5532) improvement: add useful error message when attempting to use unauthenticated cursor-agent (#5300) fix: unblock acp via databricks (#5562) feat: add --output-format json flag to goose run command (#5525) Sessions required (#5548) feat: add grouped extension loading notification (#5529) we should run this on main and also test open models at least via ope… (#5556) info: print location of sessions.db via goose info (#5557) chore: remove yarn usage from documentation (#5555) cli: adjust default theme to address #1905 (#5552) Manual compaction counting fix + cli cleanup (#5480) chore(deps): bump prismjs and react-syntax-highlighter in /ui/desktop (#5549) fix: remove qwen3-coder from provider/mcp smoke tests (#5551) fix: do not build unsigned desktop app bundles on every PR in ci. add manual option. (#5550) fix: update Husky prepare script to v9 format (#5522) ...
* main: (54 commits) add clippy warning for string_slice (#5422) improve linux tray icon support (#5425) feat: log rotation (#5561) use app.isPackaged instead of checking for node env development (#5465) disable RPM build-ID generation to prevent package conflicts (#5563) Add Diagnostics Info to Q&A and Bug Report Templates (#5565) fix: improve server error messages to include HTTP status code (#5532) improvement: add useful error message when attempting to use unauthenticated cursor-agent (#5300) fix: unblock acp via databricks (#5562) feat: add --output-format json flag to goose run command (#5525) Sessions required (#5548) feat: add grouped extension loading notification (#5529) we should run this on main and also test open models at least via ope… (#5556) info: print location of sessions.db via goose info (#5557) chore: remove yarn usage from documentation (#5555) cli: adjust default theme to address #1905 (#5552) Manual compaction counting fix + cli cleanup (#5480) chore(deps): bump prismjs and react-syntax-highlighter in /ui/desktop (#5549) fix: remove qwen3-coder from provider/mcp smoke tests (#5551) fix: do not build unsigned desktop app bundles on every PR in ci. add manual option. (#5550) ...
Signed-off-by: fbalicchia <[email protected]>
Signed-off-by: Blair Allan <[email protected]>
Token counts updated in the stream immediately after compaction.
Remove lots of unused code from compaction; only return the provider usage.
Modify update_session_metrics to support using summarization usage to estimate new context window (with system prompt token estimate).
Remove lots of custom code in CLI around token counting + manual summarization trigger
Rename /summarize -> /compact.