-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add out of context compaction test via error proxy #5805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* 'main' of github.com:block/goose: (49 commits) fixing video embed (#5171) chore: clean up random unused files (#5166) fix: adjust download_cli.sh to tolerate no OS variable (#5169) mcp tutorial page for firecrawl (#5152) Remove orphaned tool calls before compaction (#5059) feat: add copy as markdown button to documentation pages (#5158) chore: include vendored node executable (#5160) remove extra whitespace from message (#5159) Clear deeplinks after use (#5128) Revert "Fix gpt-5 input context limit (#4619)" (#5135) fix: missing cmake and protobuf for windows build, deduplicate sh/pws… (#5028) Fix bedrock tool input schema (#5064) Add self-test recipe for goose validation (#5111) fix: modifies openai request logic for reasoning models (#4221) (#4294) Fix race condition threat when set_param and set_secret of c… (#5109) Clean room implementation of the chat process (#5079) Bump rmcp (#5096) set version in an env variable for testing (#5100) fix : enhance fuzzy file search in goose desktop (#5071) Make async (#5126) ...
* 'main' of github.com:block/goose: Declarative providers (#5084) adding youtube link to firecrawl mcp tutorial, merge after 9am Eastern Oct 15 (#5173) Ollama integration: modified default model + added models (#5153) Fix codex subagent configuration in documentation (#5180) fix: include apple silicon build of the desktop app in build artifacts (#5174)
* 'main' of github.com:block/goose: (132 commits) Fix/icon ii (#5413) Enable runtime access to provider name (#5399) fix: ensure trailing newline in files created by `text_editor` tool (#5336) docs: September 2025 Community All-Stars (#5411) make supports_cache_control async to avoid block in place (#5362) Send all the logs we output (#5363) Recipe variables (#5365) Feat/add mermaid chart rendering (#5377) Set up Datadog metrics for prompt injection detection (#5385) fix: restore --resume functionality for most recent session (#5401) Gemini again (#5390) docs(prompt-library): add github-issue-labeler intermediate prompt (#5374) docs: add Linux and Windows paths to uninstall section (#5371) fix: --session-id shouldn't work without --resume, but --name should (#5360) Auto-compact Threshold UI improvements (#5354) Filter preserved user messages to be text only. (#5391) include sessionId in tool request (#5394) feat: add PR Impact Analyzer prompt (#5375) docs: add blog post on configuring goose for team environments (#5380) migrating back with new chatrecall non underscore name (#5223) ...
* 'main' of github.com:block/goose: (61 commits) [Autovisualiser] remove unnecessary content from mermaid HTML template (#5505) Improve subagents docs (#5484) FIX: prefer linux in WSL and add INSTALL_OS override for CLI (#5215) Propagate session ID in LLM and MCP requests (#5165) feat: YT Short for Canva MCP + goose (#5495) Change Recipes Test Script (#5457) Goose recover (#5450) don't start the default provider (#5351) keep the order of keys in config.yaml (#5468) Removed drafts and agentIsReady in ChatInput (#5366) nextcamp - fix session resume when navigating back to chat in sidebar (#5370) feat/fix: set optional config params, and don't overwrite unset secrets (#5325) Stringly typed config (#5463) Fix: Compaction client <-> server sync (#5481) docs: recipe activity parameter substitution (#5462) only run fork on branch PRs (#5461) docs: video on goose with apify mcp (#5472) Clear windows and fix build failure (#5452) Add menu option for setting window always on top (#5429) Delete environment variable (#5479) ...
* 'main' of github.com:block/goose: (21 commits) Manual compaction counting fix + cli cleanup (#5480) chore(deps): bump prismjs and react-syntax-highlighter in /ui/desktop (#5549) fix: remove qwen3-coder from provider/mcp smoke tests (#5551) fix: do not build unsigned desktop app bundles on every PR in ci. add manual option. (#5550) fix: update Husky prepare script to v9 format (#5522) Fix 404 for responsible coding guide (#5543) fix hermit `text file busy` issues on linux (#5372) Fix image processing (#5544) docs: AI attribution for PRs (#5547) chore(tests/mcp): testing for MCP sampling (#5456) docs: adding HOWTOAI.md (#5533) added configuration content, also added signoff, fix merging issue with another commit by creating a clean branch. removed and closed commits that caused signoff issues. (#5519) Fixes Gemini API parse issue by converting nullable type arrays to single types in tool schemas (#5530) Troubleshooting diagnostics doc (#5526) fix link to Ollama FAQ (#5531) docs: remove speech-mcp (#5514) fix: adds ProviderRetry to openai provider (#5518) docs: extensions directory minor updates (#5466) Docs/json recipe support (#5492) docs: recipe buttons (#5507) ...
* 'main' of github.com:block/goose: Sessions required (#5548) feat: add grouped extension loading notification (#5529) we should run this on main and also test open models at least via ope… (#5556) info: print location of sessions.db via goose info (#5557) chore: remove yarn usage from documentation (#5555) cli: adjust default theme to address #1905 (#5552)
* 'main' of github.com:block/goose: (125 commits) Document Mistral AI provider (#5799) docs: Add Community Stars recipe script and txt file (#5776) chore: incorporate LF feedback (#5787) docs: quick launcher (#5779) Bump auto scroll threshold (#5738) fix: add one-time cleanup for linux hermit locking issues (#5742) Don't show update tray icon if GOOSE_VERSION is set (#5750) fix: get win node path from registry (#5731) Handle spaces in extension names also (#5770) Remove empty settings card for Scheduling Engine (#5771) fix windows cli build (#5768) fix: Implement a CredentialStore for auth (#5741) blog post: How to Successfully Migrate Your App with an AI Agent (#5762) Simplify finding `goosed` (#5739) More time for goosed (#5746) Match lower case (#5763) scan recipe for security when saving recipe (#5747) feat: trying grok for live test (#5732) Platform Extension MOIM (Minus One Info Message) (#5027) docs: remove hackathon banner (#5756) ...
| if skip_backoff { | ||
| tracing::info!("Skipping backoff due to GOOSE_PROVIDER_SKIP_BACKOFF"); | ||
| } else { | ||
| tracing::info!("Backing off for {:?} before retry", delay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing is a big pain when testing with the proxy; need to be very precise and knowledgable about how many times it will backoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. If you want, you could turn it into a real config variable, but that's probably overkill
…xt-test * 'main' of github.com:block/goose: chore: Add Adrian Cole to Maintainers (#5815) [MCP-UI] Proxy and Better Message Handling (#5487) Release 1.15.0 Document New Window menu in macOS dock (#5811) Catch cron errors (#5707) feat/fix Re-enabled WAL with commit transaction management (Linux Verification Requested) (#5793) chore: remove autopilot experimental feature (#5781) Read paths from an interactive & login shell (#5774) docs: acp clients (#5800)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a third smoke test to validate that Goose properly handles out-of-context errors from providers by triggering compaction. The test uses a new error proxy tool to simulate context-length errors.
Key changes:
- Added TEST 3 to
test_compaction.shthat uses the provider error proxy to inject context-length errors and verify compaction is triggered - Enhanced
proxy.pywith--modeand--no-stdinCLI arguments for automated testing scenarios - Added
GOOSE_PROVIDER_SKIP_BACKOFFenvironment variable support to speed up retry testing - Updated CI workflow to install Python 3.12 and
uvfor running the error proxy
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/test_compaction.sh | Adds TEST 3 for out-of-context error compaction using error proxy, removes duplicate jq check |
| scripts/provider-error-proxy/proxy.py | Refactors command parsing into shared function, adds CLI args for automated mode |
| scripts/provider-error-proxy/README.md | Documents new CLI arguments and automated testing usage patterns |
| crates/goose/src/providers/retry.rs | Adds GOOSE_PROVIDER_SKIP_BACKOFF env var to skip backoff delays in tests |
| .github/workflows/pr-smoke-test.yml | Adds Python 3.12 and uv setup steps before compaction tests |
| - name: Set up Python (for error proxy) | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.12' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you still need setup-python if you are using the uv action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah tried the uv first and it didn't seem to come with python.
| if skip_backoff { | ||
| tracing::info!("Skipping backoff due to GOOSE_PROVIDER_SKIP_BACKOFF"); | ||
| } else { | ||
| tracing::info!("Backing off for {:?} before retry", delay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. If you want, you could turn it into a real config variable, but that's probably overkill
scripts/test_compaction.sh
Outdated
|
|
||
| # Pre-install proxy dependencies (so first run doesn't take forever) | ||
| echo "Installing proxy dependencies..." | ||
| # Force UV to use public PyPI (override any local/corporate mirrors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this OK/intended. I wasn't able to install the deps with the block artifactory link, had something like a 404.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah I can see how you'd get there. It seems unnecessary if this is primarily going to be in run CI, but it's also mostly harmless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was getting that issue in CI! Not sure how it was actually getting injected but lets just go with this.
| echo "✗ FAILED: Could not install proxy dependencies" | ||
| echo "Setup log:" | ||
| cat "$PROXY_SETUP_LOG" | ||
| RESULTS+=("✗ Out-of-Context Error (dependency install failed)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh?
scripts/test_compaction.sh
Outdated
| echo "Proxy log:" | ||
| cat "$PROXY_LOG" | ||
| kill $PROXY_PID 2>/dev/null || true | ||
| RESULTS+=("✗ Out-of-Context Error (proxy failed)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do these say Out-of-Context Error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the word 'Out of context test Error' to all of these to clarify.
* main: (48 commits) [fix] generic check for gemini compat (#5842) Add scheduler to diagnostics (#5849) Cors and token (#5850) fix sessions coming back with empty messages (#5841) markdown export from URL (#5830) Next camp refactor live (#5706) Add out of context compaction test via error proxy (#5805) fix: Add backward compatibility for conversationCompacted message type (#5819) Add /agent/stop endpoint, make max active agents configurable (#5826) Handle 404s (#5791) Persist provider name and model config in the session (#5419) Comment out the flaky mcp callers (#5827) Slash commands (#5718) fix: remove setx calls to not permanently edit the windows shell PATH (#5821) fix: Parse maas models for gcp vertex provider (#5816) fix: support Gemini 3's thought signatures (#5806) chore: Add Adrian Cole to Maintainers (#5815) [MCP-UI] Proxy and Better Message Handling (#5487) Release 1.15.0 Document New Window menu in macOS dock (#5811) ...
Signed-off-by: Sai Karthik <[email protected]>
Signed-off-by: Blair Allan <[email protected]>
Adds an additional smoke test around compaction occurring after OutOfContext errors.
Also gets the proxy + python working in our CI env. Can potentially use the proxy for some kind of rate limit -> continue tests as well.