Skip to content

Conversation

@katzdave
Copy link
Collaborator

@katzdave katzdave commented Nov 18, 2025

Adds an additional smoke test around compaction occurring after OutOfContext errors.

Also gets the proxy + python working in our CI env. Can potentially use the proxy for some kind of rate limit -> continue tests as well.

* 'main' of github.com:block/goose: (49 commits)
  fixing video embed (#5171)
  chore: clean up random unused files (#5166)
  fix: adjust download_cli.sh to tolerate no OS variable (#5169)
  mcp tutorial page for firecrawl (#5152)
  Remove orphaned tool calls before compaction (#5059)
  feat: add copy as markdown button to documentation pages (#5158)
  chore: include vendored node executable (#5160)
  remove extra whitespace from message (#5159)
  Clear deeplinks after use (#5128)
  Revert "Fix gpt-5 input context limit (#4619)" (#5135)
  fix: missing cmake and protobuf for windows build, deduplicate sh/pws… (#5028)
  Fix bedrock tool input schema (#5064)
  Add self-test recipe for goose validation (#5111)
  fix: modifies openai request logic for reasoning models (#4221) (#4294)
  Fix race condition threat when set_param and set_secret of c… (#5109)
  Clean room implementation of the chat process (#5079)
  Bump rmcp (#5096)
  set version in an env variable for testing (#5100)
  fix : enhance fuzzy file search in goose desktop (#5071)
  Make async (#5126)
  ...
* 'main' of github.com:block/goose:
  Declarative providers (#5084)
  adding youtube link to firecrawl mcp tutorial, merge after 9am Eastern Oct 15 (#5173)
  Ollama integration: modified default model + added models  (#5153)
  Fix codex subagent configuration in documentation (#5180)
  fix: include apple silicon build of the desktop app in build artifacts (#5174)
* 'main' of github.com:block/goose: (132 commits)
  Fix/icon ii (#5413)
  Enable runtime access to provider name (#5399)
  fix: ensure trailing newline in files created by `text_editor` tool (#5336)
  docs: September 2025 Community All-Stars (#5411)
  make supports_cache_control async to avoid block in place (#5362)
  Send all the logs we output (#5363)
  Recipe variables (#5365)
  Feat/add mermaid chart rendering (#5377)
  Set up Datadog metrics for prompt injection detection (#5385)
  fix: restore --resume functionality for most recent session (#5401)
  Gemini again (#5390)
  docs(prompt-library): add github-issue-labeler intermediate prompt (#5374)
  docs: add Linux and Windows paths to uninstall section (#5371)
  fix: --session-id shouldn't work without --resume, but --name should (#5360)
  Auto-compact Threshold UI improvements (#5354)
  Filter preserved user messages to be text only. (#5391)
  include sessionId in tool request (#5394)
  feat: add PR Impact Analyzer prompt (#5375)
  docs: add blog post on configuring goose for team environments (#5380)
  migrating back with new chatrecall non underscore name (#5223)
  ...
* 'main' of github.com:block/goose: (61 commits)
  [Autovisualiser] remove unnecessary content from mermaid HTML template (#5505)
  Improve subagents docs (#5484)
  FIX: prefer linux in WSL and add INSTALL_OS override for CLI (#5215)
  Propagate session ID in LLM and MCP requests (#5165)
  feat: YT Short for Canva MCP + goose (#5495)
  Change Recipes Test Script (#5457)
  Goose recover (#5450)
  don't start the default provider (#5351)
  keep the order of keys in config.yaml (#5468)
  Removed drafts and agentIsReady in ChatInput (#5366)
  nextcamp - fix session resume when navigating back to chat in sidebar (#5370)
  feat/fix: set optional config params, and don't overwrite unset secrets (#5325)
  Stringly typed config (#5463)
  Fix: Compaction client <-> server sync  (#5481)
  docs: recipe activity parameter substitution (#5462)
  only run fork on branch PRs (#5461)
  docs: video on goose with apify mcp (#5472)
  Clear windows and fix build failure (#5452)
  Add menu option for setting window always on top (#5429)
  Delete environment variable (#5479)
  ...
* 'main' of github.com:block/goose: (21 commits)
  Manual compaction counting fix + cli cleanup (#5480)
  chore(deps): bump prismjs and react-syntax-highlighter in /ui/desktop (#5549)
  fix: remove qwen3-coder from provider/mcp smoke tests (#5551)
  fix: do not build unsigned desktop app bundles on every PR in ci. add manual option. (#5550)
  fix: update Husky prepare script to v9 format (#5522)
  Fix 404 for responsible coding guide (#5543)
  fix hermit `text file busy` issues on linux (#5372)
  Fix image processing (#5544)
  docs: AI attribution for PRs (#5547)
  chore(tests/mcp): testing for MCP sampling (#5456)
  docs: adding HOWTOAI.md (#5533)
  added configuration content, also added signoff, fix merging issue with another commit by creating a clean branch. removed and closed commits that caused signoff issues. (#5519)
  Fixes Gemini API parse issue by converting nullable type arrays to single types in tool schemas (#5530)
  Troubleshooting diagnostics doc (#5526)
  fix link to Ollama FAQ (#5531)
  docs: remove speech-mcp (#5514)
  fix: adds ProviderRetry to openai provider (#5518)
  docs: extensions directory minor updates (#5466)
  Docs/json recipe support (#5492)
  docs: recipe buttons (#5507)
  ...
* 'main' of github.com:block/goose:
  Sessions required (#5548)
  feat: add grouped extension loading notification (#5529)
  we should run this on main and also test open models at least via ope… (#5556)
  info: print location of sessions.db via goose info (#5557)
  chore: remove yarn usage from documentation (#5555)
  cli: adjust default theme to address #1905 (#5552)
* 'main' of github.com:block/goose: (125 commits)
  Document Mistral AI provider (#5799)
  docs: Add Community Stars recipe script and txt file (#5776)
  chore: incorporate LF feedback (#5787)
  docs: quick launcher (#5779)
  Bump auto scroll threshold (#5738)
  fix: add one-time cleanup for linux hermit locking issues (#5742)
  Don't show update tray icon if GOOSE_VERSION is set (#5750)
  fix: get win node path from registry (#5731)
  Handle spaces in extension names also (#5770)
  Remove empty settings card for Scheduling Engine (#5771)
  fix windows cli build (#5768)
  fix: Implement a CredentialStore for auth (#5741)
  blog post: How to Successfully Migrate Your App with an AI Agent (#5762)
  Simplify finding `goosed` (#5739)
  More time for goosed (#5746)
  Match lower case (#5763)
  scan recipe for security when saving recipe (#5747)
  feat: trying grok for live test (#5732)
  Platform Extension MOIM (Minus One Info Message) (#5027)
  docs: remove hackathon banner (#5756)
  ...
if skip_backoff {
tracing::info!("Skipping backoff due to GOOSE_PROVIDER_SKIP_BACKOFF");
} else {
tracing::info!("Backing off for {:?} before retry", delay);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing is a big pain when testing with the proxy; need to be very precise and knowledgable about how many times it will backoff.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. If you want, you could turn it into a real config variable, but that's probably overkill

…xt-test

* 'main' of github.com:block/goose:
  chore: Add Adrian Cole to Maintainers (#5815)
  [MCP-UI] Proxy and Better Message Handling (#5487)
  Release 1.15.0
  Document New Window menu in macOS dock (#5811)
  Catch cron errors (#5707)
  feat/fix Re-enabled WAL with commit transaction management (Linux Verification Requested) (#5793)
  chore: remove autopilot experimental feature (#5781)
  Read paths from an interactive & login shell (#5774)
  docs: acp clients (#5800)
@katzdave katzdave marked this pull request as ready for review November 19, 2025 19:39
Copilot AI review requested due to automatic review settings November 19, 2025 19:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a third smoke test to validate that Goose properly handles out-of-context errors from providers by triggering compaction. The test uses a new error proxy tool to simulate context-length errors.

Key changes:

  • Added TEST 3 to test_compaction.sh that uses the provider error proxy to inject context-length errors and verify compaction is triggered
  • Enhanced proxy.py with --mode and --no-stdin CLI arguments for automated testing scenarios
  • Added GOOSE_PROVIDER_SKIP_BACKOFF environment variable support to speed up retry testing
  • Updated CI workflow to install Python 3.12 and uv for running the error proxy

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
scripts/test_compaction.sh Adds TEST 3 for out-of-context error compaction using error proxy, removes duplicate jq check
scripts/provider-error-proxy/proxy.py Refactors command parsing into shared function, adds CLI args for automated mode
scripts/provider-error-proxy/README.md Documents new CLI arguments and automated testing usage patterns
crates/goose/src/providers/retry.rs Adds GOOSE_PROVIDER_SKIP_BACKOFF env var to skip backoff delays in tests
.github/workflows/pr-smoke-test.yml Adds Python 3.12 and uv setup steps before compaction tests

- name: Set up Python (for error proxy)
uses: actions/setup-python@v5
with:
python-version: '3.12'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you still need setup-python if you are using the uv action?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah tried the uv first and it didn't seem to come with python.

if skip_backoff {
tracing::info!("Skipping backoff due to GOOSE_PROVIDER_SKIP_BACKOFF");
} else {
tracing::info!("Backing off for {:?} before retry", delay);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. If you want, you could turn it into a real config variable, but that's probably overkill


# Pre-install proxy dependencies (so first run doesn't take forever)
echo "Installing proxy dependencies..."
# Force UV to use public PyPI (override any local/corporate mirrors)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this OK/intended. I wasn't able to install the deps with the block artifactory link, had something like a 404.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah I can see how you'd get there. It seems unnecessary if this is primarily going to be in run CI, but it's also mostly harmless

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was getting that issue in CI! Not sure how it was actually getting injected but lets just go with this.

echo "✗ FAILED: Could not install proxy dependencies"
echo "Setup log:"
cat "$PROXY_SETUP_LOG"
RESULTS+=("✗ Out-of-Context Error (dependency install failed)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

echo "Proxy log:"
cat "$PROXY_LOG"
kill $PROXY_PID 2>/dev/null || true
RESULTS+=("✗ Out-of-Context Error (proxy failed)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do these say Out-of-Context Error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the word 'Out of context test Error' to all of these to clarify.

@katzdave katzdave merged commit c1c772b into main Nov 21, 2025
16 checks passed
@katzdave katzdave deleted the dkatz/out-of-context-test branch November 21, 2025 19:51
michaelneale added a commit that referenced this pull request Nov 24, 2025
* main: (48 commits)
  [fix] generic check for gemini compat (#5842)
  Add scheduler to diagnostics (#5849)
  Cors and token (#5850)
  fix sessions coming back with empty messages (#5841)
  markdown export from URL (#5830)
  Next camp refactor live (#5706)
  Add out of context compaction test via error proxy (#5805)
  fix: Add backward compatibility for conversationCompacted message type (#5819)
  Add /agent/stop endpoint, make max active agents configurable (#5826)
  Handle 404s (#5791)
  Persist provider name and model config in the session (#5419)
  Comment out the flaky mcp callers (#5827)
  Slash commands (#5718)
  fix: remove setx calls to not permanently edit the windows shell PATH (#5821)
  fix: Parse maas models for gcp vertex provider (#5816)
  fix: support Gemini 3's thought signatures (#5806)
  chore: Add Adrian Cole to Maintainers (#5815)
  [MCP-UI] Proxy and Better Message Handling (#5487)
  Release 1.15.0
  Document New Window menu in macOS dock (#5811)
  ...
kskarthik pushed a commit to kskarthik/goose that referenced this pull request Nov 25, 2025
kskarthik pushed a commit to kskarthik/goose that referenced this pull request Nov 26, 2025
BlairAllan pushed a commit to BlairAllan/goose that referenced this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants