Skip to content

Conversation

@kwsantiago
Copy link
Contributor

@kwsantiago kwsantiago commented Jul 24, 2025

Summary

Changes

  • Catch ProviderError::ContextLengthExceeded in agent reply loop
  • Call auto_compact::perform_compaction to reduce conversation size without threshold checking
  • Refactored compaction logic to separate threshold checking from actual compaction
  • Continue conversation if compaction succeeds, fail gracefully with clear error message if not

Testing

  • Builds successfully
  • Existing tests pass
  • Clippy passes with no warnings
  • Code formatted with cargo fmt

@michaelneale
Copy link
Collaborator

thanks @kwsantiago this looks interesting, seems a few things combined (wasn't aware of that openrouter stuf f- we have some other compaction things happening so hopefully wont' hit that or need it?).

Can you talk a bit more about how this prevents - there is new state with the token counting, not sure if that helps with it, or is it sniffing out error messages. I wonder if the openrouter changex can be a separate PR to make this change clearer, but I like the direction

@kwsantiago
Copy link
Contributor Author

kwsantiago commented Jul 28, 2025

@michaelneale Thank you for the feedback!

The current state of this PR is that it parses error messages after hitting the limit and stores the discovered limit in a token tracker, but doesn't use this to prevent future errors. I had initially added the OpenRouter middle-out feature in here since it was mentioned in the issue comments #1303 (comment).

At your suggestion, what I think we can do here is split this out into 2 PRs:

  • This PR (core fix for context limit error handling):

    • Keep the error parsing and improved error messaging
    • Add retry logic that automatically enables middle-out for
      OpenRouter after detecting a context limit error (middle-out won't be implemented in this PR)
    • Remove the token tracker state (since it isn't being used for
      prevention)
  • Separate PR (OpenRouter enhancements):

    • Move OpenRouter-specific middle-out configuration to the
      provider's from_env method
    • Add OPENROUTER_ENABLE_MIDDLE_OUT env var support
    • Implement proactive token counting if needed

I'll begin working on this, but if you have any other feedback, please let me know.

@kwsantiago kwsantiago changed the title fix: context limit error handling and OpenRouter middle-out support fix: context limit error handling Jul 28, 2025
@kwsantiago kwsantiago marked this pull request as draft July 28, 2025 15:20
@michaelneale michaelneale self-assigned this Jul 28, 2025
@kwsantiago kwsantiago marked this pull request as ready for review July 29, 2025 00:36
@kwsantiago
Copy link
Contributor Author

@michaelneale I figured what I had was too complex for solving this specific issue and better left as separate PRs (one PR for enhancing context limit error handling and another PR for the OpenRouter middle-out support).

Let me know if you approve of the simple change here to solve #1303 and I can work on the other PRs using this one as a base.

@kwsantiago kwsantiago requested a review from michaelneale July 29, 2025 00:39
Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this. Maybe we should have a little discussion on Discord (tag me as DOsinga?) on what should happen here. @katzdave has good improvements on compacting, but we still have this situation where we naturally run out of context - presumably often because a tool call did it.

I think we would need some more active management of the message history

@kwsantiago
Copy link
Contributor Author

kwsantiago commented Jul 29, 2025

Sounds good @DOsinga I made a thread in Discord to discuss. https://discord.com/channels/1287729918100246654/1344318255266795641/1399727098574143508

I'm turning this PR into a draft until we get concrete direction on how we want to move forward with this fix.

@kwsantiago kwsantiago requested a review from DOsinga July 29, 2025 13:51
@kwsantiago kwsantiago marked this pull request as draft July 29, 2025 13:51
@kwsantiago kwsantiago changed the title fix: context limit error handling fix: context limit error handling & automatic rollback Jul 29, 2025
@kwsantiago
Copy link
Contributor Author

@DOsinga let me know what you think. Based on your proposed solution from the Discord thread, I added an initial implementation for automatic recovery when tool responses are too big and hit context limits. In this way, Goose recovers automatically instead of dying silently when tools return huge responses.

@michaelneale michaelneale removed their assignment Jul 30, 2025
@kwsantiago kwsantiago marked this pull request as ready for review August 2, 2025 15:06
@DOsinga
Copy link
Collaborator

DOsinga commented Aug 2, 2025

I think @katzdave should have a look how this fits in with his efforts

@DOsinga
Copy link
Collaborator

DOsinga commented Aug 2, 2025

Ok, so a previous attempt to unify http clients was this:
#3547

I'm currently working on this:
#3558

to unify some other stuff, but was about to copy the client sharing from the firrst (closed) PR. I'm thinking we should get that in and that should make your life easier. I think having a fixed timeout for now would be fine actually for now. I imagine we replace all the the environment variables with true settings soon anyway.

ceterum censeo variabiles ambientis esse delendas

@kwsantiago
Copy link
Contributor Author

Sounds good to me @DOsinga let me know when we're in a good spot and I can finish this up in a way that makes sense to all stakeholders. Thank you!

@DOsinga
Copy link
Collaborator

DOsinga commented Aug 2, 2025

thank you for your efforts!

@michaelneale
Copy link
Collaborator

michaelneale commented Aug 6, 2025

@kwsantiago very nice looking focussed - some changes merged recently do you want to update this to main? (I could also push to it if you like). Curious if this helps with the openrouter context exhaustion.

I think we are in a better spot now to work on this.

On openrouter - is there any downside to having middle-out on by default? (ideally we wouldn't need it, but if it doesn't kick in until it absolutely has to then why not).

@michaelneale michaelneale self-assigned this Aug 6, 2025
@michaelneale
Copy link
Collaborator

I think this looks good - has tests, and if we can update it to main and check it is ok, lets get it in. For open router I am thinking of adding in middle-out by default (why not) anyway.

@kwsantiago
Copy link
Contributor Author

@michaelneale sounds good, if you don't get to it, I can do this tonight (EST).

Regarding open router middle-out support, should I resurrect that in a follow up PR or within this PR? I know we had originally discussed taking it out which is why I got rid of it in this branch, although it is in some of my earlier commits if we want to dig it back out.

@michaelneale
Copy link
Collaborator

@kwsantiago yeah if you want to add a middle-out in another PR - go ahead (I have an open PR which enables some defaults for startup experience so could always do it there) - if you have time, go ahead and tag me

@kwsantiago
Copy link
Contributor Author

kwsantiago commented Sep 2, 2025

@katzdave Thanks for the detailed feedback! I've addressed the issues:

  1. Reverted to using true context limit instead of estimate_target_context_limit as requested
  2. Added fallback detection for context errors in message text to handle databricks' nested error format
  3. The fix should now catch errors like "input length and max_tokens exceed context limit" and trigger auto-compaction

I don't have access to test with actual providers (databricks/openrouter), so I'd appreciate if someone could verify the fix works end-to-end. I won't have the bandwidth to iterate much further on this, but the changes should handle the reported issues. Feel free to edit the branch as needed.

@kwsantiago kwsantiago requested a review from katzdave September 2, 2025 23:35
@michaelneale
Copy link
Collaborator

@katzdave both @zanesq and I see that databricks error too, unrelated to this change, so it may be an unrelated regression?

@michaelneale
Copy link
Collaborator

@kwsantiago can you run fmt again?

@kwsantiago
Copy link
Contributor Author

kwsantiago commented Sep 3, 2025

@michaelneale I just committed cargo fmt.

@michaelneale michaelneale self-assigned this Sep 3, 2025
@michaelneale
Copy link
Collaborator

@kwsantiago sorry again - but can you update to main, hopefully will be last time. Will try this now, I think this may be good.

@kwsantiago kwsantiago force-pushed the kwsantiago/fix-context-limit-error branch from f76bb52 to 4634278 Compare September 4, 2025 00:24
Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice @kwsantiago - took a while but got there in the end!

Copy link
Collaborator

@katzdave katzdave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice work! I think we can revert the changes in utils/errors.rs completely; taking it for one last spin quickly will update shortly.

@katzdave
Copy link
Collaborator

katzdave commented Sep 4, 2025

Yes, nice work! I think we can revert the changes in utils/errors.rs completely; taking it for one last spin quickly will update shortly.

Confirming lets completely revert utils/errors.rs and then all set to merge.

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - what @katzdave says

Signed-off-by: Kyle 🐆 <kyle@privkey.io>
@kwsantiago kwsantiago force-pushed the kwsantiago/fix-context-limit-error branch from 8f2ce26 to e5c3982 Compare September 4, 2025 01:27
@kwsantiago
Copy link
Contributor Author

@katzdave @michaelneale should be good to go, let me know if there is anything else here. Appreciate all the feedback!

@michaelneale
Copy link
Collaborator

michaelneale commented Sep 4, 2025

@kwsantiago did you mean to remove more than just error.rs changes? as looks a lot smaller now, as there were utils.rs changes which I think were needed? (a force push so can't see the history)

edit: yes you did, ignore me! nice!

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks! took it for quick spin

@michaelneale michaelneale merged commit bce0466 into block:main Sep 4, 2025
10 checks passed
@kwsantiago kwsantiago deleted the kwsantiago/fix-context-limit-error branch September 4, 2025 02:01
dianed-square pushed a commit that referenced this pull request Sep 4, 2025
Signed-off-by: Kyle 🐆 <kyle@privkey.io>
katzdave added a commit that referenced this pull request Sep 4, 2025
* 'main' of github.com:block/goose:
  Fix databricks streaming errors  (#4506)
  docs: malware check for uvx and npx extensions (#4508)
  fix: auto-compact on context limit error (#3635)
  feat: multi model and multi provider config and auto switching (#4035)
This was referenced Sep 9, 2025
thebristolsound pushed a commit to thebristolsound/goose that referenced this pull request Sep 11, 2025
Signed-off-by: Kyle 🐆 <kyle@privkey.io>
Signed-off-by: Matt Donovan <mattddonovan@protonmail.com>
HikaruEgashira pushed a commit to HikaruEgashira/goose that referenced this pull request Oct 3, 2025
Signed-off-by: Kyle 🐆 <kyle@privkey.io>
Signed-off-by: HikaruEgashira <hikaru-egashira@c-fo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Goose stops responding without any error message once conversation hits context limit on openrouter.

5 participants