-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: auto-compact on context limit error #3635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: auto-compact on context limit error #3635
Conversation
|
thanks @kwsantiago this looks interesting, seems a few things combined (wasn't aware of that openrouter stuf f- we have some other compaction things happening so hopefully wont' hit that or need it?). Can you talk a bit more about how this prevents - there is new state with the token counting, not sure if that helps with it, or is it sniffing out error messages. I wonder if the openrouter changex can be a separate PR to make this change clearer, but I like the direction |
|
@michaelneale Thank you for the feedback! The current state of this PR is that it parses error messages after hitting the limit and stores the discovered limit in a token tracker, but doesn't use this to prevent future errors. I had initially added the OpenRouter middle-out feature in here since it was mentioned in the issue comments #1303 (comment). At your suggestion, what I think we can do here is split this out into 2 PRs:
I'll begin working on this, but if you have any other feedback, please let me know. |
|
@michaelneale I figured what I had was too complex for solving this specific issue and better left as separate PRs (one PR for enhancing context limit error handling and another PR for the OpenRouter middle-out support). Let me know if you approve of the simple change here to solve #1303 and I can work on the other PRs using this one as a base. |
DOsinga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this. Maybe we should have a little discussion on Discord (tag me as DOsinga?) on what should happen here. @katzdave has good improvements on compacting, but we still have this situation where we naturally run out of context - presumably often because a tool call did it.
I think we would need some more active management of the message history
|
Sounds good @DOsinga I made a thread in Discord to discuss. https://discord.com/channels/1287729918100246654/1344318255266795641/1399727098574143508 I'm turning this PR into a draft until we get concrete direction on how we want to move forward with this fix. |
|
@DOsinga let me know what you think. Based on your proposed solution from the Discord thread, I added an initial implementation for automatic recovery when tool responses are too big and hit context limits. In this way, Goose recovers automatically instead of dying silently when tools return huge responses. |
|
I think @katzdave should have a look how this fits in with his efforts |
|
Ok, so a previous attempt to unify http clients was this: I'm currently working on this: to unify some other stuff, but was about to copy the client sharing from the firrst (closed) PR. I'm thinking we should get that in and that should make your life easier. I think having a fixed timeout for now would be fine actually for now. I imagine we replace all the the environment variables with true settings soon anyway. ceterum censeo variabiles ambientis esse delendas |
|
Sounds good to me @DOsinga let me know when we're in a good spot and I can finish this up in a way that makes sense to all stakeholders. Thank you! |
|
thank you for your efforts! |
|
@kwsantiago very nice looking focussed - some changes merged recently do you want to update this to main? (I could also push to it if you like). Curious if this helps with the openrouter context exhaustion. I think we are in a better spot now to work on this. On openrouter - is there any downside to having middle-out on by default? (ideally we wouldn't need it, but if it doesn't kick in until it absolutely has to then why not). |
|
I think this looks good - has tests, and if we can update it to main and check it is ok, lets get it in. For open router I am thinking of adding in middle-out by default (why not) anyway. |
|
@michaelneale sounds good, if you don't get to it, I can do this tonight (EST). Regarding open router middle-out support, should I resurrect that in a follow up PR or within this PR? I know we had originally discussed taking it out which is why I got rid of it in this branch, although it is in some of my earlier commits if we want to dig it back out. |
|
@kwsantiago yeah if you want to add a middle-out in another PR - go ahead (I have an open PR which enables some defaults for startup experience so could always do it there) - if you have time, go ahead and tag me |
|
@katzdave Thanks for the detailed feedback! I've addressed the issues:
I don't have access to test with actual providers (databricks/openrouter), so I'd appreciate if someone could verify the fix works end-to-end. I won't have the bandwidth to iterate much further on this, but the changes should handle the reported issues. Feel free to edit the branch as needed. |
|
@kwsantiago can you run fmt again? |
|
@michaelneale I just committed cargo fmt. |
|
@kwsantiago sorry again - but can you update to main, hopefully will be last time. Will try this now, I think this may be good. |
f76bb52 to
4634278
Compare
michaelneale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice @kwsantiago - took a while but got there in the end!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, nice work! I think we can revert the changes in utils/errors.rs completely; taking it for one last spin quickly will update shortly.
Confirming lets completely revert utils/errors.rs and then all set to merge. |
michaelneale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah - what @katzdave says
Signed-off-by: Kyle 🐆 <kyle@privkey.io>
8f2ce26 to
e5c3982
Compare
|
@katzdave @michaelneale should be good to go, let me know if there is anything else here. Appreciate all the feedback! |
|
@kwsantiago did you mean to remove more than just error.rs changes? as looks a lot smaller now, as there were utils.rs changes which I think were needed? (a force push so can't see the history) edit: yes you did, ignore me! nice! |
|
@michaelneale that was due to addressing @katzdave requested changes here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thanks! took it for quick spin
Signed-off-by: Kyle 🐆 <kyle@privkey.io>
Signed-off-by: Kyle 🐆 <kyle@privkey.io> Signed-off-by: Matt Donovan <mattddonovan@protonmail.com>
Signed-off-by: Kyle 🐆 <kyle@privkey.io> Signed-off-by: HikaruEgashira <hikaru-egashira@c-fo.com>
Summary
Changes
ProviderError::ContextLengthExceededin agent reply loopauto_compact::perform_compactionto reduce conversation size without threshold checkingTesting