-
Notifications
You must be signed in to change notification settings - Fork 44.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated Self Feedback #4220
Comments
I would have thought to log the number of warnings/errors to a Python dict (hash table) so that this can be used at runtime, if we have that data available, we can use this for all sorts of purposes - including a reward/fitness function, but also to "score" certain combinations of command + parameters. Which would make it pretty easy to bail out (even in continuous mode) if the llm keeps making mistakes by coming up with commands or arguments that don't exist. Also, we could provide a meaningful error message so that the agent can forward so that the LLM, e.g. something like "Command does not exist", "Command does not support the following argument", "Argument X is not of the right type/format". The LLM would be able to make much better choices based on this sort of info. Imagine running with -y -100 and the agent keeps hallucinating command/param combinations, we would support a configurable threshold of allowing these mistakes (say 2-3 times) and if it's not adapting by then, we would trigger self-feedback to change the action/command Some initial heuristics would include:
We would come up with good error messages for these, so that the LLM can provide better tailored actions/commands. The error message itself could include name of the command and a copy of the description string, including a valid example. With this sort of system we could even tell the LLM right away which commands + param combinations were previously executed successfully. All of this is touching in #3668 |
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days. |
Hi I am new to open source and I want to contribute,can you please help |
We can, please consider joining discord to learn more. |
Hey, @Boostrix! I would love to start contributing to the project. I do not have lots of experience, so I was looking at the issues with the "good first issue" label, so I think it would be a good idea for me to contribute to an issue in conjunction with other people. Do you know if this issue is being worked on? Maybe I can join and somehow help.. |
Howdy, best to get in touch via discord, would that work for ya? |
@Boostrix, yeah! I am a member already, but not sure in what channel to post or reference this or how to go about this. Should I just do it in general? I feel it's going to get lost there. |
just to note this issue is active in discord here so this could be a good place to continue the conversation, just a thought |
Duplicates
Summary 💡
In continuous mode, we should provide an option to trigger automated self-feedback based on some configurable threshold, such as number of errors/warnings the system is seeing (for instance when executing commands that don't exist), to "self-correct". Self-correction should include a list of valid commands/parameters.
Related:
Examples 🌈
No response
Motivation 🔦
No response
The text was updated successfully, but these errors were encountered: