Automated Self Feedback #4220

Boostrix · 2023-05-15T14:11:11Z

Duplicates

I have searched the existing issues

Summary 💡

In continuous mode, we should provide an option to trigger automated self-feedback based on some configurable threshold, such as number of errors/warnings the system is seeing (for instance when executing commands that don't exist), to "self-correct". Self-correction should include a list of valid commands/parameters.

No response

Motivation 🔦

No response

bbonifacio-at-mudd · 2023-05-19T05:11:01Z

Hi @Boostrix, do you think this should be encoded with the Observer/Supervisor class you discussed here? Or should they be separate? #4242

Boostrix · 2023-05-19T05:41:29Z

I would have thought to log the number of warnings/errors to a Python dict (hash table) so that this can be used at runtime, if we have that data available, we can use this for all sorts of purposes - including a reward/fitness function, but also to "score" certain combinations of command + parameters.

Which would make it pretty easy to bail out (even in continuous mode) if the llm keeps making mistakes by coming up with commands or arguments that don't exist.

Also, we could provide a meaningful error message so that the agent can forward so that the LLM, e.g. something like "Command does not exist", "Command does not support the following argument", "Argument X is not of the right type/format".

The LLM would be able to make much better choices based on this sort of info.

Imagine running with -y -100 and the agent keeps hallucinating command/param combinations, we would support a configurable threshold of allowing these mistakes (say 2-3 times) and if it's not adapting by then, we would trigger self-feedback to change the action/command

Some initial heuristics would include:

does the command exist
is the argument supported
is the argument provided in the proper form

We would come up with good error messages for these, so that the LLM can provide better tailored actions/commands.
If it doesn't, i.e. by suggesting the same command/argument combinations, we could use an "error score" and tell it to reduce its errors.

The error message itself could include name of the command and a copy of the description string, including a valid example.

With this sort of system we could even tell the LLM right away which commands + param combinations were previously executed successfully.

All of this is touching in #3668

github-actions · 2023-09-06T20:51:13Z

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

saumit123 · 2023-10-04T09:05:29Z

Hi I am new to open source and I want to contribute,can you please help

Boostrix · 2023-10-04T18:04:12Z

We can, please consider joining discord to learn more.

estefysc · 2023-10-12T18:20:46Z

Hey, @Boostrix! I would love to start contributing to the project. I do not have lots of experience, so I was looking at the issues with the "good first issue" label, so I think it would be a good idea for me to contribute to an issue in conjunction with other people. Do you know if this issue is being worked on? Maybe I can join and somehow help..

Boostrix · 2023-10-12T22:32:36Z

Howdy, best to get in touch via discord, would that work for ya?

estefysc · 2023-10-13T18:47:23Z

@Boostrix, yeah! I am a member already, but not sure in what channel to post or reference this or how to go about this. Should I just do it in general? I feel it's going to get lost there.

Bentlybro · 2023-10-13T20:20:08Z

just to note this issue is active in discord here so this could be a good place to continue the conversation, just a thought

k-boikov added the enhancement New feature or request label May 15, 2023

Boostrix added the good first issue Good for newcomers label May 16, 2023

Boostrix mentioned this issue May 19, 2023

Dedicated Observer/Supervisor Class #4242

Closed

1 task

konraddroeske mentioned this issue May 19, 2023

Automated self feedback #4299

Closed

6 tasks

Boostrix mentioned this issue May 26, 2023

Approve an action and also provide feedback without wasting a cycle (good for story writing and other creative tasks) #4414

Closed

1 task

Boostrix mentioned this issue Jun 10, 2023

Adds risk avoidance mode and relevant config. #934

Closed

5 tasks

github-actions bot added the Stale label Sep 6, 2023

Pwuts added meta Meta-issue about a topic that multiple issues already exist for and removed Stale labels Sep 14, 2023

Swiftyos closed this as completed Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated Self Feedback #4220

Automated Self Feedback #4220

Boostrix commented May 15, 2023 •

edited

Loading

bbonifacio-at-mudd commented May 19, 2023

Boostrix commented May 19, 2023 •

edited

Loading

github-actions bot commented Sep 6, 2023

saumit123 commented Oct 4, 2023

Boostrix commented Oct 4, 2023

estefysc commented Oct 12, 2023

Boostrix commented Oct 12, 2023

estefysc commented Oct 13, 2023

Bentlybro commented Oct 13, 2023

Automated Self Feedback #4220

Automated Self Feedback #4220

Comments

Boostrix commented May 15, 2023 • edited Loading

Duplicates

Summary 💡

Examples 🌈

Motivation 🔦

bbonifacio-at-mudd commented May 19, 2023

Boostrix commented May 19, 2023 • edited Loading

github-actions bot commented Sep 6, 2023

saumit123 commented Oct 4, 2023

Boostrix commented Oct 4, 2023

estefysc commented Oct 12, 2023

Boostrix commented Oct 12, 2023

estefysc commented Oct 13, 2023

Bentlybro commented Oct 13, 2023

Boostrix commented May 15, 2023 •

edited

Loading

Boostrix commented May 19, 2023 •

edited

Loading