You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently someone needs to intentionally modify the code to say " I have beaten challenge A".
But it's possible someone makes an improvement on challenge A but also improves challenge B.
We need to attempt challenges anytime there is a prompt change.
Examples 🌈
No response
Motivation 🔦
No response
The text was updated successfully, but these errors were encountered:
waynehamadi
changed the title
How do we know if a prompt improved Auto-GPT ?
How do we know **AUTOMATICALLY** if a prompt improved Auto-GPT ?
May 14, 2023
waynehamadi
changed the title
How do we know **AUTOMATICALLY** if a prompt improved Auto-GPT ?
How do we know AUTOMATICALLY if a prompt improved Auto-GPT ?
May 14, 2023
for starters, by keeping track of the costs spent to arrive at a solution ?
In other words, at least steps/API tokens + time ?
In the future, maybe by tracking CPU/RAM utiization as well.
But in general we should gather data for different prompts so that we can use gnuplot to plot performance for each version/commit.
And we should probably start by using GPT to come up with N mutations for a given task (that we know works) and then use those as a baseline for future benchmarking
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.
Duplicates
Summary 💡
Currently someone needs to intentionally modify the code to say " I have beaten challenge A".
But it's possible someone makes an improvement on challenge A but also improves challenge B.
We need to attempt challenges anytime there is a prompt change.
Examples 🌈
No response
Motivation 🔦
No response
The text was updated successfully, but these errors were encountered: