-
Notifications
You must be signed in to change notification settings - Fork 540
Open
Description
Reason: to support continual learning with LLM, such as:
- Use historical interaction log to improve the performance of an agent with the accumulation of interaction sessions.
- Summarize data that don't fit in the context window at once.
- Personalized chat.
- Long chat.
One possible solution is:
Implement a continual learning agent and a teaching agent such that:
- The teaching agent feeds learning goal and learning data to the learning agent. The learning agent maintains learning results of a particular form.
- The learning agent can use LLM to update learning results after each batch of learning data.
- When LLM is used for learning in the learning agent, it only needs the current learning results and the new batch of learning data. So the learning can be performed online and continually.
- The teaching agent allows human input similar to user proxy agent.
- Both the learning agent and the teaching agent can be serialized and deserialized to continue the learning process.
- Optionally, the learning agent tells the teaching agent the maximal number of tokens of the learning data to receive.
- Optionally, the teaching agent tells the learning agent the minimal number of tokens needed for the next batch of learning data.
- Optionally, the teaching agent can provide feedback to the learning agent, including: numeric value about how good is the current learning result(s); and textual feedback.
### Tasks
- [x] Review PR https://github.com/microsoft/FLAML/pull/1056
- [ ] https://github.com/microsoft/FLAML/issues/1073
- [ ] https://github.com/microsoft/FLAML/issues/1072
weilinear and gagb
Metadata
Metadata
Labels
enhancementNew feature or requestNew feature or request