RetroAct

This repository is based on our paper: Improving Retrospective Language Agents via Joint Policy Gradient Optimization. It contains the IL dataset we generated, as well as demo code for our fine-tuned planner and reflector.

Overview

The Code for different datasets is in hotpotqa/, alfworld/, and intercode/.
- start IL training by sft/finetune.sh
- start RL training is in rl/finetune.sh/
- start agent test is in agent/script

Usage

Install

You can use following scripts to install related python package through pip:

git clone https://github.com/XueyangFeng/RetroAct.git
- pip install -r requirements.txt

IL training

python sft/finetune.py \
    --learning_rate 1e-4 \
    --base_model <your_base_model_path> \
    --data_path <your_sft_data_path> \
    --micro_batch_size 1 \
    --num_epochs 5 \
    --output_path <your_sft_model_path> \

RL training

To reduce the training cost, we use an off-policy approach to train the RL algorithm. You need to first calculate the ref_prob of each token by rl/ref_prob.py.

Then, you can start rl training:

python rl/finetune.py \
    --learning_rate 1e-4 \
    --base_model <your_sft_model_path> \
    --micro_batch_size 1 \
    --num_epochs 3 \
    --output_path <your_rl_model_path> \
    --regular_coefficient 1.0 \
    --reflector_reward_coefficient 1.0 \
    --clip_advantage False \
    --clip_episode 0.3  \
    --rl_data_path <your_rl_data_path> \
    --regular_data_path <your_regular_data_path>

References

Our agent framework code is based on noahshinn/reflexion
Our IL training code is based on anchen1011/FireAct/
Our RL training code is based on RUCAIBox/RLMEC

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
alfworld		alfworld
hotpotqa		hotpotqa
intercode		intercode
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RetroAct

Overview

Usage

Install

IL training

RL training

References

About

Releases

Packages

Languages

XueyangFeng/RetroAct

Folders and files

Latest commit

History

Repository files navigation

RetroAct

Overview

Usage

Install

IL training

RL training

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages