Skip to content

paulzyzy/YOLO-MARL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

YOLO-MARL: You Only LLM Once for Multi-Agent Reinforcement Learning

This is the official repository for the paper "YOLO-MARL: You Only LLM Once for Multi-agent Reinforcement Learning".

Framework Diagram

YOLO-MARL supports both ChatGPT and Claude. Please put your api KEY in the YOLO-MARL/ and name it claude_KEY.txt or openai_KEY.txt.

In YOLO-MARL/src/config/default.yaml, you need to set use_llm by yourself. False for running the MARL baselines and True for running the YOLO-MARL method.

All the baselines we used here are based on Epymarl. For the denpendencies, you could also refer to Epymarl and then install Openai or Claude dependencies:

pip install openai #If you want to use ChatGPT for your LLM API
pip install anthropic #If you want to use Claude for your LLM API

Baseline Experiments

For our baselines experiments, you could find all the hyperparameters in "Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks" for the Level-Based Foraging (LBF) and Multi-Robot Warehouse (RWARE) environments, and MPE.

For the YOLO-MARL method, please set the environment in LLM-copilot-RL/LBF/src/prompts/config/config.yaml before you generate LLM planning function.

How to use YOLO-MARL to generate planning function for testing environment:

python YOLO-MARL/src/prompts/codeGeneration.py

For YOLO-MARL training, please set the hyperparameters llm_reward and penalty in YOLO-MARL/src/envs/llm_wrapper.py for mixed_constant.

LBF:

python src/main.py --config=qmix --env-config=gymma with env_args.time_limit=50 env_args.key="lbforaging:Foraging-8x8-2p-3f-v3"

You can run experiments in these environments as follows:

LBF:

python src/main.py --config=qmix --env-config=gymma with env_args.time_limit=50 env_args.key="lbforaging:Foraging-8x8-2p-3f-v3"

RWARE:

python src/main.py --config=mappo --env-config=gymma with env_args.time_limit=500 env_args.key="rware:rware-tiny-2ag-v2"

MPE:

python src/main.py --config=qmix --env-config=gymma with env_args.time_limit=25 env_args.key="pz-mpe-simple-spread-v3"

Note that for the MPE environments tag (predator-prey) and adversary, we provide pre-trained prey and adversary policies. These can be used to control the respective agents to make these tasks fully cooperative (used in the paper) by setting env_args.pretrained_wrapper="PretrainedTag" or env_args.pretrained_wrapper="PretrainedAdversary".

SMAC:

python src/main.py --config=qmix --env-config=sc2 with env_args.map_name="3s5z"

Below, we provide the base environment and key / map name for all the environments evaluated in the "Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks":

  • LBF: all with --env-config=gymma with env_args.time_limit=50 env_args.key="..."
    • 8x8-2p-2f-coop: lbforaging:Foraging-8x8-2p-2f-coop-v3
    • 8x8-2p-2f-2s-coop: lbforaging:Foraging-2s-8x8-2p-2f-coop-v3
    • 10x10-3p-3f: lbforaging:Foraging-10x10-3p-3f-v3
    • 10x10-3p-3f-2s: lbforaging:Foraging-2s-10x10-3p-3f-v3
    • 15x15-3p-5f: lbforaging:Foraging-15x15-3p-5f-v3
    • 15x15-4p-3f: lbforaging:Foraging-15x15-4p-3f-v3
    • 15x15-4p-5f: lbforaging:Foraging-15x15-4p-5f-v3
  • RWARE: all with --env-config=gymma with env_args.time_limit=500 env_args.key="..."
    • tiny 2p: rware:rware-tiny-2ag-v2
    • tiny 4p: rware:rware-tiny-4ag-v2
    • small 4p: rware:rware-small-4ag-v2
  • MPE: all with --env-config=gymma with env_args.time_limit=25 env_args.key="..."
    • simple speaker listener: pz-mpe-simple-speaker-listener-v4
    • simple spread: pz-mpe-simple-spread-v3
    • simple adversary: pz-mpe-simple-adversary-v3 with additional env_args.pretrained_wrapper="PretrainedAdversary"
    • simple tag: pz-mpe-simple-tag-v3 with additional env_args.pretrained_wrapper="PretrainedTag"

Citing YOLO-MARL

We are still working on SMAC and hope we will update the SMAC part soon.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages