Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Results #35

Open
KornbergFresnel opened this issue Feb 16, 2022 · 0 comments
Open

Performance Results #35

KornbergFresnel opened this issue Feb 16, 2022 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@KornbergFresnel
Copy link
Member

KornbergFresnel commented Feb 16, 2022

Throughput Comparison

All the experiment results listed are obtained with one of the following hardware settings: (1) System # 1: a 32-core computing node with dual graphics cards. (2) System # 2: a two-node cluster with each node owning 128-core and a single graphics card. All the GPUs mentioned are of the same model (NVIDIA RTX3090).

Throughput comparison among the existing RL frameworks and MALib. Due to resource limitation (32 cores, 256G RAM), RLlib fails under heavy loads (CPU case: #workers >32, GPU case: #workers > 8). MALib outperforms other frameworks with only CPU and achieves comparable performance with the highly tailored framework Sample-Factory with GPU despite higher abstraction introduced. To better illustrate the scalability of MALib, we show the MA-Atari and SC2 throughput on System # 2 under different worker settings, the 512-workers group on SC2 fails due to resource limitation.

merged_throughput_report

Additional comparisons between MALib and other distributed RL training frameworks. (Left): System # 3 cluster throughput of MALib in 2-player MA-Atari and 3-player SC2. (Middle): 4-player MA-Atari throughput comparison on System # 1 without GPU. (Right)} 4-player MA-Atari throughput comparison on System # 1 with GPU.

merged_throughput_report_4p

Wall-time & Performance of PB-MARL Algorithm

Comparisons of PSRO between MALib and OpenSpiel. (a) indicates that MALib achieves the same performance on
exploitability as OpenSpiel; (b) shows that the convergence rate of MALib is 3x faster than OpenSpiel; (c) shows that MALib
achieves a higher execution efficiency than OpenSpiel, since it requires less time consumption to iterate the same learning steps, which means MALib has the potential to scale up in more complex tasks that need to run for much more steps.

pb-marl_wall_time

Typical MARL Algorithms

Results on Multi-agent Particle Environments

Comparisons of MADDPG in simple adversary under different rollout worker settings. Figures in the top row depict each agent's episode reward w.r.t. the number of sampled episodes, which indicates that MALib converges faster than RLlib with equal sampled episodes. Figures in the bottom row show the average time and average episode reward at the same number of sampled episodes, which indicates that MALib achieves 5x speedup than RLlib.

simple_adversary

Scenario Crypto

simple_crypto

Simple Push

simple_push

Simple Reference

simple_reference

Simple Speaker Listener

simple_speaker_listener

Simple Tag

simple_tag

@KornbergFresnel KornbergFresnel added the documentation Improvements or additions to documentation label Feb 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant