实现新的强化学习算法 #41

StepNeverStop · 2021-07-01T05:00:31Z

Independent Q-Learning.

…ter. (#41)

…training. (#41,#25,#31) 1. change variable name from "is_lg_batch_size" to "can_sample" 2. optimized unity wrapper 3. optimized multi-agents replay buffers

1. fixed n-step replay buffer 2. reconstruct representation net 3. remove 'use_stack' 4. implement multi-agent algorithms with shared parameters 5. optimized agent network

1. removed sarl off-policy algorithm pd_ddpg, 'cause it's not in main stream 2. updated README 3. removed `iql` and added script `IndependentMA.py` instead to implement independent multi-agent algorithms 4. optimized summary writing 5. move NamedDict from 'rls.common.config' to 'rls.common.specs' 6. updated example config 7. updated `.gitignore` 8. added property `is_multi` to identify whether training task is for sarl or marl for both unity and gym 9. reconstructed inheritance relationships between algorithms and their's superclass 10. removed `1.e+18` in yaml files and use a large integer number instead, 'cause we want a large integer rather than float

…gzoo. (#41, #34) 1. fixed a little bug in maddpg

1. fixed bugs in maddpg and vdn 2. implemented `VDNMixer` 3. optimized parameters synchronizing function

1. optimized `vdn`

1. fixed bug in pettingzoo wrapper that not scale continuous actions from [-1, 1] to [low, hight] 2. fixed bugs in `sac`, `sac_v`, `tac`, `maxsqn` 3. implemented `masac` 4. fixed bugs in `squash_action` 5. implemented PER in marl 6. added several env configuration files when using platform pettingzoo 7. updated README

StepNeverStop · 2021-08-30T08:28:05Z

优化MARL中的训练部分，避免繁多的键值索引

1. implemented `dreamer v1` 2. optimized algorithm registering and added some `__init__.py` files in dictionary `algorithms/` 3. optimized class `OPLR` 4. updated README

1. optimized `dreamerv1` 2. updated README

1. fixed bugs in `planet` and `dreamerv1` 2. renamed `t` to `th` 3. fixed other bugs 4. optimized code structure 5. updated README

StepNeverStop added a commit that referenced this issue Jul 1, 2021

feat(unity): preliminary re-implemented MADDPG capable for Unity3D.(#41)

e66965f

StepNeverStop added enhancement New feature or request optimization Better performance or solution labels Jul 1, 2021

StepNeverStop self-assigned this Jul 1, 2021

StepNeverStop added a commit that referenced this issue Jul 1, 2021

fix(maddpg): fixed summary writer (#41)

ac42285

StepNeverStop added a commit that referenced this issue Jul 1, 2021

feat(multi-agent): implemented VDN briefly, needed to fully debug. (#41)

5759300

StepNeverStop added a commit that referenced this issue Jul 2, 2021

feat(multi-agent): implemented IQL briefly, needed to fully debug. (#41)

8a70da8

Independent Q-Learning.

StepNeverStop added a commit that referenced this issue Jul 2, 2021

fix(multi-agent): added forgeted configurations. (#41)

6bb826c

StepNeverStop added a commit that referenced this issue Jul 2, 2021

fix(multi-agents): fixed summaries writing, keep one main summary wri…

75a7a4b

…ter. (#41)

StepNeverStop added a commit that referenced this issue Jul 30, 2021

v4.1.4 perf(pettingzoo): added multi-agent environment support pettin…

cca69b2

…gzoo. (#41, #34) 1. fixed a little bug in maddpg

StepNeverStop added a commit that referenced this issue Jul 30, 2021

v4.1.5 feat: fixed bugs in maddpg and vdn. (#41, #34)

d4bd9ff

1. fixed bugs in maddpg and vdn 2. implemented `VDNMixer` 3. optimized parameters synchronizing function

StepNeverStop added a commit that referenced this issue Aug 25, 2021

perf: reconstruct repo(#47, #25, #46, #34, #31, #33, #39, #41, #45, #26)

67b8979

StepNeverStop changed the title ~~实现多智体强化学习算法~~ 实现新的强化学习算法 Aug 25, 2021

StepNeverStop added a commit that referenced this issue Aug 27, 2021

perf: added npg (#41)

71115ea

StepNeverStop added a commit that referenced this issue Aug 28, 2021

feat(marl): added Qatten (#41, #34)

ad8be31

1. optimized `vdn`

StepNeverStop added a commit that referenced this issue Aug 29, 2021

v5.0.1 feat(marl-qtran): added marl algorithm qtran (#41)

506cf61

StepNeverStop added a commit that referenced this issue Aug 29, 2021

v5.1.1 feat(marl-qtran): added marl algorithm qtran (#41)

4c45ba0

StepNeverStop added a commit that referenced this issue Aug 29, 2021

v5.1.2 feat(marl-qplex): added marl algorithm qplex (#41)

92d4b9a

StepNeverStop pinned this issue Aug 31, 2021

StepNeverStop added a commit that referenced this issue Sep 1, 2021

v5.1.8 feat(dreamerv2): added dreamerv2. (#41)

7f988d4

1. optimized `dreamerv1` 2. updated README

StepNeverStop added a commit that referenced this issue Sep 2, 2021

v5.1.9 feat(planet): added planet. (#41)

7965bcf

StepNeverStop added a commit that referenced this issue Sep 15, 2021

v0.0.4 feat(cql): added CQL (#41)

026ba1d

StepNeverStop added a commit that referenced this issue Sep 23, 2021

v0.0.5 feat(bcq): implemented BCQ(#41)

d60741c

StepNeverStop added a commit that referenced this issue Sep 29, 2021

v0.0.6 feat(mve): implemented MVE (#41)

14c9bfc

1. fixed bugs in `planet` and `dreamerv1` 2. renamed `t` to `th` 3. fixed other bugs 4. optimized code structure 5. updated README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

实现新的强化学习算法 #41

实现新的强化学习算法 #41

StepNeverStop commented Jul 1, 2021 •

edited

Loading

StepNeverStop commented Aug 30, 2021

实现新的强化学习算法 #41

实现新的强化学习算法 #41

Comments

StepNeverStop commented Jul 1, 2021 • edited Loading

StepNeverStop commented Aug 30, 2021

StepNeverStop commented Jul 1, 2021 •

edited

Loading