Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

实现新的强化学习算法 #41

Open
16 of 26 tasks
StepNeverStop opened this issue Jul 1, 2021 · 1 comment
Open
16 of 26 tasks

实现新的强化学习算法 #41

StepNeverStop opened this issue Jul 1, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request optimization Better performance or solution

Comments

@StepNeverStop
Copy link
Owner

StepNeverStop commented Jul 1, 2021

@StepNeverStop StepNeverStop added enhancement New feature or request optimization Better performance or solution labels Jul 1, 2021
@StepNeverStop StepNeverStop self-assigned this Jul 1, 2021
StepNeverStop added a commit that referenced this issue Jul 2, 2021
StepNeverStop added a commit that referenced this issue Jul 2, 2021
…training. (#41,#25,#31)

1. change variable name from "is_lg_batch_size" to "can_sample"
2. optimized unity wrapper
3. optimized multi-agents replay buffers
StepNeverStop added a commit that referenced this issue Jul 4, 2021
1. fixed n-step replay buffer
2. reconstruct representation net
3. remove 'use_stack'
4. implement multi-agent algorithms with shared parameters
5. optimized agent network
StepNeverStop added a commit that referenced this issue Jul 28, 2021
1. removed sarl off-policy algorithm pd_ddpg, 'cause it's not in main stream
2. updated README
3. removed `iql` and added script `IndependentMA.py` instead to implement independent multi-agent algorithms
4. optimized summary writing
5. move NamedDict from 'rls.common.config' to 'rls.common.specs'
6. updated example config
7. updated `.gitignore`
8. added property `is_multi` to identify whether training task is for sarl or marl for both unity and gym
9. reconstructed inheritance relationships between algorithms and their's superclass
10. removed `1.e+18` in yaml files and use a large integer number instead, 'cause we want a large integer rather than float
StepNeverStop added a commit that referenced this issue Jul 30, 2021
StepNeverStop added a commit that referenced this issue Jul 30, 2021
1. fixed bugs in maddpg and vdn
2. implemented `VDNMixer`
3. optimized parameters synchronizing function
@StepNeverStop StepNeverStop changed the title 实现多智体强化学习算法 实现新的强化学习算法 Aug 25, 2021
StepNeverStop added a commit that referenced this issue Aug 27, 2021
StepNeverStop added a commit that referenced this issue Aug 28, 2021
1. optimized `vdn`
StepNeverStop added a commit that referenced this issue Aug 30, 2021
1. fixed bug in pettingzoo wrapper that not scale continuous actions from [-1, 1] to [low, hight]
2. fixed bugs in `sac`, `sac_v`, `tac`, `maxsqn`
3. implemented `masac`
4. fixed bugs in `squash_action`
5. implemented PER in marl
6. added several env configuration files when using platform pettingzoo
7. updated README
@StepNeverStop
Copy link
Owner Author

  • 优化MARL中的训练部分,避免繁多的键值索引

StepNeverStop added a commit that referenced this issue Aug 31, 2021
1. implemented `dreamer v1`
2. optimized algorithm registering and added some `__init__.py` files in dictionary `algorithms/`
3. optimized class `OPLR`
4. updated README
@StepNeverStop StepNeverStop pinned this issue Aug 31, 2021
StepNeverStop added a commit that referenced this issue Sep 1, 2021
1. optimized `dreamerv1`
2. updated README
StepNeverStop added a commit that referenced this issue Sep 29, 2021
1. fixed bugs in `planet` and `dreamerv1`
2. renamed `t` to `th`
3. fixed other bugs
4. optimized code structure
5. updated README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request optimization Better performance or solution
Projects
None yet
Development

No branches or pull requests

1 participant