-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check that the code implementation is accurate and reasonable #34
Labels
optimization
Better performance or solution
Comments
StepNeverStop
added a commit
that referenced
this issue
Jan 6, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 6, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 6, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 6, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 7, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 7, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 9, 2021
StepNeverStop
added a commit
that referenced
this issue
Jan 12, 2021
StepNeverStop
added a commit
that referenced
this issue
Jul 4, 2021
|
StepNeverStop
added a commit
that referenced
this issue
Jul 11, 2021
StepNeverStop
added a commit
that referenced
this issue
Jul 12, 2021
StepNeverStop
added a commit
that referenced
this issue
Jul 26, 2021
StepNeverStop
added a commit
that referenced
this issue
Jul 28, 2021
StepNeverStop
added a commit
that referenced
this issue
Jul 28, 2021
1. moved `logger2file` from agent class to main loop 2. updated folder `gym_env_list` 3. fixed bugs in `*.yaml` 4. added class property `n_copys` instead using `env._n_copys` 5. updated README
StepNeverStop
added a commit
that referenced
this issue
Jul 29, 2021
1. added `test.yaml` for quickly verify RLs 2. change folder name from `algos` to `algorithms` for better reading 3. removed single agent recoder, all algorithms(sarl&marl) using `SimpleMovingAverageRecoder` 4. removed `GymVectorizedType` in `common/specs.py` 5. removed `common/train/*`, and implement unified training interface in `rls/train` 6. reconstructed `make_env` function in `rls/envs/make_env` 7. optimized function `load_config` 8. moved `off_policy_buffer.yaml` to `rls/configs/buffer` 9. removed configurations like `eval_while_train`, `add_noise2buffer` etc. 10. optimized environments' configuration files 11. optimized environment wrappers and implemented unified env interface for `gym` and `unity`, see `env_base.py` 12. updated dockerfiles 13. updated README
StepNeverStop
added a commit
that referenced
this issue
Jul 29, 2021
1. optimized `iTensor_oNumpy` 2. renamed `train_time_step` to `rnn_time_steps`, `burn_in_time_step` to `burn_in_time_steps` 3. optimized `on_policy_buffer.py` 4. optimized `EpisodeExperienceReplay` 5. fixed off-policy rnn training 6. optimized&fixed `to_numpy` and `to_tensor` 7. reimplemented `call` and invoking it in `__call__`
StepNeverStop
added a commit
that referenced
this issue
Jul 30, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 26, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 27, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 27, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 28, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 28, 2021
1. added `_has_global_state` in pettingzoo env wrapper and marl policis
StepNeverStop
added a commit
that referenced
this issue
Aug 28, 2021
1. removed redundant function 2. optimized `q_target_func`
StepNeverStop
added a commit
that referenced
this issue
Aug 28, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 28, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 29, 2021
StepNeverStop
added a commit
that referenced
this issue
Aug 30, 2021
…ion `squash_action`. (#34) thanks to @BlueFisher
StepNeverStop
added a commit
that referenced
this issue
Aug 30, 2021
1. fixed bug in pettingzoo wrapper that not scale continuous actions from [-1, 1] to [low, hight] 2. fixed bugs in `sac`, `sac_v`, `tac`, `maxsqn` 3. implemented `masac` 4. fixed bugs in `squash_action` 5. implemented PER in marl 6. added several env configuration files when using platform pettingzoo 7. updated README
StepNeverStop
added a commit
that referenced
this issue
Aug 31, 2021
1. fixed rnn hidden states iteration 2. renamed `n_time_step` to `chunk_length` 2. added `train_interval` to both sarl and marl off-policy agorithms so as to control the training frequency related to data collecting 3. added `n_step_value` to calculate n-step return 4. updated README
StepNeverStop
added a commit
that referenced
this issue
Sep 3, 2021
1. renamed `iTensor_oNumpy` to `iton` 2. optimized `auto_format.py` 3. added general params `oplr_params` to initializing optimizer
StepNeverStop
added a commit
that referenced
this issue
Sep 4, 2021
*. redefine version to v0.0.1 1. removed package `supersuit` 2. implemented class `MPIEnv` 3. implemented class `VECEnv` 4. optimized env wrappers, implemented `render` method to `gyms` environment. 5. reconstructed some of returns of `env.step` from `obs` to `obs_fa` and `obs_fs`. - `obs_fa` is used to choose action based by agent/policy. For the cross point of episode i and i+1, `obs_fa` represents $observation_{i+1}^{0}$, otherwise it keeps same with `obs_fs` which represents $observation_{i}^{t}$. - `obs_fs` is used to be stored in buffer. For the cross point of episode i and i+1, `obs_fs` represents $observation_{i}^{T}$, otherwise it keeps same with `obs_fa`. 6. optimzed `rssm` related based on mentioned `obs_fs`.
StepNeverStop
added a commit
that referenced
this issue
Sep 4, 2021
1. optimized on-policy algorithms 2. renamed `cell_state` to `rnncs` 3. renamed `next_cell_state` to `rnncs_` 4. fixed bugs when storing the first experience into replay buffer 5. optimized algorithm code format. 6. fixed bugs in `c51` and `qrdqn`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
discounted_sum
calculate_td_error
nan
vdn
andqmix
The text was updated successfully, but these errors were encountered: