forked from openai/baselines
-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare tf2 [WIP] #580
Closed
Closed
Prepare tf2 [WIP] #580
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Adding action scaling to and from tanh co-domain as a generic utility. * Formating * Adding action squashing to tanh co-domain for DDPG, TD3 and SAC whenever sampled at random from action_space. * Unifying other instances of action scaling withing SAC, TD3 and DDPG. Adding a test. * Adding info on fix to changelog. * Flipping action scaling/unscaling due to confusion by parameter naming. Adding test checking involved algorithms. * Changing names of local variables for actions, in order to follow naming of used action scaling methods. * Adding check on scaling inferred actions as well * Considering learning_starts parameter of SAC and TD3 when checking action scaling. * Removing misclick addition * Adding to changelog * Adding nick to bugfix. * Removing asserts enforcing symmetric action space (DDPG, TD3, SAC). * Changelog: non-symmetric action spaces info. * Test Action Scaling: remove unnecessary wrapping of environment, make action space asymmetric. * Adding comments * Missing line break * Removing unused import.
* Refactor and clarify doc for load_results (#582) * Updates from PR feedback Added unit test for load_results and get_monitor_files Add @jbulow in changelog * Update tests/test_monitor.py Co-Authored-By: Antonin RAFFIN <[email protected]> * Update tests/test_monitor.py Co-Authored-By: Antonin RAFFIN <[email protected]> * Updates from PR feedback * Updates from PR feedback * Updates from PR feedback Convert path object to string to pass pytype's type check.
* Adding PPO_CPP project description. * Changing PPO_CPP project description. * Changelog addition on new dependant project. * Update changelog.rst * Added section on C++ portability of Tensorflow models
…tribution (#588) * Fix - sample type inconsistency in CategoricalProbabilityDistribution * Adding info on fix to changelog. * Fix - sample type inconsistency (change sample type of CategoricalProbabilityDistribution, MultiCategoricalProbabilityDistribution to tf.int64) * Change dtype of actions to int64 of ACER * Update changelog.rst
* Update to use new close API * Update custom env documentation to reflect new gym close API * Update changelog.rst * Clarifies what reset returns * Update changelog.rst
* PEP8 fixes * Update changelog.rst
* Update notebooks links + start rl tips * Update draft * Add general advice * Add limitations * Add which algo to use * Correct typos and change colab link * Polish RL evaluation * Minor edits * Update changelog * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Add DeepRL course
* Correct typos * Add spell check when available * Update changelog * Fix space * Fix HER link
* Add Gym Env checker * Test common failures * Declare param as unused * Update tests/test_envs.py Co-Authored-By: Adam Gleave <[email protected]> * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update docs/guide/rl_tips.rst Co-Authored-By: Adam Gleave <[email protected]> * Split checks * Split tests * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Update stable_baselines/common/env_checker.py Co-Authored-By: Adam Gleave <[email protected]> * Reformat files
* VecNormalize: Add public normalize_{obs..,rew} methods * Update changelog * VecNormalize: get_original_{obs,rews} * VecNormalize: Update rewards in reset() Note that after the _update_rews() refactor, self.ret doesn't update anymore if `not self.training`. * update changelog * renames * changelog: fix indent * changelog: nested list needs blank lines * Add tests * Address review, fix tests * update tests * More annotations * Update stable_baselines/common/vec_env/vec_normalize.py Co-Authored-By: Adam Gleave <[email protected]> * Address review comments * Defensive copy
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Prepare the migration to tf2 by deactivating all tests that will fail with tf2.
I also changed the docker image.
Motivation and Context
Related to #366
Types of changes
Checklist:
pytest
andpytype
both pass.