Releases · Stable-Baselines-Team/stable-baselines3-contrib

25 Mar 14:08

araffin

v1.5.0

9d7e33d

sb3-contrib v1.5.0: Bug fixes and newer gym version

Breaking Changes:

Switched minimum Gym version to 0.21.0.
Upgraded to Stable-Baselines3 >= 1.5.0

New Features:

Allow PPO to turn of advantage normalization (see PR #61) @vwxyzjn

Bug Fixes:

Removed explict calls to forward() method as per pytorch guidelines

Contributors

vwxyzjn

Assets 2

19 Jan 12:55

araffin

v1.4.0

a78891b

sb3-contrib v1.4.0: Trust Region Policy Optimization (TRPO) and Augmented Random Search (ARS) algorithms

Breaking Changes:

Dropped python 3.6 support
Upgraded to Stable-Baselines3 >= 1.4.0
MaskablePPO was updated to match latest SB3 PPO version (timeout handling and new method for the policy object)

New Features:

Added TRPO (@cyprienc)
Added experimental support to train off-policy algorithms with multiple envs (note: HerReplayBuffer currently not supported)
Added Augmented Random Search (ARS) (@sgillen)

Others:

Improve test coverage for MaskablePPO

Contributors

cyprienc and sgillen

Assets 2

23 Oct 15:25

araffin

v1.3.0

b1397bb

sb3-contrib v1.3.0 : PPO with invalid action masking

WARNING: This version will be the last one supporting Python 3.6 (end of life in Dec 2021).
We highly recommended you to upgrade to Python >= 3.7.

Breaking Changes:

Removed sde_net_arch
Upgraded to Stable-Baselines3 >= 1.3.0

New Features:

Added MaskablePPO algorithm (@kronion)
MaskablePPO Dictionary Observation support (@glmcdona)

Contributors

kronion and glmcdona

Assets 2

08 Sep 10:57

araffin

v1.2.0

b2e7126

sb3-contrib v1.2.0 : Train/Eval mode support

Breaking Changes:

Upgraded to Stable-Baselines3 >= 1.2.0

Bug Fixes:

QR-DQN and TQC updated so that their policies are switched between train and eval mode at the correct time (@ayeright)

Others:

Fixed type annotation
Added python 3.9 to CI

Contributors

ayeright

Assets 2

02 Jul 10:07

araffin

v1.1.0

ae39e00

SB3 v1.1.0: dictionary observation support and timeout handling

Breaking Changes

Added support for Dictionary observation spaces (cf. SB3 doc)
Upgraded to Stable-Baselines3 >= 1.1.0
Added proper handling of timeouts for off-policy algorithms (cf. SB3 doc)
Updated usage of logger (cf. SB3 doc)

Bug Fixes

Removed unused code in TQC

Others

SB3 docs and tests dependencies are no longer required for installing SB3 contrib

Documentation

updated QR-DQN docs checkmark typo (@minhlong94)

Assets 2

17 Mar 14:30

araffin

v1.0

81ef23d

Stable-Baselines3 v1.0

Blog post: https://araffin.github.io/post/sb3/

Breaking Changes

Upgraded to Stable-Baselines3 v1.0

Bug Fixes

Fixed a bug with QR-DQN predict method when using deterministic=False with image space

Assets 2

06 Mar 13:56

araffin

v1.0rc1

9824dac

v1.0rc1: Bug fix for QR-DQN (#21)

* Bug fix for QR-DQN

* Upgrade SB3

Assets 2

27 Feb 19:35

araffin

v0.11.1

7c2eb83

QR-DQN, SB3 upgrade and time feature wrapper Pre-release

Pre-release

Breaking Changes:

Upgraded to Stable-Baselines3 >= 0.11.1

New Features:

Added TimeFeatureWrapper to the wrappers
Added QR-DQN algorithm (@ku2482_)

Bug Fixes:

Fixed bug in TQC when saving/loading the policy only with non-default number of quantiles
Fixed bug in QR-DQN when calculating the target quantiles (@ku2482, @guyk1971)

Others:

Updated TQC to match new SB3 version
Moved quantile_huber_loss to common/utils.py (@ku2482)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking Changes:

New Features:

Bug Fixes:

Contributors

Breaking Changes:

New Features:

Others:

Contributors

Breaking Changes:

New Features:

Contributors

Breaking Changes:

Bug Fixes:

Others:

Contributors

Breaking Changes

Bug Fixes

Others

Documentation

Breaking Changes

Bug Fixes

Breaking Changes:

New Features:

Bug Fixes:

Others:

Releases: Stable-Baselines-Team/stable-baselines3-contrib

sb3-contrib v1.5.0: Bug fixes and newer gym version

Breaking Changes:

New Features:

Bug Fixes:

Contributors

sb3-contrib v1.4.0: Trust Region Policy Optimization (TRPO) and Augmented Random Search (ARS) algorithms

Breaking Changes:

New Features:

Others:

Contributors

sb3-contrib v1.3.0 : PPO with invalid action masking

Breaking Changes:

New Features:

Contributors

sb3-contrib v1.2.0 : Train/Eval mode support

Breaking Changes:

Bug Fixes:

Others:

Contributors

SB3 v1.1.0: dictionary observation support and timeout handling

Breaking Changes

Bug Fixes

Others

Documentation

Stable-Baselines3 v1.0

Breaking Changes

Bug Fixes

v1.0rc1: Bug fix for QR-DQN (#21)

QR-DQN, SB3 upgrade and time feature wrapper

Breaking Changes:

New Features:

Bug Fixes:

Others: