Release SB3-Contrib v1.7.0 : Bug fixes for PPO LSTM and quality of life improvements · Stable-Baselines-Team/stable-baselines3-contrib

Warning
Shared layers in MLP policy (mlp_extractor) are now deprecated for PPO, A2C and TRPO.
This feature will be removed in SB3 v1.8.0 and the behavior of net_arch=[64, 64]
will create separate networks with the same architecture, to be consistent with the off-policy algorithms.

Note
TRPO models saved with SB3 < 1.7.0 will show a warning about
missing keys in the state dict when loaded with SB3 >= 1.7.0.
To suppress the warning, simply save the model again.
You can find more info in issue # 1233

Breaking Changes:

Removed deprecated create_eval_env, eval_env, eval_log_path, n_eval_episodes and eval_freq parameters,
please use an EvalCallback instead
Removed deprecated sde_net_arch parameter
Upgraded to Stable-Baselines3 >= 1.7.0

New Features:

Introduced mypy type checking
Added support for Python 3.10
Added with_bias parameter to ARSPolicy
Added option to have non-shared features extractor between actor and critic in on-policy algorithms (@AlexPasqua)
Features extractors now properly support unnormalized image-like observations (3D tensor)
when passing normalize_images=False

Bug Fixes:

Fixed a bug in RecurrentPPO where the lstm states where incorrectly reshaped for n_lstm_layers > 1 (thanks @kolbytn)
Fixed RuntimeError: rnn: hx is not contiguous while predicting terminal values for RecurrentPPO when n_lstm_layers > 1

Deprecations:

You should now explicitely pass a features_extractor parameter when calling extract_features()
Deprecated shared layers in MlpExtractor (@AlexPasqua)

Others:

Fixed flake8 config
Fixed sb3_contrib/common/utils.py type hint
Fixed sb3_contrib/common/recurrent/type_aliases.py type hint
Fixed sb3_contrib/ars/policies.py type hint
Exposed modules in __init__.py with __all__ attribute (@ZikangXiong)
Removed ignores on Flake8 F401 (@ZikangXiong)
Upgraded GitHub CI/setup-python to v4 and checkout to v3
Set tensors construction directly on the device
Standardized the use of from gym import spaces

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SB3-Contrib v1.7.0 : Bug fixes for PPO LSTM and quality of life improvements

Breaking Changes:

New Features:

Bug Fixes:

Deprecations:

Others:

Contributors