Skip to content

Commit 40e0b9d

Browse files
araffincarlosluisqgallouedectlpss
authored
Add Gymnasium support (#1327)
* Fix failing set_env test * Fix test failiing due to deprectation of env.seed * Adjust mean reward threshold in failing test * Fix her test failing due to rng * Change seed and revert reward threshold to 90 * Pin gym version * Make VecEnv compatible with gym seeding change * Revert change to VecEnv reset signature * Change subprocenv seed cmd to call reset instead * Fix type check * Add backward compat * Add `compat_gym_seed` helper * Add goal env checks in env_checker * Add docs on HER requirements for envs * Capture user warning in test with inverted box space * Update ale-py version * Fix randint * Allow noop_max to be zero * Update changelog * Update docker image * Update doc conda env and dockerfile * Custom envs should not have any warnings * Fix test for numpy >= 1.21 * Add check for vectorized compute reward * Bump to gym 0.24 * Fix gym default step docstring * Test downgrading gym * Revert "Test downgrading gym" This reverts commit 0072b77. * Fix protobuf error * Fix in dependencies * Fix protobuf dep * Use newest version of cartpole * Update gym * Fix warning * Loosen required scipy version * Scipy no longer needed * Try gym 0.25 * Silence warnings from gym * Filter warnings during tests * Update doc * Update requirements * Add gym 26 compat in vec env * Fixes in envs and tests for gym 0.26+ * Enforce gym 0.26 api * format * Fix formatting * Fix dependencies * Fix syntax * Cleanup doc and warnings * Faster tests * Higher budget for HER perf test (revert prev change) * Fixes and update doc * Fix doc build * Fix breaking change * Fixes for rendering * Rename variables in monitor * update render method for gym 0.26 API backwards compatible (mode argument is allowed) while using the gym 0.26 API (render mode is determined at environment creation) * update tests and docs to new gym render API * undo removal of render modes metatadata check * set rgb_array as default render mode for gym.make * undo changes & raise warning if not 'rgb_array' * Fix type check * Remove recursion and fix type checking * Remove hacks for protobuf and gym 0.24 * Fix type annotations * reuse existing render_mode attribute * return tiled images for 'human' render mode * Allow to use opencv for human render, fix typos * Add warning when using non-zero start with Discrete (fixes #1197) * Fix type checking * Bug fixes and handle more cases * Throw proper warnings * Update test * Fix new metadata name * Ignore numpy warnings * Fixes in vec recorder * Global ignore * Filter local warning too * Monkey patch not needed for gym 26 * Add doc of VecEnv vs Gym API * Add render test * Fix return type * Update VecEnv vs Gym API doc * Fix for custom render mode * Fix return type * Fix type checking * check test env test_buffer * skip render check * check env test_dict_env * test_env test_gae * check envs in remaining tests * Update tests * Add warning for Discrete action space with non-zero (#1295) * Fix atari annotation * ignore get_action_meanings [attr-defined] * Fix mypy issues * Add patch for gym/gymnasium transition * Switch to gymnasium * Rely on signature instead of version * More patches * Type ignore because of Farama-Foundation/Gymnasium#39 * Fix doc build * Fix pytype errors * Fix atari requirement * Update env checker due to change in dtype for Discrete * Fix type hint * Convert spaces for saved models * Ignore pytype * Remove gitlab CI * Disable pytype for convert space * Fix undefined info * Fix undefined info * Upgrade shimmy * Fix wrappers type annotation (need PR from Gymnasium) * Fix gymnasium dependency * Fix dependency declaration * Cap pygame version for python 3.7 * Point to master branch (v0.28.0) * Fix: use main not master branch * Rename done to terminated * Fix pygame dependency for python 3.7 * Rename gym to gymnasium * Update Gymnasium * Fix test * Fix tests * Forks don't have access to private variables * Fix linter warnings * Update read the doc env * Fix env checker for GoalEnv * Fix import * Update env checker (more info) and fix dtype * Use micromamab for Docker * Update dependencies * Clarify VecEnv doc * Fix Gymnasium version * Copy file only after mamba install * [ci skip] Update docker doc * Polish code * Reformat * Remove deprecated features * Ignore warning * Update doc * Update examples and changelog * Fix type annotation bundle (SAC, TD3, A2C, PPO, base class) (#1436) * Fix SAC type hints, improve DQN ones * Fix A2C and TD3 type hints * Fix PPO type hints * Fix on-policy type hints * Fix base class type annotation, do not use defaults * Update version * Disable mypy for python 3.7 * Rename Gym26StepReturn * Update continuous critic type annotation * Fix pytype complain --------- Co-authored-by: Carlos Luis <[email protected]> Co-authored-by: Quentin Gallouédec <[email protected]> Co-authored-by: Thomas Lips <[email protected]> Co-authored-by: tlips <[email protected]> Co-authored-by: tlpss <[email protected]> Co-authored-by: Quentin GALLOUÉDEC <[email protected]>
1 parent 15c9daa commit 40e0b9d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

94 files changed

+1333
-733
lines changed

.github/ISSUE_TEMPLATE/custom_env.yml

+5-4
Original file line numberDiff line numberDiff line change
@@ -49,15 +49,16 @@ body:
4949
self.observation_space = spaces.Box(low=-np.inf, high=np.inf, shape=(14,))
5050
self.action_space = spaces.Box(low=-1, high=1, shape=(6,))
5151
52-
def reset(self):
53-
return self.observation_space.sample()
52+
def reset(self, seed=None):
53+
return self.observation_space.sample(), {}
5454
5555
def step(self, action):
5656
obs = self.observation_space.sample()
5757
reward = 1.0
58-
done = False
58+
terminated = False
59+
truncated = False
5960
info = {}
60-
return obs, reward, done, info
61+
return obs, reward, terminated, truncated, info
6162
6263
env = CustomEnv()
6364
check_env(env)

.github/workflows/ci.yml

+2
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ jobs:
5555
- name: Type check
5656
run: |
5757
make type
58+
# skip mypy type check for python3.7 (result is different to all other versions)
59+
if: "!(matrix.python-version == '3.7')"
5860
- name: Test with pytest
5961
run: |
6062
make pytest

Dockerfile

+11-27
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,25 @@
11
ARG PARENT_IMAGE
22
FROM $PARENT_IMAGE
33
ARG PYTORCH_DEPS=cpuonly
4-
ARG PYTHON_VERSION=3.7
4+
ARG PYTHON_VERSION=3.8
5+
ARG MAMBA_DOCKERFILE_ACTIVATE=1 # (otherwise python will not be found)
56

6-
RUN apt-get update && apt-get install -y --no-install-recommends \
7-
build-essential \
8-
cmake \
9-
git \
10-
curl \
11-
ca-certificates \
12-
libjpeg-dev \
13-
libpng-dev \
14-
libglib2.0-0 && \
15-
rm -rf /var/lib/apt/lists/*
7+
# Install micromamba env and dependencies
8+
RUN micromamba install -n base -y python=$PYTHON_VERSION \
9+
pytorch $PYTORCH_DEPS -c conda-forge -c pytorch -c nvidia && \
10+
micromamba clean --all --yes
1611

17-
# Install Anaconda and dependencies
18-
RUN curl -o ~/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
19-
chmod +x ~/miniconda.sh && \
20-
~/miniconda.sh -b -p /opt/conda && \
21-
rm ~/miniconda.sh && \
22-
/opt/conda/bin/conda install -y python=$PYTHON_VERSION numpy pyyaml scipy ipython mkl mkl-include && \
23-
/opt/conda/bin/conda install -y pytorch $PYTORCH_DEPS -c pytorch && \
24-
/opt/conda/bin/conda clean -ya
25-
ENV PATH /opt/conda/bin:$PATH
26-
27-
ENV CODE_DIR /root/code
12+
ENV CODE_DIR /home/$MAMBA_USER
2813

2914
# Copy setup file only to install dependencies
30-
COPY ./setup.py ${CODE_DIR}/stable-baselines3/setup.py
31-
COPY ./stable_baselines3/version.txt ${CODE_DIR}/stable-baselines3/stable_baselines3/version.txt
15+
COPY --chown=$MAMBA_USER:$MAMBA_USER ./setup.py ${CODE_DIR}/stable-baselines3/setup.py
16+
COPY --chown=$MAMBA_USER:$MAMBA_USER ./stable_baselines3/version.txt ${CODE_DIR}/stable-baselines3/stable_baselines3/version.txt
3217

33-
RUN \
34-
cd ${CODE_DIR}/stable-baselines3 3&& \
18+
RUN cd ${CODE_DIR}/stable-baselines3 && \
3519
pip install -e .[extra,tests,docs] && \
3620
# Use headless version for docker
3721
pip uninstall -y opencv-python && \
3822
pip install opencv-python-headless && \
39-
rm -rf $HOME/.cache/pip
23+
pip cache purge
4024

4125
CMD /bin/bash

Makefile

+6
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,12 @@ pytype:
1010
mypy:
1111
mypy ${LINT_PATHS}
1212

13+
missing-annotations:
14+
mypy --disallow-untyped-calls --disallow-untyped-defs --ignore-missing-imports stable_baselines3
15+
16+
# missing docstrings
17+
# pylint -d R,C,W,E -e C0116 stable_baselines3 -j 4
18+
1319
type: pytype mypy
1420

1521
lint:

docs/conda_env.yml

+3-3
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ channels:
44
- defaults
55
dependencies:
66
- cpuonly=1.0=0
7-
- pip=21.1
7+
- pip=22.3.1
88
- python=3.7
9-
- pytorch=1.11=py3.7_cpu_0
9+
- pytorch=1.11.0=py3.7_cpu_0
1010
- pip:
11-
- gym==0.21
11+
- gymnasium
1212
- cloudpickle
1313
- opencv-python-headless
1414
- pandas

docs/guide/callbacks.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ It will save the best model if ``best_model_save_path`` folder is specified and
210210

211211
.. code-block:: python
212212
213-
import gym
213+
import gymnasium as gym
214214
215215
from stable_baselines3 import SAC
216216
from stable_baselines3.common.callbacks import EvalCallback
@@ -260,7 +260,7 @@ Alternatively, you can pass directly a list of callbacks to the ``learn()`` meth
260260

261261
.. code-block:: python
262262
263-
import gym
263+
import gymnasium as gym
264264
265265
from stable_baselines3 import SAC
266266
from stable_baselines3.common.callbacks import CallbackList, CheckpointCallback, EvalCallback
@@ -290,7 +290,7 @@ It must be used with the :ref:`EvalCallback` and use the event triggered by a ne
290290

291291
.. code-block:: python
292292
293-
import gym
293+
import gymnasium as gym
294294
295295
from stable_baselines3 import SAC
296296
from stable_baselines3.common.callbacks import EvalCallback, StopTrainingOnRewardThreshold
@@ -322,7 +322,7 @@ An :ref:`EventCallback` that will trigger its child callback every ``n_steps`` t
322322

323323
.. code-block:: python
324324
325-
import gym
325+
import gymnasium as gym
326326
327327
from stable_baselines3 import PPO
328328
from stable_baselines3.common.callbacks import CheckpointCallback, EveryNTimesteps
@@ -379,7 +379,7 @@ It must be used with the :ref:`EvalCallback` and use the event triggered after e
379379

380380
.. code-block:: python
381381
382-
import gym
382+
import gymnasium as gym
383383
384384
from stable_baselines3 import SAC
385385
from stable_baselines3.common.callbacks import EvalCallback, StopTrainingOnNoModelImprovement

docs/guide/checking_nan.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -100,8 +100,8 @@ It will monitor the actions, observations, and rewards, indicating what action o
100100

101101
.. code-block:: python
102102
103-
import gym
104-
from gym import spaces
103+
import gymnasium as gym
104+
from gymnasium import spaces
105105
import numpy as np
106106
107107
from stable_baselines3 import PPO
@@ -129,7 +129,7 @@ It will monitor the actions, observations, and rewards, indicating what action o
129129
def reset(self):
130130
return [0.0]
131131
132-
def render(self, mode="human", close=False):
132+
def render(self, close=False):
133133
pass
134134
135135
# Create environment

docs/guide/custom_env.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@ That is to say, your environment must implement the following methods (and inher
2626

2727
.. code-block:: python
2828
29-
import gym
29+
import gymnasium as gym
3030
import numpy as np
31-
from gym import spaces
31+
from gymnasium import spaces
3232
3333
3434
class CustomEnv(gym.Env):
@@ -54,7 +54,7 @@ That is to say, your environment must implement the following methods (and inher
5454
...
5555
return observation # reward, done, info can't be included
5656
57-
def render(self, mode="human"):
57+
def render(self):
5858
...
5959
6060
def close(self):
@@ -91,7 +91,7 @@ Optionally, you can also register the environment with gym, that will allow you
9191

9292
.. code-block:: python
9393
94-
from gym.envs.registration import register
94+
from gymnasium.envs.registration import register
9595
# Example for the CartPole environment
9696
register(
9797
# unique identifier for the env `name-version`

docs/guide/custom_policy.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ using ``policy_kwargs`` parameter:
101101
102102
.. code-block:: python
103103
104-
import gym
104+
import gymnasium as gym
105105
import torch as th
106106
107107
from stable_baselines3 import PPO
@@ -143,7 +143,7 @@ that derives from ``BaseFeaturesExtractor`` and then pass it to the model when t
143143
144144
import torch as th
145145
import torch.nn as nn
146-
from gym import spaces
146+
from gymnasium import spaces
147147
148148
from stable_baselines3 import PPO
149149
from stable_baselines3.common.torch_layers import BaseFeaturesExtractor
@@ -208,7 +208,7 @@ downsampling and "vector" with a single linear layer.
208208

209209
.. code-block:: python
210210
211-
import gym
211+
import gymnasium as gym
212212
import torch as th
213213
from torch import nn
214214
@@ -308,7 +308,7 @@ If your task requires even more granular control over the policy/value architect
308308
309309
from typing import Callable, Dict, List, Optional, Tuple, Type, Union
310310
311-
from gym import spaces
311+
from gymnasium import spaces
312312
import torch as th
313313
from torch import nn
314314

0 commit comments

Comments
 (0)