[Bug Report] gym normalize wrappers are incompatible with envpool #3021

vwxyzjn · 2022-08-11T15:18:03Z

Describe the bug
gym.wrappers.NormalizeObservation and gym.wrappers.NormalizeReward are incompatible with envpool. See sail-sg/envpool#185

Code example

import numpy as np
import envpool
import gym

envs = envpool.make(
    "HalfCheetah-v4",
    env_type="gym",
    num_envs=4,
)
envs.num_envs = 4
envs.single_action_space = envs.action_space
envs.single_observation_space = envs.observation_space
envs.is_vector_env = True
envs = gym.wrappers.ClipAction(envs)
envs = gym.wrappers.NormalizeObservation(envs)
envs = gym.wrappers.TransformObservation(envs, lambda obs: np.clip(obs, -10, 10))
envs = gym.wrappers.NormalizeReward(envs)
envs = gym.wrappers.TransformReward(envs, lambda reward: np.clip(reward, -10, 10))
obs = envs.reset()
envs.step(np.array([envs.action_space.sample() for _ in range(envs.num_envs)]))

Traceback (most recent call last):
  File "/home/costa/Documents/go/src/github.com/vwxyzjn/envpool-cleanrl/bug.py", line 22, in <module>
    envs.step(np.array([envs.action_space.sample() for _ in range(envs.num_envs)]))
  File "/home/costa/.cache/pypoetry/virtualenvs/envpool-cleanrl-uAHoRI5J-py3.9/lib/python3.9/site-packages/gym/core.py", line 532, in step
    step_returns = self.env.step(action)
  File "/home/costa/.cache/pypoetry/virtualenvs/envpool-cleanrl-uAHoRI5J-py3.9/lib/python3.9/site-packages/gym/wrappers/normalize.py", line 149, in step
    self.env.step(action), True, self.is_vector_env
  File "/home/costa/.cache/pypoetry/virtualenvs/envpool-cleanrl-uAHoRI5J-py3.9/lib/python3.9/site-packages/gym/core.py", line 493, in step
    step_returns = self.env.step(action)
  File "/home/costa/.cache/pypoetry/virtualenvs/envpool-cleanrl-uAHoRI5J-py3.9/lib/python3.9/site-packages/gym/wrappers/normalize.py", line 77, in step
    obs, rews, terminateds, truncateds, infos = step_api_compatibility(
  File "/home/costa/.cache/pypoetry/virtualenvs/envpool-cleanrl-uAHoRI5J-py3.9/lib/python3.9/site-packages/gym/utils/step_api_compatibility.py", line 178, in step_api_compatibility
    return step_to_new_api(step_returns, is_vector_env)
  File "/home/costa/.cache/pypoetry/virtualenvs/envpool-cleanrl-uAHoRI5J-py3.9/lib/python3.9/site-packages/gym/utils/step_api_compatibility.py", line 59, in step_to_new_api
    and not infos["_TimeLimit.truncated"][i]
KeyError: '_TimeLimit.truncated'

System Info
Describe the characteristic of your environment:

Describe how Gym was installed (pip, docker, source, ...) pip
What OS/version of Linux you're using. Note that while we will accept PRs to improve Window's support, we do not officially support it. Linux.
Python version: 3.9

Additional context
Add any other context about the problem here.

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

arjun-kg · 2022-08-12T01:57:05Z

Thank you for the bug report.

In the step API compatibility code, I assumed that if X key exists in infos, _X mask key will also exist and this is causing the error. I made this assumption since this was gym's new way of handling vector infos (see here).

But envpool does not seem to use mask keys. I have a question in that case. In the old step API there is a difference in meaning when the key TimeLimit.truncated is not present in info vs it is present and set to False. Since envpool does not have the mask key, how does it differentiate between these two cases?

arjun-kg · 2022-08-12T02:02:35Z

I think I can patch this up regardless, but do we want to unify the way envpool handles vector infos vs how gym handles it? @pseudo-rnd-thoughts

pseudo-rnd-thoughts · 2022-08-12T10:28:23Z

The reason gym changed its approach to vector info is for jax based vectorisation, in particular, for brax, the shape of each key needed to be constant. However, it is impossible to tell between default data and no data if you want to use default data.
An example to help where we use the np.zeros((num_envs), dtype=type(data))

info_1 = {"a": 1, "b": 0, "c": False}
info_2 = {}

vector_info = {"a": [1, 0], "b": [0, 0], "c": [False, False]}

Therefore, we added an underscore version of each key to show if the key actually exists for the sub-env such that default data is usable. I hope that makes sense.

For TimeLimit.truncated, as we know that we only care about the answer when it is True, we can ignore the underscore version.

vwxyzjn mentioned this issue Aug 12, 2022

[BUG] Incompatible with latest gym normalize wrappers sail-sg/envpool#185

Closed

3 tasks

arjun-kg mentioned this issue Aug 14, 2022

Fix envpool bug with missing mask key #3026

Merged

5 tasks

jkterry1 closed this as completed in #3026 Aug 15, 2022

pseudo-rnd-thoughts mentioned this issue Aug 16, 2022

Add testing for step api compatibility functions and wrapper #3028

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] gym normalize wrappers are incompatible with envpool #3021

[Bug Report] gym normalize wrappers are incompatible with envpool #3021

vwxyzjn commented Aug 11, 2022 •

edited

Loading

arjun-kg commented Aug 12, 2022

arjun-kg commented Aug 12, 2022

pseudo-rnd-thoughts commented Aug 12, 2022

[Bug Report] gym normalize wrappers are incompatible with envpool #3021

[Bug Report] gym normalize wrappers are incompatible with envpool #3021

Comments

vwxyzjn commented Aug 11, 2022 • edited Loading

Checklist

arjun-kg commented Aug 12, 2022

arjun-kg commented Aug 12, 2022

pseudo-rnd-thoughts commented Aug 12, 2022

vwxyzjn commented Aug 11, 2022 •

edited

Loading