Update to newest Gym version #572

jkterry1 · 2021-09-16T13:05:41Z

Description

This should fix CI for Gym 0.20.0 wrt Atari ROMs

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist:

I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have reformatted the code using make format (required)
I have checked the codestyle using make check-codestyle and make lint (required)
I have ensured make pytest and make type both pass. (required)
I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

Miffyli · 2021-09-16T13:27:31Z

Looks like something is not working out right. Possible thing from Gym 0.20?

    def _check_obs(obs: Union[tuple, dict, np.ndarray, int], observation_space: spaces.Space, method_name: str) -> None:
        """
        Check that the observation returned by the environment
        correspond to the declared one.
        """
        if not isinstance(observation_space, spaces.Tuple):
            assert not isinstance(
                obs, tuple
            ), f"The observation returned by the `{method_name}()` method should be a single value, not a tuple"
    
        # The check for a GoalEnv is done by the base class
        if isinstance(observation_space, spaces.Discrete):
            assert isinstance(obs, int), f"The observation returned by `{method_name}()` method must be an int"
        elif _is_numpy_array_space(observation_space):
            assert isinstance(obs, np.ndarray), f"The observation returned by `{method_name}()` method must be a numpy array"
    
>       assert observation_space.contains(
            obs
        ), f"The observation returned by the `{method_name}()` method does not match the given observation space"
E       AssertionError: The observation returned by the `reset()` method does not match the given observation space

araffin · 2021-09-16T13:31:13Z

Looks like something is not working out right. Possible thing from Gym 0.20?

yep, probably comes from wrong dtype, I fixed several issues yesterday: cyprienc@e7a48d2

araffin · 2021-09-17T09:00:31Z

We also need to fix that issue: #573 (comment)
Either upstream or we will have to monkey patch gym in SB3... (we did that in the past, I'm hoping we won't have to do it again)

docs/misc/changelog.rst

jkterry1 · 2021-10-23T21:06:16Z

So to fix the error with breakout, the new command equivalent to the noframeskip is gym.make('ALE/Breakout-v5', frameskip=1). I don't know how you want to handle that in make_atari_env?

araffin · 2021-10-24T10:37:23Z

the new command equivalent to the noframeskip is gym.make('ALE/Breakout-v5', frameskip=1).

wait, what is the new default behavior? and why? and frameskip=1 sounds wrong... (I would expect frameskip=0)

JesseFarebro · 2021-10-25T00:44:44Z

@araffin there's no longer any environment suffixes like NoFrameskip or Deterministic. You'll now have to supply all additional options when constructing the environment.

Regarding the frameskip setting, it's always been a misnomer. It's really "how many frames is the action applied for" NOT "how many frames are skipped after the action was executed". A frameskip of 1 means the action is applied for one frame. See: https://github.com/mgbellemare/Arcade-Learning-Environment/blob/79ffb7d5c8d404bdde951d45bef263b1a1f84125/src/gym/envs/atari/environment.py#L222-L223 and https://github.com/mgbellemare/Arcade-Learning-Environment/blob/79ffb7d5c8d404bdde951d45bef263b1a1f84125/src/environment/stella_environment.cpp#L157

Hope that helps clear things up.

araffin · 2021-10-25T12:35:53Z

Thanks for the answer =)

It's really "how many frames is the action applied for"

I see, yes a bit confusing name and we actually do the same in SB3 max and skip wrapper, I've seen the name "action repeat" instead too, which better illustrate its action.

. You'll now have to supply all additional options when constructing the environment.

but what is the default option then? and what was the default option before for the v4 envs? (with NoFrameskip name)

araffin · 2021-12-14T16:13:05Z

but what is the default option then? and what was the default option before for the v4 envs? (with NoFrameskip name)

@JesseFarebro looking at https://brosa.ca/blog/ale-release-v0.7, I assume that frameskip=5 and action set is "full" with 0.25 sticky probability.
But looking at the paper of Machado et al, it seems that the value frameskip=5 and the choice of full action set is pretty arbitrary, do you know how it impacts performance/reported results?

araffin · 2021-12-14T16:15:31Z

Side note: if we want to upgrade to gym 0.21, not only code, test and documentation must be updated but also all our notebook tutorials...

EDIT: I forgot the RL Zoo too: https://github.com/DLR-RM/rl-baselines3-zoo

JesseFarebro · 2021-12-14T17:56:06Z

@araffin there was some miscommunication on my part (and some false assumptions) which lead to a frameskip of 5 being the default for v5 environments. Specifically, a frameskip of 5 was used in the Revisiting the Arcade Learning Environment paper due to their comparison with linear function approximation with hand-crafted features. In the pre-DQN literature, a value of 5 was typically used with LFA. A frameskip for 4 has been the standard value for work post-DQN and I will be updating the v5 default to be 4. The deployment of v5 environments isn't ubiquitous so I feel like making this silent change will have minimal impact and won't require bumping to v6.

Sorry for the confusion, I hope this clears things up.

araffin · 2021-12-14T20:16:04Z

thanks for the clarification, is it the same for the action set?

JesseFarebro · 2021-12-21T17:27:53Z

@araffin sorry for the delay, I was still trying to sort this all out. It makes sense to use the minimal action set to stay as close to the post-DQN methodology as possible.

araffin · 2021-12-21T21:07:19Z

@araffin sorry for the delay, I was still trying to sort this all out. It makes sense to use the minimal action set to stay as close to the post-DQN methodology as possible.

ok, then at that point, it would maybe make sense to either keep v4 (and not drop them in the future at it was intended) or provide a version with fixes (if I recall, v5 fixed some determinism issues, right? and made the max number of steps consistent?) that would allow to replicate published results?

JesseFarebro · 2021-12-21T22:03:24Z

@araffin There are still differences which we'd like to fix so the bump to v5 is warranted. Right now things will look like (after v0.7.4 of the ALE):

Version	Frame Skip	Sticky Action Prob.	Action Set	Max Frames
v0	[2, 4]	0.25	Minimal	≤ 40,000
v4	[2, 4]	0.0	Minimal	≤ 400,000
v5	4	0.25	Minimal	=108,000

The v5 settings should have been the standardized methodology since the revisiting paper. I don't have any intentions of dropping v0 or v4for the foreseeable future until there's a better way to ensure this won't break existing projects. As of right now, the next Gym release (i.e., 0.22) will actively complain if users aren't using the latest version of the environment. Hopefully, this will help with consensus.

cash · 2022-01-13T14:40:54Z

I see that this PR has been closed. Is there any plan to support gym 0.20 or greater out of the box? (I saw the comment on #674 to install without dependencies to avoid the gym version conflict).

araffin · 2022-01-13T14:59:18Z

I see that this PR has been closed. Is there any plan to support gym 0.20 or greater out of the box? (I saw the comment on #674 to install without dependencies to avoid the gym version conflict).

This PR has been closed in favor of #705, help is also welcomed to update the doc #705 (comment)

We may also wait for openai/gym#2531 to be fixed too.

cash · 2022-01-13T15:06:54Z

Thanks. Will take a look at #705. Looks like openai/gym#2531 has been fixed so maybe wait for a 0.22 release of gym?

jkterry1 · 2022-01-13T17:27:03Z

@araffin openai/gym#2531 doesn't impact any current release and we aren't releasing the next version until it's fixed.

fix Atari in CI

243457e

fix dtype and atari extra

0d94863

araffin mentioned this pull request Sep 16, 2021

[Bug] SB3 breaks with gym==0.20.0 #573

Closed

3 tasks

jkterry1 and others added 4 commits September 26, 2021 23:44

Merge branch 'master' into master

774b7c9

Merge branch 'master' into master

cdb4028

Update setup.py

4899c60

remove 3.6

4329f4b

jkterry1 changed the title ~~Fix Atari in CI for newest Gym version~~ Update to newest Gym version Oct 21, 2021

jkterry1 added 7 commits October 21, 2021 00:37

note about how to install Atari

d2ad8fd

pendulum-v1

cd29301

atari v5

e01e535

Merge branch 'master' into master

20b1ac9

black

c4e4f0a

fix pendulum capitalization

4279d63

add minimum version

f549fc8

araffin reviewed Oct 21, 2021

View reviewed changes

docs/misc/changelog.rst Outdated Show resolved Hide resolved

moved things in changelog to breaking changes

1db85d1

jkterry1 added 2 commits October 23, 2021 17:07

Merge branch 'master' into master

9abfafb

partial v5 fix

d72cdf6

Merge branch 'master' into master

ba0db77

araffin mentioned this pull request Nov 24, 2021

Gym 0.21.0 support #674

Closed

araffin mentioned this pull request Dec 6, 2021

Drop Python 3.6 support #685

Merged

14 tasks

Merge branch 'master' into master

30c9f4d

araffin and others added 3 commits December 21, 2021 23:08

Merge branch 'master' into master

de74ec8

Merge branch 'master' into master

3620a04

Merge branch 'master' into master

291e27b

modanesh mentioned this pull request Dec 28, 2021

Fix not-registered env #705

Closed

10 tasks

jkterry1 closed this Dec 30, 2021

carlosluis mentioned this pull request Jan 22, 2022

Gym fixes - Follow up from #705 #734

Merged

11 tasks

araffin mentioned this pull request Apr 15, 2022

Bump Atari envs version #865

Closed

13 tasks

araffin mentioned this pull request Jul 4, 2022

Add QR-DQN Stable-Baselines-Team/stable-baselines3-contrib#13

Merged

19 tasks

araffin mentioned this pull request Jan 9, 2023

AtariWrapper does not use recommended defaults #635

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to newest Gym version #572

Update to newest Gym version #572

jkterry1 commented Sep 16, 2021

Miffyli commented Sep 16, 2021

araffin commented Sep 16, 2021

araffin commented Sep 17, 2021

jkterry1 commented Oct 23, 2021

araffin commented Oct 24, 2021

JesseFarebro commented Oct 25, 2021

araffin commented Oct 25, 2021

araffin commented Dec 14, 2021

araffin commented Dec 14, 2021 •

edited

Loading

JesseFarebro commented Dec 14, 2021

araffin commented Dec 14, 2021

JesseFarebro commented Dec 21, 2021

araffin commented Dec 21, 2021

JesseFarebro commented Dec 21, 2021 •

edited

Loading

cash commented Jan 13, 2022

araffin commented Jan 13, 2022

cash commented Jan 13, 2022

jkterry1 commented Jan 13, 2022

Update to newest Gym version #572

Update to newest Gym version #572

Conversation

jkterry1 commented Sep 16, 2021

Description

Types of changes

Checklist:

Miffyli commented Sep 16, 2021

araffin commented Sep 16, 2021

araffin commented Sep 17, 2021

jkterry1 commented Oct 23, 2021

araffin commented Oct 24, 2021

JesseFarebro commented Oct 25, 2021

araffin commented Oct 25, 2021

araffin commented Dec 14, 2021

araffin commented Dec 14, 2021 • edited Loading

JesseFarebro commented Dec 14, 2021

araffin commented Dec 14, 2021

JesseFarebro commented Dec 21, 2021

araffin commented Dec 21, 2021

JesseFarebro commented Dec 21, 2021 • edited Loading

cash commented Jan 13, 2022

araffin commented Jan 13, 2022

cash commented Jan 13, 2022

jkterry1 commented Jan 13, 2022

araffin commented Dec 14, 2021 •

edited

Loading

JesseFarebro commented Dec 21, 2021 •

edited

Loading