Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support only new step API (while retaining compatibility functions) #3019

Merged
merged 17 commits into from
Aug 30, 2022

Conversation

arjun-kg
Copy link
Contributor

@arjun-kg arjun-kg commented Aug 5, 2022

Description

This removes backward compatibility for done (old) step API (refer #2752 for details on terminated truncated (new) vs done (old) API).
Once merged, gym will officially support only terminated truncated (new) step API. To use done (old) API, for convenience, compatibility functions and a compatibility wrapper at make are still retained.

The changes made for this PR includes,

  • Removing new_step_api arguments from vector and wrapper classes, and removing all compatibility code.
  • Switching new_step_api default to True for compatibility functions, and StepAPICompatibility wrapper
  • Updating all existing tests to support new step API
  • Updating play.py

Type of change

Please delete options that are not relevant.

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@arjun-kg arjun-kg changed the title Support only new step API (with Support only new step API (while retaining compatibility functions) Aug 5, 2022
Copy link
Contributor

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good. I only have one suggestion on the same change. I realise that this is not backward compatible but for the future, I think that it is better.
If there is a problem with v25 compatibility, then there may be enough changes to justify a v25.1 with this parameter change.

gym/envs/registration.py Outdated Show resolved Hide resolved
gym/vector/__init__.py Show resolved Hide resolved
@arjun-kg arjun-kg marked this pull request as ready for review August 11, 2022 02:26
@pseudo-rnd-thoughts
Copy link
Contributor

pseudo-rnd-thoughts commented Aug 16, 2022

In #3028, I have removed the language changes such that v0.25.3 will be backward compatible.
Therefore, would you be able to make the language changes in this PR.
These are my proposed changes but you can change them if there is a better version

  • OldStepType -> DoneStepType
  • NewStepType -> TerminationTruncationStepType
  • step_to_new_api() -> convert_to_terminated_truncated_step_api()
  • step_to_old_api() -> convert_to_done_step_api()
  • new_step_api -> convert_to_termination_truncation

Copy link
Contributor

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, using control f "done" I found the following theses that need changing

  1. The README.md needs updating
  2. passive environment checker.py line 229 updated to "rewrite the environment with (new) terminated / truncated step API."
  3. play.py docstrings for play() and PlayPlot
  4. vector_env.py docstring on line 141
  5. docstring in autoreset.py line 10, 13 and 20
  6. replace normalize.py
      if not self.is_vector_env:
          dones = terminateds or truncateds
      else:
          dones = np.bitwise_or(terminateds, truncateds)

with dones = np.logical_or(terminateds, truncateds) works with both bool and np.ndarray
7. Update wrappers step api compatbiility.py "Old step API refers to" to "(Old) Done step API refers to" and the same for "New step API" with "(New) Terminated / Truncated step API"

tests/envs/test_envs.py Outdated Show resolved Hide resolved
gym/core.py Show resolved Hide resolved
gym/utils/play.py Show resolved Hide resolved
@pseudo-rnd-thoughts
Copy link
Contributor

@RedTachyon This looks good to me, I did a document search for uses of "done" and "new_step_api", and all are correct now

Copy link
Contributor

@RedTachyon RedTachyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of comments

"""Run one timestep of the environment's dynamics.

When end of episode is reached, you are responsible for calling :meth:`reset` to reset this environment's state.
Accepts an action and returns either a tuple `(observation, reward, terminated, truncated, info)`, or a tuple
(observation, reward, done, info). The latter is deprecated and will be removed in future versions.
Accepts an action and returns either a tuple `(observation, reward, terminated, truncated, info)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "either"

@@ -557,7 +557,7 @@ def make(
id: Name of the environment. Optionally, a module to import can be included, eg. 'module:Env-v0'
max_episode_steps: Maximum length of an episode (TimeLimit wrapper).
autoreset: Whether to automatically reset the environment after each episode (AutoResetWrapper).
new_step_api: Whether to use old or new step API (StepAPICompatibility wrapper). Will be removed at v1.0
apply_step_compatibility: Whether to use apply compatibility wrapper that converts step method to return two bools (StepAPICompatibility wrapper)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we removing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I guess it might be useful for automatically supporting legacy environments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should keep a parameter in make to easily apply the compatibility wrapper


# Add human rendering wrapper
if apply_human_rendering:
env = HumanRendering(env)

# Add step API wrapper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work if the compatibility wrapper is at the end? As far as I understand, the use case here is if someone has a legacy environment, then it would convert it to a new-style environment. But wouldn't one of the wrappers before this crash out if the compatibility is not handled in advance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(checked now, it doesn't work, at least assuming my understanding is correct)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think you are correct, this should occur after the environment checker in order of wrapper

@@ -219,11 +220,6 @@ def play(
deprecation(
"`play.py` currently supports only the old step API which returns one boolean, however this will soon be updated to support only the new step api that returns two bools."
)
if env.render_mode not in {"rgb_array", "single_rgb_array"}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? Seems irrelevant

step_returns: Union[NewStepType, OldStepType],
new_step_api: bool = False,
step_returns: Union[TerminatedTruncatedStepType, DoneStepType],
output_truncation_bool: bool = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this argument name, it's pretty unclear. I think there's a different name used earlier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_termination_truncation_api ? I agree it is not a great name but uncertain of a better one

@@ -11,27 +10,27 @@ class AutoResetWrapper(gym.Wrapper):
with new step API and ``(new_obs, final_reward, final_done, info)`` with the old step API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mention of the old style API should be removed?

dones = terminateds or truncateds
else:
dones = np.bitwise_or(terminateds, truncateds)
dones = np.logical_or(terminateds, truncateds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? It used to be either or or np.bitwise_or, now it's np.logical_or

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It simplifies the code. I checked, it works for both True and np.array([True, False])

Args:
env (gym.Env): the env to wrap. Can be in old or new API
new_step_api (bool): True to use env with new step API, False to use env with old step API. (False by default)
apply_step_compatibility (bool): Apply to convert environment to use new step API that returns two bools. (False by default)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrase, the "Apply" sounds weird. Probably would be best as something like "Whether or not to [...]"

Also now I think it defaults to True

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should default to False. We shouldn't apply the compatibility wrapper by default

)

def step(self, action):
"""Steps through the environment, returning 5 or 4 items depending on `new_step_api`.
"""Steps through the environment, returning 5 or 4 items depending on `apply_step_compatibility`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I said in an earlier comment, sometimes it's output_truncation_bool, sometimes it's apply_step_compatibility, and it seems to have been mixed up here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_truncation_bool is used in the step api compatibility wrapper while apply_step_compatibility is in gym.make

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, it should be output_truncation_bool here



@pytest.mark.parametrize("VecEnv", [AsyncVectorEnv, SyncVectorEnv])
def test_vector_step_compatibility_new_env(VecEnv):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not want to test this anymore?

Copy link
Contributor

@pseudo-rnd-thoughts pseudo-rnd-thoughts Aug 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was purely for vector environments. These tests should be covered in the step compatibility function testing

@pseudo-rnd-thoughts
Copy link
Contributor

Due to @arjun-kg travelling, he is not able to fix these changes. Therefore, we will merge this PR with a follow up PR to make the relevant changes

@jkterry1 jkterry1 merged commit 54b406b into openai:master Aug 30, 2022
@wookayin
Copy link
Contributor

wookayin commented Sep 4, 2022

Having StepAPICompatibility sounds good for backward compatibility purposes despite the new default, but what about env.reset()? I would suggest have a unified wrapper class for handling both step and reset API changes. (If needed, we can discuss in another issue/thread.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants