-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support only new step API (while retaining compatibility functions) #3019
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks good. I only have one suggestion on the same change. I realise that this is not backward compatible but for the future, I think that it is better.
If there is a problem with v25 compatibility, then there may be enough changes to justify a v25.1 with this parameter change.
In #3028, I have removed the language changes such that v0.25.3 will be backward compatible.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, using control f "done" I found the following theses that need changing
- The README.md needs updating
- passive environment checker.py line 229 updated to "rewrite the environment with (new) terminated / truncated step API."
- play.py docstrings for
play()
andPlayPlot
- vector_env.py docstring on line 141
- docstring in autoreset.py line 10, 13 and 20
- replace normalize.py
if not self.is_vector_env:
dones = terminateds or truncateds
else:
dones = np.bitwise_or(terminateds, truncateds)
with dones = np.logical_or(terminateds, truncateds)
works with both bool and np.ndarray
7. Update wrappers step api compatbiility.py "Old step API refers to" to "(Old) Done step API refers to" and the same for "New step API" with "(New) Terminated / Truncated step API"
@RedTachyon This looks good to me, I did a document search for uses of "done" and "new_step_api", and all are correct now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bunch of comments
"""Run one timestep of the environment's dynamics. | ||
|
||
When end of episode is reached, you are responsible for calling :meth:`reset` to reset this environment's state. | ||
Accepts an action and returns either a tuple `(observation, reward, terminated, truncated, info)`, or a tuple | ||
(observation, reward, done, info). The latter is deprecated and will be removed in future versions. | ||
Accepts an action and returns either a tuple `(observation, reward, terminated, truncated, info)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove "either"
@@ -557,7 +557,7 @@ def make( | |||
id: Name of the environment. Optionally, a module to import can be included, eg. 'module:Env-v0' | |||
max_episode_steps: Maximum length of an episode (TimeLimit wrapper). | |||
autoreset: Whether to automatically reset the environment after each episode (AutoResetWrapper). | |||
new_step_api: Whether to use old or new step API (StepAPICompatibility wrapper). Will be removed at v1.0 | |||
apply_step_compatibility: Whether to use apply compatibility wrapper that converts step method to return two bools (StepAPICompatibility wrapper) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I guess it might be useful for automatically supporting legacy environments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should keep a parameter in make to easily apply the compatibility wrapper
|
||
# Add human rendering wrapper | ||
if apply_human_rendering: | ||
env = HumanRendering(env) | ||
|
||
# Add step API wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work if the compatibility wrapper is at the end? As far as I understand, the use case here is if someone has a legacy environment, then it would convert it to a new-style environment. But wouldn't one of the wrappers before this crash out if the compatibility is not handled in advance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(checked now, it doesn't work, at least assuming my understanding is correct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think you are correct, this should occur after the environment checker in order of wrapper
@@ -219,11 +220,6 @@ def play( | |||
deprecation( | |||
"`play.py` currently supports only the old step API which returns one boolean, however this will soon be updated to support only the new step api that returns two bools." | |||
) | |||
if env.render_mode not in {"rgb_array", "single_rgb_array"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this removed? Seems irrelevant
step_returns: Union[NewStepType, OldStepType], | ||
new_step_api: bool = False, | ||
step_returns: Union[TerminatedTruncatedStepType, DoneStepType], | ||
output_truncation_bool: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this argument name, it's pretty unclear. I think there's a different name used earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_termination_truncation_api
? I agree it is not a great name but uncertain of a better one
@@ -11,27 +10,27 @@ class AutoResetWrapper(gym.Wrapper): | |||
with new step API and ``(new_obs, final_reward, final_done, info)`` with the old step API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the mention of the old style API should be removed?
dones = terminateds or truncateds | ||
else: | ||
dones = np.bitwise_or(terminateds, truncateds) | ||
dones = np.logical_or(terminateds, truncateds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? It used to be either or
or np.bitwise_or
, now it's np.logical_or
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It simplifies the code. I checked, it works for both True
and np.array([True, False])
Args: | ||
env (gym.Env): the env to wrap. Can be in old or new API | ||
new_step_api (bool): True to use env with new step API, False to use env with old step API. (False by default) | ||
apply_step_compatibility (bool): Apply to convert environment to use new step API that returns two bools. (False by default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrase, the "Apply" sounds weird. Probably would be best as something like "Whether or not to [...]"
Also now I think it defaults to True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should default to False
. We shouldn't apply the compatibility wrapper by default
) | ||
|
||
def step(self, action): | ||
"""Steps through the environment, returning 5 or 4 items depending on `new_step_api`. | ||
"""Steps through the environment, returning 5 or 4 items depending on `apply_step_compatibility`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I said in an earlier comment, sometimes it's output_truncation_bool
, sometimes it's apply_step_compatibility
, and it seems to have been mixed up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_truncation_bool
is used in the step api compatibility wrapper while apply_step_compatibility
is in gym.make
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, it should be output_truncation_bool here
|
||
|
||
@pytest.mark.parametrize("VecEnv", [AsyncVectorEnv, SyncVectorEnv]) | ||
def test_vector_step_compatibility_new_env(VecEnv): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not want to test this anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was purely for vector environments. These tests should be covered in the step compatibility function testing
Due to @arjun-kg travelling, he is not able to fix these changes. Therefore, we will merge this PR with a follow up PR to make the relevant changes |
Having StepAPICompatibility sounds good for backward compatibility purposes despite the new default, but what about |
Description
This removes backward compatibility for done (old) step API (refer #2752 for details on terminated truncated (new) vs done (old) API).
Once merged, gym will officially support only terminated truncated (new) step API. To use done (old) API, for convenience, compatibility functions and a compatibility wrapper at make are still retained.
The changes made for this PR includes,
new_step_api
arguments from vector and wrapper classes, and removing all compatibility code.new_step_api
default toTrue
for compatibility functions, andStepAPICompatibility
wrapperplay.py
Type of change
Please delete options that are not relevant.
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)