-
-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update RescaleAction
and RescaleObservation
for np.inf
bounds
#1095
Update RescaleAction
and RescaleObservation
for np.inf
bounds
#1095
Conversation
(the tests failing are due to an unrelated issue, I'll rerun them when that's handled) |
@TimSchneider42 Could you update the PR with the project main |
Looking at the CI, this reminds me that I think we made the change a while ago intentionally from I'm not sure what to do, because the current bound is unreachable to my knowledge (therefore stupid) but |
Hi, From what I can see, all environments with unbounded observation spaces use This also means that the My specific use case is the following: I have a custom normalization wrapper that scales all bounded values to [-1, 1] and leaves unbounded values as is. The CartPole environment makes it seem as if the value is bounded when, in reality, it is not, which made my wrapper fail silently by scaling the value down to almost zero. Since CartPole is the only environment that seems to do it that way, I now have a specific exception for it. Best, |
Thanks for the rapid response @TimSchneider42, yes I think that is a very reasonable justification |
I just patched
I hope this is what you had in mind. |
@pseudo-rnd-thoughts, I get a warning from the env checker due to the limits being infinite, which makes the tests fail. How do I address that? |
I had a look at test_mujoco_custom_env.py and decided to fix the test by ignoring this specific warning. |
I'm doing some testing to understand the PR better from gymnasium.spaces import Box
from gymnasium.wrappers import RescaleObservation
from gymnasium.wrappers import RescaleAction
import numpy as np
from tests.testing_env import GenericTestEnv
def func(self, action):
return action, 0, False, False, {}
env = GenericTestEnv(
observation_space=Box(low=np.array([-10, 0, 3, -np.inf, 0, -np.inf], dtype=np.float32),
high=np.array([10, 1, 5, np.inf, np.inf, 1, ], dtype=np.float32)),
action_space=Box(low=np.array([-10, 0, 3, -np.inf, 0, -np.inf], dtype=np.float32),
high=np.array([10, 1, 5, np.inf, np.inf, 1, ], dtype=np.float32)),
step_func=func
)
print(f'{env.observation_space=}')
env = RescaleObservation(
env,
min_obs=np.array([0, 0, -1, -np.inf, -1, -np.inf], dtype=np.float32),
max_obs=np.array([1, 1, 1, np.inf, np.inf, 2 ], dtype=np.float32)
)
print(f'{env.observation_space=}')
env.reset()
unscaled_obs = np.array([5, 0.3, 4.5, 2, 10, -10], dtype=np.float32)
scaled_obs, *_ = env.step(unscaled_obs)
print(f'{unscaled_obs=}')
print(f'{scaled_obs=}')
assert np.all(scaled_obs == np.array([0.75, 0.3, 0.5, 2, 9, -9], dtype=np.float32))
print(f'{env.action_space=}')
env = RescaleAction(
env,
min_action=np.array([0, 0, -1, -np.inf, -1, -np.inf], dtype=np.float32),
max_action=np.array([1, 1, 1, np.inf, np.inf, 2 ], dtype=np.float32)
)
print(f'{env.action_space=}')
env.reset()
unscaled_action = np.array([5, 0.3, 4.5, 2, 10, -10], dtype=np.float32)
scaled_action, *_ = env.step(unscaled_action)
print(f'{unscaled_action=}')
print(f'{scaled_action=}')
assert np.all(scaled_action == np.array([0.75, 0.3, 0.5, 2, 9, -9], dtype=np.float32)) Does this make sense? We can possibly add this testing if it is correct |
@TimSchneider42 I've edited the code above to make it work, would you be able to make similar changes to |
@pseudo-rnd-thoughts, I can look into
What do you mean by that? Also, some doctest checks keep failing. Is there a script I can run to update the docstrings, or do I have to do this by hand? |
For the doctests, you'll need to update individually and you need NumPy 2.0 for it Ignore me on the unbounded / bounded question |
…failed due to the velocity observation limits of CartPole being infinite
@pseudo-rnd-thoughts, I fixed the docstrings and reorganized the commits. Is it fine to be merged now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thanks for the PR and making all the changes
RescaleAction
and RescaleObservation
for np.inf
bounds
Description
Fixes #1092
Type of change
Please delete options that are not relevant.
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)