-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow PPO to turn off advantage normalization #61
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the changelog too, otherwise LGTM ;) (and you need an additional issue if you have already one opened in SB3 repo)
tests/test_run.py
Outdated
|
||
@pytest.mark.parametrize("normalize_advantage", [False, True]) | ||
def test_advantage_normalization(model_class, normalize_advantage): | ||
model = MaskablePPO("MlpPolicy", "CartPole-v1", n_steps=64, normalize_advantage=normalize_advantage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if it works with CartPole, let's see...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey you may need to approve the workflow run ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did and there are some failures already ;)
but best is to quickly run locally using the -k argument
Had the following error, seems to relate to DLR-RM/stable-baselines3#782
|
Ok should be good now. |
Description
Allow PPO to turn of advantage normalization. Follow up from DLR-RM/stable-baselines3#763, #60
Types of changes
Checklist:
make format
(required)make check-codestyle
andmake lint
(required)make pytest
andmake type
both pass. (required)Note: we are using a maximum length of 127 characters per line