-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reward shaping not removed in evaluation in CarRacing-From-Pixels-PPO #3
Comments
Thanks for your note, |
Hi,
|
Hi, |
Let's see if the followings can describe my points better.
The code snippet I used was for evaluation. |
@lerrytang If the car went outward the field, the reward is penalized by -100. |
Hi,
The figure and log in README shows scores >1000, which due to the CarRacing's design, is not quite possible.
It turns out that the reward shaping in
Wrapper.step()
is not removed in evaluation and that leads to incorrect results.Commenting out relevant lines, I got an average score of 820 over 100 episodes.
The text was updated successfully, but these errors were encountered: