You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I count done condition with my hand, and it is around 30, while ep_len_mean is around 600. And there is also difference between ep_rew_mean and hand counted reward.
What is reason and how can I implement "ep_rew_mean" to my custom algorithm while using your Atari wrapper?
Thank you.
The text was updated successfully, but these errors were encountered:
Hi,
I want to calculate ep_rew_mean on custom Atari environment. But while calculating this, I encounter a problem. I used following code.
I debug this code and checked line 175 in "on_policy_algorithm.py":
I count done condition with my hand, and it is around 30, while ep_len_mean is around 600. And there is also difference between ep_rew_mean and hand counted reward.
What is reason and how can I implement "ep_rew_mean" to my custom algorithm while using your Atari wrapper?
Thank you.
The text was updated successfully, but these errors were encountered: