"ep_len_mean" and "ep_rew_mean" are inconsistent for Atari environments #873

semihtasbas · 2022-04-19T07:50:08Z

Hi,

I want to calculate ep_rew_mean on custom Atari environment. But while calculating this, I encounter a problem. I used following code.

env = make_atari_env('BreakoutNoFrameskip-v4', n_envs=16)
env = VecFrameStack(env, n_stack=4)
model = A2C("CnnPolicy", env, verbose=0)
model.learn(total_timesteps=int(5e6))

I debug this code and checked line 175 in "on_policy_algorithm.py":

new_obs, rewards, dones, infos = env.step(clipped_actions)

I count done condition with my hand, and it is around 30, while ep_len_mean is around 600. And there is also difference between ep_rew_mean and hand counted reward.
What is reason and how can I implement "ep_rew_mean" to my custom algorithm while using your Atari wrapper?

Thank you.

araffin · 2022-04-19T08:19:32Z

Duplicate of #181

semihtasbas added the bug Something isn't working label Apr 19, 2022

araffin added question Further information is requested duplicate This issue or pull request already exists and removed bug Something isn't working labels Apr 19, 2022

araffin marked this as a duplicate of #181 Apr 19, 2022

semihtasbas closed this as completed Apr 20, 2022

araffin mentioned this issue Apr 30, 2022

[Question] Evaluation helper on Monitor wrapped environments #894

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"ep_len_mean" and "ep_rew_mean" are inconsistent for Atari environments #873

"ep_len_mean" and "ep_rew_mean" are inconsistent for Atari environments #873

semihtasbas commented Apr 19, 2022

araffin commented Apr 19, 2022

"ep_len_mean" and "ep_rew_mean" are inconsistent for Atari environments #873

"ep_len_mean" and "ep_rew_mean" are inconsistent for Atari environments #873

Comments

semihtasbas commented Apr 19, 2022

araffin commented Apr 19, 2022