[Question] The actual training timesteps don't correspond with the hyper-parameters for Atari #367

cx441000319 · 2023-03-14T02:41:38Z

❓ Question

Hi,

As the title says, it seems the issue only occurs in Atari. Here are some commands and images for reference:

Experiment Command:
python train.py --algo ppo --env PongNoFrameskip-v4

Training Plotting Command:
python scripts/plot_train.py -a ppo -e PongNoFrameskip-v4 -f logs

Evaluation Plotting Command:
python scripts/all_plots.py -a ppo -e PongNoFrameskip-v4 -f logs --no-million -max 10000000

We can tell the number of the training timesteps is about 4e7 instead of 1e7 (n_timesteps in the hyper-parameters). The issue doesn't exist in the environments except for Atari based on my experiment results. If you want to reproduce the same issue, you can simply replace the hyper-parameter n_timesteps with a small number like 1e4 and you will find there are much more than 1e4 samples according to the episodic lengths in the logs.

Thank you so much in advance!

Checklist

I have checked that there is no similar issue in the repo
I have read the SB3 documentation
I have read the RL Zoo documentation
If code there is, it is minimal and working
If code there is, it is formatted using the markdown code blocks for both code and stack traces.

araffin · 2023-03-14T06:20:30Z

Hello,
this is expected because of preprocessing for Atari games (the action repeat, aka frameskip, is set to 4 by default).

Related: DLR-RM/stable-baselines3#181

cx441000319 · 2023-03-14T16:09:35Z

Oh, that totally makes sense. I tried my best to check if there were any details I ignored, but I didn't realize it before. It's all clear now. Thank you so much for your quick reply!

cx441000319 added the question Further information is requested label Mar 14, 2023

cx441000319 closed this as completed Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] The actual training timesteps don't correspond with the hyper-parameters for Atari #367

[Question] The actual training timesteps don't correspond with the hyper-parameters for Atari #367

cx441000319 commented Mar 14, 2023 •

edited

Loading

araffin commented Mar 14, 2023 •

edited

Loading

cx441000319 commented Mar 14, 2023

[Question] The actual training timesteps don't correspond with the hyper-parameters for Atari #367

[Question] The actual training timesteps don't correspond with the hyper-parameters for Atari #367

Comments

cx441000319 commented Mar 14, 2023 • edited Loading

❓ Question

Checklist

araffin commented Mar 14, 2023 • edited Loading

cx441000319 commented Mar 14, 2023

cx441000319 commented Mar 14, 2023 •

edited

Loading

araffin commented Mar 14, 2023 •

edited

Loading