Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print_statistics() output of ContinuousA2CBase() might be wrong due to frame update implementation? #275

Open
yutaizhou opened this issue Mar 6, 2024 · 0 comments

Comments

@yutaizhou
Copy link

Hello!

Thank you for the excellent library. I may have found a bug in how frame is tracked across training, and it comes from the implementation of where the frame = self.frame // self.num_agents update is inserted, which differs across both ContinuousA2CBase.train() and DiscreteA2CBase.train

In ContinuousA2CBase.train(), the update is inserted before self.frame += curr_frames, which I believe is the wrong implementation. Whereas in DiscreteA2CBase.train(), the update is inserted after self.frame += curr_frames, which I believe is the correct implementation.

After one interation of PPO training using num_envs=512 and horizon_length=16, ContinuousA2CBase.train() prints outs:

fps step: 6744 fps step and policy inference: 6571 fps total: 6360 epoch: 1/500 frames: 0

After modifying the update to be more similar to DiscreteA2CBase.train(), the print out is:

fps step: 6744 fps step and policy inference: 6571 fps total: 6360 epoch: 1/500 frames: 8192
@yutaizhou yutaizhou changed the title print_statistics() of ContinuousA2CBase() might be wrong? print_statistics() output of ContinuousA2CBase() might be wrong due to frame update implementation? Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant