Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRAD_MOMENTUM not used by RMSProp in DQN #37

Merged
merged 2 commits into from
Dec 19, 2024

Conversation

brett-daley
Copy link
Contributor

I noticed that GRAD_MOMENTUM on the following line is never used:

GRAD_MOMENTUM = 0.95

I think this is likely a bug, since the default momentum for Pytorch's RMSProp is 0 and not 0.95 (see docs).
This PR fixes it by passing the value into the optimizer constructor.

@kenjyoung
Copy link
Owner

kenjyoung commented Dec 9, 2024

I believe this is actually intentional, although very poorly documented. I'm a bit rusty on the context, but take a look at the original DQN source code here: https://github.com/google-deepmind/dqn/blob/master/dqn/NeuralQLearner.lua. They actually implement something like centred RMSProp without momentum. I believe in the nature version of the DQN paper they have parameters referred to as squared gradient momentum and gradient momentum which are each set to 0.95. Cross referencing this with the source code it seems like these refer to the decay factor on the gradient and the squared gradient which are fixed to be the same in the centred version of the algorithm (see for example the pytorch version here: https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html).

Nice catch in any case!

@brett-daley
Copy link
Contributor Author

Ah, I see what you mean! The "momentum" terms are referring to the first and second moment estimations in the denominator. I didn't check the original optimizer before opening the PR 😅

In light of this, I think we could do any of the following:

  • Delete the unused variable to avoid confusion,
  • Add a comment explaining why it's unused, or
  • Just close this PR.

@kenjyoung
Copy link
Owner

Sorry for the less than expedient reply. Deleting the unused variable makes sense to me! Feel free to amend the PR to do that and I can accept it!

@brett-daley
Copy link
Contributor Author

@kenjyoung PR updated!

Copy link
Owner

@kenjyoung kenjyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@brett-daley
Copy link
Contributor Author

@kenjyoung I don't have write access, would you be able to squash and merge for me?

@kenjyoung kenjyoung merged commit 4241f24 into kenjyoung:master Dec 19, 2024
@kenjyoung
Copy link
Owner

For sure, I thought I did already haha.

@brett-daley brett-daley deleted the grad-momentum branch December 19, 2024 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants