GRAD_MOMENTUM not used by RMSProp in DQN #37

brett-daley · 2024-11-18T22:50:58Z

I noticed that GRAD_MOMENTUM on the following line is never used:

Line 45 in 1918a2f

GRAD_MOMENTUM = 0.95

I think this is likely a bug, since the default momentum for Pytorch's RMSProp is 0 and not 0.95 (see docs).
This PR fixes it by passing the value into the optimizer constructor.

kenjyoung · 2024-12-09T22:27:49Z

I believe this is actually intentional, although very poorly documented. I'm a bit rusty on the context, but take a look at the original DQN source code here: https://github.com/google-deepmind/dqn/blob/master/dqn/NeuralQLearner.lua. They actually implement something like centred RMSProp without momentum. I believe in the nature version of the DQN paper they have parameters referred to as squared gradient momentum and gradient momentum which are each set to 0.95. Cross referencing this with the source code it seems like these refer to the decay factor on the gradient and the squared gradient which are fixed to be the same in the centred version of the algorithm (see for example the pytorch version here: https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html).

Nice catch in any case!

brett-daley · 2024-12-10T00:32:34Z

Ah, I see what you mean! The "momentum" terms are referring to the first and second moment estimations in the denominator. I didn't check the original optimizer before opening the PR 😅

In light of this, I think we could do any of the following:

Delete the unused variable to avoid confusion,
Add a comment explaining why it's unused, or
Just close this PR.

kenjyoung · 2024-12-16T19:08:13Z

Sorry for the less than expedient reply. Deleting the unused variable makes sense to me! Feel free to amend the PR to do that and I can accept it!

brett-daley · 2024-12-16T23:12:28Z

@kenjyoung PR updated!

kenjyoung

👍

brett-daley · 2024-12-19T22:21:25Z

@kenjyoung I don't have write access, would you be able to squash and merge for me?

kenjyoung · 2024-12-19T23:25:45Z

For sure, I thought I did already haha.

Pass momentum to DQN optimizer

3f078e5

Delete GRAD_MOMENTUM

46b00d9

kenjyoung approved these changes Dec 16, 2024

View reviewed changes

kenjyoung merged commit 4241f24 into kenjyoung:master Dec 19, 2024

brett-daley deleted the grad-momentum branch December 19, 2024 23:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRAD_MOMENTUM not used by RMSProp in DQN #37

GRAD_MOMENTUM not used by RMSProp in DQN #37

brett-daley commented Nov 18, 2024

kenjyoung commented Dec 9, 2024 •

edited

Loading

brett-daley commented Dec 10, 2024

kenjyoung commented Dec 16, 2024

brett-daley commented Dec 16, 2024

kenjyoung left a comment

brett-daley commented Dec 19, 2024

kenjyoung commented Dec 19, 2024

GRAD_MOMENTUM not used by RMSProp in DQN #37

GRAD_MOMENTUM not used by RMSProp in DQN #37

Conversation

brett-daley commented Nov 18, 2024

kenjyoung commented Dec 9, 2024 • edited Loading

brett-daley commented Dec 10, 2024

kenjyoung commented Dec 16, 2024

brett-daley commented Dec 16, 2024

kenjyoung left a comment

Choose a reason for hiding this comment

brett-daley commented Dec 19, 2024

kenjyoung commented Dec 19, 2024

kenjyoung commented Dec 9, 2024 •

edited

Loading