You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there, cool project! I'm trying to reproduce the A3C results with my own implementation and have two questions regarding the Dr. Mnih confirmed parameters on the Wiki page: (1) There was no loss clipping. The A3C paper does mention gradient clipping however which is very similar I believe. (2) In the original DQN paper they normalized rewards by sign(R(s)) rather than max(0, min(R(s), 1) as listed in the Wiki. Could you provide some clarification on these two points, please?
The text was updated successfully, but these errors were encountered:
Hi there, cool project! I'm trying to reproduce the A3C results with my own implementation and have two questions regarding the Dr. Mnih confirmed parameters on the Wiki page: (1) There was no loss clipping. The A3C paper does mention gradient clipping however which is very similar I believe. (2) In the original DQN paper they normalized rewards by
sign(R(s))
rather thanmax(0, min(R(s), 1)
as listed in the Wiki. Could you provide some clarification on these two points, please?The text was updated successfully, but these errors were encountered: