Gradient clipping and reward normalization parameters #18

danijar · 2016-07-22T13:40:32Z

Hi there, cool project! I'm trying to reproduce the A3C results with my own implementation and have two questions regarding the Dr. Mnih confirmed parameters on the Wiki page: (1) There was no loss clipping. The A3C paper does mention gradient clipping however which is very similar I believe. (2) In the original DQN paper they normalized rewards by sign(R(s)) rather than max(0, min(R(s), 1) as listed in the Wiki. Could you provide some clarification on these two points, please?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient clipping and reward normalization parameters #18

Gradient clipping and reward normalization parameters #18

danijar commented Jul 22, 2016 •

edited

Loading

Gradient clipping and reward normalization parameters #18

Gradient clipping and reward normalization parameters #18

Comments

danijar commented Jul 22, 2016 • edited Loading

danijar commented Jul 22, 2016 •

edited

Loading