You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Learning functions across many orders of magnitudes introduces Preserving Outputs Precisely, while Adaptively Rescaling Targets (Pop-Art). In summary it normalises outputs across orders of magnitudes and gets rid of the clipping (i.e. counting) rewards heuristic for Atari games. The normalisation is also better for non-stationary problems, i.e., any decent real world problem.
The below is a picture of extra notes from the authors, next to their poster at NIPS 2016:
The text was updated successfully, but these errors were encountered:
Learning functions across many orders of magnitudes introduces Preserving Outputs Precisely, while Adaptively Rescaling Targets (Pop-Art). In summary it normalises outputs across orders of magnitudes and gets rid of the clipping (i.e. counting) rewards heuristic for Atari games. The normalisation is also better for non-stationary problems, i.e., any decent real world problem.
The below is a picture of extra notes from the authors, next to their poster at NIPS 2016:
The text was updated successfully, but these errors were encountered: