You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rank-based prioritised experience replay appears to be working, but technically needs some changes. Instead of storing terminal states with a priority of 0, they should not be stored at all. This requires more checks, as the elements in the experience replay memory and the elements in the priority queue will differ.
Secondly, proportional prioritised experience replay still needs to be implemented. See here and here for an implementation of the sum binary tree.
For reference, below are results from a working implementation of rank-based PER on Frostbite:
The text was updated successfully, but these errors were encountered:
maybe we can store experience as a tuple like (s_t, a, r, s_t_1, t), terminal state will not be store in experience replay if use this pattern. usually t is 0, and t == 1 would generate tuple (s, a, r, TERMINAL_STATE, 1)
Rank-based prioritised experience replay appears to be working, but technically needs some changes. Instead of storing terminal states with a priority of 0, they should not be stored at all. This requires more checks, as the elements in the experience replay memory and the elements in the priority queue will differ.
Secondly, proportional prioritised experience replay still needs to be implemented. See here and here for an implementation of the sum binary tree.
For reference, below are results from a working implementation of rank-based PER on Frostbite:
The text was updated successfully, but these errors were encountered: