You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've got a repo where I implemented the importance sampling for ddpg PER. I'm still unsure if it works. But lines 68 - 91 might be useful for you. It's pretty slow, but I still haven't figured out a faster way.
Would be interesting to see how not using importance sampling would effect your results.
I could be wrong but it does not seem that you are annealing the bias with important sampling as suggested in the PER paper(section 3.4).
w_i = (1/N * 1/P(i))^beta
I think you would have to multiply this w_i term with your gradients
The text was updated successfully, but these errors were encountered: