Fix PER breakage
Bug Fixes
PER
PR: #108
- fix PER breakage on negative
error = reward
by adding a bumpmin_priority = abs(10 * SOLVED_MEAN_REWARD)
- add a positive
min_priority
for all problems since they may have negative rewards. We cannot doerror = abs(reward)
because it is sign sensitive for priority calculation - add assert guard to ensure
priority
is notnan