Fix Boltzmann, refactor RENDER
Bug Fixes
BoltzmannPolicy
PR: #109
- fix state reshape with dimension
> 1
usingnp.expand_dims
- guard underflow by doing
np.clip
beforenp.exp
Misc
- rename class from
DoubleDQNPolicy
toDoubleDQNEpsilonGreedyPolicy
for clarity - refactor useless
RENDER
key fromrl/spec/problems.json
intorl/experiment.py