Releases · kengz/openai_lab

27 Apr 02:47

kengz

v1.0.5

4f123b8

Fix Numerical Errors; Improve PER Latest

Latest

Improvements/Bug Fixes

Misc

PR #131

fix overflow error in np.exp of SoftmaxPolicy, BoltzmannPolicy by casting to float64 instead of float32
improve overall np.isfinite asserts
remove index after reset in *analysis.csv
remove unused specs
reorganize and expand test specs
guard continuous action value range in continuous policies
fix analytics param variable sourcing

DDPG

PR: #131

add EpsilonGreedyNoisePolicy

PER

PR: #131

add memory.update(errors) throughout all agents
add shape assert for Q values and errors throughout
auto max_mem_len as max_timestep * max_epis/3 if not specified
put the missing abs for init reward

Assets 2

19 Apr 04:12

kengz

v1.0.4

29bd213

ActorCritic, DDPG

New Algorithms

ActorCritic

PR: #118

add ActorCritic agent
add its policies, Discrete: ArgmaxPolicy, SoftmaxPolicy; Continuous: BoundedPolicy, GaussianPolicy
add basic specs, solve Cartpole-v0, Cartpole-v1, yet to solve the others

DDPG

PR: #118

add DDPG agent with custom tensorflow ops
add its policies (only Continuous now): NoNoisePolicy, LinearNoisePolicy, GaussianWhiteNoisePolicy, OUNoisePolicy
add basic specs, solve Pendulum-v0

Improvements/Bug Fixes

PR: #118

use logger.warn instead of raise error when component locks are violated
fix #114, #115 matplotlib backend setting issue. now single trial will live-plot and render
mute DoubleDQN as it breaks; instead revert to the single-model recompile from DQN

Assets 2

15 Apr 21:53

kengz

v1.0.3

13c208f

Component locks; virtualenv/conda installation support

Component Locks

PR: #120

We have a lot of components, and not all of them are compatible with another. When scheduling experiments and designing specs it is hard to keep all of them in check. This adds a component locks that does automatic checking of all specs when importing, by using the specified locks in rl/spec/component_locks.json. Uses the minimum description length design principle. When adding new components, be sure to update this file.

add double-network component lock
add discrete-action component lock; assume continuous agent can handle discrete action spaces as a generalization

Improved Installation

PR: #121
Solves: #113, #114, #115

fix broken gym installation. See gym PR 558
layout installation steps in doc, use binaries for server setup
introduce version lock for dependencies with requirements.txt, environment.yml
support installation by system python, virtualenv, conda, integrate into Grunt
add quickstart_dqn for example quickstart in doc

Bug Fixes

DoubleDQN

PR: #119

restore missing recompile_model call to the second model in DoubleDQN.

Assets 2

05 Apr 12:36

kengz

v1.0.2

530af08

Fix Boltzmann, refactor RENDER

Bug Fixes

BoltzmannPolicy

PR: #109

fix state reshape with dimension > 1 using np.expand_dims
guard underflow by doing np.clip before np.exp

Misc

rename class from DoubleDQNPolicy to DoubleDQNEpsilonGreedyPolicy for clarity
refactor useless RENDER key from rl/spec/problems.json into rl/experiment.py

Assets 2

04 Apr 12:41

kengz

v1.0.1

5fb91d3

Fix PER breakage

Bug Fixes

PER

PR: #108

fix PER breakage on negative error = reward by adding a bump min_priority = abs(10 * SOLVED_MEAN_REWARD)
add a positive min_priority for all problems since they may have negative rewards. We cannot do error = abs(reward) because it is sign sensitive for priority calculation
add assert guard to ensure priority is not nan

Assets 2

02 Apr 22:17

kengz

v1.0.0

ca7b9b4

First stable release

First stable release of OpenAI Lab

PR: #106

stable and generalized RL components design
implement discrete agents: DQN, double-DQN, SARSA, PER
run dozens of experiments as Lab tests. Solve numerous discrete environments on Fitness Matrix, with PR submissions. Mainly CartPole-v0, CartPole-v1, Acrobot-v1, LunarLander-v2
complete documentation page
complete analytics framework and generalized fitness_score as evaluation metrics
stable system design after many iterations
ready for more implementations and new research

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements/Bug Fixes

Misc

DDPG

PER

New Algorithms

ActorCritic

DDPG

Improvements/Bug Fixes

Component Locks

Improved Installation

Bug Fixes

DoubleDQN

Bug Fixes

BoltzmannPolicy

Misc

Bug Fixes

PER

First stable release of OpenAI Lab

Releases: kengz/openai_lab

Fix Numerical Errors; Improve PER

Improvements/Bug Fixes

Misc

DDPG

PER

ActorCritic, DDPG

New Algorithms

ActorCritic

DDPG

Improvements/Bug Fixes

Component locks; virtualenv/conda installation support

Component Locks

Improved Installation

Bug Fixes

DoubleDQN

Fix Boltzmann, refactor RENDER

Bug Fixes

BoltzmannPolicy

Misc

Fix PER breakage

Bug Fixes

PER

First stable release

First stable release of OpenAI Lab