ActorCritic and DDPG #118

lgraesser · 2017-04-08T21:24:28Z

New Algorithms

ActorCritic

add ActorCritic agent
add its policies, Discrete: ArgmaxPolicy, SoftmaxPolicy; Continuous: BoundedPolicy, GaussianPolicy
add basic specs, solve Cartpole-v0, Cartpole-v1, yet to solve the others

DDPG

add DDPG agent with custom tensorflow ops
add its policies (only Continuous now): NoNoisePolicy, LinearNoisePolicy, GaussianWhiteNoisePolicy, OUNoisePolicy
add basic specs, solve Pendulum-v0

Improvements/Bug Fixes

use logger.warn instead of raise error when component locks are violated
fix matplotlib backend #114, add OS X matplotlib backend #115 matplotlib backend setting issue. now single trial will live-plot and render
mute DoubleDQN as it breaks; instead revert to the single-model recompile from DQN

…into policy-gradient

lgraesser and others added 30 commits April 8, 2017 17:17

Working discrete actor critic model

a665bf3

Uncommenting analyze data

c42d991

style fix, scehdule ac experiment

6f76d42

stylefix

2f5c642

Adding AC specs

0c4bf3a

Merge remote-tracking branch 'origin/master' into policy-gradient

6041209

add component locks for ActorCritic and DDPG

1be0cb3

schedule ac on cartpole and pendulum

e8aab7c

reorder component locks

28d3cc8

add ac discrete component lock, fix and check all ac specs

1565a7a

add variance to pendulum gaussian search, narrow search space

e9ad662

add action bounds to env_spec

c3c4ffb

fix boundedpolicy to auto-bound from env-spec

7eaed26

Adding Acrobot specs

34b1b58

Merge branch 'policy-gradient' of https://github.com/kengz/openai_lab …

9b415c3

…into policy-gradient

schedule other ac experiments

3bd227f

Merge branch 'policy-gradient' of https://github.com/kengz/openai_lab …

d11abfd

…into policy-gradient

Fixing mem len param

e8e6877

Merge branch 'policy-gradient' of https://github.com/kengz/openai_lab …

3d553ee

…into policy-gradient

add ddpg fix attempt

75198e4

Merge remote-tracking branch 'origin/master' into policy-gradient

edd2276

ddpg with bounded actions

47a5f8c

Merge remote-tracking branch 'origin/master' into policy-gradient

ee011b1

permami broken with reshape to len manually

56e7f95

fixing permami shape one at a time; absolutely disgusting code

6215058

disgusting ddpg hack running

434de8b

comment out print

4fa1dbc

fucking got it, culprit was predicted_q_val shape

a691401

fix permami typo

5a827c3

runnable ddpg2 from permami, still not working yet

fd17088

kengz added 13 commits April 17, 2017 21:38

DDPG2 WORKING AT LAST

bc0e3f9

refactor ddpg and rename methods, variables properly

ea48d79

use tf losses; return critic_loss from run

366f229

restore critic_loss

1e79a98

source ddpg main class from dqn; propagate some param settings properly

c4d21b8

add compatible spec

0be8381

externalize select action to policy

7e4c28e

refactor noise policies for ddpg

402fb71

separate critic_lr for Critic

22c331f

rename base to NoNoisePolicy as proper

9ab35e9

remove DDPGBoundedPolicy, already built in to DDPG

fb6059d

remove useless ddpg examples

a1524d1

rename ddpg2 to ddpg

da54f53

kengz mentioned this pull request Apr 18, 2017

DDPG fixes #75

Closed

5 tasks

kengz changed the title ~~Working discrete actor critic model~~ ActorCritic and DDPG Apr 18, 2017

kengz added 6 commits April 18, 2017 08:57

stylefix

d30fe83

warn instead of break for component lock

e469f72

mute double dqn recompile both models till performance is fixed

3b766a9

fix graph rendering on single trial by mpl backend switching

c650b04

rename noise policies properly

be6c2a9

add ac, ddpg tests

0b50a69

kengz merged commit 29bd213 into master Apr 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ActorCritic and DDPG #118

ActorCritic and DDPG #118

lgraesser commented Apr 8, 2017 •

edited by kengz

Loading

ActorCritic and DDPG #118

ActorCritic and DDPG #118

Conversation

lgraesser commented Apr 8, 2017 • edited by kengz Loading

New Algorithms

ActorCritic

DDPG

Improvements/Bug Fixes

lgraesser commented Apr 8, 2017 •

edited by kengz

Loading