DDPG fixes #75

kengz · 2017-03-20T03:36:39Z

make DDPG work for cartpole first, for sanity check
try dis https://github.com/songrotek/DDPG/blob/master/actor_network.py
and try dis https://github.com/pemami4911/deep-rl/blob/master/ddpg/ddpg.py
batch normalization
make DDPG work for pendulum
schedule DDPG for all the experiments
check random seeds

lgraesser · 2017-03-25T05:26:11Z

rl/policy/noise.py

@@ -44,7 +44,7 @@ def select_action(self, state):
            Q_state = agent.actor.predict(state)[0]
            assert Q_state.ndim == 1
            action = np.argmax(Q_state)
-            logger.info(str(Q_state)+' '+str(action))
+            # logger.info(str(Q_state)+' '+str(action))
        return action

    def update(self, sys_vars):


AnnealedGaussian calls self.sample() but I don't think sample has been defined.
Also, how to explore with discrete actions. Add an epsilon greedy component to the policy?

lgraesser · 2017-03-28T08:06:39Z

I think it's really close to working but something is off with the gradient update. The DDPG implementation from permami works, and ddpg_tf implementation results in exactly the same actions up to the first gradient update. I also did some testing around the weights init - the algorithm is not that sensitive to them and will work for different initializations.

kengz · 2017-04-18T12:42:43Z

working DDPG implemented in #118

kengz added 4 commits March 19, 2017 23:35

remove unused actor compile

bb900bd

Merge remote-tracking branch 'origin/master' into ddpg

48f4d6b

Merge remote-tracking branch 'origin/master' into ddpg

9f57568

add bias, change param a bit, get some peaking behavior before breakdown

2a55eb1

lgraesser reviewed Mar 25, 2017

View reviewed changes

lgraesser added 7 commits March 25, 2017 02:11

DDPG updates plus debugging

6979256

DDPG cleanup

9cd2f27

Custom activation

c98ea3e

DDPG updates for pendulum

94546aa

Refactoring weight init

b5e819d

Changing dev pendulum memory

70ab72c

DDPG updates and debugging

3afcab2

kengz closed this Apr 18, 2017

kengz deleted the ddpg branch April 18, 2017 12:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDPG fixes #75

DDPG fixes #75

kengz commented Mar 20, 2017 •

edited by lgraesser

Loading

lgraesser Mar 25, 2017 •

edited

Loading

lgraesser commented Mar 28, 2017

kengz commented Apr 18, 2017

DDPG fixes #75

DDPG fixes #75

Conversation

kengz commented Mar 20, 2017 • edited by lgraesser Loading

lgraesser Mar 25, 2017 • edited Loading

Choose a reason for hiding this comment

lgraesser commented Mar 28, 2017

kengz commented Apr 18, 2017

kengz commented Mar 20, 2017 •

edited by lgraesser

Loading

lgraesser Mar 25, 2017 •

edited

Loading