Catastrophic collapse in episode score on cartpole_a3c #58

erlendaxpo · 2017-07-05T07:33:19Z

Hi,
First of all I just want to say awesome work on the library overall, really love the concept 👍

I have an issue where cartpole_a3c will converge relatively quickly (around ep 300-400). Then keep doing well, and then suddenly collapsing and not recovering. Has anyone else experienced this?

dnddnjs · 2017-07-05T08:08:58Z

Thank you for your compliment.

Now the reward which is given to agent when the episode is over before 500 timestep is -100.

I think this is too big so it can influence network's stability. I am curious about the result when change the reward from -100 to -10 or -1.

mynameisvinn · 2018-01-19T16:29:57Z

there could be many reasons behind catastrophic collapse: learning rate; gamma rate, which is the discount rate applied to rewards; etc.

one common solution is gradient clipping. by clipping gradient vectors, you minimize the impact of high variance situations (eg a -100 reward after a series of +1 rewards).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catastrophic collapse in episode score on cartpole_a3c #58

Catastrophic collapse in episode score on cartpole_a3c #58

erlendaxpo commented Jul 5, 2017

dnddnjs commented Jul 5, 2017

mynameisvinn commented Jan 19, 2018

Catastrophic collapse in episode score on cartpole_a3c #58

Catastrophic collapse in episode score on cartpole_a3c #58

Comments

erlendaxpo commented Jul 5, 2017

dnddnjs commented Jul 5, 2017

mynameisvinn commented Jan 19, 2018