Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catastrophic collapse in episode score on cartpole_a3c #58

Open
erlendaxpo opened this issue Jul 5, 2017 · 2 comments
Open

Catastrophic collapse in episode score on cartpole_a3c #58

erlendaxpo opened this issue Jul 5, 2017 · 2 comments

Comments

@erlendaxpo
Copy link

Hi,
First of all I just want to say awesome work on the library overall, really love the concept 👍

I have an issue where cartpole_a3c will converge relatively quickly (around ep 300-400). Then keep doing well, and then suddenly collapsing and not recovering. Has anyone else experienced this?

@dnddnjs
Copy link
Contributor

dnddnjs commented Jul 5, 2017

Thank you for your compliment.

Now the reward which is given to agent when the episode is over before 500 timestep is -100.

I think this is too big so it can influence network's stability. I am curious about the result when change the reward from -100 to -10 or -1.

@mynameisvinn
Copy link

there could be many reasons behind catastrophic collapse: learning rate; gamma rate, which is the discount rate applied to rewards; etc.

one common solution is gradient clipping. by clipping gradient vectors, you minimize the impact of high variance situations (eg a -100 reward after a series of +1 rewards).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants