-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continous control #3
Comments
I'm working on LSTM implementation (neon based) for the continuous case, sadly I failed to get any response from authors. It is variance and entropy that puzzles me. Any thoughts on how that is implemented code-wise? |
Thanks for information. I haven't tried it yet, but the paper provides some information as below. Did you find it is not sufficient?
|
It is a bit vague for me so I will try to summarize in order to be corrected : we need a fully connected layer outputting 2 values, add 1 softplus operation for second value (so that variance is > 0 I suppose), sample according to this gaussian (use numpy.randn * sigma + mu ?) in each dimension of action space, and finally send −1/2 (log(2πσ2)+1 as logprob instead of log(softmax) ? |
hi, @muupan , do you have a plan to implement continous control? : ) |
Here is an example:
haven't tested yet, so feel free to test/ correct |
Thanks! @etienne87 |
No description provided.
The text was updated successfully, but these errors were encountered: