Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It seems your code still has issue.(Don't quiet understand your solution) #1

Open
machanic opened this issue Apr 27, 2017 · 1 comment

Comments

@machanic
Copy link

this is the loss function:
J = tf.concat(values=[tf.log(p_y + SMALL_NUM) * (onehot_labels_placeholder), tf.log(p_loc + SMALL_NUM) * (R - no_grad_b)],axis=1) # axis = 1 acctually concat columns
p_loc is made by :
p_loc = gaussian_pdf(mean_locs, sampled_locs) # ?? mean_locs is not stop_gradient, but sampled_locs is stop_gradient
your code seems just stop_gradient sampled_locs and not stop_gradient in mean_locs?
your seperate 2 parts is not seen from the your code?

1. location network, baseline network : learn with gradients of reinforcement learning only.

2. glimpse network, core network : learn with gradients of supervised learning only.

@jtkim-kaist
Copy link
Owner

Sorry for the late answer because I'm too busy in these days.

The separation means that 'not sharing gradients'. In this code, you can see that gradients of reinforce part only flow through the location & baseline networks.

  1. your code seems just stop_gradient sampled_locs and not stop_gradient in mean_locs? : No, it is already implemented in original RAM implementation and I think it is correct.

  2. your seperate 2 parts is not seen from the your code? Plz see my code carefully. You can find that gradient flow was separated between (location, baseline net) and glimpse, core network

I'll give you more detailed answer as soon as possible...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants