Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question in function simple_context #9

Open
wai7niu8 opened this issue Jul 25, 2016 · 1 comment
Open

A question in function simple_context #9

wai7niu8 opened this issue Jul 25, 2016 · 1 comment

Comments

@wai7niu8
Copy link

Hi:
in train.ipynb, i'm confused about this line:
activation_energies = activation_energies + -1e20*K.expand_dims(1.-K.cast(mask[:, :maxlend],'float32'),1)
i think this line is unnecessary(i maybe wrong), please explain this line for me in detail.
And, when computing the attention weights, i think we should only use the current word's ht(ht is the time step t's hidden state) in decoding, but in the function simple_context, it use all headline words' ht every time step?
What's more,can you show me the paper or other references about how to implement the attention layer, i think i am not particularly familiar with it. Thank you.

@udibr
Copy link
Owner

udibr commented Jul 25, 2016

the first line in the README file gives a link to the paper on which the code is based.
please read it several times from start to finish until you feel you understand it.
Also read the references it gives.

the line you asked about reduce the energy by a huge value in all places in which mask is zero in the part of the input (0:maxlend) which came from the article.
Latter I take a softmax of the energy and as a result locations in which the mask was zero will have almost zero weight.

simple_context works at once on all the time steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants