-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SeqLSTM #207
SeqLSTM #207
Conversation
@nicholas-leonard Looks great! I'm not sure about the best way to handle (N x T x D) vs (T x N x D) layouts (where N = minibatch, T = sequence length, D = input size). TND fits better with the rest of rnn, and is more memory friendly so it will probably be a bit faster; however NTD seems like a more natural fit with the rest of nn, which is why I chose to use that layout. Thoughts? |
@jcjohnson Yeah my thoughts exactly. I want to make SeqLSTM default to TND to make the transition more seamless for rnn users. The reason I chose this for rnn was like you said the memory-friendliness. But then like you said, their is the rest of torch users and the torch-rnn package... I was thinking that we could add something like So it is up to you really :) What do you choose ? |
Really awesome to see this in rnn nice job guys. @nicholas-leonard I basically created a version of BiSequencer for the torch-rnn package which may be of use, I opened a PR here. I would be more than happy to create a PR after you implement this! |
@SeanNaren Awesome! Would really appreciate that! |
@nicholas-leonard TND default sounds good to me; that way the default behavior is also the fastest. I like the idea of a |
Fixed remember comparisons (= to ==)
Update SeqLSTM.lua [changed = to == in comparison]
A couple questions after trying out the code. Second, when batch size changed (i.e. during test time), function SeqLSTM:forget()
parent:forget()
self:resetStates()
end |
@northanapon I fixed your first bug. Wasn't able to reproduce your second bug (see unit test). Maybe it got fixed by fix to first. Can you test your second use case again with newest commit? |
I tried the new SeqLSTM (from master). The batch size problem still exists when lstm = nn.SeqLSTM(10, 10)
lstm.batchfirst = true
lstm:remember('both')
lstm:training()
lstm:forward(torch.Tensor(32, 20, 10))
lstm:evaluate()
lstm:forget()
lstm:forward(torch.Tensor(1, 1, 10)) I have to set |
@northanapon I see what you mean now. Fixed in 7116a3d . Thanks for pointing this out! |
Super fast LSTM code from Justin's torch-rnn