-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nn.BiSequencer, cudnn and non-contiguous input #178
Comments
I found similar problem in lstm = nn.FastLSTM(2,10)
lstm:cuda()
cudnn.convert(lstm, cudnn) Non-batch mode works fine lstm:forget()
lstm:forward(torch.rand(1, 2):cuda()) Batch mode does not work lstm:forget()
lstm:forward(torch.rand(4, 2):cuda()) Here is the error message:
So that is the cudnn.Sigmoid (and possibly the other 3 cudnn activations) when computing LSTM gates. Is there a plan to adopt cuDNN R5's LSTM and GRU API? I heard they much faster from NVIDIA's recent talk at my university. |
@northanapon Yes there is a plan : borisfom/cudnn.torch#3 . In the mean time, you can use Justin's super fast SeqLSTM : #207 . Not sure if it will solve your bug though. |
Thanks @nicholas-leonard, I tried SeqLSTM. It is faster than FastLSTM on GPU. |
@northanapon cudnn is not expected to give speed-up over basic SeqLSTM - most of the work is in the Linear layers that are mapped to cublas, not cudnn. The only thing that gets mapped to cudnn is activations which are a small fraction of computation and nn implementation of those is reasonable. Same would be true for FastLSTM - even if you could convert it to cudnn, you wouldn't see a speedup. But stay tuned for torch bindings for cudnn LSTM implementation. |
I'm having a problem using
nn.BiSequencer()
withcudnn
.Here's a simple example:
Here's the error I get:
cudnn/Pointwise.lua:11: Non-contiguous inputs not supported yet
I read through some closed issues and tried these variations for the network. None of them work, unfortunately.
Any help would be appreciated.
The text was updated successfully, but these errors were encountered: