Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatilibity with CuDNN R5 #239

Open
JoostvDoorn opened this issue May 4, 2016 · 10 comments
Open

Compatilibity with CuDNN R5 #239

JoostvDoorn opened this issue May 4, 2016 · 10 comments

Comments

@JoostvDoorn
Copy link
Contributor

@nicholas-leonard Do you have any thoughts on compatibility with CuDNN R5, and possible future roadmap of this package towards maturity? Possibly the modules that have been implemented in soumith/cudnn.torch would be helpful here as well for performance reasons.

@nicholas-leonard
Copy link
Member

@JoostvDoorn So torch supports this via R5 branch of the cudnn repository : https://github.com/soumith/cudnn.torch/tree/R5 . The rnn package can integrate with these but will not be automatically converted. Similarly to SeqLSTM, they expect tensors of size seqlen x batchsize x inputsize as input. I have already made changes to make integration easier by making Sequencer support Tensors as input as well : 8ac2707 .

What are compatibility would you see fitting?

@JoostvDoorn
Copy link
Contributor Author

I was thinking about automatic converison. In general I think there will be a demand for a module that is compatible with the cudnn modules for RNN/LSTM/GRU and that can then be converted using cudnn.convert. Not sure though if this package is the right place to discuss this, but since this is probably the most actively maintained RNN package for torch it seems logical this one would support this eventually. As you state SeqLSTM having the same structure should make this theoretically possible.

@nicholas-leonard
Copy link
Member

@JoostvDoorn Yes we should definitely at least make SeqLSTM support conversion via cudnn convert. Lets keep this ticket open until it gets done.

@JoostvDoorn
Copy link
Contributor Author

JoostvDoorn commented Sep 13, 2016

@nicholas-leonard I have some basic conversion code ready for SeqLSTM, here:
https://gist.github.com/JoostvDoorn/bd43ce85a101c0aeee03f734ead688fb

I am mainly wondering how we could include this within cudnn.convert (See also soumith/cudnn.torch#256)

@nicholas-leonard
Copy link
Member

You can always send a PR to rnn so that we can do something like this:

require 'cudnn'
require 'rnn'

-- define model
lstm = nn.Sequential()
   :add(...
   :add(nn.SeqLSTM())
    ..)

-- convert to cudnn where possible
culstm = lstm:cuconvert()

Not sure if this use-case is simple enough to use and to code. What do you think?

@hashbangCoder
Copy link

hashbangCoder commented Sep 16, 2016

Sorry to butt in but is it currently possible to cudnn.convert any module? Say a Sequencer(FastLSTM) ?

I have a model which wraps a sequencer around LookupTableZeroMAsk, FastLSTM, Linear (and few others like Dropout) but cudnn.convert only converts the ReLU ? And doesn't throw any errors.

It looks like this -

local fastlstm = nn.FastLSTM(embedLen,1024)
    fastlstm:maskZero(1)
    local rnn_cnn = nn.ParallelTable()
    local lookup = nn.LookupTableMaskZero(vocabSize,embedLen)
    local rnn = nn.Sequential():add(lookup):add(nn.Dropout(0.5)):add(fastlstm):add(nn.Linear(1024,2048)):add(nn.ReLU(true))

    rnn_cnn:add(rnn)
    --Use fc6 layer
    rnn_cnn:add(nn.ReLU(true)):add(nn.Linear(4096,2048))
    local shared_lin = nn.Linear(embedLen,vocabSize)
    local model = nn.Sequential():add(rnn_cnn):add(nn.CAddTable()):add(nn.Dropout(0.25)):add(shared_lin)

    collectgarbage()
    collectgarbage()
    return nn.Sequencer(model)

And I call cudnn.convert(model,cudnn).

UPDATE : i checked #178 and they have problems with batched inputs for cudnn.convert(FastLSMT) and you recommended using SeqLSTM. In my case I cannot use it since I need a sequencer wrapped around a LSTM and other modules, not just an LSTM. So what would be the solution in this case?

Let me know if I should move to a new issue thread for this.

@JoostvDoorn
Copy link
Contributor Author

JoostvDoorn commented Sep 16, 2016

Note that there is no cudnn implementation of nn.Linear. Other than that you are right, and nn.Sequencer(nn.Tanh) is not converted to nn.Sequencer(cudnn.Tanh). cudnn.convert uses nn.Module.replace, which recursively converts the children based on self.modules, not sure where this goes wrong... @hashbangCoder Can you open a new issue to discuss using SeqLSTM? You don't need to use a Sequencer for the code above.

@nicholas-leonard Yes, I guess that would be the easiest approach. I can work on this, but it's not priority right now.

@hashbangCoder
Copy link

hashbangCoder commented Sep 16, 2016

@JoostvDoorn Yes, I realized that its implemented in cuBLAS soon after i posted the comment and updated my comment :D

Can you open a new issue to discuss using SeqLSTM?

I'm actually interested in the FastLSTM. But if you want me to move this to another thread, sure.

You don't need to use a Sequencer for the code above.

Could you please elaborate a bit? My understanding is that SeqLSTM is equivalent of (but faster than) Sequencer(FastLSTM). However I want to use something more like this Sequencer(Sequential():add(Lookup):add(FastLSTM):add(Linear):add(Dropout)) for which I cannot use SeqLSTM . I want params to be shared across all layers for all time-steps hence the Sequencer.

It says in the README :

Note that a SeqLSTM cannot replace FastLSTM in code that decorates it with a AbstractSequencer or Recursor as this would be equivalent to Sequencer(Sequencer(FastLSTM)). You have been warned.

@hashbangCoder
Copy link

Nevermind. I interpreted the paper (i was reading) completely wrong. Got a test case working now. Thanks anyway!

@suryabhupa
Copy link
Contributor

To follow up on this -- if I'm using rnn's LSTM implementation, will cudnn.convert() work to convert it?

Are there any workarounds other than using cudnn.LSTM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants