You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to use a DataParallelTable with an RNN from https://github.com/Element-Research/rnn to train on 4 GPUs. This didn't work out of the box, as rnn adds nn.Module:forget, which recursively calls this on all submodules.
Without DataParallelTable, to make a memory module forget its memory parameters, you would call model:forget. With DataParallelTable, this will call :forget on self.modules[1], which does not get transmitted to the GPU. Instead, I worked around this by calling model.impl:exec(function(m) m:forget() end).
This might be useful to someone else. Is this something you want to include either in documentation or make explicit?
The text was updated successfully, but these errors were encountered:
Since :forget is not a standard nn.Module API, this is exactly what we would expect as behavior. @nicholas-leonard can you figure this out and see if things can be made smoother?
@mxh Could you make a Pull Request to rnn to add this functionality to the rnn library? A unit test would also be nice as I have little experience with DataParallelTable. I can help you with the details.
I was trying to use a DataParallelTable with an RNN from https://github.com/Element-Research/rnn to train on 4 GPUs. This didn't work out of the box, as rnn adds nn.Module:forget, which recursively calls this on all submodules.
Without DataParallelTable, to make a memory module forget its memory parameters, you would call
model:forget
. With DataParallelTable, this will call :forget on self.modules[1], which does not get transmitted to the GPU. Instead, I worked around this by callingmodel.impl:exec(function(m) m:forget() end)
.This might be useful to someone else. Is this something you want to include either in documentation or make explicit?
The text was updated successfully, but these errors were encountered: