-
Notifications
You must be signed in to change notification settings - Fork 6.8k
RNN Example not supported for MXNet Scala #11571
Comments
Note that we're no longer using ptb. Instead we're using Sherlock Holmes dataset. |
@szha , Do you know if there is any operator changes/abandoned related the RNN for the past 1 year? The code we have written 1 year ago cannot work for now. I believe there shouldn't be any changes on Scala side. |
Not that I'm aware of. Other than the fused RNN kernels in CPU, there hasn't been much change since September 2016. https://github.com/apache/incubator-mxnet/commits/master/src/operator/rnn-inl.h |
I can reproduce the same issue on
The [Update] The changes on Scala package was the garbage collection, see more in here. |
Can we confirm whether it is caused by Scala frontend changes? e.g., use scala frontend @v1.0.0 but backend @v1.1.0 (or inverse) to see if it also crashes. |
@yzhliu tested on Files add dispose method:
|
The root cause of this issue is this line. The |
@lanking520 The code you pointed out is correct, it disposes the NDArrays in the DataBatch created using the Slice operator. Slice operator does not create new NDArrays instead gives a reference to the original NDArray with an offset. The pointers to the NDArrays(both sliced and original) use shared_ptr which only decrements the reference count, so it won't free the original NDArray correspondingly if you don't dispose the Sliced NDArray it won't free the original even if the original NDArray is freed. I suspect the corruption to be elsewhere. |
@nswamy and @yzhliu , thanks for your recommendation on the changes. There should be no problem when we dispose a Solution: pass copies to the BucketIter and this will solve the problem, I will send a PR regarding to this and see how well it can perform |
Thanks for all of you for your help, I have added a PR for RNN example here: #11753 |
Finally the DataDesc problem resolved... |
This model has not been updated since 2017/07, the RNN example for LstmBucketing is no longer runnable.
Please contribute to RNN support on Scala for the current version. If you would like to run this example for now, please revert your MXNet version to
v0.12
.In summary, I suspect that some operator changes caused this issue.
Please contact @lanking520 or @nswamy if you would like to dig in and fix this problem. The following is a way to reproduce the issues I found:
LSTM Bucketing
Setup
Download the required ptb file from here:
You can use this script to run the model
or manually identify the file path by following this
Additional step to get it work
Please add the follows to IO.scala
Problem that I found
Segmentation fault caused by C++ backend
NDArray size mismatch
Data name not the same
TestCharRNN
Please follow this tutorial
The problem I have
The text was updated successfully, but these errors were encountered: