An implementation of the Tensor-Train Layer and Tensor-Train Recurrent Neural Networks for Keras, based on the existing implementation for Lasagne and Matlab.
"Tensorizing Neural Networks"
Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov
In Advances in Neural Information Processing Systems 28 (NIPS-2015).
"Tensor-Train Recurrent Neural Networks for Video Classification"
Yinchong Yang, Denis Krompass, Volker Tresp
In International Conference on Machine Learning 34 (ICML-2017).
A Tensor-Train Layer (TTL) provides a replacement of a fully-connected feed-forward layer in Neural Networks. It factorizes the otherwise dense weight matrix into a train of smaller, 4D tensors and can therefore reduce the number of weights in the model. It is expected to speed up the training and inference to a rather large and tunable proportion, while not sacrificing much of the modeling power. This is especially the case, when the input contains redundant information, which could be e.g. pixels in a image or feature map after convolution. Also for feature selection / dimension reduction tasks that are too expensive to run lasso- and ridge-regularization, TT layer could also be an option so long as one is more interested in the mere prediction quality than identifying specific explaining features.
A Tensor-Train Recurrent Neural Network (TT-RNN), including Simple RNN, LSTM and GRU, replaces the weight matrix mapping from input to hidden layer (the gates in case of LSTM and GRU) with a TTL. This turns out to enable RNNs to handle very high-dimensional sequential data, such as video streams, where each frame in a video forms an extremely high-dimensional input vector.
Our TTLayer.py is a reimplementation of the terrific work by https://github.com/Bihaqo/TensorNet with specific improvement and adjustments for Keras. Applying only keras grammar i.e. the Backend functions, this implementation is expected to utilize both Theano and TensorFlow backends and compatible with other keras components. We further integrate these codes into the keras RNN implementations of TTRNN.py.
Once you've decided the Tensor-Train setting, including the factorization of the input and output vector as well as the tensor ranks, you can define the TT layer just in place of a usual dense one:
h = TT_Layer(tt_input_shape=[4, 8, 8, 4], tt_output_shape=[4, 8, 8, 4], tt_ranks=[1, 3, 3, 3, 1],
use_bias=True, activation='relu', W_regularizer=l2(5e-4), debug=False, ortho_init=True)
and use the TTRNNs as if they were usual RNNs:
rnn_layer = TT_GRU(tt_input_shape=[16, 32], tt_output_shape=[8, 8], tt_ranks=[1, 3, 1],
return_sequences=False, debug=True,
dropout=.25, recurrent_dropout=.25, activation='tanh')
including 4 out of 5 Datasets at the NIPS2003 workshop on feature selection
1.2. Gene Expression Data
including 4 Datasets that are from the R-package spls
1.3. MNIST Data
as in [Novikov et al. 2015] with slight modification
1.4. CIFAR10 Data
as in [Novikov et al. 2015] with slight modification
2.1. UCF11
video data as in [Yang et al. 2017]
2.2. Hollywood2
video data as in [Yang et al. 2017]
video data as in [Yang et al. 2017]
2.4. HMDB51
additional experiments
2.5. FiboClass
for POC and debugging purposes
I would like to thank Alexander Novikov and his colleagues for sharing their codes in lasagne and as well as a lot of first-hand experience in applying the model, without which this implementation would not have been possible. Also many thanks go to Siemens AG, Corporate Technology for supporting this work.
This implementation is still at a very early stage. There might be some warnings regarding the keras 2.x, because the code was originally developed in keras 1.x. We therefore highly appreciate insights, recommendations and critics of all kinds. I'm also going to upload some small subsets of the video data soon enough.
Tested with Keras 1.0.8 and Theano 0.8.2.dev., partly Keras 2.0.4 and Theano 0.9.0..