- PyTorch Implementation of the GRU4REC model.
- Original paper: Session-based Recommendations with Recurrent Neural Networks(ICLR 2016)
- This code is mostly a PyTorch re-implementation of the original Theano code written by the authors of the GRU4REC paper. What I did are as below:
- Replaced the Theano components with PyTorch
- Simplifying and Cleaning the session-parallel mini-batch generation code
- PyTorch 0.4.0 support
- The code is now compatible with PyTorch >= 0.4.0
- Code cleanup
- Removed redundant pieces of code thanks to simpler API of PyTorch 0.4.0
- Improved the readablility of the confusing rnn updating routine
- Improved the readability of the training/testing routine
- Optimization
- Testing code is now much faster than before
- PyTorch 0.4.0
- Python 3.6.4
- pandas 0.22.0
- numpy 1.14.0
- Filenames
- Training set should be named as
train.tsv
- Test set should be named as
test.tsv
- Training set should be named as
- File Paths
train.tsv
,test.tsv
should be located under thedata
directory. i.e.data/train.tsv
,data/test.tsv
- Contents
train.tsv
,test.tsv
should be the tsv files that stores the pandas dataframes that satisfy the following requirements(without headers):- The 1st column of the tsv file should be the integer Session IDs
- The 2nd column of the tsv file should be the integer Item IDs
- The 3rd column of the tsv file should be the Timestamps
See example.ipynb
for the full jupyter notebook script that
- Loads the data
- Trains & tests a GRU4REC model
- Loads & tests a pretrained GRU4REC model
- Before using
run_train.py
, I highly recommend that you to take a look atexample.ipynb
to see how the implementation works in general. - Default parameters are the same as the TOP1 loss case in the GRU4REC paper.
- Intermediate models created from each training epochs will be stored to
models/
, unless specified. - The log file will be written to
logs/train.out
.
$ python run_train.py > logs/train.out
Args:
--loss_type: Loss function type. Should be one of the 'TOP1', 'BPR', 'CrossEntropy'.(Default: 'TOP1')
--model_name: The prefix for the intermediate models that will be stored during the training.(Default: 'GRU4REC')
--hidden_size: The dimension of the hidden layer of the GRU.(Default: 100)
--num_layers: The number of layers for the GRU.(Default: 1)
--batch_size: Training batch size.(Default: 50)
--dropout_input: Dropout probability of the input layer of the GRU.(Default: 0)
--dropout_hidden: Dropout probability of the hidden layer of the GRU.(Default: .5)
--optimizer_type: Optimizer type. Should be one of the 'Adagrad', 'RMSProp', 'Adadelta', 'Adam', 'SGD'(Default: 'Adagrad')
--lr: Learning rate for the optimizer.(Default: 0.01)
--weight_decay: Weight decay for the optimizer.(Default: 0)
--momentum: Momentum for the optimizer.(Default: 0)
--eps: eps parameter for the optimizer.(Default: 1e-6)
--n_epochs: The number of training epochs to run.(Default: 10)
--time_sort: Whether to sort the sessions in the dataset in a time order.(Default: False)
- The results from this PyTorch Implementation gives a slightly better result compared to the original code that was written in Theano. I guess this comes from the difference between Theano and PyTorch & the fact that dropout has no effect in my single-layered PyTorch GRU implementation.
- The results were reproducible within only 2 or 3 epochs, unlike the original Theano implementation which runs for 10 epochs by default.
$ bash run_train.sh