Skip to content

Latest commit

 

History

History
182 lines (136 loc) · 8.94 KB

README.md

File metadata and controls

182 lines (136 loc) · 8.94 KB

PyPI - Python Version Version GitHub repo size GitHub arXiv

Overview

DaisyRec-v2.0 is a Python toolkit developed for benchmarking top-N recommendation task. The name DAISY stands for multi-Dimension fAirly comparIson for recommender SYstem. Since its release, DaisyRec has undergone continuous upgrades and updates. The table below shows the code version and its corresponding research paper, and DaisyRec-v2.0 (dev branch) is the latest version. (Note that DaisyRec-v2.0 is still under testing. If there is any issue, please feel free to let us know)

Version Papers
DaisyRec Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison
DaisyRec2.0-main DaisyRec 2.0: Benchmarking Recommendation for Rigorous Evaluation
DaisyRec2.0-dev Under upgrade and optimizition

The figure below shows the overall framework of DaisyRec-v2.0.

Tutorial - How to use DaisyRec-v2.0

Pre-requisits

Make sure you have a CUDA enviroment to accelarate since the deep-learning models could be based on it.

How to Run

python test.py
python tune.py

Earlier Version of GUI Command Generator and Tutorial

  • The GUI Command Generator is available here.

  • Please refer to DaisyRec-v2.0-Tutorial.ipynb, which demontrates how to use DaisyRec-v2.0 to tune hyper-parameters and test the algorithms step by step.

Updated Version of GUI Command Generator and Tutorial

  • The updated GUI Command Generator is available here.

  • Please refer to DaisyRec-v2.0-Tutorial-New.ipynb, which demontrates how to use DaisyRec-v2.0 to tune hyper-parameters and test the algorithms step by step.

Documentation

The documentation of DaisyRec-v2.0 is available here, which provides detailed explainations for all commands.

Implemented Algorithms

Below are the algorithms implemented in DaisyRec-v2.0. More baselines will be added later.

  • Memory-based Methods
    • MostPop, ItemKNN, EASE
  • Latent Factor Methods
    • PureSVD, SLIM, MF, FM
  • Deep Learning Methods
    • NeuMF, NFM, NGCF, Multi-VAE
  • Representation Methods
    • Item2Vec

Datasets

You can download experiment data, and put them into the data folder. All data are available in links below:

Ranking Results

  • Please refer to ranking_results for the ranking performance of different baselines across six datasets (i.e., ML-1M, LastFM, Book-Crossing, Epinions, Yelp and AMZ-Electronic).

    • Regarding Time-aware Split-by-Ratio (TSBR)
      • We adopt Bayesian HyperOpt to perform hyper-parameter optimization w.r.t. NDCG@10 for each baseline under three views (i.e., origin, 5-filer and 10-filter) on each dataset for 30 trails.
      • We keep original objective functions for each baseline (bpr loss for MF, FM, NFM and NGCF; squre error loss for SLIM; cross-entropy loss for NeuMF and Multi-VAE), employ the uniform sampler, and adopt time-aware split-by-ratio (i.e., TSBR) at global level (rho=80%) as the data splitting method. Besides, 10% of the latest training set is held out as the validation set to tune the hyper-parameters. Once the optimal hyper-parameters are decided, we feed the whole training set to train the final model and report the performance on the test set.
      • Note that we only have the 10-fiter results for SLIM due to its extremely high computational complexity on large-scale datasets, which is unable to complete in a reasonable amount of time; and NGCF on Yelp and AMZe under origin view is also omitted because of the same reason.
    • Regarding Time-aware Leave-One-Out (TLOO)
      • We adopt Bayesian HyperOpt to perform hyper-parameter optimization w.r.t. NDCG@10 for each baseline under three views (i.e., origin, 5-filer and 10-filter) on each dataset for 30 trails.
      • We keep original objective functions for each baseline (bpr loss for MF, FM, NFM and NGCF; squre error loss for SLIM; cross-entropy loss for NeuMF and Multi-VAE), employ the uniform sampler, and adopt time-aware leave-one-out (i.e., TLOO) as the data splitting method. In particular, for each user, his last interaction is kept as the test set, and the second last interaction is used as the validation set; and the rest intereactions are treated as training set.
      • Note that we only have the 10-fiter results for all the methods across the six datasets.
  • Please refer to appendix.pdf file for the optimal parameter settings and other information.

    • Tables 16-18 show the best hyper-parameter settings for TSBR
    • Table 19 shows the best hyper-parameter settings for TLOO

Team Members

DaisyRec Leaders Zhu Sun
Senior members Hui Fang, Jie Yang, Xinghua Qu, Jie Zhang
Developers Di Yu
Contributors Cong Geng
DaisyRec-v2.0 Leaders Zhu Sun
Senior members Hui Fang, Jie Yang, Xinghua Qu, Jie Zhang, Yew-Soon Ong
Developers Di Yu, Hongyang Liu
Contributors Cong Geng, Yanmu Ding, Syed M Zaheen

Cite

Please cite both of the following papers if you use DaisyRec-v2.0 in a research paper in any way (e.g., code and ranking results):

@inproceedings{sun2020are,
  title={Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison},
  author={Sun, Zhu and Yu, Di and Fang, Hui and Yang, Jie and Qu, Xinghua and Zhang, Jie and Geng, Cong},
  booktitle={Proceedings of the 14th ACM Conference on Recommender Systems},
  year={2020}
}

@article{sun2022daisyrec,
  title={DaisyRec 2.0: Benchmarking Recommendation for Rigorous Evaluation},
  author={Sun, Zhu and Fang, Hui and Yang, Jie and Qu, Xinghua and Liu, Hongyang and Yu, Di and Ong, Yew-Soon and Zhang, Jie},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year={2022},
  publisher={IEEE}
}

TODO List

  • A more friendly GUI command generator
  • Two ways of negative sampling
  • Add data source link for the well-split datasets in the TPMAI paper
  • Add tutorial on how to integrate new algorithms in DaisyRec-v2.0

Acknowledgements

We refer to the following repositories to improve our code: