Hybrid Content Boosted Neural Collaborative Filtering for Automatic Playlist Genration

This repository is team spotif.ai's main repository of the solution for the Spotify-RecSys2018 Challenge. The proposed system is a hybrid recommender system that employs the content of the playlist and the Collaborative Filtering (CF). More speicifically, we use two different recommenders for the two different recommendation scenarios.

Matrix Factorization (MF) based CF model

If seed tracks are already existing, one can deploy well-developed (any kind of) MF based CF system for the recommendatation.

Content Based Neural Collaborative Filtering (NCF)

For the playlists that do not have any seed tracks, we applied a Recurrent Neural Network (RNN) based Content Boosted Neural Collaborative Filtering (CBNCF) model.
This model serves the preference score for each tracks within candidates (dataset) per each playlist, based on the track title text

Setup the virtual env

To reproduce the entire result or just try out each sub-steps of this solution, you should first install the virtual environment. For this, we recommend for you to install pipenv that we used for this project. If your system already has python2.7 and pip, you can simply run the code below.

$(sudo) pip install pipenv

Then, clone this repo by

$git clone https://github.com/eldrin/recsys18-spotify-spotif-ai.git

After the cloning, you need to get into the directory, to install and fire up the environment.

$cd recsys18-spotify-spotif-ai
$pipenv install

If your main python version is python3.X, make sure install this repo with python2.X option.

$pipenv --python 2.7 install

Then it'll automatically install python2.7 version of virtualenv in your system. Note that if your system does not have python2.X, you might need to install it manually or use pyenv

To get into the virtual environment, you can simply hit the command below within the repo top-directory

$pipenv shell

Now you're good to go!

Reproduce the result

With all the dependencies installed correctly, you can now reproduce the result, simply by hitting the command below.

$python reproduce.py /where/you/decompress/mpd/dataset/ /path/to/challenge_set.json /path/to/dump/outputs/ --n-factors=1000

If you're interested in use of gpu for the rnn training, you can simple put a flag for that

$python reproduce.py /where/you/decompress/mpd/dataset/ /path/to/challenge_set.json /path/to/dump/outputs/ --n-factors=1000 --use-gpu=True

After running this script, it'll dump the submission file (.csv) in ./data/.

TODOs

~~generate readme~~
provide a script (or notebook) for quick run of the program
~~finish the README.md (with proper explanation for everything)~~

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
RecsysChallengeTools @ df4e25e		RecsysChallengeTools @ df4e25e
configs		configs
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Pipfile		Pipfile
README.md		README.md
data.py		data.py
evaluation.py		evaluation.py
losses.py		losses.py
model.py		model.py
optimizers.py		optimizers.py
post_process.py		post_process.py
prepare_submission.py		prepare_submission.py
pretrain_cf.py		pretrain_cf.py
reproduce.py		reproduce.py
train_rnn.py		train_rnn.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid Content Boosted Neural Collaborative Filtering for Automatic Playlist Genration

Setup the virtual env

Reproduce the result

TODOs

About

Releases

Packages

Languages

License

eldrin/recsys18-spotify-spotif-ai

Folders and files

Latest commit

History

Repository files navigation

Hybrid Content Boosted Neural Collaborative Filtering for Automatic Playlist Genration

Setup the virtual env

Reproduce the result

TODOs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages