Movie Pepper Backend

This repo contains all the backend code for the Movie Pepper open source recommendation engine.

This includes the REST API and the IMDb crawler.

Setup

Python 3, pip and virtualenv must be installed

Create a virtualenv

python3 -m venv venv

source venv/bin/activate

Install dependencies

pip install -r requirements.txt
python -m textblob.download_corpora
python -m nltk.downloader stopwords

Crawler

A Bash script is provided to simplify executing the Spidy crawler.

cd movie_scrape
START_URL="http://www.imdb.com/search/title?groups=top_1000&sort=user_rating,desc&page=1&ref" ./scrap.sh

After the crawl is complete calculate the TF-IDF values and Doc2Vec models.

python tfidf_lsa.py
python doc2vec.py

This step is needed to execute the server.

Server

Start the server

gunicorn --bind 0.0.0.0:5000 server:app

You will probably want to use a reverse proxy such as NGINX and secure it with HTTPS.

For developemnt you can use

python server.py

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
.github		.github
movie_scrape		movie_scrape
.coveragerc		.coveragerc
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
d2v-optimizer.py		d2v-optimizer.py
doc2vec.py		doc2vec.py
manifest.yaml		manifest.yaml
optimizer.py		optimizer.py
recommender.py		recommender.py
server.py		server.py
tests.py		tests.py
tfidf_lsa.py		tfidf_lsa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie Pepper Backend

Setup

Crawler

Server

About

Releases

Packages

Contributors 5

Languages

License

hugo19941994/movie-pepper-back

Folders and files

Latest commit

History

Repository files navigation

Movie Pepper Backend

Setup

Crawler

Server

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages