Skip to content

tylerho5/movie-recommender-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Recommender System

README IS A WORK IN PROGRESS
CONTENTS SUBJECT TO CHANGE

Description

This project is a movie recommendation system that predicts what users will rate movies they haven't watched yet within the train set and recommends 10 movies based on highest predicted ratings for each user. The recommender system explains why recommendations are made and is transparent about how recommendations are determined by showing latent feature weighting. Additionally, the system provides evaluation metrics for recommendation assessment.

Output includes:

  • top 10 recommendations list with movie ids
  • top 10 recommendations list with movie names
  • ratio of similar user embeddings and tag and genre latent features as weightings
  • explanations for why the top 3 movies were recommended to add transparency to recommendations

Features

  • Hybrid filter approach to take advantage of similar user ratings as well as individual user movie preference
  • Collaborative filtering through SVD
  • Content-based filtering through Word2Vec neural network to apply natural language processing to analyze and derive semantic relationships from user-submitted review tags and movie genres
  • Regression-based rating prediction
  • Recommendation explainability and transparency

Installation

  1. Clone the repository:

    git clone https://github.com/tylerho5/movie-recommender-system.git
  2. Create a virtual environment and install dependencies:

    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt

Usage

  1. Place your datasets in the raw-datasets/ folder.
  2. Run the main script:
    python recSys_final.py
  3. Outputs will be saved in the output/ directory

Folder Structure

├── output/             # Results from program
├── raw-datasets/       # Raw datasets in system
├── report/             # Technical report in .tex format
├── scripts/            # Python scripts for recommender system
├── .gitattributes      # Defining attributes
├── .gitignore          # Ignored environment files
├── LICENSE             # Licensing for open-source library inspiration
├── README.md           # Project documentation
└── requirements.txt    # List of dependencies

Acknowledgments

This project was built with inspiration and code references from the following open-source repositories:

We are grateful to the open-source community for these resources, which have significantly contributed to the inspiration and development of the project.

Contributors

This project was a collaborative effort between:

  • Tyler Ho: Project lead, implementation of custom SVD, custom Word2Vec, custom scaler, custom regression, as well as documentation and codebase organiziation.
  • Quynh Nguyen: Contributions in research, testing, technical report, and presentation.

License

This project is licensed under the BSD 3-Clause License. See the LICENSE file in this repository for details.