-
Notifications
You must be signed in to change notification settings - Fork 3
A regression system was designed that predicts the IMDb rating of a movie. A movie viewer would otherwise have to rely on a critic's review or self-instincts. A dataset, obtained from Kaggle, contains certain attributes (such as genre, duration, names of actor, director, number of voters for the rating, plot and keywords, language, etc.) and per…
amanvora/Movie-Rating-Prediction
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Libraries required: 1. Scikit-learn 2. Matplotlib 3. numpy Instructions to run: 1. Please run preproc.py first. It will generate 'preprocessed.txt' and 'imdbScores.txt' to be used by the other two scripts. 2. modelSelCV.py and preproc.py each have commented sections in order to view plot of covariance matrix and categorical features respectively. 3. Since model selection performs an exhaustive evaluation of all the models across all the hyperparamters, it takes a long time to execute (approx. 30 minutes depending on the system) Code referred: 1. For one-hot enconding of catgorical features: http://fastml.com/converting-categorical-data-into-numbers-with-pandas-and-scikit-learn/ 2. For splitting the data to (train,test) and loading the database https://www.kaggle.com/nsnilay/d/deepmatrix/imdb-5000-movie-dataset/predicting-imdb-ratings-using-numerical-attributes/notebook
About
A regression system was designed that predicts the IMDb rating of a movie. A movie viewer would otherwise have to rely on a critic's review or self-instincts. A dataset, obtained from Kaggle, contains certain attributes (such as genre, duration, names of actor, director, number of voters for the rating, plot and keywords, language, etc.) and per…
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published