Portfolio Project: Date-A-Scientist

Main goal: Create machine learning models to make a prediction about OKCupid users. In the real world, the OKCupid app uses predictions like this to estimate compatibility among users and then suggest matches. The specific goal here is to compare and contrast the performance of supervised machine learning models to predict the religious preference of users based on other aspects of their profiles. A supplemental -learning- exercise is to see whether a natural language processing model can predict which user wrote a very brief essay.

Preprocessing: Data preprocessed to select variables, remove null data, clean and sort, add dummy variables, and address label imbalance. Data split into train-test sets.

Models: SkLearn's Logistic Regression, K Neighbors Classifier, Decision Tree Classifier, and Random Forest Classifier. (And supplemental Natural Language with Bag of Words and Naive Bayes Classifier)

Conclusion: The Random Forest Classifier learning method performed best with a baseline score of 52%, which is somewhat successful and better than guessing (where 11% accuracy would be expected.) The model possibly could be refined for better accuracy, however there is a real concern of forcing/ overfitting the data.

The Natural Language Processing learning exercise resulted in a model that made the correct prediction in average of 1.8/10 times, very slightly better performance compared to guessing (where 1/10 times is expected.) Note again that this was a learning exercise using available but not entirely adequate data, and as such this was successful.

Code: Python, Jupyter Notebook
Packages: pandas, numpy, matplolib, seaborn, sklearn
Data Source: Raw data for this project data provided by OKCupid app via Codecademy.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.ipynb_checkpoints		.ipynb_checkpoints
2ndlanguage.png		2ndlanguage.png
README.md		README.md
belief_overall.png		belief_overall.png
body_type_sex.png		body_type_sex.png
confusion.png		confusion.png
date-a-scientist.ipynb		date-a-scientist.ipynb
drinking.png		drinking.png
drug_use.png		drug_use.png
education.png		education.png
ethinicity.png		ethinicity.png
pets.png		pets.png
religions.png		religions.png
smoking.png		smoking.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Portfolio Project: Date-A-Scientist

About

Releases

Packages

Languages

t-will-gillis/portfolio-ok_cupid_date-a-scientist

Folders and files

Latest commit

History

Repository files navigation

Portfolio Project: Date-A-Scientist

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages