Predicting Poverty in Costa Rican Households through Proxy Means Testing

Team Members

Lee-Or Bentovim, Katherine Dumais, Andrew Dunn, Kathryn Link-Oberstar

Project Summary

Using a Kaggle dataset from the Inter-American Development Bank, we design a machine learning model to classify household-level poverty using a Proxy Means Tests methodology. After data cleaning and collapsing the data to the household level, we use several oversampling techniques and cross validation to improve model performance given imbalances in poverty categories. Following testing random forests, logistic regression, naive bayes, and k-nearest neighbors, as well as different combinations of hyperparameters, we select logistic regression as our best performing model. We also test ensemble methods and explore using a binary poverty categorization. Finally, we note limitations of our approach and recommendations for further exploration.

Project Report

We describe our complete approach and results in a full report.

Acknowledgments

Professor: Chenhao Tan

Teaching Assistant: Zander Meitus

Data Source: Inter-American Development Bank data publicly hosted on Kaggle.

Name		Name	Last commit message	Last commit date
Latest commit History 241 Commits
Kaggle_download		Kaggle_download
exploratory_analysis		exploratory_analysis
ml_model_testing		ml_model_testing
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
Summary Report.pdf		Summary Report.pdf
__init__.py		__init__.py
best_model.py		best_model.py
best_model_evaluation.ipynb		best_model_evaluation.ipynb
evaluate_classification.py		evaluate_classification.py
final_kaggle_code.ipynb		final_kaggle_code.ipynb
load_data.py		load_data.py
loops.py		loops.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
sample_report.ipynb		sample_report.ipynb
var_descriptions.json		var_descriptions.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Poverty in Costa Rican Households through Proxy Means Testing

Team Members

Project Summary

Project Report

Acknowledgments

About

Releases

Packages

Contributors 4

Languages

andrewjtdunn/Costa-Rican-Household-Poverty-Level-Prediction

Folders and files

Latest commit

History

Repository files navigation

Predicting Poverty in Costa Rican Households through Proxy Means Testing

Team Members

Project Summary

Project Report

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages