By Gabe Epstein
As part of my Introduction to Data Science course, we are 'generating a tutorial that will walk users through the entire data science pipeline: data curation, parsing, and management; exploratory data analysis; model building as either hypothesis testing and/or machine learning; and then the curation of a message or messages covering insights learned during the tutorial.' This is the repository for my final portfolio! More information about the project and its motivations can be found here: Data Science Final Tutorial.
For my project, I will be analyzing datasets containing various NBA statistics from many seasons. I hope that, through ETL, EDA, and Model Building, I will be able to predict who will be the NBA's Most Improved Player, and perhaps determine what feature or set of features has the most weight in determining the MIP. I will be doing my coding in Python on Google Colaboratory and uploading it here on GitHub. Some of the libraries that will be used include Pandas
, NumPy
, SQL
, Seaborn
, and more.
All of the files for my project can be found in this repository, including the datasets used, the Jupyter Notebook in .ipynb format, and any other relevant files I utilize throughout this project. I hope you enjoy!
Please find the Project Website hosted by GitHub Pages here: Final Portfolio Website