Skip to content

ankitpt/Data_Science_Blog_Post

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Blog Post

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

This code runs with Python version 3.* and libraries and their respective versions mentioned in requirements.txt. Following command can be executed for their installation:
pip install -r requirements.txt

Project Motivation

This is an Udacity Nanodegree project. The dataset I chose had statitics of players in top 5 soccer leagues across Europe from year 2014-2015 to 2019-20. Apart from attributes like goals, assists, yellow and red cards,etc, there were some other interesting statistics such as expected goals, expected assists, key passes,etc.

With the data at hand, I tried to answer following questions:

  • Which league has the most attacking defenders across Europe's top 5 league in the last 6 seasons?
  • Which teams outperform their expected goals measure while which ones underperform (and in which season)? How is it related to performance in respective league in that year? Further, which teams constantly outperform their expected goals measure in last 5 seasons?
  • In a particular season and league, which teams were most dependent on a single player for scoring goals?: a Gini coefficient analysis of expected goals chain statistic

File Descriptions

data folder consists of csv files containing statisitics of each player in Europe's top 5 soccer league teams from year 2014-15 to 2019-20.

Results

The main findings of this work can be found on my Medium blog

Licensing, Authors, Acknowledgements

The data was obtained from Kaggle. The Gini coefficient analysis is based on ideas presented in this article. Further, I would also like to acknowledge Udacity Data Scientist Nanodegree instructors for providing an opportunity of creating a blog for a data science problem.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published