This repository contains hosts code and files for the tutorial "Data Cleaning: How the pros do the dirty work".
In this tutorial, you'll dive straight into the trenches of transforming messy, real-world data into something pristine and usable.
The easiest way to get started on the tutorial is by clicking on this binder link:
- Install Python3.x and pip
- Create a virtual environment
python3 -m venv data-cleaning-tutorial-env
- Activate virtual environment
source data-cleaning-tutorial-env/bin/activate
- Install dependencies
pip install -r requirements.txt
- Run Jupyter-lab on the project root
jupyter-lab
- Open
index.ipynb
, follow the instructions, and run the code! - Follow the tutorial!