This repository houses a scraping engine for the UCPD's Incident Report webpage. The data is stored on Google Cloud Platform's Datastore and ran using GitHub Actions.
Scrape the UCPD Incident Report webpage every weekday morning, pulling all incidents from the latest reported incident date in the Google Datastore to the current day.
- Ethical Issues of Crime Mapping: Link
- The Maroon Launches UChicago Police Department Incident Reporter: Link
I'd like to thank @kdumais111 and @FedericoDM for their incredible help in getting the scraping architecture in place. As well as @ehabich for adding a bit of testing validation to the project. Thanks, y'all! <3
- Python version:
^3.11.4
uv
version:0.5.7
- Download at: link.
- Census API Key stored in the environment variable:
CENSUS_API_KEY
- Google Cloud Platform service account with location of the
service_account.json
file stored in the environment variable:GOOGLE_APPLICATION_CREDENTIALS
- Google Cloud Platform project ID stored in the environment variable:
GOOGLE_CLOUD_PROJECT
- Google Maps API key stored in the environment variable:
GOOGLE_MAPS_API_KEY
- Any modules should be added via the
uv add [module]
command.- Example:
uv add pre-commit
- Example:
make build-model
: Build a predictive XGBoost model based off of locally saved incident data and save it in thedata
folder.make categorize
: Categorize stored, 'Information' labeled incidents using the locally saved predictive model.make download
: Download all incidents into a locally stored file titledincident_dump.csv
.make env
: Creates or activates auv
virtual environment.make lint
: Runspre-commit
on the codebase.make seed
: Save incidents starting from January 1st of 2011 and continuing until today.make update
: Save incidents starting from the most recently saved incident until today.