A twitter bot for tweeting about NSW exposure venues, using the Data NSW COVID-19 Dataset
- The logic for scraping from the Data NSW feed is in covid.py
- The logic for saving venues to a database and sending it to twitter is a scrapy pipeline in pipelines.py
- Scrape the data from Data NSW's COVID-19 venue API
- Pass each individual venue down the Scrapy Pipeline's (SQLPipeline & TwitterPipeline)
- Check the database to see if the venue has been seen before. If not, save it to the database. Allow the item to pass through the pipeline.
- If venue has already been seen, drop it from the pipeline (don't let it pass through)
- Once we've finished scraping and all new venues have been saved, activate the
TwitterPipeline
- Check the database for any venues that don't have an associated Tweet.
- Collate the new venues by venue name (one tweet per venue with multiple times)
- Tweet the aggregate tweet and reply to this tweet with each venue
- Save tweets against the venue record in the database.
- Ensure python3.8 is installed.
- Install pipenv.
pip install pipenv
pipenv install
from the root of the repo- Activate the virtualenv with
pipenv shell
- Set the required environment variables in settings.py
- Run
scrapy crawl covid