An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.
- Modular notification system
- Currently supports Webhooks (e.g. Discord, Slack, etc.) and ntfy.sh
- Web scraping via Selenium
- Simple configuration of multiple scrapers with conditional notifications
- A browser
- Most Chromium-based browsers and Firefox-based browsers should work
- Edge is not recommended
- Selenium should also be able to download and cache the appropriate browser if necessary
- Configuration for the web scraper is handled through a TOML file
- To see an example configuration, see
config.example.toml
- This can be copied to
config.toml
and edited to suit your needs - To get the CSS selector for an element, you can use your browser's developer tools (
F12
,Ctrl+Shift+I
, right-click -> Inspect Element, etc.)- If you're not already in inspect, you can press
Ctrl+Shift+C
to enter inspect element mode (or just click the inspect button in the developer tools) - Click on the element you want to select
- Right-click on the element in the HTML pane
- Click "Copy" -> "Copy selector"
- If you're not already in inspect, you can press
- To see an example configuration, see
- Some other configuration is handled through environment variables and/or command-line arguments (
--help
for more information)- For example, to set the path to the configuration file, you can set the
PATH_TO_TOML
environment variable or use the--path-to-toml
command-line argument
- For example, to set the path to the configuration file, you can set the
- Docker
- Docker is a platform for developing, shipping, and running applications in containers
- Docker Compose
- Clone the repository
git clone https://github.com/slashtechno/scrape-and-ntfy
- Change directory into the repository
cd scrape-and-ntfy
- Configure via
config.toml
- Optionally, you can configure some other options via environment variables or command-line arguments in the
docker-compose.yml
file
- Optionally, you can configure some other options via environment variables or command-line arguments in the
docker compose up -d
- The
-d
flag runs the containers in the background - If you want, you can run
sqlite-web
by uncommenting the appropriate lines indocker-compose.yml
to view the database in a browser on localhost:5050
- The
- Python (3.11+)
- Install with
pip
pip install scrape-and-ntfy
- Depending on your system, you may need to use
pip3
instead ofpip
orpython3 -m pip
/python -m pip
.
- Configure
- Run
scrape-and-ntfy
- This assumes
pip
-installed scripts are in yourPATH
- This assumes
- Python (3.11+)
- PDM
- Clone the repository
git clone https://github.com/slashtechno/scrape-and-ntfy
- Change directory into the repository
cd scrape-and-ntfy
- Run
pdm install
- This will install the dependencies in a virtual environment
- You may need to specify an interpreter with
pdm use
- Configure
pdm run python -m scrape_and_ntfy
- This will run the bot with the configuration in
config.toml
- This will run the bot with the configuration in