=====================
A web scraper for detik.com, allowing users to search and export news articles by keywords and total pages.
You can try the live demo at: https://detiknewsscraper.streamlit.app/
- Search news articles by keywords
- Specify the total number of pages to scrape
- Export results to CSV, JSON, or XLSX files
- Python 3.x
- HTTPX
- selectolax
- Streamlit
- Install dependencies:
python -m pip install httpx selectolax streamlit
- Clone the repository:
git clone https://github.com/karvanpy/DETIKNewsScraper
- Run the scraper:
streamlit run DETIKScraper.py
- Install dependencies and build tools:
pkg install python build-essential cmake ninja libopenblas libandroid-execinfo patchelf binutils-is-llvm
- Install tools for building Python projects:
pip3 install setuptools wheel packaging pyproject_metadata cython meson-python versioneer
- Install pyarrow and pillow:
pkg install python-pyarrow python-pillow
- Install selectolax and streamlit:
pip3 install httpx selectolax streamlit
- Clone the repository:
git clone https://github.com/karvanpy/DETIKNewsScraper
- Run the scraper:
streamlit run DETIKScraper.py
- Run the scraper using the command
streamlit run DETIKScraper.py
- Enter your search keyword and select the total number of pages to scrape
- Choose the export format (CSV, JSON, or XLSX)
- Click the "Scrape" button to start the scraping process