GitHub - liakdimi1/yellow-pages-scraper

Yellow Pages Scraper This Scrapy spider allows you to scrape business information from Yellow Pages website based on different cities and categories.

Installation Clone the repository: bash Copy code git clone https://github.com/liakdimi1/yellow-pages-scraper.git Install the required dependencies: Copy code pip install scrapy pandas Usage Update the cities and categories variables in the CrawlingSpider class of yellow_spider.py file to specify the desired cities and categories for scraping.

Run the spider using the following command: scrapy crawl yellow_spider Copy code scrapy crawl yellow_spider -o output.csv This will start the scraping process and save the extracted data to the output.csv file.

Customization If you want to modify the data fields being scraped or the structure of the output file, you can edit the parse_item method in the CrawlingSpider class.

Additional customization options, such as handling pagination and handling different types of data, can be implemented in the spider according to your specific requirements.

Contributing Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

License This project is licensed under the MIT License.

Feel free to update the sections according to your specific project details and requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

License

liakdimi1/yellow-pages-scraper

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages