Skip to content

liakdimi1/yellow-pages-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Yellow Pages Scraper This Scrapy spider allows you to scrape business information from Yellow Pages website based on different cities and categories.

Installation Clone the repository: bash Copy code git clone https://github.com/liakdimi1/yellow-pages-scraper.git Install the required dependencies: Copy code pip install scrapy pandas Usage Update the cities and categories variables in the CrawlingSpider class of yellow_spider.py file to specify the desired cities and categories for scraping.

Run the spider using the following command: scrapy crawl yellow_spider Copy code scrapy crawl yellow_spider -o output.csv This will start the scraping process and save the extracted data to the output.csv file.

Customization If you want to modify the data fields being scraped or the structure of the output file, you can edit the parse_item method in the CrawlingSpider class.

Additional customization options, such as handling pagination and handling different types of data, can be implemented in the spider according to your specific requirements.

Contributing Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

License This project is licensed under the MIT License.

Feel free to update the sections according to your specific project details and requirements.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published