Skip to content

Web Scrapper To collect google patent Images Using Python and Selenium

Notifications You must be signed in to change notification settings

narayan123411/WebScrapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Google Patents Image Scraper

Introduction

The Google Patents Image Scraper is a versatile Python tool that automates the downloading of patent images from Google Patents. Designed to support both GUI and non-GUI interactions within a single application, it caters to a wide range of users, from those who prefer a visual interface to those who need the efficiency of command-line execution.

Features

  • Unified GUI and Non-GUI Functionality: Seamlessly switch between a Graphical User Interface and a Command-Line Interface within the same application.

  • Robust Error Handling: The tool is equipped with error handling to manage and retry after exceptions, ensuring reliability and stability.

  • Progress Summary Generation: After completion, the tool generates a summary of the scraping session, detailing the number of images successfully downloaded, any skipped or failed attempts, and other relevant metrics.

    ezgif com-resize

How It Works

The tool works in the following way:

  1. Upon launch, users can choose to operate in GUI mode or proceed with non-GUI mode.
  2. The scraper navigates to Google Patents, searches for specified patents, and retrieves image URLs.
  3. Images are then downloaded and saved to a specified directory.
  4. Throughout the scraping process, progress updates are displayed, either within the GUI or in the command line.
  5. A summary report is provided at the end, summarizing the session's outcomes.

Getting Started

Follow these steps to use the Google Patents Image Scraper:

  1. Clone this repository to your local machine.
  2. Install the required dependencies from requirements.txt.
  3. Execute the scraper file Automation.ipynb
  4. Attach Patent Numbers as an input and begin the scraping process.

Contribution

Feedback and contributions are highly appreciated. If you'd like to contribute or suggest improvements, please fork the repository, push your changes, and create a pull request. For larger changes or feature suggestions, please open an issue first to discuss what you would like to change.

Please star the project if you find it helpful!

About

Web Scrapper To collect google patent Images Using Python and Selenium

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published