detectAi is Image Text Extraction with Tesseract OCR

Description

This Python script processes a folder of images, extracts text using Tesseract OCR, and matches the extracted text against specified regex patterns. It is designed to handle batch processing of images and identifies images that contain text matching the given patterns.

Installation

Prerequisites

Python 3.x
Tesseract OCR installed on your system

Dependencies

Install the required Python libraries using:

pip install -r requirements.txt

Tesseract OCR

Ensure Tesseract OCR is installed on your system. Installation instructions can be found at Tesseract's GitHub repository. Usage

Run the script with the following command:

python detecAi.py -f [folder_path] -mr [regex_patterns] -bs [batch_size] -o [output_file]

-f/--folder: Path to the folder containing images.
-mr/--regex: List of regex patterns to search in the text.
-bs/--batch-size: Number of images to process in each batch (default: 25).
-o/--output-file: Output file to save the names of matched images (default: matched_images.txt)

Example

python detecAi.py -f ./images -mr "\\d{3}-\\d{2}-\\d{4}"  -bs 10 -o results.txt

Tutorial

https://youtu.be/W-riZ-_lO0Q?si=2AHpVmdljpTsm4Tr

Contributing

Contributions to this project are welcome. Please fork the repository and open a pull request with your changes or suggestions.

Acknowledgments

Tesseract OCR, for the OCR engine.
Pillow, for image processing capabilities.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
detecAi.py		detecAi.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

detectAi is Image Text Extraction with Tesseract OCR

Description

Installation

Prerequisites

Dependencies

Tesseract OCR

Example

Tutorial

Contributing

Acknowledgments

About

Releases

Packages

Languages

djallalzoldik/AI-Detect-sensative

Folders and files

Latest commit

History

Repository files navigation

detectAi is Image Text Extraction with Tesseract OCR

Description

Installation

Prerequisites

Dependencies

Tesseract OCR

Example

Tutorial

Contributing

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages