Here are
2,008 public repositories
matching this topic...
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Updated
Jul 17, 2024
Python
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Updated
Jul 13, 2024
Python
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Updated
Jul 16, 2024
Python
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Updated
Jul 17, 2024
Python
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Updated
Jul 16, 2024
Python
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Updated
Jul 17, 2024
Python
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Updated
Jul 5, 2024
Python
Updated
Jul 5, 2024
Python
Scan, index, and archive all of your paper documents
Updated
Apr 6, 2021
Python
Updated
Aug 29, 2022
Python
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Updated
Jul 11, 2024
Python
A supercharged version of paperless: scan, index and archive all your physical documents
Updated
Feb 14, 2023
Python
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Updated
Feb 21, 2024
Python
Updated
Jul 15, 2024
Python
A Unified Toolkit for Deep Learning Based Document Image Analysis
Updated
Mar 7, 2024
Python
Low-code development tool based on PaddlePaddle(飞桨低代码开发工具)
Updated
Jul 16, 2024
Python
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Updated
Jul 15, 2024
Python
OpenMMLab Text Detection, Recognition and Understanding Toolbox
Updated
Jul 15, 2024
Python
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Updated
Jul 16, 2024
Python
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network
Updated
Oct 3, 2023
Python
Improve this page
Add a description, image, and links to the
ocr
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
ocr
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.