SamKenX applications and Document AI, the end-to-end document processing platform on Cloudstorage warehouse.
-
Updated
Mar 25, 2023 - Python
SamKenX applications and Document AI, the end-to-end document processing platform on Cloudstorage warehouse.
Extracting Data from Document PDF and Converting to EDI211 Files Using GCP and Google Document AI
An OCR-free Visual Document Embedding Model Based on MiniCPM-V-2.0
[Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"
Document AI Toolbox is an SDK for Python that provides utility functions for managing, manipulating, and extracting information from the document response. It creates a "wrapped" document object from JSON files in Cloud Storage, local JSON files, or output directly from the Document AI API.
An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)
Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
A Repo For Document AI
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Add a description, image, and links to the document-ai topic page so that developers can more easily learn about it.
To associate your repository with the document-ai topic, visit your repo's landing page and select "manage topics."