TableDigitalizer

This project is an improvement over the algorithms released in paper Multi-Type-TD-TSR. The Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition and the source code

The original algorithm cannot deal with column spanning or row spanning. The algorithm proposed here has been optimised for column spanning as will be depicted from the following images

Example 1

Example 2

Example 3

Algorithm for color invariance is also proposed in the code, during experimentations the proposed algorithm was found to be more robust than the one proposed in the original paper.

Web Application

A Flask based web application has also been developed which can be run - After cloning the repository, run python tsr.py

The original algorithms reside in author.py and the new tailored ones are present in self.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
env		env
static/files		static/files
templates		templates
PROJECT REPORT.pdf		PROJECT REPORT.pdf
README.md		README.md
T1.png		T1.png
T2.png		T2.png
T3.png		T3.png
T4.png		T4.png
author.py		author.py
self.py		self.py
tsr.py		tsr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TableDigitalizer

Example 1

Example 2

Example 3

Web Application

About

Releases

Packages

Languages

ShreyanshBardia/TableDigitalizer

Folders and files

Latest commit

History

Repository files navigation

TableDigitalizer

Example 1

Example 2

Example 3

Web Application

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages