SegmEdit allows to browse and edit XML files (in TrueViz format) containing information about structure of PDF documents (words, lines, zones) and about zones' classification (title, author, abstract, etc.) One of the components of the solution is a server responsible for distribution of documents to be processed.
SegmEdit was created in order to create a test suite for page segmentation and zone classification algorithms, which are part of a metadata extraction framework developed at CeON.
It is an open source software, wrote in Python using wxWidgets library and ImageMagick software suite.
- Krzysztof Rusek [email protected]
- Artur Czeczko [email protected]
See files: