Skip to content

Releases: knaw-huc/loghi

v2.1.4

17 Sep 19:27
Compare
Choose a tag to compare

Preserve custom attribute for TextLine

v2.1.3

17 Sep 14:58
Compare
Choose a tag to compare

loghi-htr:

make implementation of custom models that exceed vgsl possibilities a bit easier (eg: use docker mapping to override custom_model.py)

improve plots of loss/CER

Loghi tooling

added MinionFixPageXML. This Minion takes an directory of pagexml's as input, reads them and writes them back while correcting some known errors.

improved cutting of TextLines in MinionCutFromImageBasedOnPageXMLNew

use textline coords when baseline is missing
MinionGeneratePageImages:

random thickness lines
MinionConvertPageToTxt

add no-overwrite
overall:

improved memory usage
opencv to 4.10.0
some refactoring

laypa

to be added

loghi general

  • Make Gradio demo compatible with loghi-htr 2.1.x With the recent changes to loghi-htr to switch to ASGI (loghi-htr commit d0e9a87), the gradio demo stopped working with docker. This updates the compose file to follow the changes made on the loghi-htr scripts.
  • add pipefail to scripts
  • easier changing of outputdir during training

v2.0.4

03 May 13:34
Compare
Choose a tag to compare

Release Notes for Loghi Version 2.0.4

Date: 2024-04-29

Overview

A small bug fixing update to fix a problem with Laypa's training not working

Submodule-Specific Updates

Laypa Version 2.0.4


Full Changelog for Loghi Repository: 2.0.3...2.0.4

v2.0.3

26 Apr 07:35
Compare
Choose a tag to compare

Release Notes for Loghi Version 2.0.3

Date: 2024-04-26

Overview

This release introduces enhancements across two submodules: Loghi-HTR and Laypa. Key updates include improved error handling in Loghi-HTR and new features and fixes in Laypa. Additionally, this version marks the introduction of a new Gradio demo in the Loghi repository, facilitating interactive demonstrations of the software.

Submodule-Specific Updates

Loghi-HTR Version 2.0.3

  • Improved Error Handling: Added specific error reporting with a ValueError raised when text partitions contain no valid lines, improving clarity for users working with large text datasets.
  • Docker image: docker pull loghi/docker.htr:2.0.3
  • Full Release Notes for Loghi-HTR

Laypa Version 2.0.3

  • New Augmentations: Introduces Invert, JPEG compression, Noise, and Hue transformations.
  • Enhancements in Data Processing: Includes fixes for image preprocessing and optimization in augmentations using np.uint8.
  • Docker image: docker pull loghi/docker.laypa:2.0.3
  • Full Release Notes for Laypa

New Feature in Main Repository

  • Gradio Demo: A new interactive demo is now available, demonstrating the capabilities of Loghi software. For details and usage, visit the gradio directory in the main repository.

Full Changelog for Loghi Repository: 2.0.2...2.0.3

v2.0.2

26 Apr 07:24
Compare
Choose a tag to compare

Release Notes for Loghi Version 2.0.2

Date: 2024-04-18

Overview

This release includes updates for two of the three Loghi submodules: Loghi-HTR and Laypa. It brings improvements in system performance, API reliability, memory management, and security for Loghi-HTR, along with bug fixes and usability enhancements in Laypa. There are no updates for Loghi Tooling in this version.

Submodule-Specific Updates

Loghi-HTR Version 2.0.2

  • Enhanced TensorFlow strategy automation and API error handling.
  • Significant improvements in memory management and security.
  • Docker image: docker pull loghi/docker.htr:2.0.2
  • Full Release Notes for Loghi-HTR

Laypa Version 2.0.2

  • Configuration path issues fixed and improvements in augmentation visualization.
  • Addressed issues with image preprocessing settings.
  • Docker image: docker pull loghi/docker.laypa:2.0.2
  • Full Release Notes for Laypa

Full Changelog for Loghi Repository: 2.0.1...2.0.2

v2.0.1

18 Apr 10:25
Compare
Choose a tag to compare

Release Notes for Loghi Version 2.0.1

Date: 2024-04-10

Overview Loghi-htr

Version 2.0.1 of HTR introduces critical updates to enhance model accuracy and configuration clarity. This release notably corrects a CTC loss calculation bug and updates the README to guide users on essential model configurations.

Major Updates

  • CTC Loss Calculation Bug Fix: Addressed an issue affecting the accuracy of the CTC loss calculation under specific dataset and batch size conditions, ensuring more reliable model training outcomes.

for details see:
https://github.com/knaw-huc/loghi-htr/releases/tag/2.0.1

v2.0.0

08 Apr 15:19
Compare
Choose a tag to compare

Release Notes for Loghi Version 2.0.0

Date: 2024-04-04

Overview

This release marks a significant milestone for the Loghi repository, encompassing major updates across all submodules: Loghi-HTR, Laypa, and Loghi-tooling. Version 2.0.0 introduces a plethora of enhancements aimed at improving performance, user experience, and system efficiency. With a focus on optimized data processing, advanced visualization tools, and a more intuitive user interface, this version ensures a seamless and powerful experience for users engaging in handwriting text recognition tasks and related operations.

Highlights of the Release

  • Enhanced User Experience: The repository has been reorganized with a more intuitive folder structure, ensuring easier navigation and access to resources.
  • Script Improvements: Introduction of more intuitive script names within a new scripts directory, facilitating easier and more understandable interactions with the software.
  • Web Service Enhancements: Improved web service scripts offer better reliability and performance when using Loghi's services.
  • Comprehensive Documentation: Updated README.md files across most directories provide detailed guidance, making it easier for new users to get started and for existing users to explore new features.

Submodule-Specific Updates

Loghi-HTR Version 2.0.0

  • Modular Code Structure for improved maintainability.
  • API v2 with enhanced performance and efficiency.
  • Custom Learning Rate Schedule and GPU Handling Refinements.
  • Significant enhancements in visualization, data loading, and augmentation processes.

Full release notes for Loghi-HTR can be found here.

Laypa Version 2.0.0

  • DPI adjustment for automatic image size modification based on DPI.
  • CPU fallback mechanism to ensure reliability under resource constraints.
  • AugInput refactor for streamlined data augmentation processes.

Full release notes for Laypa can be found here.

Loghi-tooling Version 2.0.0

  • New MinionConvertPageToTxt for efficient conversion from PageXML to plain text.
  • Improved tag support for Unicode and HTML style text inputs.
  • Fixes and refactoring for better data handling and performance.

Full release notes for Loghi Tooling can be found here.

Docker Images

The new Docker images can be obtained with the following commands:

docker pull loghi/docker.laypa:2.0.0
docker pull loghi/docker.htr:2.0.0
docker pull loghi/docker.loghi-tooling:2.0.0

Full Changelog: 1.3.14...2.0.0

v1.3.14

29 Mar 12:27
Compare
Choose a tag to compare

1.3.14 contains fixes for the calculation of the textline polygon in command line mode.

If you use the server api: we introduced some incompatabilities in this version. please use 1.3.12 or v2 which will become available next week.

v1.3.12

22 Mar 09:07
Compare
Choose a tag to compare

1.3.12 Loghi-tooling
when saving PageXML removing TranskribusMetadata as it is not valid pagexml
fix ExtractBaselinesResource to recalculate textline contours
refactor/clean
add image file for testing extractbaselines via api
add image so textline polygons can be calculated
disable broken test

BREAKING: pagexml contours are now calculated in MinionExtractBaselines instead of MinionCutFromImageBasedOnPageXMLNew

Loghi-HTR:
The DataLoader now skips lines that are empty after stripping, ensuring cleaner data processing.
The --random_jpeg augmentation has been adjusted to be less extreme, providing more realistic augmentations.
When using the --existing_model option, the channels are now always reset to ensure consistent model behavior.
Fixed a bug where confidence scores could exceed 1 due to precision errors. All confidence scores are now clamped to the range [0, 1]. A warning is logged whenever this clamping occurs.
The SavedModel format is now converted and saved to the new .keras format in the output/model.name directory. Starting from May 2024, the legacy format will only be usable for inference.

1.3.11
allow empty points to be ignored as they might get fixed later
bump postgres to 42.7.2
bump opencv version to 4.9.0
pdf converter better support for jpeg's
fix minionshrinktextlines to use adaptive thresholding and avoid creating single pixel baselines.
pdf support
WIP: read v2 style format loghi-htr output
fix bug in setting correct namespace

1.3.10
fix bug in setting correct namespace
update log4j
fix pdfconverter (WIP)
add pdf converter (WIP)

1.3.9
update jackson
if it's 2013 use 2013
update libraries
add vulnerability scanner

v1.3.7

22 Dec 09:38
Compare
Choose a tag to compare

Loghi-tooling:

includes changes from 1.3.6 as well

  • fix nullpointer exception when using older models without config
  • Add optional security to loghiwebservice
  • Fix recalculate reading order test
  • improvements in generic reading order detection
  • don't include textstyle for strings that are empty
  • avoid nullpointer exception when reading htr config