Skip to content

0.16.11

Latest
Compare
Choose a tag to compare
@scanny scanny released this 10 Dec 00:51
· 9 commits to main since this release
b981d71

Enhancements

  • Enhance quote standardization tests with additional Unicode scenarios
  • Relax table segregation rule in chunking. Previously a Table element was always segregated into its own pre-chunk such that the Table appeared alone in a chunk or was split into multiple TableChunk elements, but never combined with Text-subtype elements. Allow table elements to be combined with other elements in the same chunk when space allows.
  • Compute chunk length based solely on element.text. Previously .metadata.text_as_html was also considered and since it is always longer that the text (due to HTML tag overhead) it was the effective length criterion. Remove text-as-html from the length calculation such that text-length is the sole criterion for sizing a chunk.

Features

Fixes

  • Fix ipv4 regex to correctly include up to three digit octets.