Skip to content

0.14.4

Compare
Choose a tag to compare
@christinestraub christinestraub released this 03 Jun 21:16
· 273 commits to main since this release
1dede50

Enhancements

  • Move logger error to debug level when PDFminer fails to extract text which includes error message for Invalid dictionary construct.
  • Add support for Pinecone serverless Adds Pinecone serverless to the connector tests. Pinecone
    serverless will work version versions >=0.14.2, but hadn't been tested until now.

Features

  • Allow configuration of the Google Vision API endpoint Add an environment variable to select the Google Vision API in the US or the EU.

Fixes

  • Address the issue of unrecognized tables in UnstructuredTableTransformerModel When a table is not recognized, the element.metadata.text_as_html attribute is set to an empty string.
  • Remove root handlers in ingest logger. Removes root handlers in ingest loggers to ensure secrets aren't accidentally exposed in Colab notebooks.
  • Fix V2 S3 Destination Connector authentication Fixes bugs with S3 Destination Connector where the connection config was neither registered nor properly deserialized.
  • Clarified dependence on particular version of python-docx Pinned python-docx version to ensure a particular method unstructured uses is included.
  • Ingest preserves original file extension Ingest V2 introduced a change that dropped the original extension for upgraded connectors. This reverts that change.