Skip to content

Jina Executor used for extracting images and text as chunks from PDF data

Notifications You must be signed in to change notification settings

luquitared/executor-pdfsegmenter

 
 

Repository files navigation

✨ PDFSegmenter

PDFSegmenter is an Executor used for extracting images and text as chunks from PDF data. It stores each images and text of each page as chunks separately, with their respective mime types. It uses the pdfplumber library.

About

Jina Executor used for extracting images and text as chunks from PDF data

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%