PDFSegmenter is an Executor used for extracting images and text as chunks from PDF data. It stores each images and text of each page as chunks separately, with their respective mime types. It uses the pdfplumber library.
forked from jina-ai/executor-pdfsegmenter
-
Notifications
You must be signed in to change notification settings - Fork 0
Jina Executor used for extracting images and text as chunks from PDF data
luquitared/executor-pdfsegmenter
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Jina Executor used for extracting images and text as chunks from PDF data
Resources
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- Python 100.0%