-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug Report: Text Truncation in EPUB Files Larger Than 500KB #39
Comments
here is the Example Code Online |
I have changed the extract_string_max_length to 100000000, but not work. # Create a new extractor
extractor = Extractor()
# extractor.set_extract_string_max_length(1000)
extractor.set_extract_string_max_length(100000000) |
Thanks for reporting the issue. I was going to suggest setting set_extract_string_max_length to a higher value, but it seems that you already tried that. I'll have a look further into this. |
I am getting the same problem, it sets 500K no matter the extract length is set to. |
|
Bug Report: Text Truncation in EPUB Files Larger Than 500KB
Project: Extractous
Version: [extractous==0.2.0]
Environment: [X86, Linux, Python3.10 ]
Description:
When processing EPUB documents that extracted result text larger than 500KB, the extracted text is consistently truncated around the 500KB mark, preventing the output of the complete content.
Expected Behavior:
The extracted text should contain the complete content of the EPUB file, regardless of its text length.
Actual Behavior:
The extracted text is truncated around the 500KB mark, leaving out the remaining content of the EPUB file.
Example Code:
Additional Information:
Greatly appreciate it if you could kindly look into and address this issue at your earliest convenience.
Thank you very much for your assistance!
The text was updated successfully, but these errors were encountered: