We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
main
453c291
OS type: centos 7.6 method: docker compose
parsing pdfs using law method, most succeeded but one failed.
check logs:
Traceback (most recent call last): File "/ragflow/rag/svr/task_executor.py", line 130, in build cks = chunker.chunk(row["name"], binary=binary, from_page=row["from_page"], ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/rag/app/laws.py", line 106, in chunk for txt, poss in pdf_parser(filename if not binary else binary, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/rag/app/laws.py", line 80, in __call__ return [(b["text"], self._line_tag(b, zoomin)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/rag/app/laws.py", line 80, in <listcomp> return [(b["text"], self._line_tag(b, zoomin)) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ragflow/deepdoc/parser/pdf_parser.py", line 831, in _line_tag while bott * ZM > self.page_images[pn[-1] - 1].size[1]: ~~~~~~~~~~~~~~~~^^^^^^^^^^^^ IndexError: list index out of range
No response
parsing pdfs using law method.
The text was updated successfully, but these errors were encountered:
rm page number exception for pdf parser (#424)
0499a3f
### What problem does this PR solve? #423 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)
fixed.
Sorry, something went wrong.
No branches or pull requests
Is there an existing issue for the same bug?
Branch name
main
Commit ID
453c291
Other environment information
Actual behavior
parsing pdfs using law method, most succeeded but one failed.
check logs:
Expected behavior
No response
Steps to reproduce
Additional information
No response
The text was updated successfully, but these errors were encountered: