-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when using pipes #889
Comments
Output with less options, same file: for i in $(seq 0 100); do podman run --rm -i ocrmypdf --verbose -l deu - - <tmp.pdf >out.pdf; done
dmesg:
|
...and here with just --verbose 1
|
Looks like you're using ocrmypdf 12.6. Can you try again with a more recent version? v13 introduced some improvements to concurrency that reduce hanging. I won't be able to do much with the segfault. Perhaps you can identify the responsible process? |
Hi @jbarlow83 I am using the latest docker container, but also explicitly tried out jbarlow83/ocrmypdf:v13.2.0, jbarlow83/ocrmypdf:v13.1.1 and jbarlow83/ocrmypdf:v13.1.0 Just tried with an additional |
The message The latest should read:
|
Could you give me a pointer how to do that with docker/podman? |
Oh, sorry - my bad. It seems I mixed up versions when filing this bug report... Here is the result with 13.2.0: for i in $(seq 0 100); do podman run --rm -i jbarlow83/ocrmypdf:v13.2.0 --verbose -l deu - - <tmp.pdf >out.pdf; done
dmesg
|
Taking that last segfault:
Address within libc: 0x00007f025e26d9b5 - 0x7f025e11e000 = 0x14F9B5 $ addr2line -e /usr/lib/libc.so.6 -fCi 0x14F9B5
__GI_netname2host
:? Does this help? |
Re addr2line: That's just showing that the container manager crashed but doesn't give any insight into the container process responsible (if any). You could also try 1) |
for i in $(seq 0 100); do podman run --rm -i jbarlow83/ocrmypdf:v13.2.0 --use-threads --verbose -l deu - - <tmp.pdf >out.pdf; done
|
for i in $(seq 0 100); do podman run --rm -i jbarlow83/ocrmypdf:v13.2.0 --jobs 1 --verbose -l deu - - <tmp.pdf >out.pdf; done
|
So, I assume nothing to do with concurrency :( |
podman logs (stderr) of the last crashes yields (both approximately the same)
Full logs here: Logs of stdout contain a corrupt PDF that includes some debug messages (see gs options to the far right of below snippet):
Full log (stdout): |
OK, removing
BTW: the error regarding xref 18 also shows up when I get the correct output with valid (and correctly ocr'ed PDFs). |
Can you try to reproduce this issue using docker rather than podman as the container host? |
It seems that I'll set up docker and test it there tomorrow. However there are a few oddities that may require individual tickets/bug reports and investigation:
|
ocrmypdf does not log to stdout. The test suite covers checking that stdout is clean. I believe that podman is responsible for the behavior you're seeing. When I run ocrmypdf in native Linux or in Docker on your file, a valid PDF/A is produced (confirmed by checking PDF/A compliance with verapdf and a PDF viewer). The error regarding xref 18 is just indicating that an unusual image (specifically 2-bit grayscale) could not be optimized. |
After a lot of testing I can confirm that this is working with docker and the stand-alone version. Even if I start the container, replace the entry point, and then exec ocrmypdf in a loop I cannot reproduce the issue. It certainly looks like a podman issue. For now, I will live with the |
Cross-posted in containers/conmon#315 |
ocrmypdf does not behave much differently. Frankly if the error shown here is actually the case, |
Describe the bug
When running ocrmypdf through podman/docker I sometimes (#864) experience segmentation faults and the container hangs indefinitely. The output file is empty.
To Reproduce
The following command is executed to reproduce the failure, due to the non-deterministic behavior of ocrmypdf, it might take a while or even multiple loops to reproduce.
All of the options can be omitted and the issue is reproducible. The resulting log is:
dmesg yields:
(Always the same location in libc)
Exchanging
>out.pdf
withtee out.pdf
I at some point could see strange characters being omited after %%EOF (?), however, most of the time it hangs before that.Example file
The example file is attached in encrypted form. tmp.pdf.gpg.zip
Expected behavior
The output file should be correct and the tool should not hang.
System
The text was updated successfully, but these errors were encountered: