-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always prefer the PDF.js JPEG decoder for very large images, in order to reduce peak memory usage (issue 11694) #11707
Conversation
a923379
to
d647cc0
Compare
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/74eab0a8f5f79c3/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/30747c9310fd381/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.67.70.0:8877/30747c9310fd381/output.txt Total script time: 19.92 mins
Image differences available at: http://54.67.70.0:8877/30747c9310fd381/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.215.176.217:8877/74eab0a8f5f79c3/output.txt Total script time: 25.14 mins
Image differences available at: http://54.215.176.217:8877/74eab0a8f5f79c3/reftest-analyzer.html#web=eq.log |
d647cc0
to
74d3666
Compare
… to reduce peak memory usage (issue 11694) When JPEG images are decoded by the browser, on the main-thread, there's a handful of short-lived copies of the image data; see https://github.com/mozilla/pdf.js/blob/c3f4690bde8137d80c74203b1ad91476fc2ca160/src/display/api.js#L2364-L2408 That code thus becomes quite problematic for very big JPEG images, since it increases peak memory usage a lot during decoding. In the referenced issue there's a couple of JPEG images whose dimensions are `10006 x 7088` (i.e. ~68 mega-pixels), which causes the *peak* memory usage to increase by close to `1 GB` (i.e. one giga-byte) in my testing. By letting the PDF.js JPEG decoder, rather than the browser, handle very large images the *peak* memory usage is considerably reduced and the allocated memory also seem to be reclaimed faster. *Please note:* This will lead to movement in some existing `eq` tests.
74d3666
to
62a9c26
Compare
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @timvandermeij received. Current queue size: 0 Live output at: http://54.67.70.0:8877/57505d60df0d083/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/57505d60df0d083/output.txt Total script time: 2.44 mins Published |
I can confirm that the tab doesn't crash anymore. In general I also think that this is a better approach for large images. Thanks! |
When JPEG images are decoded by the browser, on the main-thread, there's a handful of short-lived copies of the image data; see
pdf.js/src/display/api.js
Lines 2364 to 2408 in c3f4690
That code thus becomes quite problematic for very big JPEG images, since it increases peak memory usage a lot during decoding. In the referenced issue there's a couple of JPEG images whose dimensions are
10006 x 7088
(i.e. ~68 mega-pixels), which causes the peak memory usage to increase by close to1 GB
(i.e. one giga-byte) in my testing.By letting the PDF.js JPEG decoder, rather than the browser, handle very large images the peak memory usage is considerably reduced and the allocated memory also seem to be reclaimed faster.
Please note: This will lead to movement in some existing
eq
tests. Refer to #11523 (comment) for an explanation of the different test "failures".Fixes #11694 (to the extent that doing so is possible, given the size of the JPEG images).