LibGfx/JBIG2+jbig2-from-json+Tests: Write intermediate halftone, text regions by nico · Pull Request #26411 · SerenityOS/serenity

nico · 2025-11-18T02:23:35Z

Similar to #26410, the challenge with refining halftone and text regions
is that the writer needs to know the decoded halftone and text region
bitmap. That data is easily available in the loader, but not in the
writer.

Similar to #26410, the approach is to call into the loader with a
list of segments needed to decode the intermediate region's data.

...and then some minor plumbing to hook up the intermediate region
types in jbig2-from-json.

With this, we can write all region segment types :^)

The only thing related to refinement in general that's still missing is that we still can't write symbols that refine refinement symbols that use an embedded text region for refinement. Here we also need the decoded form of a text region, but the text region is embedded in a symbol dictionary, and that might currently be actively written. In other words, the technique of calling the loader doesn't directly work, since the thing to load isn't complete yet. I have to think more about how to handle that case.

…but for intermediate halftone and text regions, this approach lets us implement this with very little code :^)

JBIG2ImageDecoderPlugin::create_embedded_jbig2_decoder() takes a list of segment data spans and the number of an intermediate segment, and returns the image data of that intermediate segment.

No behavior change.

This will also be used to get the contents of intermediate halftone and text regions. No behavior change.

Similar to SerenityOS#26410, the challenge with refining halftone and text regions is that the writer needs to know the decoded halftone and text region bitmap. That data is easily available in the loader, but not in the writer. Similar to SerenityOS#26410, the approach is to call into the *loader* with a list of segments needed to decode the intermediate region's data. ...and then some minor plumbing to hook up the intermediate region types in jbig2-from-json. With this, we can write all region segment types :^)

This adds writer support for refinement of a symbol dictionary entry that is itself a strip refinement. This case is a bit awkward for the writer: To write the refinement, we need the bitmap that's being refined. But since that's a strip refine, we don't have that bitmap, we only have the parts it consists of. As SerenityOS#26411 explains, the approach of calling into the loader to do the strip decoding doesn't work here: We need the decoded strip refinement while writing the strip itself. So we'd have to send a partially encoded strip to the loader, which isn't something we can currently do. But since strip refinements only use a limited subset of full text strips (always a background color of 0, always a reference corner of topleft, always a composition operator of "or"), reimplementing the loader's logic for this case, while conceptually not nice, is not a lot of code. So let's just do that. Using this, also add a test for this feature. I believe this tests the last missing refinement case :^) The test uses regular symbols for the eyes (symbols 0 and 1) and the mouth (symbol 6). The nose, 100x100 pixels, is conceptually split into 60x60, 40x60, 60x40, 40x40 tiles. Since symbols are sorted by increasing height, the latter two are symbols 2 and 3, the former are symbols 4 and 5. Symbol 3 isn't actually the bottom right part of the nose, but contains the pixel data of an eye. This is for testing refinement below. To test strip background color filling, the 60x60 tile is actually 59x60 pixels. And to test refinement, the top left tile is actually solid white instead of the top left part of the nose. The first refinement symbol (id 7) refines the top left tile to a 59x60 tile that's the top left corner of the nose, but shifted several pixels to the right. The second refinement symbol (id 8) is a strip refinement. It puts symbols 7 and 5 next to each other, to produce the top half of the nose (with the left part being shifted to the side). The third refinement symbol (id 9) is a simple refinement of symbol 8 refines it to the actual top of the nose (to fix the shift, and to test refinement of strips in a symbol dictionary). The fourth refinement symbol (id 10) is a strip refinement that puts symbol 9 in the first strip, with an offset of 1, and symbols 2 and 3 in a second strip. This means symbol 10 is almost the nose, but its bottom right contains the pixel data of an eye, since that's what symbol 3 contained. Finally, the text region refines symbol 10 to the actual nose pixels, to test refinement of strips from a text region. Whew! Since the text region refines to the nose, it's useful to locally remove that refinement to test that the reference bitmap looks as expected. Similarly, it's useful to locally make the final strip refinement put symbol 8 instead of symbol 9 in the first strip, to make sure the input to symbol 9 looks as expected. The test file decodes fine in all viewers I've tried (PDFium, Preview.app, pdf.js, mutool). (This case is much more awkward for the writer than for the loader, after all.)

This adds writer support for refinement of a symbol dictionary entry that is itself a strip refinement. This case is a bit awkward for the writer: To write the refinement, we need the bitmap that's being refined. But since that's a strip refine, we don't have that bitmap, we only have the parts it consists of. As #26411 explains, the approach of calling into the loader to do the strip decoding doesn't work here: We need the decoded strip refinement while writing the strip itself. So we'd have to send a partially encoded strip to the loader, which isn't something we can currently do. But since strip refinements only use a limited subset of full text strips (always a background color of 0, always a reference corner of topleft, always a composition operator of "or"), reimplementing the loader's logic for this case, while conceptually not nice, is not a lot of code. So let's just do that. Using this, also add a test for this feature. I believe this tests the last missing refinement case :^) The test uses regular symbols for the eyes (symbols 0 and 1) and the mouth (symbol 6). The nose, 100x100 pixels, is conceptually split into 60x60, 40x60, 60x40, 40x40 tiles. Since symbols are sorted by increasing height, the latter two are symbols 2 and 3, the former are symbols 4 and 5. Symbol 3 isn't actually the bottom right part of the nose, but contains the pixel data of an eye. This is for testing refinement below. To test strip background color filling, the 60x60 tile is actually 59x60 pixels. And to test refinement, the top left tile is actually solid white instead of the top left part of the nose. The first refinement symbol (id 7) refines the top left tile to a 59x60 tile that's the top left corner of the nose, but shifted several pixels to the right. The second refinement symbol (id 8) is a strip refinement. It puts symbols 7 and 5 next to each other, to produce the top half of the nose (with the left part being shifted to the side). The third refinement symbol (id 9) is a simple refinement of symbol 8 refines it to the actual top of the nose (to fix the shift, and to test refinement of strips in a symbol dictionary). The fourth refinement symbol (id 10) is a strip refinement that puts symbol 9 in the first strip, with an offset of 1, and symbols 2 and 3 in a second strip. This means symbol 10 is almost the nose, but its bottom right contains the pixel data of an eye, since that's what symbol 3 contained. Finally, the text region refines symbol 10 to the actual nose pixels, to test refinement of strips from a text region. Whew! Since the text region refines to the nose, it's useful to locally remove that refinement to test that the reference bitmap looks as expected. Similarly, it's useful to locally make the final strip refinement put symbol 8 instead of symbol 9 in the first strip, to make sure the input to symbol 9 looks as expected. The test file decodes fine in all viewers I've tried (PDFium, Preview.app, pdf.js, mutool). (This case is much more awkward for the writer than for the loader, after all.)

This adds writer support for refinement of a symbol dictionary entry that is itself a strip refinement. This case is a bit awkward for the writer: To write the refinement, we need the bitmap that's being refined. But since that's a strip refine, we don't have that bitmap, we only have the parts it consists of. As SerenityOS#26411 explains, the approach of calling into the loader to do the strip decoding doesn't work here: We need the decoded strip refinement while writing the strip itself. So we'd have to send a partially encoded strip to the loader, which isn't something we can currently do. But since strip refinements only use a limited subset of full text strips (always a background color of 0, always a reference corner of topleft, always a composition operator of "or"), reimplementing the loader's logic for this case, while conceptually not nice, is not a lot of code. So let's just do that. Using this, also add a test for this feature. I believe this tests the last missing refinement case :^) The test uses regular symbols for the eyes (symbols 0 and 1) and the mouth (symbol 6). The nose, 100x100 pixels, is conceptually split into 60x60, 40x60, 60x40, 40x40 tiles. Since symbols are sorted by increasing height, the latter two are symbols 2 and 3, the former are symbols 4 and 5. Symbol 3 isn't actually the bottom right part of the nose, but contains the pixel data of an eye. This is for testing refinement below. To test strip background color filling, the 60x60 tile is actually 59x60 pixels. And to test refinement, the top left tile is actually solid white instead of the top left part of the nose. The first refinement symbol (id 7) refines the top left tile to a 59x60 tile that's the top left corner of the nose, but shifted several pixels to the right. The second refinement symbol (id 8) is a strip refinement. It puts symbols 7 and 5 next to each other, to produce the top half of the nose (with the left part being shifted to the side). The third refinement symbol (id 9) is a simple refinement of symbol 8 refines it to the actual top of the nose (to fix the shift, and to test refinement of strips in a symbol dictionary). The fourth refinement symbol (id 10) is a strip refinement that puts symbol 9 in the first strip, with an offset of 1, and symbols 2 and 3 in a second strip. This means symbol 10 is almost the nose, but its bottom right contains the pixel data of an eye, since that's what symbol 3 contained. Finally, the text region refines symbol 10 to the actual nose pixels, to test refinement of strips from a text region. Whew! Since the text region refines to the nose, it's useful to locally remove that refinement to test that the reference bitmap looks as expected. Similarly, it's useful to locally make the final strip refinement put symbol 8 instead of symbol 9 in the first strip, to make sure the input to symbol 9 looks as expected. The test file decodes fine in all viewers I've tried (PDFium, Preview.app, pdf.js, mutool). (This case is much more awkward for the writer than for the loader, after all.)

nico added 5 commits November 17, 2025 21:13

LibGfx/JBIG2Loader: Add method that returns intermediate region data

832a5e4

JBIG2ImageDecoderPlugin::create_embedded_jbig2_decoder() takes a list of segment data spans and the number of an intermediate segment, and returns the image data of that intermediate segment.

LibGfx/JBIG2Writer: Make a lambda return NonnullRefPtr

af8bd4f

No behavior change.

LibGfx/JBIG2Writer: Extract collect_related_segments lambda

819136e

This will also be used to get the contents of intermediate halftone and text regions. No behavior change.

Tests/LibGfx: Add JBIG2 tests for intermediate halftone and text regions

70a23f9

github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 18, 2025

nico merged commit fc2e3fd into SerenityOS:master Nov 18, 2025
13 checks passed

nico deleted the jbig2-refine-page-halftone-text branch November 18, 2025 12:51

github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 18, 2025

nico mentioned this pull request Dec 29, 2025

LibGfx/JBIG2Writer+Tests/LibGfx: Refinement of symbol strip refines #26510

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LibGfx/JBIG2+jbig2-from-json+Tests: Write intermediate halftone, text regions #26411

LibGfx/JBIG2+jbig2-from-json+Tests: Write intermediate halftone, text regions #26411
nico merged 5 commits intoSerenityOS:masterfrom
nico:jbig2-refine-page-halftone-text

nico commented Nov 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nico commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nico commented Nov 18, 2025 •

edited

Loading