-
Notifications
You must be signed in to change notification settings - Fork 16
overlapping regions #7
Comments
The pixel classification is turned into polygonal regions that are then going through operations that dilate and erode, which can lead to overlaps. For a high-level overview, see http://ceur-ws.org/Vol-2723/long20.pdf The set of polygonal operations that finally lead to overlaps is defined here: origami/origami/custom/layouts/bbz.py Line 59 in 544485b
Implementation of the different operations starts here: origami/origami/batch/detect/layout.py Line 310 in 544485b
Are we talking about the second example page? It looks to me like the pixel classifier already gets this wrong, i.e. classifies this as text. This would mean that our BBZ table training data did not generalize well for this case.
The PAGE-XML is indeed output for the warped page, coordinates are transformed from dewarped into warped space for the export. The dewarping transformation is basically a grid of dewarped points that models dewarping through linear interpolations and works both ways, i.e. warped -> dewarped and dewarped -> warped. So, for each regular grid point, there is one dewarped grid point, and this mapping of quadrilaterals defined the dewarping. This mapping is available at all post-dewarping stages of the pipeline (it's saved into a separate file). Implementation is at https://github.com/poke1024/origami/blob/master/origami/core/dewarp.py where the Transformer class implements the actual interpolation, see origami/origami/core/dewarp.py Line 143 in 544485b
This is probably related to fine tuning of polygonal operations (also see questions below). Either the constituent text line polygons do not get merged in the first place (you might want to look into origami/origami/batch/detect/layout.py Line 928 in 544485b
origami/origami/custom/layouts/bbz.py Line 59 in 544485b
FixSpillOverH (or changing its parameters) from the Transformer in origami/origami/custom/layouts/bbz.py Line 59 in 544485b
FixSpillOverH is that sometimes, in the pixel classifier, blocks do get merged which should not, and this tries to fix it - but sometimes it fixes too much.
This is a good question. My best ad hoc guess is the origami/origami/batch/detect/lines.py Line 158 in 544485b
Yes. The code location to experiment is at origami/origami/custom/layouts/bbz.py Line 59 in 544485b
Instead of the current implementation you could use a
which means merging all overlapping regions (starting at any overlap > 0) and not doing any dilations or erosions. This might be worth experimenting with. The current default set of operations is fine-tuned towards some border cases encountered in the BBZ layout.
Not in the API or exports at this point, but after running the "contours" stage, you can unzip In terms of PageXML export, there is some simple support for exporting TableRegions (see origami/origami/batch/detect/compose.py Line 145 in 544485b
I would need to look into this in more detail to give a better answer. |
Not sure if this a bug at all. I've used your pretrained BBZ model to segment pages in similar data:
Börsenblatt des Deutschen Buchhandels
. These also have 2-column layouts besides the 3- and 4-column layouts ofBerliner Börsenzeitung
, and the advertisement parts look very different. But I assumed the domains are close enough for pages like this.The bbz-segment results (via full Origami pipeline and
compose --page-xml
) do look very good in general. This is truly amazing work!But some errors leave me puzzled:
(Sorry, cannot get these to render with equal width in GFM...)
Here what I don't understand is:
TableRegion
s instead of recursiveTextRegion
s?The text was updated successfully, but these errors were encountered: