-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ocrd_utils.coordinates_for_segment: clip to parent? #489
Comments
(For example, this problem has surfaced in |
See here for an idea of what could/should really be done in core. Working with many different libraries (Shapely, OpenCV, Scikit) it's still surprisingly hard to calculate contour polygons in a way that avoids self-intersections. Maybe we should also encapsulate that demand in a utility function. |
That's a reasonable solution, you have my 👍 |
This could also be part of an OCR-D adaptor to PAGE-XML pixel-center coordinate convention – if that does turn out to be the binding interpretation. We could make Let's discuss this on Tuesday! |
No we cannot! The problem is that at the moment most processors scramble the details of the new segment to be created (identifier, coordinates, content) that segment does not exist yet. The only way this can be done is via an extra kwarg for the parent (as shown in #492), which even increases the asymmetry and idiosyncrasy. But there's another option: Instead of adding some function or argument to explicitly post-process the coordinates, we could also handle all cases implicitly by augmenting the PAGE-XML DOM (via #479's
Pros:
|
We have the utility function
coordinates_for_segment
for conversion of polygons from some derived image and its coordinate transform to absolute coordinates of the original image.This is all the function does, but of course the expectation is more: Whereever that polygon was in the derived image, give me coordinates which I can annotate as final result of segmentation! (And unfortunately, that is also what it is mostly used for.)
However, that would entail also doing the inverse of
image_from_polygon
, i.e. clipping (or rather masking) the polygon to its parent element. This usually means ensuring the child element's polygon stays within the parent element's polygon. Except for the top-level parent (Page
when there is noBorder
), where this means clipping coordinates to a range between zero and image size.Thus, not having that functionality in core means segmenters will tend to produce inconsistent coordinates (failing the PAGE validator), and sometimes (e.g. when the page was rotated) even negative coordinates (failing XML validators and other PAGE libraries).
Unfortunately, we already made the first argument the segment's polygon instead of its element (as in
coordinates_of_segment
– an ugly asymmetry which itself originated from PAGE'sBorder
idiosyncrasy). So we cannot silently enrichcoordinates_for_segment
by looking into the segment'sparent_object_
and clipping the polygon respectively.Or can we? How about making it polymorphic and checking the type of the first argument, as an opt-in for implicit conversion consistent with parent? To me that seems much better than providing a new function and trying to find processors which need it. We could even transform the function to be exclusively for the new usage by starting to deprecate the old usage at a later point and hunting it down.
@kba @wrznr if you give me your thumbs up I will kick this off with a PR.
The text was updated successfully, but these errors were encountered: