-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support dewarping #180
Comments
Thanks for summarizing the problem and opening this discussion. I will have to think more about this and ideally, we should also discuss this with @chris1010010. But as to the potential problems you raise:
|
I'm not so sure about that. It comes with an obligatory On the other hand, using
So far we rely on that mechanism only to indicate which coordinate transforms described in PcGts actually apply to an AlternativeImage, so we can track its coordinate system w.r.t. That point was more about the workspace/METS than the processor/PAGE side: There should be a fast and reliable way of identifying any changes of the original image across the workflow chain, without the need to search through all pages and PAGEs. I'm not a METS expert, there are so many ways to represent that. We just need something that does not break any existing use-cases, is not too contrived and efficiently implementable. (And we should still allow for the possibility of not being able to track the coordinate system but nevertheless mark the change as such, so implementations like anybaseocr-dewarp can at least fit in.) There's of course an alternative to replacing the original image and using DwGts: We could also facilitate PcGts-only dewarping with some representation in |
This is somewhat already part of #116 but I would like to see a discussion for the specific problem that dewarping poses to the coordinate reproducibility principle.
Now that we have actual promising tools that we could wrap for page-level dewarping, like blitzDrt for perspective correction and Origami's dewarper for parametric grid morphing, we should provide a solution how to integrate this in OCR-D.
To represent the coordinate system after dewarping the page, we could rely on PAGE-XML's dewarping schema (DwGts for short). It references the original image under
/DwGts/DocumentImage/@filename
and describes the morphing grid under/DwGts/Grid
(withRow[*]/@points
againstRow[*]/@index
withRow[*]/@refLinePos
andColumn[*]/@index
withColumn[*]/@refLinePos
). (Unfortunately, it comes with very little documentation and no examples.)But this is a separate XML file not referenced by the PAGE-XML content schema (PcGts). So for dewarping steps, the output fileGrp would need to be comprised of 3 files per page:
/PcGts/Page/@imageFilename
instead of the original/input image, and transforming all existing coordinates of the input PcGts/DwGts/DocumentImage/@filename
So any later processing step will only "see" the dewarped image and use its coordinate system. Whenever we want to transform back, we'll have to take the current PcGts, look up the earlier DwGts, and create a new PcGts by replacing the
/PcGts/Page/@imageFilename
with/DwGts/DocumentImage/@filename
and inverse transforming all coordinates according to/DwGts/Grid
. This could be at the final ingest, or some intermediate step.Potential problems:
application/vnd.prima.page+xml
does not look very discriminative. Or is there any other facility that could distinguish 2. and 3. in the dewarping fileGrp?dewarped
inPcGts/Page/AlternativeImage[*]/@comments
, or via some general mechanism in METS (likemets:file/mets:groupid
ormets:file/mets:transformFile
or generally representing all workflow dependencies viamets:digiprovMD
)?The text was updated successfully, but these errors were encountered: