Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Region cropping problem #27

Merged
merged 29 commits into from
Jan 14, 2020
Merged

Region cropping problem #27

merged 29 commits into from
Jan 14, 2020

Conversation

khurramHashmi
Copy link
Collaborator

This branch is ready to merge with master branch. Please test first and then merge.

ocrd_anybaseocr/cli/ocrd_anybaseocr_block_segmentation.py Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
@@ -76,7 +76,7 @@
"output_file_grp": ["OCR-D-IMG-DEWARP"],
"parameters": {
"imgresize": { "type": "string", "default": "resize_and_crop", "description": "run on original size image"},
"pix2pixHD": { "type": "string", "default":"/home/ahmed/project/pix2pixHD", "description": "Path to pix2pixHD library"},
"pix2pixHD": { "type": "string", "default":"/home/jenckel/pix2pixHD/pix2pixHD", "description": "Path to pix2pixHD library"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a good default. Please provide models via setup package_data or subrepo and a makefile rule.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really good practice, to link multiple 100 MB models in the setup?
I'd prefer to provide a link to the sources in the README and output a warning/error if they are missing at runtime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really good practice, to link multiple 100 MB models in the setup?

No, of course not. But a subrepo for that (which you can link to with relative paths in your README and tool json default etc) is good practise.

I'd prefer to provide a link to the sources in the README and output a warning/error if they are missing at runtime

Yes, that's fine. (This comment was about the absolute pathname which appeared in the original version)

requirements.txt Outdated Show resolved Hide resolved
@@ -136,10 +143,6 @@ def _process_segment(self,page_image, page, page_xywh, page_id, input_file, n):
file_grp=self.image_grp
)
page.add_AlternativeImage(AlternativeImageType(filename=file_path, comments=page_xywh['features']))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this processor do? Provide a derived image of the page with colors/frames for segments? That's not what segmentation in a PAGE-XML annotation workflow looks like. You need to add regions (and remove existing regions).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened #31 to address this

ocrd_anybaseocr/cli/ocrd_anybaseocr_textline.py Outdated Show resolved Hide resolved
Comment on lines 99 to 112
points = region.Coords.get_points()
points = points.split(" ")

x_min = min(int(points[0].split(",")[0]), int(points[1].split(",")[0]), int(points[2].split(",")[0]), int(points[3].split(",")[0]))
x_max = max(int(points[0].split(",")[0]), int(points[1].split(",")[0]), int(points[2].split(",")[0]), int(points[3].split(",")[0]))
y_min = min(int(points[0].split(",")[1]), int(points[1].split(",")[1]), int(points[2].split(",")[1]), int(points[3].split(",")[1]))
y_max = max(int(points[0].split(",")[1]), int(points[1].split(",")[1]), int(points[2].split(",")[1]), int(points[3].split(",")[1]))

if x_max>page_image.size[0]:
x_max = page_image.size[0]-1
if y_max>page_image.size[1]:
y_max = page_image.size[1]-1

img__ = page_image.crop((x_min,y_min,x_max,y_max))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is totally inadequate. What if the region already has alternative images? What if it has @orientation (from deskewing)?

The region_xywh you are passing to _process_segment along with img__ does not match. It will give wrong coordinates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What made you bypass image_from_segment's region_image?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem was exactly that it already had an alternative image
we tried to "merge" the following two pipelines:
raw image --> binarization --> deskew --> cropping --> tiseg ---> textline with
raw image --> block segmentation --(region+alternative image)--> textline
per default block segmentation adds an alternative image for each extracted region which means textline will take the alternative image from the raw image over applying the extracted regions to the binarized/deskewed etc. image (both of which textline needs)
We are aware that the pipeline was not supposed to have multiple branches, but with the dependencies of our methods there is no way for a single straight forward pipeline, so we tried to use the API as best as possible.
we ended up with three choices:
either uncouple the pipelines and perform: raw image --> block segmentation --> binarization --> deskew --> textline independently, apply the regions manually (which we tried to do) or not add an alternative image during block segmentation (which may be better?)
What do you think would be the best option? (We tried to explain the problem before, and I hope its understandable now, but communicating the problem turned out to be difficult.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per default block segmentation adds an alternative image for each extracted region which means textline will take the alternative image from the raw image over applying the extracted regions to the binarized/deskewed etc. image (both of which textline needs)

This is not a valid sentence in English, and I don't understand it.

We are aware that the pipeline was not supposed to have multiple branches, but with the dependencies of our methods there is no way for a single straight forward pipeline, so we tried to use the API as best as possible.

What the API can give you are filters and selector on operations already performed earlier in the workflow. It cannot provide the workflow itself for a processor! (That's up to the user to decide. But of course, you can document what a good setup is for your tools.)

However, the processor can constrain its input images, so it does not pick up the wrong annotations when there is a freedom of choice, and stops execution when put in the wrong (place of a) pipeline.

So I don't see a problem with the above workflow. Just that your textline segmentation needs 2 input file groups then (to merge their regions I guess).

You could even be very restrictive to prevent other workflows:

  • binarization: filter binarized,deskewed,cropped page images
  • deskewing: select binarized, filter deskewed,cropped page images
  • cropping: select binarized,deskewed, filter cropped page images
  • tiseg: select binarized,deskewed,cropped page images
  • block segmentation: filter binarized,deskewed,cropped page images
  • textline: select binarized,deskewed,cropped region images on input file group 1, filter binarized,deskewed,cropped region images on input file group 2

But you don't need to do this (and I would recommend against it, except where an algorithm just cannot handle the input otherwise).

We tried to explain the problem before, and I hope its understandable now, but communicating the problem turned out to be difficult.

You can always turn to the Gitter Lobby for general questions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The region_xywh you are passing to _process_segment along with img__ does not match. It will give wrong coordinates.

What made you bypass image_from_segment's region_image?

This has been resolved by 13d855e (although the wrong method still remains as commented code).

But the rest of the discussion was about filters/selectors. You took my advice to the extreme in 0ac8567 – hence my comment below

ocrd_anybaseocr/cli/ocrd_anybaseocr_block_segmentation.py Outdated Show resolved Hide resolved
ocrd_anybaseocr/cli/ocrd_anybaseocr_block_segmentation.py Outdated Show resolved Hide resolved
ocrd_anybaseocr/cli/ocrd_anybaseocr_block_segmentation.py Outdated Show resolved Hide resolved
ocrd_anybaseocr/cli/ocrd_anybaseocr_block_segmentation.py Outdated Show resolved Hide resolved
@bertsky
Copy link
Contributor

bertsky commented Dec 16, 2019

Regarding the CI failures, I believe you should try to update/renew your CircleCI build cache. Also adding some pip install -U pip would be advisable:

You are using pip version 9.0.1, however version 19.3.1 is available.

Copy link
Contributor

@bertsky bertsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the problem when no border is annotated, I still recommend against making your processors that strict (maximally filtering/selecting image features). This should mostly be up to workflow configuration (for which you could give hints and recommendations in the README and tool json descriptions).

@mahmed1995 mahmed1995 merged commit 8a39f3a into master Jan 14, 2020
@wrznr
Copy link

wrznr commented Jan 14, 2020

Is it save to merge a PR for which the continuous integration tests failed? Is it save for users to update their working installation?

@mjenckel
Copy link
Collaborator

mjenckel commented Jan 14, 2020

problem was related to circleci not finding tensorflow 2.0 (as already mentioned by @bertsky)
should be fixed

@bertsky
Copy link
Contributor

bertsky commented Jan 15, 2020

There are so many unresolved comments above – it's not clear what additional commits address which problems. But I believe quite a few errors remain. @khurramHashmi You make it extremely hard to collaborate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants