Refactoring of ocrd_tesserocr common functionality into core #268

kba · 2019-08-01T17:47:19Z

Start implementing OCR-D/ocrd_tesserocr#49

codecov-io · 2019-08-01T17:53:23Z

Codecov Report

Merging #268 into master will decrease coverage by 5.07%.
The diff coverage is 42.64%.

@@            Coverage Diff             @@
##           master     #268      +/-   ##
==========================================
- Coverage   98.07%   92.99%   -5.08%     
==========================================
  Files          30       30              
  Lines        1350     1485     +135     
  Branches      268      287      +19     
==========================================
+ Hits         1324     1381      +57     
- Misses         15       92      +77     
- Partials       11       12       +1

Impacted Files	Coverage Δ
ocrd/ocrd/workspace.py	`66.86% <16.66%> (-24.29%)`	⬇️
ocrd_utils/ocrd_utils/__init__.py	`76.07% <59.75%> (-16.52%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e678c0d...5332936. Read the comment docs.

bertsky

Looks good. I see a few possible improvements...

bertsky

Looks good. I see a few possible improvements...

ocrd_utils/ocrd_utils/__init__.py

ocrd_utils/requirements.txt

bertsky · 2019-08-04T08:31:41Z

Wait a minute! Where is coordinates_for_segment?

When you add this, please be sure to apply OCR-D/ocrd_tesserocr#68 as well!

Co-Authored-By: Robert Sachunsky <[email protected]>

kba · 2019-08-07T15:24:57Z

Wait a minute! Where is coordinates_for_segment?

It's there now. The rest of common.py has been turned into workspace methods.

More tests would be wise but the ocrd_tesserocr test suite passes.

Conflicts: CHANGELOG.md ocrd/requirements.txt ocrd_modelfactory/requirements.txt tests/test_utils.py

bertsky

Looks very good already. I am glad you adopted the Workspace method option. I will start a PR afterwards with test cases for the 3 new methods as well, if you like. (Based on assets, if that is not asking too much.)

(Please also see unresolved comments from last time.)

CHANGELOG.md

ocrd/ocrd/workspace.py

bertsky · 2019-08-12T13:53:44Z

More tests would be wise but the ocrd_tesserocr test suite passes.

Yes, ocrd_tesserocr is very much in need of more test coverage now, and here image_from_page, image_from_segment and save_image_file should be controlled by tests. As for the latter, I believe we would need some real-life workspaces (as in assets) to cover this.

Conflicts: CHANGELOG.md ocrd/ocrd/workspace.py ocrd_utils/ocrd_utils/__init__.py tests/test_utils.py

…e_from_segment

bertsky

Becoming less and less certain of this. @kba what do you think, is this too much?

ocrd/ocrd/workspace.py

wrznr · 2019-08-21T08:07:57Z

Wrt. cropping vs. cutting (vs. segmenting?): Using the term cropping for localizing a page's border was a bad choice right from the start because it mixes the intellectual process of finding the borders and the physical process of separating the OCR-relevant from the irrelevant parts of the actual image. Using cutting does not improve things IMHO. The more I think about it, the more meaningful the use of the term (page-level) segmentation seems to me because this is what cropping right now does: It localizes the segment page on an image file. We could then use cropping as it is intended.

wrznr

We should not let the terminological discussion slow us down.

This reverts commit f1772ce.

kba · 2019-08-21T09:35:08Z

We should not let the terminological discussion slow us down.

I second that, let's discuss in #289.

bertsky

Sorry @kba, I forgot to finalize my last review! Please fix in the next PR...

CHANGELOG.md

ocrd_utils/ocrd_utils/__init__.py

kba · 2019-08-21T12:31:41Z

@kba Do you want me to make the repairs myself and commit here?

I'll fix it right away, thanks for spotting.

Fixed in 5fd2875

kba added 2 commits August 1, 2019 19:40

ocrd_utils: require PIL, numpy

d33f4ec

utils: import the utility functions from ocrd_tesserocr

4192ba5

kba requested review from wrznr and bertsky August 1, 2019 17:47

relax numpy version requirement

1cdc3b6

kba mentioned this pull request Aug 1, 2019

Adapt to utils moved to core, #49 OCR-D/ocrd_tesserocr#66

Merged

utils: more tests

cb21189

bertsky requested changes Aug 3, 2019

View reviewed changes

bertsky mentioned this pull request Aug 4, 2019

common: no conversion to unsigned in coordinates_for_segment OCR-D/ocrd_tesserocr#68

Merged

kba and others added 7 commits August 6, 2019 18:24

Merge branch 'master' into tesserocr-common

8123ead

Apply suggestions by @bertsky

e6ec6a4

Co-Authored-By: Robert Sachunsky <[email protected]>

coordinates_for_segment, OCR-D/ocrd_tesserocr#68

5032086

utils: tests

5adc512

📝 changelog

16562f2

export PAGE namespace in to_xml, regression from OCR-D#271

22a4624

move remaining tesserocr common fns to workspace

8413ada

kba added a commit to kba/ocrd_tesserocr that referenced this pull request Aug 7, 2019

🔥 remove commons, adapt to OCR-D/core#268

5d4a3ca

kba changed the title ~~[WIP] Refactoring of ocrd_tesserocr common functionality into core~~ Refactoring of ocrd_tesserocr common functionality into core Aug 7, 2019

Merge branch 'master' into tesserocr-common

a962808

Conflicts: CHANGELOG.md ocrd/requirements.txt ocrd_modelfactory/requirements.txt tests/test_utils.py

bertsky mentioned this pull request Aug 12, 2019

generated namespace prefix is invalid #277

Closed

bertsky requested changes Aug 12, 2019

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

ocrd/ocrd/workspace.py Show resolved Hide resolved

ocrd/ocrd/workspace.py Show resolved Hide resolved

bertsky mentioned this pull request Aug 16, 2019

image_from_page / image_from_segment: Need for workspace? OCR-D/ocrd_tesserocr#65

Closed

kba added 2 commits August 20, 2019 19:07

📝 format utils pydoc

ac95d38

📝 format utils pydoc

677890b

kba added 4 commits August 20, 2019 19:39

Merge branch 'master' into tesserocr-common

c51e961

Conflicts: CHANGELOG.md ocrd/ocrd/workspace.py ocrd_utils/ocrd_utils/__init__.py tests/test_utils.py

📝 changelog

ef21476

🎨 in Workspace docstrings: S/crop/cut/

f1772ce

🔥 deprecate resolve_image_as_pil in favor of image_from_page and imag…

50f7621

…e_from_segment

bertsky requested changes Aug 20, 2019

View reviewed changes

wrznr approved these changes Aug 21, 2019

View reviewed changes

wrznr requested a review from bertsky August 21, 2019 08:09

Revert ":art: in Workspace docstrings: S/crop/cut/"

5332936

This reverts commit f1772ce.

kba mentioned this pull request Aug 21, 2019

cropping vs. cutting vs. segmenting #289

Closed

kba merged commit 27fd169 into OCR-D:master Aug 21, 2019

kba deleted the tesserocr-common branch August 21, 2019 09:45

bertsky reviewed Aug 21, 2019

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

CHANGELOG.md Show resolved Hide resolved

CHANGELOG.md Show resolved Hide resolved

ocrd_utils/ocrd_utils/__init__.py Show resolved Hide resolved

kba mentioned this pull request Aug 23, 2019

coordinate transform utils have misleading mnemonics #157

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring of ocrd_tesserocr common functionality into core #268

Refactoring of ocrd_tesserocr common functionality into core #268

kba commented Aug 1, 2019

codecov-io commented Aug 1, 2019 •

edited

Loading

bertsky left a comment

bertsky left a comment

bertsky commented Aug 4, 2019

kba commented Aug 7, 2019

bertsky left a comment

bertsky commented Aug 12, 2019

bertsky left a comment

wrznr commented Aug 21, 2019 •

edited

Loading

wrznr left a comment

kba commented Aug 21, 2019

bertsky left a comment

kba commented Aug 21, 2019 •

edited

Loading

Refactoring of ocrd_tesserocr common functionality into core #268

Refactoring of ocrd_tesserocr common functionality into core #268

Conversation

kba commented Aug 1, 2019

codecov-io commented Aug 1, 2019 • edited Loading

Codecov Report

bertsky left a comment

Choose a reason for hiding this comment

bertsky left a comment

Choose a reason for hiding this comment

bertsky commented Aug 4, 2019

kba commented Aug 7, 2019

bertsky left a comment

Choose a reason for hiding this comment

bertsky commented Aug 12, 2019

bertsky left a comment

Choose a reason for hiding this comment

wrznr commented Aug 21, 2019 • edited Loading

wrznr left a comment

Choose a reason for hiding this comment

kba commented Aug 21, 2019

bertsky left a comment

Choose a reason for hiding this comment

kba commented Aug 21, 2019 • edited Loading

codecov-io commented Aug 1, 2019 •

edited

Loading

wrznr commented Aug 21, 2019 •

edited

Loading

kba commented Aug 21, 2019 •

edited

Loading