bashlib: better tooling required to handle PAGE/images in same fileGrp #571

kba · 2020-08-21T16:15:58Z

c.f. https://github.com/OCR-D/ocrd_olena/pull/68/files this should be easier.

bertsky · 2020-08-21T17:22:06Z

…and core can also be more efficient!

This is probably not much more than a different mode for the find command: add the option --by-page to yield at most one file per pageId, preferably mimetype PAGE, or "image/*", but failing if there are more than 1 PAGE or image files. (Failure could mean simply skip that pageId, or exiting.)

IMO the best option would be to have a shared, fast implementation for both this CLI option and Processor.input_files.

bertsky · 2021-12-02T07:45:58Z

Or expose Processor.input_files directly to bashlib.

kba · 2021-12-08T10:46:47Z

Since we now have ocrd bashlib input-files is this solved?

bertsky · 2021-12-08T10:48:57Z

Yes (just forgot to link).

bertsky mentioned this issue Aug 21, 2020

AlternativeImage should be in the fileGroup they originated from #505

Closed

EEngl52 assigned kba Sep 3, 2020

bertsky mentioned this issue Dec 2, 2021

bashlib: list-resources, show-resource, and input-files #753

Merged

bertsky closed this as completed Dec 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bashlib: better tooling required to handle PAGE/images in same fileGrp #571

bashlib: better tooling required to handle PAGE/images in same fileGrp #571

kba commented Aug 21, 2020

bertsky commented Aug 21, 2020

bertsky commented Dec 2, 2021

kba commented Dec 8, 2021

bertsky commented Dec 8, 2021

bashlib: better tooling required to handle PAGE/images in same fileGrp #571

bashlib: better tooling required to handle PAGE/images in same fileGrp #571

Comments

kba commented Aug 21, 2020

bertsky commented Aug 21, 2020

bertsky commented Dec 2, 2021

kba commented Dec 8, 2021

bertsky commented Dec 8, 2021