Google Cloud Vision to PAGE-XML #125

kba · 2020-04-29T12:51:08Z

It was mentioned before but @cneud just reminded me of https://github.com/PRImA-Research-Lab/cloud-vision-ocr-to-page . Should not be too hard to integrate and would allow using GCV results in OCR-D/Transkribus/OCR4all.

BTW: Has anyone experience with the Azure Computer Vision API in the context of OCR? As a sign of goodwill in times of Covid-19, they are currently offering a generous free tier including access to the vision API. Would be interesting to compare.

bertsky · 2022-11-17T15:39:49Z

BTW the existing integration of GCV as part of the PRImA converter (transform gcv page linking to alto page) is broken: it delegates to java -jar PageConverter.jar -source-xml $INFILE instead of java -jar PageConverter.jar -source-json $INFILE:

ocr-fileformat/script/transform/alto__page

Line 19 in 8878b8a

    
           java -jar "$JAR" -neg-coords toZero -source-xml "$INFILE" -target-xml "$OUTFILE" -convert-to LATEST 2>&1

stweil · 2022-11-17T16:14:49Z

Thanks. So it was broken right from the beginning (commit 7332869).

bertsky · 2022-11-17T16:30:07Z

So it was broken right from the beginning (commit 7332869).

I'm not sure. Perhaps the PRImA convert was capable of detecting the format automatically before. But it does not look like it.

Anyway, here is a fix: #156

stweil · 2022-11-17T16:30:16Z

I tried it with fixed arguments, and it fails:

java -jar vendor/JPageConverter/PageConverter.jar -neg-coords toZero -source-json 1850-Baptis-EMU-0204.txt -target-xml 1850-Baptis-EMU-0204.xml -convert-to LATEST
null
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "org.primaresearch.dla.page.Page.getLayout()" because "page" is null
	at org.primaresearch.dla.page.converter.PageConverter.handleNegativeCoordinates(PageConverter.java:449)
	at org.primaresearch.dla.page.converter.PageConverter.run(PageConverter.java:266)
	at org.primaresearch.dla.page.converter.PageConverter.main(PageConverter.java:161)

bertsky · 2022-11-17T16:31:09Z

I tried it with fixed arguments, and it fails:

I know. That's because in this example, the input data is incomplete. See here

bertsky · 2023-06-06T22:31:03Z

Since #156 we do have a working GCV converter here based on https://github.com/PRImA-Research-Lab/prima-page-converter, so there is no actual need for https://github.com/PRImA-Research-Lab/cloud-vision-ocr-to-page.

Comparing both implementations, IIUC we have:


implementation	cloud-vision-ocr-to-page	prima-page-converter with json input
external dependencies	GCV (Java API)	none (standalone)
usage	online (network API)	offline (JSON)
can also output ALTO	no	yes
yields `@imageFilename`	yes	no
yields width and height	yes	yes
coordinates	bbox	bbox
paragraphs	recursive TextRegion	recursive TextRegion
other region types	Image+Separator+Graphic+Table	Image+Separator+Graphic+Table
aggregate words to lines	yes	yes
confidence	yes	no

kba · 2023-06-09T15:31:01Z

Thanks for the comparison, very helpful.


implementation	cloud-vision-ocr-to-page	prima-page-converter with json input
external dependencies	GCV (Java API)	none (standalone)
usage	online (network API)	offline (JSON)

IMHO these are the strongest reasons against the cloud-vision-ocr-to-page approach.

It's unfortunate that the confidences aren't serialized, like gcv2hocr does with x_wconf for hOCR though, but with development largely stalled, nothing much we can do except rewrite ourselves.

bertsky · 2023-06-09T16:21:27Z

It's unfortunate that the confidences aren't serialized, like gcv2hocr does with x_wconf for hOCR though, but with development largely stalled, nothing much we can do except rewrite ourselves.

We can (fix ourselves and) ship our own builds. I have successfully set up Eclipse and can compile most of the modules (e.g. libs, PageViewer, PageConverter).

(I have done that with PageViewer including validator error messages.)

stweil added this to the v1.0.0 milestone Jun 25, 2020

stweil mentioned this issue Aug 31, 2020

Release version 0.3.0 and 1.0.0 #120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Cloud Vision to PAGE-XML #125

Google Cloud Vision to PAGE-XML #125

kba commented Apr 29, 2020

bertsky commented Nov 17, 2022

stweil commented Nov 17, 2022

bertsky commented Nov 17, 2022

stweil commented Nov 17, 2022

bertsky commented Nov 17, 2022

bertsky commented Jun 6, 2023

kba commented Jun 9, 2023

bertsky commented Jun 9, 2023 •

edited

Loading

Google Cloud Vision to PAGE-XML #125

Google Cloud Vision to PAGE-XML #125

Comments

kba commented Apr 29, 2020

bertsky commented Nov 17, 2022

stweil commented Nov 17, 2022

bertsky commented Nov 17, 2022

stweil commented Nov 17, 2022

bertsky commented Nov 17, 2022

bertsky commented Jun 6, 2023

kba commented Jun 9, 2023

bertsky commented Jun 9, 2023 • edited Loading

bertsky commented Jun 9, 2023 •

edited

Loading