-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Cloud Vision to PAGE-XML #125
Comments
BTW the existing integration of GCV as part of the PRImA converter (transform
|
Thanks. So it was broken right from the beginning (commit 7332869). |
I'm not sure. Perhaps the PRImA convert was capable of detecting the format automatically before. But it does not look like it. Anyway, here is a fix: #156 |
I tried it with fixed arguments, and it fails:
|
I know. That's because in this example, the input data is incomplete. See here |
Since #156 we do have a working GCV converter here based on https://github.com/PRImA-Research-Lab/prima-page-converter, so there is no actual need for https://github.com/PRImA-Research-Lab/cloud-vision-ocr-to-page. Comparing both implementations, IIUC we have:
|
Thanks for the comparison, very helpful.
IMHO these are the strongest reasons against the It's unfortunate that the confidences aren't serialized, like gcv2hocr does with |
We can (fix ourselves and) ship our own builds. I have successfully set up Eclipse and can compile most of the modules (e.g. libs, PageViewer, PageConverter). (I have done that with PageViewer including validator error messages.) |
It was mentioned before but @cneud just reminded me of https://github.com/PRImA-Research-Lab/cloud-vision-ocr-to-page . Should not be too hard to integrate and would allow using GCV results in OCR-D/Transkribus/OCR4all.
BTW: Has anyone experience with the Azure Computer Vision API in the context of OCR? As a sign of goodwill in times of Covid-19, they are currently offering a generous free tier including access to the vision API. Would be interesting to compare.
The text was updated successfully, but these errors were encountered: