Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add Paddle OCR recognition to Bookworm #129

Open
cary-rowen opened this issue Mar 13, 2022 · 4 comments
Open

Feature Request: Add Paddle OCR recognition to Bookworm #129

cary-rowen opened this issue Mar 13, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@cary-rowen
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Whether it is Windows10OCR or Tesseract OCR, the recognition effect in Simplified Chinese environment is not ideal.

Describe the solution you'd like

There is a open source project called Paddle OCR. This project supports multiple languages. In the Chinese and English scenarios I tested, its recognition rate exceeded Windows OCR and Tesseract OCR.
It would be nice if Bookworm could add a Paddle recognition engine

Describe alternatives you've considered

None

Additional context

Screen readers in China have distributed it as part of their screen readers, and so far everything seems to be working fine.
Paddle OCR repo: https://github.com/PaddlePaddle/PaddleOCR/

@mush42
Copy link
Collaborator

mush42 commented Mar 13, 2022

Hello @cary-rowen
I investigated adding this OCR engine to Bookworm.
The main road blocker here is that adding this will increase the bundle size significantly.
What does those screen readers do to embed this engine?
How much their bundle size increased since adding this engine?
Perhaps there is another, simpler way to embed this engine that I'm not aware of.

Best
Musharraf

@cary-rowen
Copy link
Collaborator Author

Hi @mush42

As far as I know, those screen reader bundles add less than 20 MB, Paddle OCR has a different recognition model, we might consider adding a light built-in model to Bookworm.
I will further find more useful news for you.
Thanks

@mush42
Copy link
Collaborator

mush42 commented Mar 22, 2022

Hello @cary-rowen

I've been studying paddle OCR and the ways it can be added to Bookworm without bringing in a huge number of additional dependencies.

The major issue is that most of the development documentation is written in Chinese, but through Google Translate, I was able to understand the basics of the process.

Paddle OCR can be embedded using one of the two following ways:

  1. Use paddle OCr C++ interface to create a python binding:
    The benefit of this is speed, because all of the processing will happen in C++. Also, the python bindings will be reusable, for instance, we can create an NVDA add-on based on this.
    The major downside to this is that it requires a lot of time for initial development and testing.

  2. Use ONNXRuntime:

Paddle has official support for ONNX runtime, but I couldn't find any official confirmation from paddle developers as to whether all of the models support ONNX runtime.

ONNX is fairly fast, but a major downside of this is that the majority of OCR processing happens in python, which is slower of course.

The above technical details are notes to my future self, and for other interested parties.

Best
Musharraf

@cary-rowen
Copy link
Collaborator Author

Hi @mush42 , great to hear you're working on this.
I will try to ask the authors of Paddle OCR about whether all models support the ONNX runtime.
btw, I'm very happy with the advantages of the first option, which means that there may be more projects using Paddle OCR especially the NVDA add-on.
All in all, good luck to you.

thanks

@DraganRatkovich DraganRatkovich added the enhancement New feature or request label Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants