You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Maybe it's already a feature, but I am needing to tell invoice2data that when I use ocrmypdf I also need to tell ocrmypdf with the "deskew" parameter, and and also set "redo_ocr" and "force_ocr" parameters.
Cheers,
Adrián
The text was updated successfully, but these errors were encountered:
I have noticed that, in some cases, Tesseract may recognize more text with --psm 1 instead of --psm 6 (the default for invoice2data defined in src/invoice2data/input/tesseract.py).
One possibility would be to call invoice2data with '--psm 1', which would overwrite the default defined in src/invoice2data/input/tesseract.py for that parameter only. # invoice2data --input-reader tesseract --psm 1...
Another possibility would be to have default values for parameters defined in a .conf file. IMO, this would be far less practical as --psm 3 works in most cases and needs to be overwritten just a few times in my experience.
I would be happy to provide help in testing but I have no experience in Python.
Hi,
Maybe it's already a feature, but I am needing to tell invoice2data that when I use ocrmypdf I also need to tell ocrmypdf with the "deskew" parameter, and and also set "redo_ocr" and "force_ocr" parameters.
Cheers,
Adrián
The text was updated successfully, but these errors were encountered: