Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option --psm 10 digits are not taken account. #2159

Open
ruimo opened this issue Jan 15, 2019 · 4 comments
Open

Option --psm 10 digits are not taken account. #2159

ruimo opened this issue Jan 15, 2019 · 4 comments

Comments

@ruimo
Copy link

ruimo commented Jan 15, 2019

Environment

  • 4.0.0
  • macOS 10.13.3

Current Behavior:

$ tesseract 000.png stdout --psm 10 digits
Warning: Invalid resolution 0 dpi. Using 70 instead.
iL

000.png: 000

Expected Behavior:

Since --psm 10 is specified, the output 'iL' is invalid. A single character should be recognized.
As 'digits' are specified as a config file, 'iL' is invalid. A digit character(0123456789-.) should be recognized.

Suggested Fix:

@amitdo
Copy link
Collaborator

amitdo commented Jan 15, 2019

The content of the digits config file:
tessedit_char_whitelist 0123456789-.

See #751.

@ruimo
Copy link
Author

ruimo commented Jan 15, 2019

Thanks, I updated the expected behaviour.

@bertsky
Copy link
Contributor

bertsky commented Mar 23, 2019

I have looked into this via debugging (adding DebugBeams at the end of RecodeBeamSearch::Decode). For the whitelisting, I used branch #2294.

Since --psm 10 is specified, the output 'iL' is invalid. A single character should be recognized.

True. From what I can tell, that PSM_SINGLE_CHAR does not work for LSTMs. And getting two characters out of one is a known error. I coined this the "diplopia" problem when presenting my analysis for #2339.

As 'digits' are specified as a config file, 'iL' is invalid. A digit character(0123456789-.) should be recognized.

True again. As @amitdo and @Shreeshrii pointed out, whitelisting will be re-instated for LSTMs after #2294 is merged. But even with that, for this example, nothing can be decoded (empty result), because the beam is too narrow.

It can be made wider, e.g. by changing ComputeTopN to not stop looking further than the first and second best input alternatives (which will be null and "." in this case). I do get a "1" as result with that modification, but the implications of this should be discussed in #2294.

@Shreeshrii
Copy link
Collaborator

@ruimo

Please try with my finetuned traineddata which is limited to digits and some punctuation (instead of digits config).

digits_layer.traineddata
digitsall_layer.traineddata

from https://github.com/Shreeshrii/tessdata_shreetest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@ruimo @Shreeshrii @stweil @amitdo @bertsky and others