Arabic trained-data produce 20% accuracy #250

ibrahimAlii · 2018-09-11T16:44:27Z

Summary:

When I use english data It's worked very well, but when I use arabic it's required to copy all cube data and also produced in bad quality.

Steps to reproduce the issue:

Input any arabic digits/words.
Get the Utf8Text()

Expected result:
I should get correct data.

Actual result:
I got wired result.

Tess-two version:
8.0.0

Android version:
28

Phone/device model:
Pixel

Link to training data used:
https://github.com/tesseract-ocr/tessdata/blob/3.04.00/ara.traineddata

Link to image used as input:

http://3.bp.blogspot.com/-CZRdjlj2ybU/TkAbU6C4RWI/AAAAAAAAAAw/n4Hej0ct3rw/s1600/ind.jpg

The text was updated successfully, but these errors were encountered:

rmtheis · 2018-09-13T00:57:38Z

Thanks for the bug report. It's not entirely clear to me what the problem is because you just said you get a "weird result." Maybe try different page segmentation modes and try using different portions of the input image.

Most likely your issue is not a bug and this is working as intended.

ibrahimAlii · 2018-09-22T15:11:56Z

@rmtheis Please check below image

The result is should be like picture I got some arabic character instead of digits, also there is digits like one and three i got nine instead of one and "ها" instead of three and "هلا" instead of six.

rmtheis closed this as completed Sep 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arabic trained-data produce 20% accuracy #250

Arabic trained-data produce 20% accuracy #250

ibrahimAlii commented Sep 11, 2018

rmtheis commented Sep 13, 2018

ibrahimAlii commented Sep 22, 2018

Arabic trained-data produce 20% accuracy #250

Arabic trained-data produce 20% accuracy #250

Comments

ibrahimAlii commented Sep 11, 2018

rmtheis commented Sep 13, 2018

ibrahimAlii commented Sep 22, 2018