An option to output scores and alternate chars. #25

danvk · 2015-01-14T00:12:57Z

When requested, this data is written into a .alts.json file alongside the line image.

For example:

$ ./ocropus-rpred --alternates -n -m models/en-default.pyrnn.gz book/0001/010004.bin.png
$ cat book/0001/010004.alts.json
[
  [
    [
      "a",
      0.78196890563642629
    ],
    [
      "s",
      0.33257442535877763
    ]
  ],
  ...
]

Fixes #16

When requested, this data is written into a '.alts.json' file alongside the line image.

tmbdev · 2015-01-29T18:17:48Z

Thanks for giving this a try. I don't think this works very well, though. Alternative characters frequently don't occur in the same place, so your code will fail to pick up important recognition alternatives. For example, the alternative to an "m" is an "rn", but neither the "r" nor the "n" are going to be where the "m" is. At the very least, you'd have to output all candidates as hypotheses with potentially overlapping bounding boxes. The old recognizer had a .boxes format for that, and if one wanted to do that, it would probably be best to stick with the same format.

A better solution is to output a recognition lattice. The best format for that is probably OpenFST binary or text format.

A good way to output a recognition lattice is to take the posterior probability at each output x coordinate, and building a WFST out of that (strictly linear, c transitions between state x and state x+1, where c is the number of classes). That can then be matched against a simple model of the form ε+([^ε]+ε+)* The result can be thresholded and then output as a recognition lattice. Instead of the unconstrained model, you can also use a simple bigraph or trigraph model to already pre-select reasonable interpretations of the input.

That can be done pretty easily using pyopenfst. In fact, I used to have code like that for LSTM decoding, but I can't find it anymore. The resulting .fst files can also be used with the existing OCRopus language modeling.

danvk · 2015-01-29T19:00:32Z

I'll close this out then -- thanks for the details. It would be nice to get information on alternatives.

Would the scores output by this make sense if you only looked at the top match? In the example above, would it be fair to say that there's a 78% chance that there's an a?

One thought I had was to use these probabilities to gauge whether a line was upside-down. I'd run the same model on the original line image & a flipped line image and look for high probabilities of asymmetric letters like A and v.

tmbdev · 2015-01-29T21:56:54Z

We've tried this and it works very well: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5413722&tag=1

The best way is still to do what I suggested: generate the FST corresponding to each text line, then match against a language model, look at the cost of a match.

It's not that hard to do, but if you aren't familiar with pyopenfst, it may be some overhead getting started.

These two notebooks explain it a little:

http://nbviewer.ipython.org/github/tmbdev/teaching-nlpa/blob/master/nlpa-openfst.ipynb

http://nbviewer.ipython.org/github/tmbdev/teaching-nlpa/blob/master/nlpa-openfst2.ipynb

A good way to enable experimentation might simply be to save the undecoded LSTM output as a PNG image. That way, people can write lots of post-processing scripts.

An option to output scores and alternate chars.

4032f4e

When requested, this data is written into a '.alts.json' file alongside the line image.

danvk force-pushed the alternates branch from 786f94f to 4032f4e Compare January 21, 2015 23:46

danvk closed this Jan 29, 2015

amitdo mentioned this pull request Dec 29, 2015

how to get the confidence of predicted output ? #74

Closed

zuphilip mentioned this pull request Mar 4, 2016

Is there a way to access confidence level? #87

Closed

zuphilip mentioned this pull request Feb 22, 2017

Bringing back ocropus-lattices #186

Open

amitdo mentioned this pull request Dec 16, 2017

Implemented computation of probability matrix #279

Open

bertsky mentioned this pull request Mar 20, 2019

RFC: Lattice Output tesseract-ocr/tesseract#2339

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An option to output scores and alternate chars. #25

An option to output scores and alternate chars. #25

danvk commented Jan 14, 2015

tmbdev commented Jan 29, 2015

danvk commented Jan 29, 2015

tmbdev commented Jan 29, 2015

An option to output scores and alternate chars. #25

An option to output scores and alternate chars. #25

Conversation

danvk commented Jan 14, 2015

tmbdev commented Jan 29, 2015

danvk commented Jan 29, 2015

tmbdev commented Jan 29, 2015