Skip to content

Stroke Width Transform

Paul Capestany edited this page May 21, 2015 · 16 revisions

The latest version of OpenOCR has the ability to preprocess the image using Stroke Width Transform, which is capable of removing the non-text pixels from an image.

Here is an example of Stroke Width Transform In Action.

Start an additional worker

This was just recently added, and so the launcher.sh script hasn't been updated yet. In the meantime, you should be able to get it running with the following:

$ export AMQP_URI=amqp://admin:${RABBITMQ_PASS}@${RABBITMQ_HOST}/
$ docker run -d tleyden5iwx/open-ocr-preprocessor open-ocr-preprocessor -amqp_uri "${AMQP_URI}" -preprocessor "stroke-width-transform"

Verifying it works

$ curl -X POST -H "Content-Type: application/json" -d '{"img_url":"http://bit.ly/ocrimage-swt","engine":"tesseract", "preprocessors":["stroke-width-transform"]}' http://${RABBITMQ_HOST}:${HTTP_PORT}/ocr

Expected result:

YH XMCDMTDC

Compare to default Tesseract output

$ curl -X POST -H "Content-Type: application/json" -d '{"img_url":"http://bit.ly/ocrimage-swt","engine":"tesseract"}' http://${RABBITMQ_HOST}:${HTTP_PORT}/ocr

Expected result:

E' ,‘YHwacpMTDCH ;
 3?". ‘  V‘L"~m> I shah-r}. I’VMU' i 5: 1“”. A"

As you can see, in this particular case the Stroke Width Transform makes a huge positive difference.

Black-on-white text

By default, it expects black text on a white background. However, if you have white text on a black background, you will want to pass an additional parameter as follows:

curl -X POST -H "Content-Type: application/json" -d '{"img_url":"http://bit.ly/ocrimage-swt","engine":"tesseract", "preprocessors":["stroke-width-transform"], "preprocessor-args":{"stroke-width-transform":"0"}}' http://${RABBITMQ_HOST}:${HTTP_PORT}/ocr

Legal values for preprocessor-args/stroke-width-transform:

  • "0" -- white text on a black background
  • "1" -- black text on a white background (default)

In the case of this test image, since it's black test on a white background, passing in "0" completely breaks the OCR and it returns no output.

Docker images

Code

References