Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in -[G8Tesseract recognizedBlocksByIteratorLevel:] --> tesseract::ResultIterator::AppendUTF8WordText #155

Closed
DAndGClark opened this issue Mar 18, 2015 · 13 comments

Comments

@DAndGClark
Copy link

I kick off a recognition operation, with completion block. In the completion block I'm calling recognizedBlocksByIteratorLevel: and it asserts from tesseract::ResultIterator::AppendUTF8WordText, saying
it_->word()->best_choice != NULL:Error:Assert failed:in file resultiterator.cpp, line 594
If I don't try to get the blocks, the text recognition works fine. I've tried levels "Block", "Paragraph", and "TextLine".

@kevincon
Copy link
Collaborator

Can you provide the code you're using when this error occurs?

What kinds of images are you trying to recognize (size, resolution, content, etc.)?

It'd also be great if you could provide a sample image I can use to try to reproduce your issue.

@DAndGClark
Copy link
Author

Following is the code. I figured out that if I call recognizedBlocksByIteratorLevel: earlier (as the code is currently configured), it works. The commented out version in the completion block is what was giving the trouble and leading to the assert.
I was taking pictures with iPhone 6 camera of small blocks of black English text on white paper; the text in the image is easily recognized by Tesseract. I have several test blocks of text, and all of them generated the assert when recognizedBlocksByIteratorLevel: was called from the completion block.

-(void)recognizeImageWithTesseract:(UIImage *)image
{
    // Preprocess the image so Tesseract's recognition will be more accurate
    UIImage *bwImage = [image g8_blackAndWhite];

    // Create a new `G8RecognitionOperation` to perform the OCR asynchronously
    G8RecognitionOperation *operation = [[G8RecognitionOperation alloc] init];
    operation.tesseract.language = @"eng";
    operation.tesseract.engineMode = G8OCREngineModeTesseractOnly;
    operation.delegate = self;
    operation.tesseract.image = bwImage;                                        // Set the image on which to perform recognition

    // Some debug info:
    // First try just to get the segments from the image (no OCR):
    //operation.tesseract.pageSegmentationMode = G8PageSegmentationModeAutoOnly;// Auto segmentation, NO OCR
    operation.tesseract.pageSegmentationMode = G8PageSegmentationModeAutoOSD;   // Auto orientation, script detection, segmentation, then OCR
    DLogTM(kDebugOCR, @"Attempting to find the text segments in the image.");
    BOOL result = operation.tesseract.recognize;
    DLogTM(kDebugOCR, @"Attempt to find the text segments %s", result?"succeeded":"failed");
    if (result)
    {
        NSArray *blocks = [operation.tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelTextline];
        DLogM(kDebugOCR, @"Block location                       Confidence   Text");
        for (G8RecognizedBlock *b in blocks)
        {
            DLogM(kDebugOCR, @"%@             %5.2f      %@", NSStringFromCGRect([b boundingBoxAtImageOfSize:image.size]), b.confidence, b.text);
        }

        UIImage *processedImage = [operation.tesseract imageWithBlocks:blocks drawText:YES thresholded:NO];
        NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);       // Create path
        NSString *filePath = [[paths objectAtIndex:0] stringByAppendingPathComponent:@"OCRImage.png"];
        [UIImagePNGRepresentation(processedImage) writeToFile:filePath atomically:YES];                         // Save image
    }


    operation.tesseract.pageSegmentationMode = G8PageSegmentationModeAuto;      // Auto segmentation, followed by OCR

    operation.recognitionCompleteBlock = ^(G8Tesseract *tesseract) {
        // Fetch the recognized text
        DLogTM(kDebugOCR, @"Tesseract OCR text recognition finished.");
        NSString *recognizedText = tesseract.recognizedText;


        // Some debug info:
        DLogM(kDebugOCR, @"Recognized text:\n%@", recognizedText);

//      NSArray *blocks = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelTextline];
//      DLogM(kDebugOCR, @"Block location        Confidence      Text");
//      for (G8RecognizedBlock *b in blocks)
//      {
//          DLogM(kDebugOCR, @"%@     %5.2f      %@", NSStringFromCGRect([b boundingBoxAtImageOfSize:image.size]), b.confidence, b.text);
//      }

//      UIImage *processedImage = [tesseract imageWithBlocks:blocks drawText:YES thresholded:NO];
//      NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);       // Create path
//      NSString *filePath = [[paths objectAtIndex:0] stringByAppendingPathComponent:@"OCRImage.png"];
//      [UIImagePNGRepresentation(processedImage) writeToFile:filePath atomically:YES];                         // Save image


        [self performSelector:@selector(putOCRTextIntoField:)
                     onThread:[NSThread engineThread]
                   withObject:recognizedText
                waitUntilDone:NO];
    };

    [self.operationQueue addOperation:operation];                       // Add the recognition operation to the queue
}

@kevincon
Copy link
Collaborator

It looks like you're using G8RecognitionOperation improperly.

You're calling operation.tesseract.recognize yourself, but you're not supposed to do that. The whole point of using G8RecognitionOperation is that the recognize function is called on the internal tesseract member automatically asynchronously, and all you have to do is 1) setup the tesseract member of the operation as you need (like what language you want it to use, etc.), 2) provide a recognitionCompleteBlock for the operation; this will be called after recognition is performed so you can access recognizedText on the tesseract argument to the block to get the results or call recognizedBlocksByIteratorLevel, and finally 3) add the operation to an operation queue, the act of which automatically calls recognize on the operations tesseract member field.

Does that make sense? You should review the "Using NSOperationQueue" section of our Wiki, which provides an example for how you should be using G8RecognitionOperation: https://github.com/gali8/Tesseract-OCR-iOS/wiki/Using-Tesseract-OCR-iOS#using-nsoperationqueue

Try adjusting your code accordingly, and if you're still having problems, provide the new version of your code and I'll try to help you debug it further.

@DAndGClark
Copy link
Author

Sorry for the confusion, but the only reason I was trying the direct call to recognize was that when I let it happen properly via the operation queue it asserted when calling recognizedBlocksByIteratorLevel: from the completion block. So I was intentionally using it improperly only for debugging purposes, because of the assert mentioned above. I do understand that trying to do the recognition on the main thread is not a realistic option.

@kevincon
Copy link
Collaborator

Oh okay, then can you provide the original/proper code that you believe should be working but instead crashes? I won't be able to help you unless I can reproduce everything you're experiencing; it'll help if I don't have to make any assumptions about the code you're running.

@DAndGClark
Copy link
Author

Thank you for looking at this problem.
I took the picture of text with an iPhone6 camera, a close-up of 3 little lines of text (you can see in the console output below that the text was recognized properly). The same failure happens on every picture of text I've tried it on (including different devices), so it does not seem to be influenced by the content of the picture.
Here is the exact code that doesn't work:

-(void)recognizeImageWithTesseract:(UIImage *)image
{
    G8RecognitionOperation *operation = [[G8RecognitionOperation alloc] init];
    operation.tesseract.image = [image g8_blackAndWhite];   // Preprocess the image so Tesseract's recognition will be more accurate
    operation.tesseract.language = @"eng";                                      //  Configure Tesseract
    operation.tesseract.engineMode = G8OCREngineModeTesseractOnly;
    operation.tesseract.pageSegmentationMode = G8PageSegmentationModeAuto;      // Auto segmentation, followed by OCR
    operation.delegate = self;

    operation.recognitionCompleteBlock = ^(G8Tesseract *tesseract) {
        // Fetch the recognized text
        NSLog(@"Tesseract OCR text recognition finished.");
        NSString *recognizedText = tesseract.recognizedText;



        // Some debug info:  --------------------------------------------------------------------------------
        G8Orientation orientation = tesseract.orientation;

        NSLog(@"Recognized text:\n%@", recognizedText);
        NSLog(@"Orientation: %s", orientation==G8OrientationPageUp?"PageUp" : orientation==G8OrientationPageDown?"PageDown" : orientation==G8OrientationPageLeft?"PageLeft" : "PageRight");
        NSLog(@"DeSkewAngle: %4.2f", tesseract.deskewAngle);

        NSArray *blocks = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelTextline];
        NSLog(@"Block location                         Confidence   Text");
        for (G8RecognizedBlock *b in blocks)
        {
            NSLog(@"%@                  %5.2f      %@", NSStringFromCGRect([b boundingBoxAtImageOfSize:image.size]), b.confidence, b.text);
        }

        UIImage *processedImage = [tesseract imageWithBlocks:blocks drawText:YES thresholded:NO];
        NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);       // Create path
        NSString *filePath = [[paths objectAtIndex:0] stringByAppendingPathComponent:@"OCRImage.png"];
        [UIImagePNGRepresentation(processedImage) writeToFile:filePath atomically:YES];                         // Save image
        // --------------------------------------------------------------------------------------------------


        //[self performSelector:@selector(putOCRTextIntoField:)
        //           onThread:[NSThread engineThread]
        //         withObject:recognizedText
        //      waitUntilDone:NO];

    };

    // Finally, add the recognition operation to the queue
    NSLog(@"Kicking off Tesseract OCR operation to recognize the text.");
    [self.operationQueue addOperation:operation];
}

***** The console:

2015-03-24 10:16:40.903 FMGo[616:583649] Kicking off Tesseract OCR operation to recognize the text.
2015-03-24 10:16:42.939 FMGo[616:583582] Tesseract OCR text recognition finished.
2015-03-24 10:16:44.819 FMGo[616:583582] Recognized text:
BobWasHere Inc.
5303 Patrick Henry Dr.
San Jose, CA 95044

2015-03-24 10:16:44.819 FMGo[616:583582] Orientation: PageUp
2015-03-24 10:16:44.820 FMGo[616:583582] DeSkewAngle: 0.00
it_->word()->best_choice != NULL:Error:Assert failed:in file resultiterator.cpp, line 594

***** and the stack:

* thread #1: tid = 0x8e79e, 0x0000000194e93270 libsystem_kernel.dylib`__pthread_kill + 8, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x0000000194e93270 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000194f31170 libsystem_pthread.dylib`pthread_kill + 112
    frame #2: 0x0000000194e0ab18 libsystem_c.dylib`abort + 112
    frame #3: 0x000000010070da18 MyApp`ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const + 292
    frame #4: 0x0000000100776bb4 MyApp`tesseract::ResultIterator::AppendUTF8WordText(STRING*) const + 108
    frame #5: 0x0000000100776910 MyApp`tesseract::ResultIterator::IterateAndAppendUTF8TextlineText(STRING*) + 848
    frame #6: 0x0000000100776330 MyApp`tesseract::ResultIterator::GetUTF8Text(tesseract::PageIteratorLevel) const + 352
  * frame #7: 0x000000010069ada8 MyApp`-[G8Tesseract blockFromIterator:iteratorLevel:](self=0x00000001703b1100, _cmd=0x000000010202a81e, iterator=0x00000001702cdba0, iteratorLevel=G8PageIteratorLevelTextline) + 80 at G8Tesseract.mm:535
    frame #8: 0x000000010069b540 MyApp`-[G8Tesseract recognizedBlocksByIteratorLevel:](self=0x00000001703b1100, _cmd=0x0000000102018b38, pageIteratorLevel=G8PageIteratorLevelTextline) + 152 at G8Tesseract.mm:606
    frame #9: 0x0000000100167e9c MyApp`__52-[ActiveObjectDelegate recognizeImageWithTesseract:]_block_invoke(.block_descriptor=<unavailable>, tesseract=0x00000001703b1100) + 420 at ActiveObjectDelegate.mm:1019
    frame #10: 0x00000001006963d0 MyApp`__30-[G8RecognitionOperation init]_block_invoke_2(.block_descriptor=<unavailable>) + 68 at G8RecognitionOperation.m:40
    frame #11: 0x0000000184603be8 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 16
    frame #12: 0x0000000184555374 Foundation`-[NSBlockOperation main] + 96
    frame #13: 0x0000000184544ecc Foundation`-[__NSOperationInternal _start:] + 636
    frame #14: 0x000000018460694c Foundation`__NSOQSchedule_f + 228
    frame #15: 0x0000000103c18f94 libdispatch.dylib`_dispatch_client_callout + 16
    frame #16: 0x0000000103c1dc28 libdispatch.dylib`_dispatch_main_queue_callback_4CF + 1864
    frame #17: 0x00000001836de2ec CoreFoundation`__CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12
    frame #18: 0x00000001836dc394 CoreFoundation`__CFRunLoopRun + 1492
    frame #19: 0x00000001836091f4 CoreFoundation`CFRunLoopRunSpecific + 396
    frame #20: 0x000000018ca2b6fc GraphicsServices`GSEventRunModal + 168
    frame #21: 0x0000000187f9a10c UIKit`UIApplicationMain + 1488
    frame #22: 0x0000000100039f54 MyApp`main(argc=1, argv=0x000000016fdcfa48) + 300 at main.mm:169
    frame #23: 0x0000000194d7aa08 libdyld.dylib`start + 4

As I mentioned above, if I have the same code execute directly on the main thread after a direct call to recognize, then I get all the right answers and no assert, so the problem seems to be influenced by the operation queue process, maybe the thread switching.

@kevincon
Copy link
Collaborator

Thanks for providing all of this info. I'm on vacation this week but I'll take a look at it early next week and try to implement a fix for you.

@kevincon
Copy link
Collaborator

Okay I've got a solution for you. First, an explanation:

When you access the orientation, deskewAngle, direction, etc., we call AnalyseLayout on the internal Tesseract object. The documentation for this function in the source code says:

/**
 * Runs page layout analysis in the mode set by SetPageSegMode.
 * May optionally be called prior to Recognize to get access to just
 * the page layout results. Returns an iterator to the results.
 * Returns NULL on error or an empty page.
 * The returned iterator must be deleted after use.
 * WARNING! This class points to data held within the TessBaseAPI class, and
 * therefore can only be used while the TessBaseAPI class still exists and
 * has not been subjected to a call of Init, SetImage, Recognize, Clear, End
 * DetectOS, or anything else that changes the internal PAGE_RES.
 */

This documentation is a little tricky for me to follow, but I think the gist of it is that AnalyseLayout() is supposed to only be called before calling Recognize on the Tesseract object. It seems that this requirement is even not so strict, because I got your code to run without errors by just moving the page layout information accesses before calling recognizedText on the tesseract argument of the G8RecognitionOperation's recognitionCompleteBlock().

So try modifying your code to be as follows and let me know if it works for you (note that I removed the file writing and text box part from the code you provided in your last message since it wasn't relevant, so you'll have to add that back in):

-(void)recognizeImageWithTesseract:(UIImage *)image
{
    // We recently updated the init for G8RecognitionOperation to be initWithLanguage, so make sure you redownload the source code or reinstall the CocoaPod so you have the latest version
    G8RecognitionOperation *operation = [[G8RecognitionOperation alloc] initWithLanguage:@"eng"];
    // Preprocess the image so Tesseract's recognition will be more accurate
    operation.tesseract.image = [image g8_blackAndWhite];
    operation.tesseract.engineMode = G8OCREngineModeTesseractOnly;
    operation.tesseract.pageSegmentationMode = G8PageSegmentationModeAuto;
    operation.delegate = self;

    operation.recognitionCompleteBlock = ^(G8Tesseract *tesseract) {
        NSLog(@"Tesseract OCR text recognition finished.");

        // Access layout info BEFORE fetching recognized text
        G8Orientation orientation = tesseract.orientation;
        NSLog(@"Orientation: %s", orientation == G8OrientationPageUp ? "PageUp" : orientation == G8OrientationPageDown ? "PageDown" : orientation == G8OrientationPageLeft ? "PageLeft" : "PageRight");
        NSLog(@"DeSkewAngle: %4.2f", tesseract.deskewAngle);

        // Fetch the recognized text
        NSString *recognizedText = tesseract.recognizedText;
        NSLog(@"Recognized text:\n%@", recognizedText);

        // Access info about recognized blocks
        NSArray *blocks = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelTextline];
        NSLog(@"Block location                         Confidence   Text");
        for (G8RecognizedBlock *b in blocks)
        {
            NSLog(@"%@                  %5.2f      %@", NSStringFromCGRect([b boundingBoxAtImageOfSize:image.size]), b.confidence, b.text);
        }
    };

    // Finally, add the recognition operation to the queue
    NSLog(@"Kicking off Tesseract OCR operation to recognize the text.");
    [self.operationQueue addOperation:operation];
}

If this works for you, I propose modifying G8RecognitionOperation so that in addition to calling recognize() before it calls the recognitionCompleteBlock it also performs layout analysis (if applicable based on the pageSegmentationMode of the operation's tesseract member) so that the orientation, deskewAngle, etc. are available to users in the recognitionCompleteBlock. Do you think that sounds okay? Any other suggestions?

@DAndGClark
Copy link
Author

Your suggestion works...Thank you.
I still have trouble getting the layout-based info prior to doing the recognize (which in my case means prior to adding the operation to the operation queue).
Do you have any advice about getting the text rectangles before doing the recognition? Is this something that should be quick, or does it end up doing all or most of the recognition just to get the rectangles?
Regarding your proposal, would people generally use this in 2 steps; first do the layout analysis and as a result perform some rotation and/or skew correction, then step 2 would be the actual recognize? If that is the case then doing the layout analysis as an add-on to doing the recognize doesn't seem useful. If the caller has set G8PageSegmentationModeAuto, does the code attempt to correct for page orientation and skew prior to doing the recognize?

@kevincon
Copy link
Collaborator

kevincon commented Apr 1, 2015

My understanding is that Tesseract does not make the text rectangles available until you call recognize and recognition is performed. Intuitively, that kind of makes sense that it wouldn't be able to tell you where the characters/symbols in an image are until it "recognized" what they are in the image.

I agree that if the user wants to use the page layout analysis values to rotate/adjust the image before performing recognition then my proposal doesn't make much sense because recognition is automatically performed when the operation is added to the queue, so the user doesn't have a chance to adjust the image.

My proposal is more for the case where the user wants to know the values of the page layout analysis (just for informational purposes I guess) in the completion callback, which in the case of G8RecogitionOperation means that the operation should call AnalyseLayout() automatically before the operation automatically calls recognize(). Since the app actually crashes if the user tries to perform layout themselves in the completion block after recognition was already performed, I think this code is necessary to fix this bug and provide a better user experience. I'll work on this before we release version 4.0.0 this week.

For your purpose, if you want to be able to adjust the image using the page layout analysis values before recognition is performed, I recommend doing it in 2 steps:

  1. Initialize a regular G8Tesseract object, set its image to be the image you want to work with, access the orientation/skew values here without performing recognition, manually adjust the image using these values and then...
  2. Initialize a G8RecognitionOperation and set its image to be the adjusted image from (1) and add it to the queue so it can perform recognition on the adjusted image.

I haven't tried it myself, but my hope is that step 1 is fast relative to actually performing recognition so that this method doesn't slow down your app. I guess if it does you could always kick off step 1 asynchronously.

I don't believe Tesseract does any corrections for page orientation/skew (regardless of the PSM mode set), but I could be wrong about that. My basis for this understanding are the following links:
http://stackoverflow.com/questions/18487398/ocr-tesseract-intelligent-rotation-for-image
https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality

@sauravkedia
Copy link

Hi Guys,
I am facing problem in fetching orientation, deskew angle, writing direction and textline order of the image. Orientation and deskew angle always return 0.
Can any one please help me out or suggest something, how i can use these parameters to find character orientation.
Tesseract configuration:
"engineMode = G8OCREngineModeTesseractOnly"
"pageSegmentationMode = G8PageSegmentationModeOSDOnly"

I am using tesseract library for my iOS project. I am able to scan image in portrait mode perfectly but whenever the image orientation change, it doesn't gives me correct result.

Thanks in advance.

@ws233
Copy link
Collaborator

ws233 commented Apr 26, 2016

Hi @sauravkedia.
As I remember, Tesseract can understand the angle only between -15 and +15 degrees.
For more info, pls, take a look on upstream Tesseract wiki.

@sauravkedia
Copy link

Thanks Cyril.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants