-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes Extracted Text not indexing in solr #767
Conversation
@dannylamb I noticed this issue last night also, so great to have a patch to try. I've applied the patch via composer and it works for a fresh new Repository Item but when I tried media update (e.g. by removing the PDF and adding a different PDF) I got a new Thumbnail Image but the extracted_text is still the text from the old file... |
@kayakr Thanks so much for looking at this. I've pushed up a tidier version of what I had before. But hold up on testing it. I'll walk through those steps and see why the extracted text isn't updating when replacing a file. If I can fix it here I will. |
So followed your steps and uploaded a new file to the "Original File" media, and got a new thumbnail but no extracted text or technical metadata. Extracted text had nothing in the logs, but I did get this for the fits data:
Drupal logs are showing
|
After debugging this pretty hard and running into everything from file permissions issues (www-data couldn't write sites/default/files), to transaction issues
At this point i'm going to rebuild my environment and try again to see if I can isolate the issues. But one thing's for sure, you upload a new media and things go haywire. It would be nice if another @Islandora/8-x-committers can try and recreate as well. Just for basic sanity... |
I'll give it a spin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so I pulled in the PR and it works! Granted, I hit the FITS WSOD issue, and the logs are full of errors from the JsonldTypeAlterReaction.... but this PR works as advertised and isn't the source of those errors.
JIRA Ticket: Resolves Islandora/documentation#1476
What does this Pull Request do?
Manually triggers search re-indexing for nodes when their extracted text media are inserted or updated.
What's new?
Brute forcing re-indexing in media insert and update hooks.
How should this be tested?
To confirm
Now apply this PR
Interested parties
@Islandora/8-x-committers