Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix indexer logic by refreshing doc headers before indexing #89

Merged
merged 1 commit into from
Mar 10, 2023

Conversation

jfreda
Copy link
Contributor

@jfreda jfreda commented Mar 10, 2023

This PR fixes improves indexer logic by switching the order of operations where we now refresh document(+drafts) headers before indexing the documents for search. Before this, the indexer could get into a state where it was constantly indexing and refreshing the doc headers each run because the header refreshes would cause the documents to have a new modified time.

It's not exactly a hack, but note that we now track the timestamps of when we refresh headers of all documents in a folder by putting a refreshHeaders: prefix in the google_drive_id column for these records in the indexer_folders table.

@jfreda jfreda requested a review from a team as a code owner March 10, 2023 19:49
log.Error("error getting drafts headers folder indexer data",
"error", err,
)
os.Exit(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to exit here? Could we retry instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good thought, but we're already exiting for basically all errors in the indexer (there's a TODO above to improve error handling). I'd like to address this more holistically in a future PR.

Copy link
Contributor

@anubhavmishra anubhavmishra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just one minor comment!

@jfreda jfreda merged commit c97ec29 into main Mar 10, 2023
@jfreda jfreda deleted the jfreda/fix-indexer-logic-order branch March 10, 2023 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants