Skip to content

Conversation

@ikreymer
Copy link
Member

Instead of regex search on url and title:

  • If search string starts with https:// or http://, do a prefix search on URL
  • Otherwise, do a $text search on title, sort by match score.

@ikreymer ikreymer requested a review from tw4l February 20, 2025 08:01
@tw4l
Copy link
Member

tw4l commented Feb 20, 2025

Overall makes sense, see inline question about RAM usage of text indices, and also looks like the corresponding test needs to be updated to match new behavior (I'm guessing it's due to how $text indices tokenize that we don't quite the result we expected in that test, perhaps due to punctuation in titles?)

@tw4l tw4l merged commit 931e351 into issue-2406-crawl-migration-rework Feb 20, 2025
22 checks passed
@tw4l tw4l deleted the try-text-search branch February 20, 2025 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants