-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Search is case-sensitive in non-English languages #3116
Comments
There are quite a few articles suggesting to create the postgresql database with LC_COLLATE and LC_CTYPE set to C. Like here. UPDATE event_search SET vector = array_to_tsvector(lower(tsvector_to_array(vector)::text)::text[]) ; Not sure if it was proper and/or safe way. |
@areisp thanks for workaround description, can it also fix issue element-hq/element-web#7247 with user search? In which table we must execute proposed UPDATE for it? |
The issue is that the |
|
Maybe you must rebuild indexes after changing locale? |
Sure, I've issued |
Full log: |
I have found a PR here https://github.com/matrix-org/synapse/pull/6268/files that forcing lowercase of strings for solving similar problem with case-sensitive search. |
Sidenote: as far as I see synapse will reject the workarounds which try to use a non-C db locale, so this will come up as a problem again. The current state of searching is a hack, it should be possible to use a proper full-text search backend. I'm not sure it would be very hard to develop one independently (using the db and redirecting the api calls), but right now I'm busy with many other things. :-( |
Description
It's hard to find the text you need in languages other than English as case becomes important.
Steps to reproduce
Also, here the FTS language is hardcoded as English so no stemming is supported for other languages. I propose lowercasing the event text before inserting it and do the same on querying.
Version information
The text was updated successfully, but these errors were encountered: