-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore diacritics in searches #778
Comments
Will need to patch Epub.js for this. Probably just need to change this bit: https://github.com/futurepress/epub.js/blob/5c7f21d648d9d20d44c6c365d164b16871847023/src/section.js#L197. It currently makes all text lowercase before matching. So we'll have to add an option to remove all diacritics before matching. |
Would be happy to send a patch if desired. |
I looked a bit more into this. It seems that it's not simple to implement at all. First, it's not enough to test whether the text contains the query; you'd have to get the offset as well (so that it can be highlighted and navigated to). But if one simply removes diacritics, it can alter the length of the text, depending on whether the diacritic is a separate code point. Another problem is that the behavior can vary depending on the language, so the best way to see if two characters are equal is to use a These problems means that a simple Edit: found a locale-aware implementation of indexOf: https://github.com/arty-name/locale-index-of |
Another thing I just realized: apart from diacritics, one also needs to remove or ignore other kinds characters, such as the various zero-width characters. For example, if you use Calibre to insert soft hyphens into your book (which is necessary because Kindle doesn't support auto hyphens for KF7 and KF8) suddenly you can't find anything in Foliate anymore. One can observe this bug by opening any of the .azw files from Standard Ebooks. Edit: it seems this is already handled by |
Fixed in the |
Is your feature request related to a problem? Please describe.
When searching for words in Spanish with accents (más vs mas) or names (Elío vs Elio) Foliate will only match the words with the exact same accents.
Describe the solution you'd like
Similarily to how it ignores the casing of the search query, Foliate should also ignore the diacritics of the query.
Foliate could also provide an option like Firefox's "Match Diacritics" for forcing Foliate to consider the diacritics in the query.
The text was updated successfully, but these errors were encountered: