user searched for $250 but found €250. #1079
Replies: 5 comments
-
You have $ in 'blend_chars'. So words get indexed both with and without the charactor. Hence matching both As for the Euro symbol, check your encoding. Not entirely sure, but I think the Euro symbol pasted here is codepoint 128 (U+80) from Windows CP 1252, rather than U+20AC from unicode. (but that Could be encoding issue on this forum. This page does say its UTF8) |
Beta Was this translation helpful? Give feedback.
-
Well it is kinda 'tricky' with keyword searching, as it designed for words, not numbers. In concept if want to match with the currency symbol, would need it in charset_table. (and have to deal with different symbols, including different encodings) ... ie do also want to deal with if the text actually contains '250 USD' type thing? Or $2,000 Might need to do some normalization - eg with regex_filter, to get 'prices' into a consistent format. Also if want to deal with say $25.43 - which case might need . in charset table (which adds extra complexity as it also used for period!) Frankly, if possible, might be best if can deal with the 'price' outside of 'keywords' eg using an actual numberic attribute, can also then store the currency in attribute too
|
Beta Was this translation helpful? Give feedback.
-
Do you reindex your data after the change? Could you provide fully reproducible example (config, source data, query with output) ? |
Beta Was this translation helpful? Give feedback.
-
Moving this issue to Discussions. |
Beta Was this translation helpful? Give feedback.
-
Many thanks for the quick response. I have checked symbol €, its U+20AC. I have added currency symbols to charset_table (and reindex all data from scratch). IDX_TYPE = """type = plain
About reproducing - |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
Weird search results, user searched for $250 but found €250.
To Reproduce
Expected behavior
Describe the environment:
Manticore 6.0.2 89c7a51@230210 (columnar 2.0.0 a7c703d@230130) (secondary 2.0.0 a7c703d@230130)
For Manticore 4 without columnar the same bug
Debian GNU/Linux 11 (bullseye)
Messages from log files:
Logs without errors
My config -
Beta Was this translation helpful? Give feedback.
All reactions