-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Search Relevance/AnalysisHow text is split into tokensHow text is split into tokens>bugTeam:Search RelevanceMeta label for the Search Relevance team in ElasticsearchMeta label for the Search Relevance team in Elasticsearchv7.2.0
Description
On 7.2.0 the behaviour of using a synonym filter after e.g. a lowercase filter in the analysis is not as expected.
PUT /test_index
{
"settings": {
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : [ "lowercase", "synonym"]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms" : [ "Eins, Uno, One" ]
}
}
}
}
}
}
GET /test_index/_analyze
{
"analyzer": "synonym",
"text" : "Uno"
}
returns
{
"tokens" : [
{
"token" : "uno",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 0
}
]
}
I would have expected the lowercasing to be applied to the synonym filter inputs and consequently the term to expand to the three variations.
This is how the output of the previous example looks like on 7.1.1. still:
{
"tokens" : [
{
"token" : "uno",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 0
},
{
"token" : "eins",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 0
},
{
"token" : "one",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 0
}
]
}
The problem doesn't seem to be limited to the _analyze endpoint but also searching doesn't work as before.
On 7.2.0:
PUT /test_index/_mapping
{
"properties": {
"title" : {
"type": "text",
"analyzer": "synonym"
}
}
}
POST /test_index/_doc
{
"title" : "Eins"
}
POST /test_index/_search
{
"query": {
"match": {
"title": "Uno"
}
}
}
returns no hits.
On 7.1.1 this is the output:
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "pqt3BWwB6PA4qSE33nZE",
"_score" : 0.5274171,
"_source" : {
"title" : "Eins"
}
}
]
Metadata
Metadata
Assignees
Labels
:Search Relevance/AnalysisHow text is split into tokensHow text is split into tokens>bugTeam:Search RelevanceMeta label for the Search Relevance team in ElasticsearchMeta label for the Search Relevance team in Elasticsearchv7.2.0