-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix case sensitivity rules for wildcard queries on text fields #71751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
|
Pinging @elastic/es-search (Team:Search) |
markharwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of comments and it also looks like WildcardFieldMapper is missing a normalizedWildcardQuery override but otherwise LGTM
| assertHitCount(searchResponse, 0L); | ||
|
|
||
| wildCardQuery = wildcardQuery("field1", "bb*"); | ||
| searchResponse = client().prepareSearch().setQuery(wildCardQuery).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a test where the search string is mixed case but set wildCardQuery.caseInsensitive(true)
| boolean caseInsensitive, | ||
| SearchExecutionContext context | ||
| ) { | ||
| return super.wildcardQuery(value, method, caseInsensitive, true, context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe calling normalizedWildcardQuery here would make the intent more obvious plus help any tracing back of where we make use of normalized wildcard queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but I left the caseInsensitive parameter out of the normalizedWildcardQuery signature on purpose, so I need to call the protected method that takes both arguments here. My thinking was that we use normalizedWildcardQuery only from QueryStringQueryParser where we don't have the caseInsensitive option. Maybe you see a different solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok. That makes sense.
…ic#71751) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
…ic#71751) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
… (#72214) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
… (#72216) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
…ic#71751) (elastic#72214) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
… (#72224) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in #53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the
case_insensitiveflag). This is done because we still need to maintainnormalization of the query sting when the wildcard query method on the field type is
requested from the
query_stringquery parser. Wildcard queries on keywordfields should also continue to apply the fields normalizer, regardless of
whether the
case_insensitiveis set, because normalization could involvesomething else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).
Closes #71403