-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Mask wildcard query special characters on keyword queries #53127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Wildcard queries on keyword fields get normalized, however this normalization step should exclude the two special characters * and ? in order to keep the wildcard query itself intact. Closes elastic#46300
|
Pinging @elastic/es-search (:Search/Analysis) |
server/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
Outdated
Show resolved
Hide resolved
|
@jimczi pushed some changes moving the code like you suggested |
|
@elasticmachine update branch |
|
merge conflict between base and head |
jimczi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The StringFieldType looks good to me.
I am less sure about the modifications in the _type field. I left a comment on how we can address it but that can be done in a follow up if you prefer.
server/src/main/java/org/elasticsearch/index/mapper/TypeFieldMapper.java
Outdated
Show resolved
Hide resolved
|
@jimczi @romseygeek thanks for the review, I added a commit that changes TypeFieldType to extend ConstantFieldType like you suggested and adapted tests where necessary. Hope this looks okay now. |
We can have only one type in 7x, but it may be something other than |
server/src/main/java/org/elasticsearch/index/mapper/TypeFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ConstantFieldType.java
Outdated
Show resolved
Hide resolved
jimczi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good to me. I left one comment regarding the handling of prefix and wildcard queries for _type field. Feel free to address it without further review.
| result = new MatchNoDocsQuery("[_type] was lexicographically greater than upper bound of range"); | ||
| } | ||
| } | ||
| protected boolean matches(String pattern, QueryShardContext context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to handle wildcard and prefixes here ? We don't support prefix and wildcard queries on the _type field today and since _type are now a thing from the past I don't think we should add this ability. Just checking that the pattern exactly matches the internal type should be enough.
|
@elasticmachine run elasticsearch-ci/2 |
|
@elasticmachine update branch |
) Wildcard queries on keyword fields get normalized, however this normalization step should exclude the two special characters * and ? in order to keep the wildcard query itself intact. Closes elastic#46300
|
@cbuescher I removed the |
Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
…ic#71751) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
…ic#71751) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
… (#72214) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
… (#72216) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
…ic#71751) (elastic#72214) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
… (#72224) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
Wildcard queries on keyword fields should get normalized, however this normalization
should exclude the two special characters * and ? in order to keep the wildcard query
itself intact.
Closes #46300