Skip to content

Conversation

@cbuescher
Copy link
Member

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in #53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the case_insensitive flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the query_string query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the case_insensitive is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes #71403

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in elastic#53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes elastic#71403
@cbuescher cbuescher added >bug :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.13.0 v7.12.2 labels Apr 15, 2021
@cbuescher cbuescher requested a review from markharwood April 15, 2021 14:09
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Apr 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Contributor

@markharwood markharwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments and it also looks like WildcardFieldMapper is missing a normalizedWildcardQuery override but otherwise LGTM

assertHitCount(searchResponse, 0L);

wildCardQuery = wildcardQuery("field1", "bb*");
searchResponse = client().prepareSearch().setQuery(wildCardQuery).get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a test where the search string is mixed case but set wildCardQuery.caseInsensitive(true)

boolean caseInsensitive,
SearchExecutionContext context
) {
return super.wildcardQuery(value, method, caseInsensitive, true, context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe calling normalizedWildcardQuery here would make the intent more obvious plus help any tracing back of where we make use of normalized wildcard queries

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but I left the caseInsensitive parameter out of the normalizedWildcardQuery signature on purpose, so I need to call the protected method that takes both arguments here. My thinking was that we use normalizedWildcardQuery only from QueryStringQueryParser where we don't have the caseInsensitive option. Maybe you see a different solution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. That makes sense.

@cbuescher cbuescher merged commit 0519e37 into elastic:master Apr 26, 2021
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Apr 26, 2021
…ic#71751)

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in elastic#53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes elastic#71403
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Apr 26, 2021
…ic#71751)

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in elastic#53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes elastic#71403
cbuescher pushed a commit that referenced this pull request Apr 26, 2021
… (#72214)

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in #53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes #71403
cbuescher pushed a commit that referenced this pull request Apr 26, 2021
… (#72216)

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in #53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes #71403
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Apr 26, 2021
…ic#71751) (elastic#72214)

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in elastic#53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes elastic#71403
cbuescher pushed a commit that referenced this pull request Apr 26, 2021
… (#72224)

Wildcard queries on text fields should not apply the fields analyzer to the
search query. However, we accidentally enabled this in #53127 by moving the
query normalization to the StringFieldType super type. This change fixes this by
separating the notion of normalization and case insensitivity (as implemented in
the `case_insensitive` flag). This is done because we still need to maintain
normalization of the query sting when the wildcard query method on the field type is
requested from the `query_string` query parser. Wildcard queries on keyword
fields should also continue to apply the fields normalizer, regardless of
whether the `case_insensitive` is set, because normalization could involve
something else than lowercasing (e.g. substituting umlauts like in the
GermanNormalizationFilter).

Closes #71403
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.12.2 v7.13.0 v7.14.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wildcard queries on text fields don't obey rules for case sensitivity

4 participants