Skip to content

Conversation

@peteralfonsi
Copy link
Contributor

@peteralfonsi peteralfonsi commented Sep 23, 2025

Description

Some dynamically updateable parameters can change query results, for example "use_similarity" and "split_queries_on_whitespace" for keyword indices. This means that after changing them, users can get stale/incorrect values from the request cache. This PR prevents this by disabling caching queries which are on fields with non-default values for these parameters. This should be an acceptable solution as users opting into the non-default parameter should be fairly rare.

Related Issues

#19279

Check List

  • Functionality includes testing.
  • [N/A] API changes companion pull request created, if applicable.
  • [N/A] Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Peter Alfonsi added 3 commits September 23, 2025 12:42
Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
@peteralfonsi peteralfonsi requested a review from a team as a code owner September 23, 2025 20:07
Signed-off-by: Peter Alfonsi <[email protected]>
@github-actions
Copy link
Contributor

❌ Gradle check result for 5c5f4d4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 6157c8a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sgup432
Copy link
Contributor

sgup432 commented Sep 24, 2025

@peteralfonsi Instead of trying to find ways to invalidate the cache by listening on mapper events, should we instead not cache the queries for such indices which uses use_similarity in its mapping? I think it will simplify it by not having to write the extra logic to handle specific scenarios. I don't think its worth caching for such indices.

In IndicesService.canCache, it does have access to indexShard/indexSettings, and I think we can use that to check whether index has this mapping and accordingly not cache for such indices.

@peteralfonsi
Copy link
Contributor Author

@sgup432 I think it makes more sense to keep these queries cacheable for a couple reasons:

  • There are at least 2 parameters where changing them affects query results, the other one is "split_queries_on_whitespace" on keyword queries, and there may be more (for example in field mappers from other plugins), so we don't want the solution to be specific to "use_similarity"
  • For "use_similarity", every keyword field does technically have this defined, it's just the default is false. The problem is really in switching between the 2 states, not it being true or false. We can imagine some complex query where one component is on a field with "use_similarity", and it'd be unfortunate if the whole thing skipped the cache unnecessarily

@sgup432
Copy link
Contributor

sgup432 commented Sep 24, 2025

Okay, I was hoping that we avoiding such indices to be cached might be a more simpler solution. As I don't think we will gain a lot overall caching such queries.

But seems like even with my approach, we might have to tackle this case by case which I was trying to avoid in the first place.

@peteralfonsi peteralfonsi changed the title Wipe request cache on mapper updates for index Disable request cache for queries on fields with non-default use_similarity or split_queries_on_whitespace Sep 25, 2025
Peter Alfonsi added 4 commits September 24, 2025 17:08
This reverts commit 57da039.

Signed-off-by: Peter Alfonsi <[email protected]>
This reverts commit 87fbc14.

Signed-off-by: Peter Alfonsi <[email protected]>
This reverts commit edc7b38.

Signed-off-by: Peter Alfonsi <[email protected]>
@github-actions
Copy link
Contributor

❌ Gradle check result for 624d2b8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Peter Alfonsi <[email protected]>
@github-actions
Copy link
Contributor

❌ Gradle check result for dc2c277: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@peteralfonsi
Copy link
Contributor Author

Flaky test (org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/310_match_bool_prefix/multi_match single field complete term} is mentioned specifically in the issue): #14294

Signed-off-by: Peter Alfonsi <[email protected]>
@github-actions
Copy link
Contributor

❌ Gradle check result for 27f5b42: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❕ Gradle check result for 274c90f: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@codecov
Copy link

codecov bot commented Sep 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.00%. Comparing base (e0ee3b4) to head (274c90f).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19385      +/-   ##
============================================
+ Coverage     72.81%   73.00%   +0.18%     
- Complexity    69854    69930      +76     
============================================
  Files          5676     5676              
  Lines        321048   321067      +19     
  Branches      46420    46421       +1     
============================================
+ Hits         233774   234383     +609     
+ Misses        68376    67677     -699     
- Partials      18898    19007     +109     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jainankitk jainankitk merged commit a19434c into opensearch-project:main Sep 25, 2025
33 checks passed
vinaykpud pushed a commit to vinaykpud/OpenSearch that referenced this pull request Sep 26, 2025
…ilarity` or `split_queries_on_whitespace` (opensearch-project#19385)

---------

Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Co-authored-by: Peter Alfonsi <[email protected]>
karenyrx pushed a commit to karenyrx/OpenSearch that referenced this pull request Sep 29, 2025
…ilarity` or `split_queries_on_whitespace` (opensearch-project#19385)

---------

Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Co-authored-by: Peter Alfonsi <[email protected]>
peteralfonsi added a commit to peteralfonsi/OpenSearch that referenced this pull request Oct 15, 2025
…ilarity` or `split_queries_on_whitespace` (opensearch-project#19385)

---------

Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Co-authored-by: Peter Alfonsi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants