Skip to content

Fix flaky MMR diversification YAML tests#143706

Merged
mayya-sharipova merged 2 commits intoelastic:mainfrom
mayya-sharipova:fix-mrr-test
Mar 6, 2026
Merged

Fix flaky MMR diversification YAML tests#143706
mayya-sharipova merged 2 commits intoelastic:mainfrom
mayya-sharipova:fix-mrr-test

Conversation

@mayya-sharipova
Copy link
Copy Markdown
Contributor

The default int8_hnsw index type quantizes float32 vectors to
int8, introducing enough scoring error to non-deterministically
reorder documents with close cosine similarities. With only 4
dimensions the quantization is particularly coarse.

Use explicit hnsw index type on test dense_vector mappings to
get exact float scoring and deterministic KNN result ordering.
Update expected results to reflect exact cosine ordering.

Closes #143430
Closes #143609

The default int8_hnsw index type quantizes float32 vectors to
int8, introducing enough scoring error to non-deterministically
reorder documents with close cosine similarities. With only 4
dimensions the quantization is particularly coarse.

Use explicit hnsw index type on test dense_vector mappings to
get exact float scoring and deterministic KNN result ordering.
Update expected results to reflect exact cosine ordering.

Closes elastic#143430
Closes elastic#143609
@mayya-sharipova mayya-sharipova added >test Issues or PRs that are addressing/adding tests :Search Relevance/Vectors Vector search v9.4.0 labels Mar 5, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 5, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@mayya-sharipova mayya-sharipova requested a review from pmpailis March 6, 2026 13:39
Copy link
Copy Markdown
Contributor

@pmpailis pmpailis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mayya-sharipova mayya-sharipova merged commit 2058c4c into elastic:main Mar 6, 2026
36 checks passed
@mayya-sharipova mayya-sharipova deleted the fix-mrr-test branch March 6, 2026 14:12
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 6, 2026
…locations

* upstream/main: (153 commits)
  ES|QL: Update docs for TOP_SNIPPETS and DECAY (elastic#143739)
  Correctly include endpoint id in log msg in AuthorizationPoller (elastic#143743)
  Bar searching or sorting on _seq_no when disabled (elastic#143600)
  Generalize `testClientCancellation` test (elastic#143586)
  JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction (elastic#143702)
  Track recycler pages in circuit breaker (elastic#143738)
  [ESQL] Enable distributed pipeline breakers for external sources via FragmentExec (elastic#143696)
  Adding 'mode' and 'codec' fields to ES monitoring template (elastic#143673)
  [ESQL] Columnar I/O and vectorized block conversion for external sources (elastic#143703)
  Fix flaky MMR diversification YAML tests (elastic#143706)
  ES|QL codegen: check builder arguments for vector support (elastic#143724)
  Add Views Security Model (elastic#141050)
  ESQL: Prevent pushdown of unmapped fields in filters and sorts (elastic#143460)
  Don't run seq_no pruning tests in release CI (elastic#143725)
  ESQL: Support intra-row field references in ROW command (elastic#140217)
  ES|QL: Remove implicit limit in FORK branches in CSV tests (elastic#143601)
  IndexRoutingTests with and without synthetic id (elastic#143566)
  Synthetic id upgrade test in serverless (elastic#142471)
  Disable "Review skipped" comments for PRs without specified labels (elastic#143728)
  Cleanup ES|QL T-Digest code duplication, add memory accounting (elastic#143662)
  ...
sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Mar 6, 2026
The default int8_hnsw index type quantizes float32 vectors to
int8, introducing enough scoring error to non-deterministically
reorder documents with close cosine similarities. With only 4
dimensions the quantization is particularly coarse.

Use explicit hnsw index type on test dense_vector mappings to
get exact float scoring and deterministic KNN result ordering.
Update expected results to reflect exact cosine ordering.

Closes elastic#143430
Closes elastic#143609
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

4 participants