Skip to content

[kbn-evals] Fix missing datasets in report table due to refresh race#265549

Open
patrykkopycinski wants to merge 8 commits into
elastic:mainfrom
patrykkopycinski:pk/evals-fix-refresh-race
Open

[kbn-evals] Fix missing datasets in report table due to refresh race#265549
patrykkopycinski wants to merge 8 commits into
elastic:mainfrom
patrykkopycinski:pk/evals-fix-refresh-race

Conversation

@patrykkopycinski
Copy link
Copy Markdown
Contributor

Summary

Fixes a race condition where the last dataset(s) in an eval run would intermittently be missing from the results table.

Root cause: indexSingleScore() writes individual score documents with refresh: false for performance. When exportEvaluations() later attempts a bulk upsert of the same documents, it gets 409 Conflict responses for already-written docs. The refresh: 'wait_for' on the bulk request only applies to newly created documents — not the conflicted ones. This leaves the last scenario's score documents invisible when reportModelScore() queries the kibana-evaluations index for aggregated stats.

Fix: Add an explicit indices.refresh({ index: 'kibana-evaluations' }) call after exportEvaluations() and before reportModelScore() to ensure all score documents are searchable before the stats query runs. The .catch(() => {}) silently handles the case where the index doesn't exist yet.

Test plan

  • Run a multi-dataset eval suite (e.g., pci-compliance with 8 datasets) — all datasets should appear in the final results table
  • Verified the fix resolves the missing "no matching data" dataset that was previously intermittently absent

Made with Cursor

indexSingleScore writes documents with refresh:false for performance.
The subsequent exportEvaluations bulk upsert gets 409 conflicts on
already-written docs, and its refresh:'wait_for' only applies to newly
created documents. This leaves the last scenario's scores invisible
when reportModelScore queries the index for aggregated stats.

Add an explicit indices.refresh() after exportEvaluations and before
reportModelScore to ensure all score documents are searchable.
@patrykkopycinski patrykkopycinski force-pushed the pk/evals-fix-refresh-race branch from c608bf9 to 4eb5f87 Compare April 24, 2026 13:52
@elasticmachine
Copy link
Copy Markdown
Contributor

⏳ Build in-progress, with failures

Failed CI Steps

History

Model.id is string | undefined in @kbn/inference-common, so the
interface must accept undefined too.
indexSingleScore writes documents with refresh:false for performance.
The subsequent exportEvaluations bulk upsert gets 409 conflicts on
already-written docs, and its refresh:'wait_for' only applies to newly
created documents. This leaves the last scenario's scores invisible
when reportModelScore queries the index for aggregated stats.

Add an explicit indices.refresh() after exportEvaluations and before
reportModelScore to ensure all score documents are searchable.
// on those docs so its refresh:'wait_for' won't cover them. Force a refresh
// to make every score visible before the stats query.
await evaluationsEsClient.indices
.refresh({ index: 'kibana-evaluations' })
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low src/evaluate.ts:391

The refresh call at line 391 uses the hardcoded string 'kibana-evaluations' instead of the EVALUATIONS_DATA_STREAM_ALIAS constant used throughout EvaluationScoreRepository. If the constant value changes, the refresh silently targets the wrong index (error swallowed by .catch(() => {})), causing reportModelScore to potentially not see all documents. Consider using EVALUATIONS_DATA_STREAM_ALIAS instead.

🤖 Copy this AI Prompt to have your agent fix this:
In file x-pack/platform/packages/shared/kbn-evals/src/evaluate.ts around line 391:

The refresh call at line 391 uses the hardcoded string `'kibana-evaluations'` instead of the `EVALUATIONS_DATA_STREAM_ALIAS` constant used throughout `EvaluationScoreRepository`. If the constant value changes, the refresh silently targets the wrong index (error swallowed by `.catch(() => {})`), causing `reportModelScore` to potentially not see all documents. Consider using `EVALUATIONS_DATA_STREAM_ALIAS` instead.

Evidence trail:
x-pack/platform/packages/shared/kbn-evals/src/evaluate.ts lines 389-393 (REVIEWED_COMMIT) - shows hardcoded 'kibana-evaluations' and .catch(() => {}); x-pack/platform/packages/shared/kbn-evals/src/utils/score_repository.ts line 187 (REVIEWED_COMMIT) - defines EVALUATIONS_DATA_STREAM_ALIAS = 'kibana-evaluations'; git_grep results show EVALUATIONS_DATA_STREAM_ALIAS used at lines 411, 416, 418, 485, 501, 541, 634, 658, 670, 727 in score_repository.ts; git_grep for 'export.*EVALUATIONS_DATA_STREAM_ALIAS' returned no results confirming constant is not exported.

@patrykkopycinski patrykkopycinski self-assigned this Apr 27, 2026
@patrykkopycinski patrykkopycinski added release_note:skip Skip the PR/issue when compiling release notes v9.4.0 v9.5.0 labels Apr 27, 2026
@kibanamachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

cc @patrykkopycinski

@patrykkopycinski patrykkopycinski added the backport:version Backport to applied version labels label Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes v9.4.0 v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants