Skip to content

[Security][Detection Engine] ESQL Rule Execution Logic Integration Test#252936

Merged
hannahbrooks merged 11 commits intomainfrom
235895-failing-test-detection-engine-esql-rule
Feb 19, 2026
Merged

[Security][Detection Engine] ESQL Rule Execution Logic Integration Test#252936
hannahbrooks merged 11 commits intomainfrom
235895-failing-test-detection-engine-esql-rule

Conversation

@hannahbrooks
Copy link
Copy Markdown
Contributor

@hannahbrooks hannahbrooks commented Feb 12, 2026

Summary

Resolves #235895

The test that was failing was:

should generate alerts over multiple pages from different indices but same event id for mv_expand when number alerts exceeds max signal

This means that when the same document exists in multiple indices (same id), and a rule uses mv_expand, the system should create alerts from all indices. This can take multiple runs because the number of alerts created may exceed max_signals (maximum number of alerts that can be created per run).

Bug

When the document was being inserted in both indices that the test has, they shared the same timestamp. So when Elasticsearch was pulling these documents, it would be pulling inconsistently from index ecs_compliant and ecs_compliant_synthetic_source leading to unpredictable ordering. updateExcludedDocuments would not consistently receive the same documents to exclude across runs.

Fix

I added an index based increment to the milliseconds of the timestamp. This lets Elasticsearch always be able to order documents when they share an id. This was decided to be work around. Now instead, I have added a tiebreaker to the sort, being the _index to add deterministic ordering.

Testing

I ran a loop to ensure that the test passed at least 10 times in a row.

for i in {1..10}; do
  echo "=== Run $i ==="
  npm run rule_execution_logic:esql:runner:ess > /tmp/test-run-$i.log 2>&1
  if [ $? -ne 0 ]; then
    echo "=== FAILED on run $i — see /tmp/test-run-$i.log ==="
    break
  fi
  echo "=== PASSED run $i ==="
  rm /tmp/test-run-$i.log
done
Screenshot 2026-02-12 at 11 45 05 AM

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

@hannahbrooks hannahbrooks self-assigned this Feb 12, 2026
@hannahbrooks hannahbrooks added backport:all-open Backport to all branches that could still receive a release Team:Detection Engine Security Solution Detection Engine Area labels Feb 12, 2026
@hannahbrooks hannahbrooks marked this pull request as ready for review February 12, 2026 18:39
@hannahbrooks hannahbrooks requested a review from a team as a code owner February 12, 2026 18:39
@hannahbrooks hannahbrooks requested a review from rylnd February 12, 2026 18:39
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/security-detection-engine (Team:Detection Engine)

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#10741

[✅] x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/configs/ess.config.ts: 100/100 tests passed.
[✅] x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/configs/serverless.config.ts: 100/100 tests passed.

see run history

Copy link
Copy Markdown
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the investigation and writeup, here! Nice work.

However: TL;DR I don't think we should pretend this situation can't happen.

Since we are not fixing the fact of these circumstances, and instead are modifying our test to avoid these circumstances:

How does the nondeterministic ordering exhibited in these test failures affect rule execution?

  • If it affects rule execution in any way, I think we need to either fix that behavior, or, if we can't (e.g. due to an ES|QL limitation) or it's determined not to be a bug, then we should at least document how it can manifest/affect things.
  • If it can't affect rule execution, then I think we should try instead to make the test robust to that situation (by relaxing constraints within the test, somehow), rather than avoiding the situation entirely.

@hannahbrooks
Copy link
Copy Markdown
Contributor Author

hannahbrooks commented Feb 17, 2026

Replying to @rylnd:

How does the nondeterministic ordering exhibited in these test failures affect rule execution?
Even though rare, the nondeterministic ordering CAN affect rule execution if the following conditions are present (like in the test):

  1. Results hit maxSignals (usually due to using mv_expand)
  2. Documents across indices have the same _id and @timestamp (creating no tiebreaker)

Can we fix the behaviour?
What we need to add is some sort of a tiebreaker. However, the sort order is part of the user's query and if we append a second sort we overwrite the original sort. Modifying the existing sort would require parsing the query, which doesn't currently happen. For the test, I can add a sort with _index and a comment why.

Follow-up
I can open an issue to investigate adding _index automatically. There would be some edge cases though (eg. no sort, no _index, already sorting by _index). Maybe that can help us determine if this a bug or if it should be documented.

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#10776

[✅] x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/configs/ess.config.ts: 100/100 tests passed.
[✅] x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/configs/serverless.config.ts: 100/100 tests passed.

see run history

@rylnd
Copy link
Copy Markdown
Contributor

rylnd commented Feb 18, 2026

@vitaliidm since you wrote this test initially: do you have any thoughts as to how to best address this? I'm still not quite sure if this is a bug with the rule executor, or simply "user misconfiguration" (i.e. if you only sort on @timestamp, and you're mv_expanding to duplicate both _id and @timestamp, it's possible you'll either miss or duplicate alerts).

@vitaliidm
Copy link
Copy Markdown
Contributor

I agree with @rylnd

We should
a) try to fix
b) document potential inconsistent behaviour

Documenting can be the easiest way to proceed, considering why this happens and implications:

  1. We exclude documents from search because we lack of ES|QL results pagination, so it's already workaround that we would like to get rid off as soon as pagination feature added
  2. What is happening is exclusively related with MV_EXPAND usage and multiply alerts creating from a single document
  3. It can also only happen when we hit max signals. This narrows down possible occurrences.

What are the implications when we hit this edge case?

Per design, we do not generate more than max signals alerts from a single document.
Worst case: up to max_signals - 1 alerts from one index's copy of the document won't be created across rule re-executions, because the boundary document flips non-deterministically between indices. The document itself still generates alerts from at least one index.

Possible fixes:

Adding _index as a sort in the query by default could prevent this from happening. At the same time it can prevent detecting of alerts from other indices, as adding that sort would prioritise results from one index over the rest(this could happen if user does not use @timestamp or any other sort for their query and could even lead to excluding alerts from events: if we create max_signals from index_1 and during next execution event from index_2 falls our of rule interval)

So, I would suggest

  1. adding bullet point in https://www.elastic.co/docs/solutions/security/detect-and-alert/create-detection-rule#esql-query-design, where we already suggest using @timestamp to consider using _index as tiebreaker, whenever MV_EXPAND used
  2. Creating an issue where we can dig deeper into what can be done. Btw, we have Automatically inject metadata _id into ES|QL Detection Rules #248194 on adding _id automatically. But it might have some compromises, since any field can be transformed in an ES|QL query

@hannahbrooks
Copy link
Copy Markdown
Contributor Author

hannahbrooks commented Feb 18, 2026

I agree with @rylnd

We should a) try to fix b) document potential inconsistent behaviour

Documenting can be the easiest way to proceed, considering why this happens and implications:

  1. We exclude documents from search because we lack of ES|QL results pagination, so it's already workaround that we would like to get rid off as soon as pagination feature added
  2. What is happening is exclusively related with MV_EXPAND usage and multiply alerts creating from a single document
  3. It can also only happen when we hit max signals. This narrows down possible occurrences.

What are the implications when we hit this edge case?

Per design, we do not generate more than max signals alerts from a single document. Worst case: up to max_signals - 1 alerts from one index's copy of the document won't be created across rule re-executions, because the boundary document flips non-deterministically between indices. The document itself still generates alerts from at least one index.

Possible fixes:

Adding _index as a sort in the query by default could prevent this from happening. At the same time it can prevent detecting of alerts from other indices, as adding that sort would prioritise results from one index over the rest(this could happen if user does not use @timestamp or any other sort for their query and could even lead to excluding alerts from events: if we create max_signals from index_1 and during next execution event from index_2 falls our of rule interval)

So, I would suggest

  1. adding bullet point in https://www.elastic.co/docs/solutions/security/detect-and-alert/create-detection-rule#esql-query-design, where we already suggest using @timestamp to consider using _index as tiebreaker, whenever MV_EXPAND used
  2. Creating an issue where we can dig deeper into what can be done. Btw, we have Automatically inject metadata _id into ES|QL Detection Rules #248194 on adding _id automatically. But it might have some compromises, since any field can be transformed in an ES|QL query

Thank you for this input. I agree with your suggestions, and I will proceed with:

Copy link
Copy Markdown
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test fix as written looks good, as long as the followup issue is created as discussed here.

P.S. Once this is merged, don't forget to close the associated failed-test issue!

@elasticmachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @hannahbrooks

@hannahbrooks hannahbrooks merged commit 8cb144e into main Feb 19, 2026
16 checks passed
@hannahbrooks hannahbrooks deleted the 235895-failing-test-detection-engine-esql-rule branch February 19, 2026 15:49
@kibanamachine
Copy link
Copy Markdown
Contributor

Starting backport for target branches: 8.19, 9.2, 9.3

https://github.com/elastic/kibana/actions/runs/22188978052

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 19, 2026
…st (elastic#252936)

## Summary

Resolves [elastic#235895](elastic#235895)
When mv_expand is used, all documents added to indices share the same _id and @timestamp. This leads to indeterministic ordering when ElasticSearch is pulling documents. There is no tiebreaker, so we get unpredictable results. This fixes PR fixes a test that encounters this issue.

(cherry picked from commit 8cb144e)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 19, 2026
…st (elastic#252936)

## Summary

Resolves [elastic#235895](elastic#235895)
When mv_expand is used, all documents added to indices share the same _id and @timestamp. This leads to indeterministic ordering when ElasticSearch is pulling documents. There is no tiebreaker, so we get unpredictable results. This fixes PR fixes a test that encounters this issue.

(cherry picked from commit 8cb144e)
@kibanamachine
Copy link
Copy Markdown
Contributor

💔 Some backports could not be created

Status Branch Result
8.19 Backport failed because of merge conflicts

You might need to backport the following PRs to 8.19:
- [ska] relocation security_solution_* FTR tests (#231416)
9.2
9.3

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 252936

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Feb 19, 2026
…ion Test (#252936) (#254034)

# Backport

This will backport the following commits from `main` to `9.3`:
- [[Security][Detection Engine] ESQL Rule Execution Logic Integration
Test (#252936)](#252936)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Hannah
Brooks","email":"hannah.brooks@elastic.co"},"sourceCommit":{"committedDate":"2026-02-19T15:49:43Z","message":"[Security][Detection
Engine] ESQL Rule Execution Logic Integration Test (#252936)\n\n##
Summary\n\nResolves
[#235895](https://github.com/elastic/kibana/issues/235895)\nWhen
mv_expand is used, all documents added to indices share the same _id and
@timestamp. This leads to indeterministic ordering when ElasticSearch is
pulling documents. There is no tiebreaker, so we get unpredictable
results. This fixes PR fixes a test that encounters this
issue.","sha":"8cb144ef0002d4b54e157374d0f481ad89657a66","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","backport:all-open","Team:Detection
Engine","v9.4.0"],"title":"[Security][Detection Engine] ESQL Rule
Execution Logic Integration
Test","number":252936,"url":"https://github.com/elastic/kibana/pull/252936","mergeCommit":{"message":"[Security][Detection
Engine] ESQL Rule Execution Logic Integration Test (#252936)\n\n##
Summary\n\nResolves
[#235895](https://github.com/elastic/kibana/issues/235895)\nWhen
mv_expand is used, all documents added to indices share the same _id and
@timestamp. This leads to indeterministic ordering when ElasticSearch is
pulling documents. There is no tiebreaker, so we get unpredictable
results. This fixes PR fixes a test that encounters this
issue.","sha":"8cb144ef0002d4b54e157374d0f481ad89657a66"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/252936","number":252936,"mergeCommit":{"message":"[Security][Detection
Engine] ESQL Rule Execution Logic Integration Test (#252936)\n\n##
Summary\n\nResolves
[#235895](https://github.com/elastic/kibana/issues/235895)\nWhen
mv_expand is used, all documents added to indices share the same _id and
@timestamp. This leads to indeterministic ordering when ElasticSearch is
pulling documents. There is no tiebreaker, so we get unpredictable
results. This fixes PR fixes a test that encounters this
issue.","sha":"8cb144ef0002d4b54e157374d0f481ad89657a66"}}]}]
BACKPORT-->

Co-authored-by: Hannah Brooks <hannah.brooks@elastic.co>
kibanamachine added a commit that referenced this pull request Feb 19, 2026
…ion Test (#252936) (#254033)

# Backport

This will backport the following commits from `main` to `9.2`:
- [[Security][Detection Engine] ESQL Rule Execution Logic Integration
Test (#252936)](#252936)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Hannah
Brooks","email":"hannah.brooks@elastic.co"},"sourceCommit":{"committedDate":"2026-02-19T15:49:43Z","message":"[Security][Detection
Engine] ESQL Rule Execution Logic Integration Test (#252936)\n\n##
Summary\n\nResolves
[#235895](https://github.com/elastic/kibana/issues/235895)\nWhen
mv_expand is used, all documents added to indices share the same _id and
@timestamp. This leads to indeterministic ordering when ElasticSearch is
pulling documents. There is no tiebreaker, so we get unpredictable
results. This fixes PR fixes a test that encounters this
issue.","sha":"8cb144ef0002d4b54e157374d0f481ad89657a66","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","backport:all-open","Team:Detection
Engine","v9.4.0"],"title":"[Security][Detection Engine] ESQL Rule
Execution Logic Integration
Test","number":252936,"url":"https://github.com/elastic/kibana/pull/252936","mergeCommit":{"message":"[Security][Detection
Engine] ESQL Rule Execution Logic Integration Test (#252936)\n\n##
Summary\n\nResolves
[#235895](https://github.com/elastic/kibana/issues/235895)\nWhen
mv_expand is used, all documents added to indices share the same _id and
@timestamp. This leads to indeterministic ordering when ElasticSearch is
pulling documents. There is no tiebreaker, so we get unpredictable
results. This fixes PR fixes a test that encounters this
issue.","sha":"8cb144ef0002d4b54e157374d0f481ad89657a66"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/252936","number":252936,"mergeCommit":{"message":"[Security][Detection
Engine] ESQL Rule Execution Logic Integration Test (#252936)\n\n##
Summary\n\nResolves
[#235895](https://github.com/elastic/kibana/issues/235895)\nWhen
mv_expand is used, all documents added to indices share the same _id and
@timestamp. This leads to indeterministic ordering when ElasticSearch is
pulling documents. There is no tiebreaker, so we get unpredictable
results. This fixes PR fixes a test that encounters this
issue.","sha":"8cb144ef0002d4b54e157374d0f481ad89657a66"}}]}]
BACKPORT-->

Co-authored-by: Hannah Brooks <hannah.brooks@elastic.co>
ersin-erdal pushed a commit to ersin-erdal/kibana that referenced this pull request Feb 19, 2026
…st (elastic#252936)

## Summary

Resolves [elastic#235895](elastic#235895)
When mv_expand is used, all documents added to indices share the same _id and @timestamp. This leads to indeterministic ordering when ElasticSearch is pulling documents. There is no tiebreaker, so we get unpredictable results. This fixes PR fixes a test that encounters this issue.
@hannahbrooks
Copy link
Copy Markdown
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

hannahbrooks added a commit to hannahbrooks/kibana that referenced this pull request Feb 19, 2026
…st (elastic#252936)

## Summary

Resolves [elastic#235895](elastic#235895)
When mv_expand is used, all documents added to indices share the same _id and @timestamp. This leads to indeterministic ordering when ElasticSearch is pulling documents. There is no tiebreaker, so we get unpredictable results. This fixes PR fixes a test that encounters this issue.

(cherry picked from commit 8cb144e)

# Conflicts:
#	x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/esql.ts
@hannahbrooks hannahbrooks added (do not use) backport:9.2 This doesn't do backports! use `backport:version` `v9.2.0` instead (do not use) backport:9.3 This doesn't do backports! use `backport:version` `v9.3.0` instead and removed backport:all-open Backport to all branches that could still receive a release labels Feb 19, 2026
hannahbrooks added a commit to elastic/docs-content that referenced this pull request Mar 10, 2026
<!--
Thank you for contributing to the Elastic Docs! 🎉
Use this template to help us efficiently review your contribution.
-->

## Summary

Requested upon discussion in issue
[#252936](elastic/kibana#252936).

When `mv_expand` is used, all documents added to indices share the same
`_id` and `@timestamp`. This leads to indeterministic ordering when
ElasticSearch is pulling documents. There is no tiebreaker, so we get
unpredictable results. Here we add a note to the docs to encourage users
to add a tiebreaker and avoid this behaviour.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment