Skip to content

Comments

ES|QL support for partial results#223198

Merged
nkhristinin merged 17 commits intoelastic:mainfrom
nkhristinin:esql-partial
Jun 20, 2025
Merged

ES|QL support for partial results#223198
nkhristinin merged 17 commits intoelastic:mainfrom
nkhristinin:esql-partial

Conversation

@nkhristinin
Copy link
Contributor

@nkhristinin nkhristinin commented Jun 10, 2025

ES|QL support for partial results

Issue

We have 2 use cases:

  • For aggregation query, we set allow_partial_results to false
  • For normal query we are set warning status if there cluster failures

How to test

  1. Create a datastream
PUT _index_template/my_datastream_template
{
  "index_patterns": ["my_datastream*"],
  "data_stream": {},        
  "template": {
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "host": {
          "properties": {
            "name": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}


PUT /_data_stream/my_datastream
  1. For a single specific index set broken mapping

GET my_datastream

PUT .ds-my_datastream-2025.06.11-000001/_mapping
{
  "runtime": {
    "broken": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['nonexistent_field'].value)"
      }
    }
  }
}
  1. Ingest document
POST my_datastream/_doc
{
  "@timestamp": "2025-06-05T09:04:11.493Z"
}
  1. Check that query return partial result true:
POST _query/async?drop_null_columns=true&allow_partial_results=true
{
  "query": "from my_datastream* METADATA _id | limit 101",
  "keep_alive": "60s"
}

response:

{
  "is_running": false,
  "took": 5,
  "is_partial": true,
...
  1. Create rule ES|QL with the same query and lookback which overlap documents you created on step 3.

Observe warning

Screenshot 2025-06-11 at 08 52 07

@nkhristinin
Copy link
Contributor Author

/ci

@nkhristinin
Copy link
Contributor Author

/ci

1 similar comment
@nkhristinin
Copy link
Contributor Author

/ci

@nkhristinin nkhristinin changed the title Handle cluster failures for ES|QL ES|QL support for partial results Jun 11, 2025
@nkhristinin nkhristinin marked this pull request as ready for review June 11, 2025 08:41
@nkhristinin nkhristinin requested a review from a team as a code owner June 11, 2025 08:41
@nkhristinin nkhristinin requested a review from dhurley14 June 11, 2025 08:41
@nkhristinin nkhristinin added backport:version Backport to applied version labels v8.19.0 release_note:skip Skip the PR/issue when compiling release notes labels Jun 11, 2025
@nkhristinin
Copy link
Contributor Author

@elasticmachine merge upstream

@yctercero
Copy link
Contributor

I can't remember if we do this for EQL partial, but should we mention in the message that alerts were generated for available shards (when we do allow partial results).

cc @approksiu

@nkhristinin
Copy link
Contributor Author

I can't remember if we do this for EQL partial, but should we mention in the message that alerts were generated for available shards (when we do allow partial results).

cc @approksiu

Update message to be similar with EQL. new message:

The ES|QL event query was only executed on the available shards. The query failed to run successfully on the following shards: ${error}

@nkhristinin
Copy link
Contributor Author

@elasticmachine merge upstream

@nkhristinin nkhristinin removed release_note:skip Skip the PR/issue when compiling release notes backport:version Backport to applied version labels v8.19.0 labels Jun 16, 2025
await esArchiver.unload(packetBeatPath);
});

it('should handle shard failures and include warning in logs', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test case for aggregating queries, where rule should fail and does not generate any alerts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

rule,
});

expect(logs).toEqual(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add assertion that rule still creates alerts from available shards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Contributor

@dhurley14 dhurley14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I have two small comments but this looks great! Thanks for the easy testing instructions.

Screenshot 2025-06-18 at 4 43 54 PM

const clusters = response?._clusters?.details ?? {};
const shardFailures = Object.keys(clusters).reduce<EsqlEsqlShardFailure[]>((acc, cluster) => {
const failures = clusters[cluster]?.failures ?? [];
return [...acc, ...failures];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no cluster failures for a given cluster we are still spreading out the accumulated failures. I imagine if there are many clusters failing, that accumulator could grow large. I'd suggest adding logic to update the accumulator directly or only spread the accumulator if there are new failures to be added.


const esqlQueryString = {
drop_null_columns: true,
allow_partial_results: !isRuleAggregating,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here just stating allow_partial_results defaults to true? I like to have it documented in-line.

Ref: elastic/elasticsearch#122802

@nkhristinin nkhristinin requested a review from vitaliidm June 19, 2025 07:08
it('should handle shard failures and include errors in logs for query that is aggregating', async () => {
const rule: EsqlRuleCreateProps = {
...getCreateEsqlRulesSchemaMock(),
query: `from packetbeat-* | stats _count=count(non_existing) by @timestamp`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query in test should be valid, otherwise test validates error on invalid query, not unavailable shard

Copy link
Contributor

@vitaliidm vitaliidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests look good

@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #56 / Actions and Triggers app Maintenance Windows Maintenance windows table paginates maintenance windows correctly

Metrics [docs]

✅ unchanged

History

@nkhristinin nkhristinin enabled auto-merge (squash) June 20, 2025 07:03
@nkhristinin nkhristinin merged commit 8bd7f0e into elastic:main Jun 20, 2025
10 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.19

https://github.com/elastic/kibana/actions/runs/15774826641

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jun 20, 2025
## ES|QL support for partial results

[Issue](elastic#211622)

We have 2 use cases:

- For aggregation query, we set `allow_partial_results` to false
- For normal query we are set warning status if there cluster failures

### How to test

1. Create a datastream

```
PUT _index_template/my_datastream_template
{
  "index_patterns": ["my_datastream*"],
  "data_stream": {},
  "template": {
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "host": {
          "properties": {
            "name": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

PUT /_data_stream/my_datastream
```

2.  For a single specific index set broken mapping

```

GET my_datastream

PUT .ds-my_datastream-2025.06.11-000001/_mapping
{
  "runtime": {
    "broken": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['nonexistent_field'].value)"
      }
    }
  }
}
```

3.  Ingest document

```
POST my_datastream/_doc
{
  "@timestamp": "2025-06-05T09:04:11.493Z"
}
```

4. Check that query return partial result true:

```
POST _query/async?drop_null_columns=true&allow_partial_results=true
{
  "query": "from my_datastream* METADATA _id | limit 101",
  "keep_alive": "60s"
}
```

response:
```
{
  "is_running": false,
  "took": 5,
  "is_partial": true,
...
```

4. Create rule ES|QL with the same query and lookback which overlap
documents you created on step 3.

Observe warning

<img width="1261" alt="Screenshot 2025-06-11 at 08 52 07"
src="https://github.com/user-attachments/assets/c371f57b-51ff-4a13-96e3-19e2094d794c"
/>

---------

Co-authored-by: Vitalii Dmyterko <92328789+vitaliidm@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit 8bd7f0e)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Jun 23, 2025
@kibanamachine
Copy link
Contributor

Looks like this PR has a backport PR but it still hasn't been merged. Please merge it ASAP to keep the branches relatively in sync.
cc: @nkhristinin

kibanamachine added a commit that referenced this pull request Jun 23, 2025
# Backport

This will backport the following commits from `main` to `8.19`:
- [ES|QL support for partial results
(#223198)](#223198)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Khristinin
Nikita","email":"nikita.khristinin@elastic.co"},"sourceCommit":{"committedDate":"2025-06-20T08:33:31Z","message":"ES|QL
support for partial results (#223198)\n\n## ES|QL support for partial
results\n\n[Issue](#211622)
\n\n\nWe have 2 use cases:\n\n- For aggregation query, we set
`allow_partial_results` to false\n- For normal query we are set warning
status if there cluster failures\n\n\n\n\n\n### How to test\n\n1. Create
a datastream\n\n```\nPUT _index_template/my_datastream_template\n{\n
\"index_patterns\": [\"my_datastream*\"],\n \"data_stream\": {}, \n
\"template\": {\n \"mappings\": {\n \"properties\": {\n \"@timestamp\":
{\n \"type\": \"date\"\n },\n \"host\": {\n \"properties\": {\n
\"name\": {\n \"type\": \"keyword\"\n }\n }\n }\n }\n }\n }\n}\n\n\nPUT
/_data_stream/my_datastream\n```\n\n2. For a single specific index set
broken mapping\n\n```\n\nGET my_datastream\n\nPUT
.ds-my_datastream-2025.06.11-000001/_mapping\n{\n \"runtime\": {\n
\"broken\": {\n \"type\": \"keyword\",\n \"script\": {\n \"source\":
\"emit(doc['nonexistent_field'].value)\"\n }\n }\n }\n}\n```\n\n3.
Ingest document\n\n```\nPOST my_datastream/_doc\n{\n \"@timestamp\":
\"2025-06-05T09:04:11.493Z\"\n}\n```\n\n4. Check that query return
partial result true:\n\n```\nPOST
_query/async?drop_null_columns=true&allow_partial_results=true\n{\n
\"query\": \"from my_datastream* METADATA _id | limit 101\",\n
\"keep_alive\": \"60s\"\n}\n```\n\nresponse:\n```\n{\n \"is_running\":
false,\n \"took\": 5,\n \"is_partial\": true,\n...\n```\n\n4. Create
rule ES|QL with the same query and lookback which overlap\ndocuments you
created on step 3.\n\nObserve warning\n\n<img width=\"1261\"
alt=\"Screenshot 2025-06-11 at 08 52
07\"\nsrc=\"https://github.com/user-attachments/assets/c371f57b-51ff-4a13-96e3-19e2094d794c\"\n/>\n\n---------\n\nCo-authored-by:
Vitalii Dmyterko
<92328789+vitaliidm@users.noreply.github.com>\nCo-authored-by: Elastic
Machine
<elasticmachine@users.noreply.github.com>","sha":"8bd7f0e522ef861a8154fcf982c62ee759220422","branchLabelMapping":{"^v9.1.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","backport:version","v9.1.0","v8.19.0"],"title":"ES|QL
support for partial
results","number":223198,"url":"https://github.com/elastic/kibana/pull/223198","mergeCommit":{"message":"ES|QL
support for partial results (#223198)\n\n## ES|QL support for partial
results\n\n[Issue](#211622)
\n\n\nWe have 2 use cases:\n\n- For aggregation query, we set
`allow_partial_results` to false\n- For normal query we are set warning
status if there cluster failures\n\n\n\n\n\n### How to test\n\n1. Create
a datastream\n\n```\nPUT _index_template/my_datastream_template\n{\n
\"index_patterns\": [\"my_datastream*\"],\n \"data_stream\": {}, \n
\"template\": {\n \"mappings\": {\n \"properties\": {\n \"@timestamp\":
{\n \"type\": \"date\"\n },\n \"host\": {\n \"properties\": {\n
\"name\": {\n \"type\": \"keyword\"\n }\n }\n }\n }\n }\n }\n}\n\n\nPUT
/_data_stream/my_datastream\n```\n\n2. For a single specific index set
broken mapping\n\n```\n\nGET my_datastream\n\nPUT
.ds-my_datastream-2025.06.11-000001/_mapping\n{\n \"runtime\": {\n
\"broken\": {\n \"type\": \"keyword\",\n \"script\": {\n \"source\":
\"emit(doc['nonexistent_field'].value)\"\n }\n }\n }\n}\n```\n\n3.
Ingest document\n\n```\nPOST my_datastream/_doc\n{\n \"@timestamp\":
\"2025-06-05T09:04:11.493Z\"\n}\n```\n\n4. Check that query return
partial result true:\n\n```\nPOST
_query/async?drop_null_columns=true&allow_partial_results=true\n{\n
\"query\": \"from my_datastream* METADATA _id | limit 101\",\n
\"keep_alive\": \"60s\"\n}\n```\n\nresponse:\n```\n{\n \"is_running\":
false,\n \"took\": 5,\n \"is_partial\": true,\n...\n```\n\n4. Create
rule ES|QL with the same query and lookback which overlap\ndocuments you
created on step 3.\n\nObserve warning\n\n<img width=\"1261\"
alt=\"Screenshot 2025-06-11 at 08 52
07\"\nsrc=\"https://github.com/user-attachments/assets/c371f57b-51ff-4a13-96e3-19e2094d794c\"\n/>\n\n---------\n\nCo-authored-by:
Vitalii Dmyterko
<92328789+vitaliidm@users.noreply.github.com>\nCo-authored-by: Elastic
Machine
<elasticmachine@users.noreply.github.com>","sha":"8bd7f0e522ef861a8154fcf982c62ee759220422"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/223198","number":223198,"mergeCommit":{"message":"ES|QL
support for partial results (#223198)\n\n## ES|QL support for partial
results\n\n[Issue](#211622)
\n\n\nWe have 2 use cases:\n\n- For aggregation query, we set
`allow_partial_results` to false\n- For normal query we are set warning
status if there cluster failures\n\n\n\n\n\n### How to test\n\n1. Create
a datastream\n\n```\nPUT _index_template/my_datastream_template\n{\n
\"index_patterns\": [\"my_datastream*\"],\n \"data_stream\": {}, \n
\"template\": {\n \"mappings\": {\n \"properties\": {\n \"@timestamp\":
{\n \"type\": \"date\"\n },\n \"host\": {\n \"properties\": {\n
\"name\": {\n \"type\": \"keyword\"\n }\n }\n }\n }\n }\n }\n}\n\n\nPUT
/_data_stream/my_datastream\n```\n\n2. For a single specific index set
broken mapping\n\n```\n\nGET my_datastream\n\nPUT
.ds-my_datastream-2025.06.11-000001/_mapping\n{\n \"runtime\": {\n
\"broken\": {\n \"type\": \"keyword\",\n \"script\": {\n \"source\":
\"emit(doc['nonexistent_field'].value)\"\n }\n }\n }\n}\n```\n\n3.
Ingest document\n\n```\nPOST my_datastream/_doc\n{\n \"@timestamp\":
\"2025-06-05T09:04:11.493Z\"\n}\n```\n\n4. Check that query return
partial result true:\n\n```\nPOST
_query/async?drop_null_columns=true&allow_partial_results=true\n{\n
\"query\": \"from my_datastream* METADATA _id | limit 101\",\n
\"keep_alive\": \"60s\"\n}\n```\n\nresponse:\n```\n{\n \"is_running\":
false,\n \"took\": 5,\n \"is_partial\": true,\n...\n```\n\n4. Create
rule ES|QL with the same query and lookback which overlap\ndocuments you
created on step 3.\n\nObserve warning\n\n<img width=\"1261\"
alt=\"Screenshot 2025-06-11 at 08 52
07\"\nsrc=\"https://github.com/user-attachments/assets/c371f57b-51ff-4a13-96e3-19e2094d794c\"\n/>\n\n---------\n\nCo-authored-by:
Vitalii Dmyterko
<92328789+vitaliidm@users.noreply.github.com>\nCo-authored-by: Elastic
Machine
<elasticmachine@users.noreply.github.com>","sha":"8bd7f0e522ef861a8154fcf982c62ee759220422"}},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

---------

Co-authored-by: Khristinin Nikita <nikita.khristinin@elastic.co>
Co-authored-by: Vitalii Dmyterko <92328789+vitaliidm@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Nikita Khristinin <nkhristinin@gmail.com>
@kibanamachine kibanamachine removed the backport missing Added to PRs automatically when the are determined to be missing a backport. label Jun 23, 2025
akowalska622 pushed a commit to akowalska622/kibana that referenced this pull request Jun 25, 2025
## ES|QL support for partial results

[Issue](elastic#211622) 


We have 2 use cases:

- For aggregation query, we set `allow_partial_results` to false
- For normal query we are set warning status if there cluster failures





### How to test

1. Create a datastream

```
PUT _index_template/my_datastream_template
{
  "index_patterns": ["my_datastream*"],
  "data_stream": {},        
  "template": {
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "host": {
          "properties": {
            "name": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}


PUT /_data_stream/my_datastream
```

2.  For a single specific index set broken mapping

```

GET my_datastream

PUT .ds-my_datastream-2025.06.11-000001/_mapping
{
  "runtime": {
    "broken": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['nonexistent_field'].value)"
      }
    }
  }
}
```

3.  Ingest document

```
POST my_datastream/_doc
{
  "@timestamp": "2025-06-05T09:04:11.493Z"
}
```

4. Check that query return partial result true:

```
POST _query/async?drop_null_columns=true&allow_partial_results=true
{
  "query": "from my_datastream* METADATA _id | limit 101",
  "keep_alive": "60s"
}
```

response:
```
{
  "is_running": false,
  "took": 5,
  "is_partial": true,
...
```

4. Create rule ES|QL with the same query and lookback which overlap
documents you created on step 3.

Observe warning

<img width="1261" alt="Screenshot 2025-06-11 at 08 52 07"
src="https://github.com/user-attachments/assets/c371f57b-51ff-4a13-96e3-19e2094d794c"
/>

---------

Co-authored-by: Vitalii Dmyterko <92328789+vitaliidm@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
@florent-leborgne florent-leborgne added the Feature:ES|QL ES|QL related features in Kibana label Jun 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels Feature:ES|QL ES|QL related features in Kibana release_note:enhancement v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants