[ES|QL] Enable CCS tests for subqueries by fang-xing-esql · Pull Request #137776 · elastic/elasticsearch

fang-xing-esql · 2025-11-08T03:50:25Z

This feature is behind snapshot, and not formally released yet.

This is to support #136035, and it is also to support https://github.com/elastic/esql-planning/issues/88

This PR addresses the following items:

Enabled CCS tests with subqueries in the following tests:
- Added CrossClusterSubqueryIT
- Added CrossClusterSubqueryUnavailableRemotesIT
- Updated MultiClusterSpecIT, convert queries with subqueries to use remote index patterns, by recognizing multiple from commands in the query
- Updated subquery.csv-spec to allow more queries to run with MultiClusterSpecIT, removed the fork tags, and removed metadata from some of the queries.
Prune subqueries that do not have a valid index pattern found for them, so that the query can continue as far as there is valid IndexResolution for a subset of the subqueries.
- Updated EsqlSession to recognize subquery index patterns that do not have a valid IndexResolution found, and mark them as EMPTY_SUBQUERY
- A new rule PruneEmptyUnionAllBranch is added in Analyzer to prune empty subqueries.

Item that will be addressed in the next PRs:

We still have some restrictions of subquery(with remote index pattern) with lookup join, it is marked as TODO in the CrossClusterSubqueryIT, and it will be addressed in a separate PR.
The status and metrics collected in EsqlExecutionInfo may not reflect the situation of fork or subquery correctly, it will be addressed in a separate PR, as we need to clarify the execution time behaviors when something went wrong with one subquery, how do we process the other subqueries.
It is likely future change is needed to support CPS with subqueries.

** Prerequisite**:

This PR ensures consistent results for subquery and fork with CCS.

…the cluster cannot be found

elasticsearchmachine · 2025-11-12T00:36:52Z

Hi @fang-xing-esql, I've created a changelog YAML for you.

elasticsearchmachine · 2025-11-12T00:37:50Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

smalyshev

I reviewed part of it, will finish up tomorrow.

smalyshev · 2025-11-12T22:46:01Z

...ulti-clusters/src/javaRestTest/java/org/elasticsearch/xpack/esql/ccq/MultiClusterSpecIT.java

+        for (int i = 0; i <= input.length() - delimiterLength; i++) {
+            char c = input.charAt(i);
+
+            if (c == '(') {


I hope we don't have (s inside strings anywhere, otherwise this I think would break.

We don't have fancy queries like this in CsvTests yet, and we don't check for tokens like from, metadata , |, ), '(' either. This is a very lightweight parsing utility targeting to existing subqueries in CsvTests. If there are more fancy subqueries added to CsvTests in the future, it needs to be enhanced.

smalyshev · 2025-11-12T23:04:02Z

...ulti-clusters/src/javaRestTest/java/org/elasticsearch/xpack/esql/ccq/MultiClusterSpecIT.java

+    /**
+     * Convert index patterns and subqueries in FROM commands to use remote indices.
+     */
+    private static String convertSubqueryToRemoteIndices(String testQuery) {


What about SET commands that can precede FROM? Are those supported?

SET is not tested, and specially characters that could be a token used by query parser is not tested here either. We are not implementing a full query parser here. Just to make sure all the queries in CsvTests with the subquery capability works fine in this PR. If more fancy queries are added into CsvTests, we will need to enhance it.

smalyshev · 2025-11-12T23:05:32Z

...ulti-clusters/src/javaRestTest/java/org/elasticsearch/xpack/esql/ccq/MultiClusterSpecIT.java

+        for (String indexPatternOrSubquery : indexPatternsAndSubqueries) {
+            // remove the from keyword if it's there
+            indexPatternOrSubquery = indexPatternOrSubquery.strip();
+            if (indexPatternOrSubquery.toLowerCase(Locale.ROOT).startsWith("from ")) {


This seems to imply every element of indexPatternsAndSubqueries can start with FROM, but there could be only one FROM and then comma-separated list of expressions... Not sure if it's important for particular queries in the tests, but seems incorrect.

Each indexPatternOrSubquery could be one of the following:

from + index patterns: from index1

index patterns: index2

subqueries: (from index3 | where a > 0), subqueries will be processed recursively.

Wait, ate you saying you can do FROM index1, FROM index2, FROM index3?

Wait, ate you saying you can do FROM index1, FROM index2, FROM index3?

I don't think this is a valid CsvTests query, this method might be able to deal with it, however it will not see this pattern in queries from CsvTests...

...ulti-clusters/src/javaRestTest/java/org/elasticsearch/xpack/esql/ccq/MultiClusterSpecIT.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

smalyshev · 2025-11-12T23:33:14Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

+                // take care of removing the subquery during analysis.
+                // If all subqueries have invalid index resolution, we should fail in Analyzer's verifier.
+                if (r.indexResolution.isEmpty() == false // it is not a row
+                    && r.indexResolution.size() > 1 // there is a subquery


Can subquery be the only clause in FROM? I understand it's kinda weird to do it, but is it possible?

If a subquery is the only clause in a FROM, it will be merged into the main query by parser, there is a test here.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

smalyshev · 2025-11-12T23:39:00Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

+                            String clusterAlias = entry.getKey();
+                            String indexExpression = entry.getValue();
+                            EsqlExecutionInfo.Cluster cluster = executionInfo.getCluster(clusterAlias);
+                            if (indexExpression.equals(cluster.getIndexExpression())) {


Not sure I understand this check - if it's the entry under this cluster's alias, why we need to check again?

clustersWithInvalidIndexResolutions is a map, its key is cluster alias and value is all of the index patterns that are marked as invalid, because they are missing on that cluster.

If all of the indices are missing on a cluster, the cluster is marked as SKIPPED here in EsqlExwecutionInfo. If some of the indices are available, and some are missing, the cluster is not marked as SKIPPED. We do this check here mainly for the situations that a remote cluster is referenced by multiple subqueries.

smalyshev · 2025-11-12T23:49:46Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/subquery.csv-spec

 required_capability: subquery_in_from_command

-FROM employees, (FROM sample_data | EVAL x = client_ip::keyword ) metadata _index
+FROM employees, (FROM sample_data metadata _index | EVAL x = client_ip::keyword ) metadata _index


What about this check:

assumeFalse("can't test with _index metadata", (remoteMetadata == false) && hasIndexMetadata(testCase.query));

wouldn't it interfere with this test?

It does. I'd like to keep some metadata tests for subqueries here. So I removed metadata option from a few queries in this file and hope to cover more subquery tests in MultiClusterSpecIT.

smalyshev · 2025-11-13T19:05:13Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/subquery.csv-spec

@@ -3,7 +3,6 @@
 //

 subqueryInFrom


Question for my education - if we have two sets of metadata fields, like this: FROM index1, (FROM index2 METADATA _index, _id) METADATA _index, _version - what is the resulting field set here? Is it a union, an intersection or something else?

It is an UnionAll of the results of all subqueries, if there are duplicated records among the subqueries, the duplicates are not removed.

I am not asking about rows, but about column names - if there's _index in subquery and in the main expression, how does it work? Does main subquery _index only fills the column for actual indices, not subqueries?

The test has metadata for both the main and subquery, both indices referenced by main and subquery return.

...src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/CrossClusterSubqueryIT.java

idegtiarenko · 2025-11-17T07:25:07Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlCCSUtils.java

+    ) {
+        // check if all clusters involved have skip_unavailable = true
+        boolean allClustersSkipUnavailable = true;
+        var groupedIndices = indicesGrouper.groupIndices(


We could no longer rely on indicesGrouper as it is not going to be available for CPS queries.

I am going to open a pr with alternative for it this week. I will keep you updated.

I have merged #138396
You should be able to access grouped original or concrete indices from EsRelation or EsIndex now.
Please let me know if you need any help with it.

The reference to IndicesExpressionGrouper when identifying empty subquery is removed.

idegtiarenko · 2025-11-17T07:31:06Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

                }
+                // Check if a subquery need to be pruned. If some but not all the subqueries has invalid index resolution,
+                // and all the clusters referenced by the subquery that has invalid index resolution have skipUnavailable=true,
+                // try to prune it by setting IndexResolution to EMPTY_SUBQUERY. Analyzer.PruneEmptyUnionAllBranch will


I believe this should also add a warning with remote and failure.

I believe this should also add a warning with remote and failure.

The empty subqueries with no valid index pattern found after field caps is logged in debug log in this PR. EsqlExecutionInfo seems to be a good place to add some details(for both planning and execution) at subquery level. It is added as a subtask here.

idegtiarenko

We should not introduce more usages of indicesGrouper since it is not going to be available in CPS.

fang-xing-esql · 2025-11-17T17:20:50Z

Thank you so much for reviewing @smalyshev @idegtiarenko ! Stas's comments have been addressed(let me know if I missed anything), and I'll wait for Ievgen's PR and then adjust the usage of indicesGrouper.

fang-xing-esql · 2025-11-17T17:22:01Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeService.java

+                    // miss the results from a branch and return wrong results. Both RUNNING and SUCCESSFUL mean we should continue
+                    // processing the next compute on remote cluster in the queue. If it is partial, it is fine to skip the subsequent
+                    // branches as something went wrong already.
+                    if (clusterStatus != EsqlExecutionInfo.Cluster.Status.RUNNING


@smalyshev I'd like to discuss a bit more with you about the change here.

This change would redefine how statuses work, I am not sure that's the right thing to do. SUCCESSFUL means "we are done with this cluster and all went well", PARTIAL means "we are done with this cluster and we had to skip some data". I don't think we should put cluster into SUCCESSFUL state before the main query is finished. Maybe we need more statuses here?

Also, I am not sure this:

If it is partial, it is fine to skip the subsequent branches as something went wrong already.

is correct. If one subquery partially failed, but we did not set the cluster to SKIPPED, I am not convinced it's ok to skip the other subqueries. In fact, we may have to revisit when we set clusters to SKIPPED - e.g. if there's a missing index in one of the subqueries, and we have partial results on, it shouldn't skip this cluster in all other subqueries I think. Also, if one index is corrupt, it doesn't mean other subqueries will fail. Again, we may need to consider maybe we need more fine-grained checks here now that we have subqueries.

Discussed with @smalyshev offline, today the way that the cluster status is updated in EsqlExecutionInfo does not quite work with the branch model of fork/subquery. The SUCCESSFUL status in EsqlExecutionInfo was set at branch level(the same for the metrics saved in EsqlExecutionInfo), not at the whole query level, so we might lose data from branches if there is overlap on the remote cluster between branches.

This needs more discussion, and should be addressed in a separate PR. We will need to find a better way to set cluster status and metrics in EsqlExecutionInfo. I'm going to remove this change from this PR, and make this PR add stable tests for subqueries with CCS only(avoid overlap remote cluster between subqueries(branches) for now).

This is the PR that ensures subquery/fork return deterministic results.

…subquery

…und by fieldcaps

fang-xing-esql · 2025-11-26T13:55:35Z

We should not introduce more usages of indicesGrouper since it is not going to be available in CPS.

@idegtiarenko The usage of indicesGrouper added by this PR has been removed, could you take a look again to see if there is anything left that might have negative impact to CPS?

idegtiarenko · 2025-12-02T12:19:40Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

+                            .anyMatch(ir -> ir.isValid() && ir.get().originalIndices().get(clusterAlias) != null);
+                        if (clusterHasValidIndex == false) {
+                            String errorMsg = "no valid indices found in any subquery " + EsqlCCSUtils.inClusterName(clusterAlias);
+                            LOGGER.debug(errorMsg);


Do we need this log statement here? If so, could we convert it to logger formatting instead of unconditional string concatenation?

Done, this message is attached to the failure/warning message to skipped cluster as well.

idegtiarenko · 2025-12-04T13:16:21Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java

+                            String errorMsg = "no valid indices found in any subquery {}";
+                            LOGGER.debug(errorMsg, EsqlCCSUtils.inClusterName(clusterAlias));


Suggested change

String errorMsg = "no valid indices found in any subquery {}";

LOGGER.debug(errorMsg, EsqlCCSUtils.inClusterName(clusterAlias));

LOGGER.debug("no valid indices found in any subquery {}", EsqlCCSUtils.inClusterName(clusterAlias));

idegtiarenko

Looks good from index resolution point of view 👍

…ter tests

fang-xing-esql · 2025-12-05T16:17:05Z

Thank you for reviewing @idegtiarenko @smalyshev !

enable CCS tests for subqueries

2a05b89

elasticsearchmachine added the v9.3.0 label Nov 8, 2025

fang-xing-esql added the >enhancement label Nov 8, 2025

fang-xing-esql and others added 2 commits November 8, 2025 08:57

Merge branch 'main' into enable-subquery-ccs-tests

e1d0b40

fix tests

32751c8

fang-xing-esql added the test-release Trigger CI checks against release build label Nov 8, 2025

fang-xing-esql and others added 4 commits November 10, 2025 19:47

prune empty subquery when skipUnavailable=true

849060f

Merge branch 'main' into enable-subquery-ccs-tests

4555201

update cluster as skipped only if all index patterns associated with …

8bc48bf

…the cluster cannot be found

Merge branch 'main' into enable-subquery-ccs-tests

b22ce97

fang-xing-esql added the :Analytics/ES|QL AKA ESQL label Nov 12, 2025

Update docs/changelog/137776.yaml

76214e6

fang-xing-esql requested a review from smalyshev November 12, 2025 00:37

Merge branch 'main' into enable-subquery-ccs-tests

67c21e5

fang-xing-esql marked this pull request as ready for review November 12, 2025 00:37

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 12, 2025

smalyshev reviewed Nov 12, 2025

View reviewed changes

smalyshev reviewed Nov 13, 2025

View reviewed changes

fang-xing-esql and others added 2 commits November 16, 2025 12:04

update according to review comments

6b48eab

Merge branch 'main' into enable-subquery-ccs-tests

f46c2c4

idegtiarenko reviewed Nov 17, 2025

View reviewed changes

idegtiarenko requested changes Nov 17, 2025

View reviewed changes

fang-xing-esql commented Nov 17, 2025

View reviewed changes

parkertimmins mentioned this pull request Nov 19, 2025

Add binary doc value compression with variable doc count blocks #137139

Merged

Merge branch 'main' into enable-subquery-ccs-tests

85d728e

fang-xing-esql mentioned this pull request Nov 25, 2025

[ES|QL] Do not skip a remote cluster base on the query's execution time status #138332

Merged

fang-xing-esql and others added 2 commits November 25, 2025 08:56

pick up EsqlQueryRequest change in main

cc966bb

Merge branch 'main' into enable-subquery-ccs-tests

2b46613

fang-xing-esql mentioned this pull request Nov 25, 2025

[ES|QL] Support remote cluster in subqueries in from command #136035

Open

5 tasks

fang-xing-esql and others added 3 commits November 25, 2025 23:01

remove the reference to IndicesExpressionGrouper when identify empty …

490184f

…subquery

add debug log for subqueries that do not have valid index patterns fo…

60cd5ef

…und by fieldcaps

Merge branch 'main' into enable-subquery-ccs-tests

e1c210f

fang-xing-esql added 2 commits December 1, 2025 16:25

update according to review comments

5a8db0f

Merge branch 'main' into enable-subquery-ccs-tests

3c17204

idegtiarenko self-requested a review December 2, 2025 12:00

idegtiarenko reviewed Dec 2, 2025

View reviewed changes

fang-xing-esql and others added 4 commits December 2, 2025 08:40

Merge branch 'main' into enable-subquery-ccs-tests

e96a3f3

refactor according to review comments

feacb24

Merge branch 'main' into enable-subquery-ccs-tests

786ac8f

Merge branch 'main' into enable-subquery-ccs-tests

8feab3e

idegtiarenko self-requested a review December 3, 2025 14:10

idegtiarenko reviewed Dec 4, 2025

View reviewed changes

idegtiarenko approved these changes Dec 4, 2025

View reviewed changes

fang-xing-esql and others added 3 commits December 4, 2025 09:08

Merge branch 'main' into enable-subquery-ccs-tests

2ecf4bf

Merge branch 'main' into enable-subquery-ccs-tests

1ff8380

remove an ignored subquery CsvTests that causes trouble to mixed clus…

ede6f41

…ter tests

fang-xing-esql merged commit 733eee4 into elastic:main Dec 5, 2025
35 checks passed

ioanatia mentioned this pull request Dec 11, 2025

ES|QL: Enable CCS tests for FORK #139302

Merged

fang-xing-esql deleted the enable-subquery-ccs-tests branch January 30, 2026 14:39

		String errorMsg = "no valid indices found in any subquery {}";
		LOGGER.debug(errorMsg, EsqlCCSUtils.inClusterName(clusterAlias));

	String errorMsg = "no valid indices found in any subquery {}";
	LOGGER.debug(errorMsg, EsqlCCSUtils.inClusterName(clusterAlias));
	LOGGER.debug("no valid indices found in any subquery {}", EsqlCCSUtils.inClusterName(clusterAlias));

Conversation

fang-xing-esql commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 12, 2025

Uh oh!

elasticsearchmachine commented Nov 12, 2025

Uh oh!

smalyshev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

idegtiarenko left a comment

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql commented Nov 17, 2025

Uh oh!

fang-xing-esql commented Nov 8, 2025 •

edited

Loading

fang-xing-esql Nov 17, 2025 •

edited

Loading

fang-xing-esql Nov 18, 2025 •

edited

Loading

fang-xing-esql Nov 18, 2025 •

edited

Loading

smalyshev Nov 17, 2025 •

edited

Loading

fang-xing-esql Nov 18, 2025 •

edited

Loading