[ES|QL] Non-Correlated Subquery in FROM command#135744
[ES|QL] Non-Correlated Subquery in FROM command#135744fang-xing-esql merged 37 commits intoelastic:mainfrom
Conversation
|
Hi @fang-xing-esql, I've created a changelog YAML for you. |
8a72832 to
0c5b79d
Compare
|
Hi @fang-xing-esql, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Pull Request Overview
This PR introduces support for non-correlated subqueries within the FROM command in ES|QL, allowing queries to reference multiple data sources including both index patterns and subqueries. The implementation enables subqueries to be processed similarly to Fork operations, with key distinctions in index resolution and predicate pushdown capabilities.
- Adds grammar and parser support for subquery syntax in FROM commands
- Implements UnionAll logical plan to handle mixed index patterns and subqueries
- Enables predicate pushdown optimization specifically for UnionAll operations
Reviewed Changes
Copilot reviewed 36 out of 39 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| EsqlBaseParser.g4 | Updates grammar to support subquery syntax in FROM_MODE |
| LogicalPlanBuilder.java | Creates UnionAll plans and handles subquery/index pattern combinations |
| UnionAll.java | New logical plan extending Fork with union-typed field support |
| Subquery.java | New logical plan node representing subquery placeholders |
| Analyzer.java | Resolves subquery indices and handles union-typed fields |
| PushDownAndCombineFilters.java | Adds predicate pushdown optimization for UnionAll |
| EsqlSession.java | Implements subquery index resolution during pre-analysis |
| Various test files | Adds comprehensive test coverage for subquery functionality |
Comments suppressed due to low confidence (1)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/SubqueryTests.java:1
- There's a typo in "nested fork/subquery is not supported, it passes Analyzer" - should be "nested fork/subquery is not supported; it passes Analyzer" (semicolon instead of comma for better grammar).
/*
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| } | ||
| return parent; | ||
| } else { // We should not reach here as the grammar does not allow it | ||
| throw new ParsingException("FROM is required in a subquery"); |
There was a problem hiding this comment.
The error message "FROM is required in a subquery" is misleading since the grammar already enforces this requirement. Consider a more descriptive message like "Invalid subquery structure" or remove the comment and exception if this code path is truly unreachable.
| throw new ParsingException("FROM is required in a subquery"); | |
| throw new ParsingException("Invalid subquery structure"); |
| LogicalPlan newChild = switch (child) { | ||
| case Project project -> maybePushDownFilterPastProjectForUnionAllChild(pushable, project); | ||
| case Limit limit -> maybePushDownFilterPastLimitForUnionAllChild(pushable, limit); | ||
| default -> null; // TODO add a general push down for unexpected pattern |
There was a problem hiding this comment.
The TODO comment indicates incomplete functionality. Consider implementing the general push down logic or at least provide a more specific plan for when this will be addressed, as returning null could lead to silent failures in optimization.
| default -> null; // TODO add a general push down for unexpected pattern | |
| default -> { | |
| // Fallback: unknown child type, do not push down filter for this child. | |
| // Consider implementing general push down logic here in the future. | |
| yield child; | |
| } |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Outdated
Show resolved
Hide resolved
| boolean supportsAggregateMetricDouble, | ||
| boolean supportsDenseVector | ||
| boolean supportsDenseVector, | ||
| Set<IndexPattern> subqueryIndices |
There was a problem hiding this comment.
Merging this subqueryIndices into the mainIndices is another option, it will require changes to EsqlCCSUtils.initCrossClusterState and EsqlCCSUtils.createIndexExpressionFromAvailableClusters, as they associate the ExecutionInfo with only one index pattern today.
| hasCapabilities(adminClient(), List.of(ENABLE_FORK_FOR_REMOTE_INDICES.capabilityName())) | ||
| ); | ||
| } | ||
| // Subqueries in FROM are not fully tested in CCS yet |
There was a problem hiding this comment.
When there is subquery exists in the query convertToRemoteIndices doesn't generate a correct remote index pattern yet, the query becomes invalid. Subqueries are not fully tested in CCS yet, working on it as a follow up.
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/SubqueryTests.java
Show resolved
Hide resolved
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/kibana-esql (ES|QL-ui) |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownAndCombineFilters.java
Outdated
Show resolved
Hide resolved
|
Thank you so much for reviewing this PR in such depth, I really appreciate the time and thought you put into it, @astefan ! I'll create a follow up issue for improving I made a couples of changes and added additional tests related to subqueries combined with fork and full-text functions, both in the main query and within subqueries. Initially, I planned to address these in a separate PR since this one is already quite large, but I think these scenarios are important enough to include here. First, Second, the validation of commands before a full-text function, as well as the verification of the field referenced by that function, is now deferred to |
Thank you so much for your thoughtful reviews @luigidellaquila! They really help this PR to become a better PR. I think I addressed all of them, please just let me know if there is anything that I missed. |
luigidellaquila
left a comment
There was a problem hiding this comment.
Thanks @fang-xing-esql
There are still a couple of things I'd like to check in detail (the pushdown logic in particular), but for what I could see it looks good, so I'm approving to unblock the merge.
| return DATE_NANOS; | ||
| } | ||
|
|
||
| if (t1.isCounter()) { |
There was a problem hiding this comment.
nit: I'd expect this logic to be in EsqlDataTypeConverter.commonType() (that you are using below btw).
The only logical difference is noCounter().
Maybe commonTypes() could just do
for (List<Attribute> out : outputs) {
type = EsqlDataTypeConverter.commonType(type, out.get(i).dataType()).noCounter();
}
bpintea
left a comment
There was a problem hiding this comment.
Started my journey on this PR. Minor/optional notes only.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownFilterAndLimitIntoUnionAll.java
Outdated
Show resolved
Hide resolved
...lasticsearch/xpack/esql/optimizer/rules/logical/PushDownFilterAndLimitIntoUnionAllTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
It'd be great to have some javadoc for this class.
(I would seem it's mostly useful in PushDownFilterAndLimitIntoUnionAll, where we first push "below" UnionAll and then once there, below Subquery?
Seems like the LogicalPlanBuilder#visitRelation always places the index patern first, followed by the subqueries.
I'm wondering thus if this marker node is really needed. Wouldn't these push downs be doable on each branch by existing rules, if Subquery wasn't there?
But still to look into it and didn't grasp it all yet. Javadoc would be great in any case, though. :) )
There was a problem hiding this comment.
There are a couple of reasons that Subquery node exists in the plan.
First, originally I added this node because in the future we may want to support qualifiers with subqueries(like FROM idx1, (FROM idx2, idx3) as idx23), and this is a place that we can store the name(qualifier) of the subqueries, that can be referenced by the parent query.
Second, when I started working on pushing down filters into subqueries, I realize PushDownAndCombineFilters does not push down filters below limit intentionally, refer to here, however UnionAll/Fork adds a implicit limit for each branch, so this rule doesn't help the predicate pushdown for subqueries. The Subquery node is also used as a pattern for predicate pushdown. I'll add more comments in the code.
bpintea
left a comment
There was a problem hiding this comment.
One issue I see that we could align is the treatment of the "union"/conflicting types: in case these emerge out of an index pattern, we return the type as unsupported, along with the original_types and a suggested_cast. User is in the know, all good.
With UnionAll, we return it of type keyword, even if no index in the union has the conflicting field of that type and null values, but with no other indication whatsoever that there's a conflict. The user won't know of the conflict, which I think is problematic, as this could occur frequently.
Don't know if we have a decision here or track this somewhere?
(Some more comments to follow, sorry :) )
| ThreadPool.Names.SEARCH_COORDINATION, | ||
| ThreadPool.Names.SYSTEM_READ | ||
| ); | ||
| if (subqueryIndexPattern != null) { |
There was a problem hiding this comment.
Can this now ever happen? Shouldn't it be guaranteed by the grammar that it's never null?
There was a problem hiding this comment.
The piece of code that processes subquery index patterns in EsqlSession will be in a separate PR from @craigtaverner, and it will be shared by subqueries and views, stay tuned :).
| ); | ||
|
|
||
| } else { | ||
| // occurs when dealing with local relations (row a = 1) |
There was a problem hiding this comment.
This isn't currently supported by the grammar, no?
| && indexResolution.matches(indexPattern) == false | ||
| && context.subqueryResolution().isEmpty() == false) { | ||
| // index pattern does not match main index | ||
| indexResolution = context.subqueryResolution().getOrDefault(indexPattern, indexResolution); |
There was a problem hiding this comment.
| indexResolution = context.subqueryResolution().getOrDefault(indexPattern, indexResolution); | |
| indexResolution = context.subqueryResolution().get(indexPattern); | |
| if (indexResolution == null) {... |
Wouldn't it be a bug if the simple get() would return a null? And wouldn't we introduce a new one, if an "unkown" pattern resolves then to the "main" index?
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I think there are some fixes needed on conflicting/union types from different branches. I've left a few notes, out of which one is, I think, a bug.
Potentially related to that: when running the query in that comment - with index1 only having "x": 1 and index2 having "x": "2" (i.e. both text and keyword fields) - I get a CCE:
[2025-10-17T12:49:31,742][ERROR][o.e.r.ChunkedRestResponseBodyPart] [runTask-0] failure encoding chunk java.lang.ClassCastException: class org.elasticsearch.compute.data.IntVectorBlock cannot be cast to class org.elasti
csearch.compute.data.LongBlock (org.elasticsearch.compute.data.IntVectorBlock and org.elasticsearch.compute.data.LongBlock are in unnamed module of loader java.net.FactoryURLClassLoader @11d2dd2d)
at org.elasticsearch.xpack.esql.action.PositionToXContent$1.valueToXContent(PositionToXContent.java:72)
at org.elasticsearch.xpack.esql.action.PositionToXContent.positionToXContent(PositionToXContent.java:53)
at org.elasticsearch.xpack.esql.action.ResponseXContentUtils.lambda$rowValues$7(ResponseXContentUtils.java:114)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.rest.ChunkedRestResponseBodyPart$1.encodeChunk(ChunkedRestResponseBodyPart.java:161)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.rest.RestController$EncodedLengthTrackingChunkedRestResponseBodyPart.encodeChunk(RestController.java:1012)
at org.elasticsearch.transport.netty4@9.3.0-SNAPSHOT/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.writeChunk(Netty4HttpPipeliningHandler.java:436)
at org.elasticsearch.transport.netty4@9.3.0-SNAPSHOT/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWriteChunkedResponse(Netty4HttpPipeliningHandler.java:263)
at org.elasticsearch.transport.netty4@9.3.0-SNAPSHOT/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWrite(Netty4HttpPipeliningHandler.java:231)
at org.elasticsearch.transport.netty4@9.3.0-SNAPSHOT/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.write(Netty4HttpPipeliningHandler.java:184)
at io.netty.transport@4.1.126.Final/io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891)
at io.netty.transport@4.1.126.Final/io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956)
at io.netty.transport@4.1.126.Final/io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1263)
at io.netty.common@4.1.126.Final/io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.common@4.1.126.Final/io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.common@4.1.126.Final/io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
at io.netty.transport@4.1.126.Final/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.common@4.1.126.Final/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
at io.netty.common@4.1.126.Final/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:1575)
Also, something like:
FROM index1, (FROM index2) | EVAL l = x::LONG | EVAL i = x::INTEGER | KEEP x, l, i
fails with a verification exception:
"org.elasticsearch.xpack.esql.VerificationException: Found 1 problem\nline 130:47: Output has changed from [[x{r}#204, l{r}#191, i{r}#194]] to [[x{r}#204, l{r}#191, i{r}#194]].
at org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizer.optimize(LogicalPlanOptimizer.java:121)
at org.elasticsearch.xpack.esql.session.EsqlSession.optimizedPlan(EsqlSession.java:947)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.lambda$onResponse$1(EsqlSession.java:228)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.addListener(SubscribableListener.java:222)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.lambda$andThen$1(SubscribableListener.java:534)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListener.run(ActionListener.java:465)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.newForked(SubscribableListener.java:138)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.andThen(SubscribableListener.java:534)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.andThen(SubscribableListener.java:489)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:228)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:219)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:355)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:262)
at org.elasticsearch.xpack.esql.session.EsqlSession.analyzeWithRetry(EsqlSession.java:885)
at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$resolveIndices$14(EsqlSession.java:538)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:355)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:262)
at org.elasticsearch.xpack.esql.session.EsqlSession.preAnalyzeSubqueryIndices(EsqlSession.java:554)
at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$preAnalyzeSubqueryIndices$15(EsqlSession.java:551)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$preAnalyzeSubqueryIndex$16(EsqlSession.java:589)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:233)
at org.elasticsearch.xpack.esql.session.IndexResolver.lambda$resolveAsMergedMapping$0(IndexResolver.java:100)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:413)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:228)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:222)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:350)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:413)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:111)
at org.elasticsearch.xpack.esql.action.EsqlResolveFieldsAction.finishHim(EsqlResolveFieldsAction.java:338)
at org.elasticsearch.xpack.esql.action.EsqlResolveFieldsAction.lambda$doExecuteForked$7(EsqlResolveFieldsAction.java:232)
at org.elasticsearch.base@9.3.0-SNAPSHOT/org.elasticsearch.core.AbstractRefCounted$1.closeInternal(AbstractRefCounted.java:125)
at org.elasticsearch.base@9.3.0-SNAPSHOT/org.elasticsearch.core.AbstractRefCounted.decRef(AbstractRefCounted.java:77)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.support.RefCountingRunnable.close(RefCountingRunnable.java:113)
at org.elasticsearch.base@9.3.0-SNAPSHOT/org.elasticsearch.core.Releasables$4.close(Releasables.java:178)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.RunOnce.run(RunOnce.java:41)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.fieldcaps.RequestDispatcher.innerExecute(RequestDispatcher.java:177)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.fieldcaps.RequestDispatcher$1.doRun(RequestDispatcher.java:146)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:328)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:324)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractThrottledTaskRunner$1.doRun(AbstractThrottledTaskRunner.java:136)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:35)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1067)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1575)\n"
| Map<String, AbstractConvertFunction> convertFunctions = new HashMap<>(); | ||
| plan.forEachDown(p -> p.forEachExpression(AbstractConvertFunction.class, f -> { | ||
| if (f.field() instanceof Attribute attr && unionAll.output().contains(attr)) { | ||
| convertFunctions.putIfAbsent(attr.name(), f); |
There was a problem hiding this comment.
I think this is problematic: we convert an emerging union output attribute to the first[*] type showing up in a conversion function, irrespective of this conversion function applying to the union output attribute or not.
That is: if x is a field in two indices (of different types), part of the output of something like:
FROM index1, (FROM index2) | EVAL some_field = x::LONG | EVAL some_other_field = x::INTEGER
x's type will be either LONG or INTEGER, depending on the order of those EVALs. Which I think is wrong.
[*] first in order of walking the tree.
I guess we probably want to restrict the map to only those attributes that apply a conversion function to another one with the same name? In which case, I guess we'll want to simply push down the conversion function to the UnionAll branches -- so not just copy it, but push it away downwards: right now the conversions stay, even if done on the same name attribute.
There was a problem hiding this comment.
It is a good idea, I'll see how to deal with it. I'd like to keep the explicit casting push down in this PR. And I'm going to remove the implicit casting among subqueries and main index patterns from this PR, and do it as a follow up, as suggested by @alex-spies in the design review, if you haven't review the implicit casting part, you can skip it for now.
There was a problem hiding this comment.
Thank you so much for catching this! I think we can handle multiple conversion functions on the same UnionAll output better. The collectConvertFunctions method has been updated to collect all explicit conversion functions from the main query that reference the outputs of UnionAll. These explicit conversion functions are now pushed down into each UnionAll branch and returned as new output attributes. And the subsequent commands after UnionAll can reference the outputs without confusion.
|
|
||
| private LogicalPlan maybePushDownConvertFunctionsToChild(LogicalPlan child, List<Alias> aliases, List<Attribute> output) { | ||
| // Fork/UnionAll adds an EsqlProject on top of each child plan during resolveFork, check this pattern before pushing down | ||
| if (aliases.isEmpty() == false && child instanceof EsqlProject esqlProject) { |
There was a problem hiding this comment.
Why do we need to check for the EsqlProject presence / pattern?
There was a problem hiding this comment.
Up to here, the branches of UnionAll follow a common pattern like below:
UnionAll/Fork
Project
Eval (optional)
Subquery (main index patterns do not have this node)
...
In both ResolveUnionTypesInUnionAll and PushDownFilterAndLimitIntoUnionAll, the pattern checks are intentionally strict — these rules only apply their transformations when a branch of UnionAll exactly matches the expected patterns. The main goal is to have a tight/better control over their behavior and ensure they perform only the intended transformations. If the structure of subqueries changes in the future and no longer fits the expected pattern, these rules will simply stop applying, allowing us to detect such changes early in the process.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
| DataType targetType = convertFunctions.containsKey(oldAttr.name()) | ||
| ? convertFunctions.get(oldAttr.name()).dataType() | ||
| : oldAttr.dataType(); | ||
| if (oldAttr.dataType() != targetType) { |
There was a problem hiding this comment.
I think this condition can only be true if convertFunctions.containsKey(oldAttr.name()). Maybe we can simplify a bit the code here for better legibility.
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/tree/EsqlNodeSubclassTests.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/tree/EsqlNodeSubclassTests.java
Outdated
Show resolved
Hide resolved
…across subqueries
…on only on coordinator node
* non-correlated subquery in from command
* non-correlated subquery in from command
* Basic support for DATE_RANGE field type in ESQL * [CI] Update transport version definitions * Update docs/changelog/133309.yaml * Suppres two more tests * Two more tests fixes * Two more tests * Some more infra for new tests * Some more fixes for tests * And some more tests * And some more tests * Fix capanility check in CSV tests and some more fixes * Block lookup join * Block lookup join * Relaxed the previous transport version check, which was in fact always false - replace it with something that can work for now. * Fixed a typo * Style fixes * More fixes: block lookup join. Delete date_range from unsupported yaml test, it's not unsupported anymore and there is a normal yaml test for it. * Supress RestEsqlIT because complex queries like this are not supported yet * Removed unused field * Bring back deleted yaml tests * Refactor DateRange Block type (and element type) to LongRange * Review fixes * small fixes * Added identity option to TO_DATE_RANGE, plus more tests * [CI] Update transport version definitions * Fix irregular spaces (#137014) * Fix irregular spaces * Update analysis-keyword-repeat-tokenfilter.md * Update search-suggesters.md * Update search-profile.md * Test utility for `POST _features/_reset` (#137133) We call `POST _features/_reset` in various places in the tests, only sometimes asserting that the response is `200 OK`. This commit extracts a utility that makes it harder to miss this assertion. It also adds the `?error_trace` parameter so that on failure the details are visible in the response, and `?master_timeout=-1` to avoid spurious timeouts if CI is running a little slow. * Revert "Test utility for `POST _features/_reset` (#137133)" This reverts commit df67e27. * Mute org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT test {csv-spec:lookup-join-expression.LookupJoinExpressionWithTerm} #137157 * Mute org.elasticsearch.xpack.esql.qa.single_node.EsqlSpecIT test {csv-spec:lookup-join-expression.LookupJoinExpressionWithTerm} #137157 * Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeForkIT test {csv-spec:lookup-join-expression.LookupJoinExpressionWithTerm} #137160 * Mute org.elasticsearch.cluster.routing.allocation.decider.WriteLoadConstraintDeciderIT testCanRemainNotPreferredIsIgnoredWhenAllOtherNodesReturnNotPreferred #137162 * Try bulk load sorted ordinals across ranges (#137076) This change implements the TODO for loading sorted ordinals in the TSDB codec. With ordinal range encoding, we can bulk-append the entire range for the next ordinal, rather than reading and appending one document at a time. * Fix serialization assymetry (writeOptional vs plain read), fixed VerifierTett * Relax (again) the check in blockLoader in RangeFieldMapper, since it seem to fail in practice * [CI] Auto commit changes from spotless * AnalyzerTest fix * Fix APM tracer to respect recording status of spans (#137015) Co-authored-by: Jack Shirazi <jack.shirazi@elastic.co> * Mute org.elasticsearch.xpack.esql.plan.logical.CommandLicenseTests testLicenseCheck #137168 * Catch-and-rethrow TooComplexToDeterminizeException within ESQL (#137024) --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * [ES|QL] Non-Correlated Subquery in FROM command (#135744) * non-correlated subquery in from command * [docs] Update category field to use keyword type in full text tutorial (#137169) closes elastic/docs-content#3329 * [Inference API] Rename E5 model/inference endpoint in EIS to JinaAI counterpart (#137028) * ESQL: Add a bit of javadoc to preanalysis methods (#137099) Explain where minimum transport version is determined for a given query. * Adds ml-cpp release notes (#137178) * Remove mutes pointing to closed ESQL flaky test issues (#137174) * rest-api-spec: Add missing options to enum types (#136640) * rest-api-spec: Add missing options to enum types * Allow 0-9 in enum options * Add applies_to frontmatter to mapping-reference docs (#137184) * [docs] Update ESQL command H1 titles for SEO (#137188) * style fix * Some fixes: change the TopNEncoder to the correct one, fix the unsupported test to match the expected results (good ones and not unsupported). * Transport version * [CI] Auto commit changes from spotless * Remove temporary print * Add missing fix in boolean tranport flag in IndexResolver * Serialize "unsupported" DataType when communicating to old clusters * Block date_range on yaml tests for bwc * Fixed missing import after resolving conflicts with main * Fix mistake on merging main with new element types * Revert fix to DataType serialization We should handle this at a higher level, since this code is widely used in serialization tests which fail with this change. * Try a temporary fix for enrichment * [CI] Auto commit changes from spotless * Some small fixes * cat API: added endpoint for Circuit Breakers (#136890) Added CAT Action to display Circuit Breakers stats for all nodes. The API supports pattern matching as a path parameter and the standard query parameters of CAT actions. This change includes spec and yamlRestTest. Addresses #132688 * ESQL: Fix release tests (#137298) New field type isn't serializable outside of snapshot. * Reject invalid `reverse_nested` aggs (#137047) * Mute org.elasticsearch.xpack.esql.plan.physical.ShowExecSerializationTests testConcurrentSerialization #137338 * Mute org.elasticsearch.readiness.ReadinessClusterIT testReadinessDuringRestartsNormalOrder #136955 * [docs] Update changelog summary for semantic_text ELSER on EIS default (#137339) * Fix method visibility in public-callers-finder (#137200) Public-callers-finder did not correctly report if methods are externally visible due to the following issues: - checking if interfaces are exported (see findAccessibility) must be done using the exports of the interface’s module, not the exports of the current class - super classes must be treated the same way as interfaces, currently these are ignored - currently all public methods of a (potentially private, non-exported) class implementing a public, exported interface are considered accessible / visible regardless if part of the interface or not This fixes visibility to be consistent with the JdkApiExtractor tool by implementing both using the same common logic in `AccessibleJdkMethods.loadAccessibleMethods`. Note: this currently includes #137193, which will be merged independently. Relates to ES-13117 * [ML] Adding bulk create functionality to ModelRegistry (#136569) * Adding bulk storage of multiple models * Adding tests * Adding log for duplicate ids * [CI] Auto commit changes from spotless * Removing unused code * Removing constructor * Adding more tests * Adding in logic to delete models when a failure occurs * revert rename changes * formatting * Starting on feedback * Improving tests * Moving most tests to ModelRegistryIT * [CI] Auto commit changes from spotless * Fixing test * Removing duplicate tests * Handling empty list and duplicates * Fixing empty delete --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Remove deprecated constant for default elser inference id (#137329) * Cleaned up the old constant * Cleaned up test * Update docs/changelog/137329.yaml * Update docs/changelog/137329.yaml * Update docs/changelog/137329.yaml * Delete docs/changelog/137329.yaml * ESQL: Handle release of 9.2 in test (#137070) I'd made a mistake in #136327 when writing the test for fetching fields that the release of 9.2.0 revealed. This fixes it and adds one more test that we needed from #136327. * Fixed inconsistency in the isSyntheticSourceEnabled flag (#137297) * Fixed inconsistency in the isSyntheticSourceEnabled flag * Update docs/changelog/137297.yaml * [ML] Disable CPS for Dataframes (#136716) Cross-Project Search and Cross-Cluster Search is indefinitely disabled for Dataframe Analytics. The error message will now display if the syntax would have otherwise resolved to the respective feature. * [Docs] Improve semantic_text updates documentation organization (#137340) * Move script update restrictions section for consistency * Update docs/reference/elasticsearch/mapping-reference/semantic-text.md Co-authored-by: Liam Thompson <leemthompo@gmail.com> * Update docs/reference/elasticsearch/mapping-reference/semantic-text.md Co-authored-by: Liam Thompson <leemthompo@gmail.com> * Update docs/reference/elasticsearch/mapping-reference/semantic-text.md Co-authored-by: Liam Thompson <leemthompo@gmail.com> --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> * Remove unused field from IndexModule (#137342) * Improving random sampling performance by lazily calling getSamplingConfiguration() (#137223) * [ML] Disable CrossProject for Datafeeds (#136897) Initially, Datafeeds will not support cross-project source indices. We will verify that the IndicesOptions is not trying to resolve a cross-project index expression and throw a cross-project specific error message. * [ES-12998] Invoking gradle continue flag on periodic runs to allow for more information on test failures (#136900) * Add --continue flag to invoke maven when a task failure has been hit so that we can see the outcome of more testing over time. * ES|QL: Improve value loading for match_only_text mapping (#137026) * [Transform] Remove extra reset calls (#137346) ESRestTestCase already calls the `_reset` API, we do not need to do it twice. Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Add chunk_rescorer usage to output of explain and profile for text_similarity_rank_retriever (#137249) * Add chunk_rescorer usage to output of explain for text_similarity_rank_retriever * Update docs/changelog/137249.yaml * Update toString * Add support for profile * Update 137249.yaml * That's what you get for resolving conflicts in the UI, fixed compile failure * [CI] Auto commit changes from spotless * Fix test compilation --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Remove `first` and `last` functions from documentation (#137341) * Remove `first` and `last` functions from documentation * Apply suggestions from code review Co-authored-by: Liam Thompson <leemthompo@gmail.com> * update --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> * ESQL: Work around concurrent serialization bug (#137350) The bug still exists in the underlying code, but it isn't released so I'm working around it by doing an eager call to `hashCode`. We'll fix the real bug real soon. Closes #137338 * [ES|QL] Add CHUNK function (#134320) * Add new function to chunk strings * Refactor CHUNK function to support multiple values * Default to returning all chunks * [CI] Auto commit changes from spotless * Handle warnings * Loosen export restrictions to try to get compile error working * Remove inference dependencies * Fix compilation errors * Remove more inference deps * Fix compile errors from merge * Fix existing tests * Exclude from CSV tests * Add more tests * Cleanup * [CI] Auto commit changes from spotless * Cleanup * Update docs/changelog/134320.yaml * PR feedback * Remove null field constraint * [CI] Auto commit changes from spotless * PR feedback: Refactor to use an options map * Cleanup * Regenerate docs * Add test on a concatenated field * Add multivalued field test * Don't hardcode strings * [CI] Auto commit changes from spotless * PR feedback --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Add default sort for message.template_id field in logsdb indices (#136571) This PR adds the index setting `index.logsdb.default_sort_on_message_template`. When set, LogsDB indices will add `message.template_id` to the default sort fields (assuming the `message` field is present and of type `pattern_text`). * Transport version * [CI] Update transport version definitions * Specifically allow to_date_range and to_string * Tansport version * Update docs/changelog/133309.yaml * Fixed forbidden toString * Revert enrichment hack * transport version * Fixed accidentally deleted files * [CI] Update transport version definitions * Generated files for to_string / is_null tests * Transport version * [CI] Update transport version definitions * Transport version * Small merge accident fix * Added missing case for BlockTestUtils * Small fix for AllSupportedFieldTest * Transport version * [CI] Update transport version definitions * [CI] Update transport version definitions * Transport version * merge fixes * change date_range to under_construction * Some error tests fixes * More error tests fixes * More merge fixes * [CI] Auto commit changes from spotless * Smale checstyle fix * error tests fixes * More fixes * One more error fix * Fix the date range capability * Update docs/changelog/133309.yaml * bring back accidentally deleted files * changelog type * AllSupportedFieldTest fix for no snapshot build * Transport version * Fix for CSV test to avoid order false errors * Another fix after changing to under construction * merge fixes * [CI] Auto commit changes from spotless * [CI] Update transport version definitions * [CI] Update transport version definitions * [CI] Update transport version definitions * [CI] Update transport version definitions * Fix some date->long text replacements we missed * Removed duplicated test data `date_ranges.csv` * Update transport version after merging main * Nicer string literal * Update transport version after merging main * TO_DATERANGE should be snapshot-only * Fix failing tests after moving TO_DATERANGE to snapshot-only * Remove `useDateRangeWhenNotSupported` since that seems only relevant to partial support across multiple versions * Disabled TO_DATE_RANGE on release builds of testInlineCast * [CI] Auto commit changes from spotless * Update docs/changelog/133309.yaml * transport version * merge fixes * another transport version * Another try in fixing the multi cluster tests * [CI] Update transport version definitions * Fixed compilation error * Fix thread leak in ContextIndexSearcherTests.testMaxClause Add missing executor cleanup in finally block to prevent thread leaks when the test creates a ThreadPoolExecutor. * Add queryContainsIndices utility method to EsqlTestUtils This new method checks if a given ESQL query contains any specified indices, facilitating special handling for queries using indices loaded into multiple clusters. The method is integrated into MultiClusterSpecIT to streamline the logic for determining when to use remote indices, improving code clarity and maintainability. * Fix for the last review fix * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Craig Taverner <craig@amanzi.com> Co-authored-by: Fabrizio Ferri-Benedetti <algernon@fastmail.com> Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: elasticsearchmachine <58790826+elasticsearchmachine@users.noreply.github.com> Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co> Co-authored-by: Moritz Mack <mmack@apache.org> Co-authored-by: Jack Shirazi <jack.shirazi@elastic.co> Co-authored-by: Matt <matthew.alp@elastic.co> Co-authored-by: Fang Xing <155562079+fang-xing-esql@users.noreply.github.com> Co-authored-by: Liam Thompson <leemthompo@gmail.com> Co-authored-by: Tim Grein <tim.grein@elastic.co> Co-authored-by: Alexander Spies <alexander.spies@elastic.co> Co-authored-by: kosabogi <105062005+kosabogi@users.noreply.github.com> Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@elastic.co> Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co> Co-authored-by: Matteo Mazzola <matteo.mazzola@elastic.co> Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com> Co-authored-by: Mridula <mridula.s@elastic.co> Co-authored-by: Dmitry Kubikov <Kubik42@users.noreply.github.com> Co-authored-by: Pat Whelan <pat.whelan@elastic.co> Co-authored-by: Alan Woodward <romseygeek@apache.org> Co-authored-by: Keith Massey <keith.massey@elastic.co> Co-authored-by: Neil Bhavsar <neil.bhavsar@elastic.co> Co-authored-by: Ioana Tagirta <ioanatia@users.noreply.github.com> Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co> Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com> Co-authored-by: Jordan Powers <jordan.powers@elastic.co>
* Basic support for DATE_RANGE field type in ESQL * [CI] Update transport version definitions * Update docs/changelog/133309.yaml * Suppres two more tests * Two more tests fixes * Two more tests * Some more infra for new tests * Some more fixes for tests * And some more tests * And some more tests * Fix capanility check in CSV tests and some more fixes * Block lookup join * Block lookup join * Relaxed the previous transport version check, which was in fact always false - replace it with something that can work for now. * Fixed a typo * Style fixes * More fixes: block lookup join. Delete date_range from unsupported yaml test, it's not unsupported anymore and there is a normal yaml test for it. * Supress RestEsqlIT because complex queries like this are not supported yet * Removed unused field * Bring back deleted yaml tests * Refactor DateRange Block type (and element type) to LongRange * Review fixes * small fixes * Added identity option to TO_DATE_RANGE, plus more tests * [CI] Update transport version definitions * Fix irregular spaces (elastic#137014) * Fix irregular spaces * Update analysis-keyword-repeat-tokenfilter.md * Update search-suggesters.md * Update search-profile.md * Test utility for `POST _features/_reset` (elastic#137133) We call `POST _features/_reset` in various places in the tests, only sometimes asserting that the response is `200 OK`. This commit extracts a utility that makes it harder to miss this assertion. It also adds the `?error_trace` parameter so that on failure the details are visible in the response, and `?master_timeout=-1` to avoid spurious timeouts if CI is running a little slow. * Revert "Test utility for `POST _features/_reset` (elastic#137133)" This reverts commit df67e27. * Mute org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT test {csv-spec:lookup-join-expression.LookupJoinExpressionWithTerm} elastic#137157 * Mute org.elasticsearch.xpack.esql.qa.single_node.EsqlSpecIT test {csv-spec:lookup-join-expression.LookupJoinExpressionWithTerm} elastic#137157 * Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeForkIT test {csv-spec:lookup-join-expression.LookupJoinExpressionWithTerm} elastic#137160 * Mute org.elasticsearch.cluster.routing.allocation.decider.WriteLoadConstraintDeciderIT testCanRemainNotPreferredIsIgnoredWhenAllOtherNodesReturnNotPreferred elastic#137162 * Try bulk load sorted ordinals across ranges (elastic#137076) This change implements the TODO for loading sorted ordinals in the TSDB codec. With ordinal range encoding, we can bulk-append the entire range for the next ordinal, rather than reading and appending one document at a time. * Fix serialization assymetry (writeOptional vs plain read), fixed VerifierTett * Relax (again) the check in blockLoader in RangeFieldMapper, since it seem to fail in practice * [CI] Auto commit changes from spotless * AnalyzerTest fix * Fix APM tracer to respect recording status of spans (elastic#137015) Co-authored-by: Jack Shirazi <jack.shirazi@elastic.co> * Mute org.elasticsearch.xpack.esql.plan.logical.CommandLicenseTests testLicenseCheck elastic#137168 * Catch-and-rethrow TooComplexToDeterminizeException within ESQL (elastic#137024) --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * [ES|QL] Non-Correlated Subquery in FROM command (elastic#135744) * non-correlated subquery in from command * [docs] Update category field to use keyword type in full text tutorial (elastic#137169) closes elastic/docs-content#3329 * [Inference API] Rename E5 model/inference endpoint in EIS to JinaAI counterpart (elastic#137028) * ESQL: Add a bit of javadoc to preanalysis methods (elastic#137099) Explain where minimum transport version is determined for a given query. * Adds ml-cpp release notes (elastic#137178) * Remove mutes pointing to closed ESQL flaky test issues (elastic#137174) * rest-api-spec: Add missing options to enum types (elastic#136640) * rest-api-spec: Add missing options to enum types * Allow 0-9 in enum options * Add applies_to frontmatter to mapping-reference docs (elastic#137184) * [docs] Update ESQL command H1 titles for SEO (elastic#137188) * style fix * Some fixes: change the TopNEncoder to the correct one, fix the unsupported test to match the expected results (good ones and not unsupported). * Transport version * [CI] Auto commit changes from spotless * Remove temporary print * Add missing fix in boolean tranport flag in IndexResolver * Serialize "unsupported" DataType when communicating to old clusters * Block date_range on yaml tests for bwc * Fixed missing import after resolving conflicts with main * Fix mistake on merging main with new element types * Revert fix to DataType serialization We should handle this at a higher level, since this code is widely used in serialization tests which fail with this change. * Try a temporary fix for enrichment * [CI] Auto commit changes from spotless * Some small fixes * cat API: added endpoint for Circuit Breakers (elastic#136890) Added CAT Action to display Circuit Breakers stats for all nodes. The API supports pattern matching as a path parameter and the standard query parameters of CAT actions. This change includes spec and yamlRestTest. Addresses elastic#132688 * ESQL: Fix release tests (elastic#137298) New field type isn't serializable outside of snapshot. * Reject invalid `reverse_nested` aggs (elastic#137047) * Mute org.elasticsearch.xpack.esql.plan.physical.ShowExecSerializationTests testConcurrentSerialization elastic#137338 * Mute org.elasticsearch.readiness.ReadinessClusterIT testReadinessDuringRestartsNormalOrder elastic#136955 * [docs] Update changelog summary for semantic_text ELSER on EIS default (elastic#137339) * Fix method visibility in public-callers-finder (elastic#137200) Public-callers-finder did not correctly report if methods are externally visible due to the following issues: - checking if interfaces are exported (see findAccessibility) must be done using the exports of the interface’s module, not the exports of the current class - super classes must be treated the same way as interfaces, currently these are ignored - currently all public methods of a (potentially private, non-exported) class implementing a public, exported interface are considered accessible / visible regardless if part of the interface or not This fixes visibility to be consistent with the JdkApiExtractor tool by implementing both using the same common logic in `AccessibleJdkMethods.loadAccessibleMethods`. Note: this currently includes elastic#137193, which will be merged independently. Relates to ES-13117 * [ML] Adding bulk create functionality to ModelRegistry (elastic#136569) * Adding bulk storage of multiple models * Adding tests * Adding log for duplicate ids * [CI] Auto commit changes from spotless * Removing unused code * Removing constructor * Adding more tests * Adding in logic to delete models when a failure occurs * revert rename changes * formatting * Starting on feedback * Improving tests * Moving most tests to ModelRegistryIT * [CI] Auto commit changes from spotless * Fixing test * Removing duplicate tests * Handling empty list and duplicates * Fixing empty delete --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Remove deprecated constant for default elser inference id (elastic#137329) * Cleaned up the old constant * Cleaned up test * Update docs/changelog/137329.yaml * Update docs/changelog/137329.yaml * Update docs/changelog/137329.yaml * Delete docs/changelog/137329.yaml * ESQL: Handle release of 9.2 in test (elastic#137070) I'd made a mistake in elastic#136327 when writing the test for fetching fields that the release of 9.2.0 revealed. This fixes it and adds one more test that we needed from elastic#136327. * Fixed inconsistency in the isSyntheticSourceEnabled flag (elastic#137297) * Fixed inconsistency in the isSyntheticSourceEnabled flag * Update docs/changelog/137297.yaml * [ML] Disable CPS for Dataframes (elastic#136716) Cross-Project Search and Cross-Cluster Search is indefinitely disabled for Dataframe Analytics. The error message will now display if the syntax would have otherwise resolved to the respective feature. * [Docs] Improve semantic_text updates documentation organization (elastic#137340) * Move script update restrictions section for consistency * Update docs/reference/elasticsearch/mapping-reference/semantic-text.md Co-authored-by: Liam Thompson <leemthompo@gmail.com> * Update docs/reference/elasticsearch/mapping-reference/semantic-text.md Co-authored-by: Liam Thompson <leemthompo@gmail.com> * Update docs/reference/elasticsearch/mapping-reference/semantic-text.md Co-authored-by: Liam Thompson <leemthompo@gmail.com> --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> * Remove unused field from IndexModule (elastic#137342) * Improving random sampling performance by lazily calling getSamplingConfiguration() (elastic#137223) * [ML] Disable CrossProject for Datafeeds (elastic#136897) Initially, Datafeeds will not support cross-project source indices. We will verify that the IndicesOptions is not trying to resolve a cross-project index expression and throw a cross-project specific error message. * [ES-12998] Invoking gradle continue flag on periodic runs to allow for more information on test failures (elastic#136900) * Add --continue flag to invoke maven when a task failure has been hit so that we can see the outcome of more testing over time. * ES|QL: Improve value loading for match_only_text mapping (elastic#137026) * [Transform] Remove extra reset calls (elastic#137346) ESRestTestCase already calls the `_reset` API, we do not need to do it twice. Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Add chunk_rescorer usage to output of explain and profile for text_similarity_rank_retriever (elastic#137249) * Add chunk_rescorer usage to output of explain for text_similarity_rank_retriever * Update docs/changelog/137249.yaml * Update toString * Add support for profile * Update 137249.yaml * That's what you get for resolving conflicts in the UI, fixed compile failure * [CI] Auto commit changes from spotless * Fix test compilation --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Remove `first` and `last` functions from documentation (elastic#137341) * Remove `first` and `last` functions from documentation * Apply suggestions from code review Co-authored-by: Liam Thompson <leemthompo@gmail.com> * update --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> * ESQL: Work around concurrent serialization bug (elastic#137350) The bug still exists in the underlying code, but it isn't released so I'm working around it by doing an eager call to `hashCode`. We'll fix the real bug real soon. Closes elastic#137338 * [ES|QL] Add CHUNK function (elastic#134320) * Add new function to chunk strings * Refactor CHUNK function to support multiple values * Default to returning all chunks * [CI] Auto commit changes from spotless * Handle warnings * Loosen export restrictions to try to get compile error working * Remove inference dependencies * Fix compilation errors * Remove more inference deps * Fix compile errors from merge * Fix existing tests * Exclude from CSV tests * Add more tests * Cleanup * [CI] Auto commit changes from spotless * Cleanup * Update docs/changelog/134320.yaml * PR feedback * Remove null field constraint * [CI] Auto commit changes from spotless * PR feedback: Refactor to use an options map * Cleanup * Regenerate docs * Add test on a concatenated field * Add multivalued field test * Don't hardcode strings * [CI] Auto commit changes from spotless * PR feedback --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> * Add default sort for message.template_id field in logsdb indices (elastic#136571) This PR adds the index setting `index.logsdb.default_sort_on_message_template`. When set, LogsDB indices will add `message.template_id` to the default sort fields (assuming the `message` field is present and of type `pattern_text`). * Transport version * [CI] Update transport version definitions * Specifically allow to_date_range and to_string * Tansport version * Update docs/changelog/133309.yaml * Fixed forbidden toString * Revert enrichment hack * transport version * Fixed accidentally deleted files * [CI] Update transport version definitions * Generated files for to_string / is_null tests * Transport version * [CI] Update transport version definitions * Transport version * Small merge accident fix * Added missing case for BlockTestUtils * Small fix for AllSupportedFieldTest * Transport version * [CI] Update transport version definitions * [CI] Update transport version definitions * Transport version * merge fixes * change date_range to under_construction * Some error tests fixes * More error tests fixes * More merge fixes * [CI] Auto commit changes from spotless * Smale checstyle fix * error tests fixes * More fixes * One more error fix * Fix the date range capability * Update docs/changelog/133309.yaml * bring back accidentally deleted files * changelog type * AllSupportedFieldTest fix for no snapshot build * Transport version * Fix for CSV test to avoid order false errors * Another fix after changing to under construction * merge fixes * [CI] Auto commit changes from spotless * [CI] Update transport version definitions * [CI] Update transport version definitions * [CI] Update transport version definitions * [CI] Update transport version definitions * Fix some date->long text replacements we missed * Removed duplicated test data `date_ranges.csv` * Update transport version after merging main * Nicer string literal * Update transport version after merging main * TO_DATERANGE should be snapshot-only * Fix failing tests after moving TO_DATERANGE to snapshot-only * Remove `useDateRangeWhenNotSupported` since that seems only relevant to partial support across multiple versions * Disabled TO_DATE_RANGE on release builds of testInlineCast * [CI] Auto commit changes from spotless * Update docs/changelog/133309.yaml * transport version * merge fixes * another transport version * Another try in fixing the multi cluster tests * [CI] Update transport version definitions * Fixed compilation error * Fix thread leak in ContextIndexSearcherTests.testMaxClause Add missing executor cleanup in finally block to prevent thread leaks when the test creates a ThreadPoolExecutor. * Add queryContainsIndices utility method to EsqlTestUtils This new method checks if a given ESQL query contains any specified indices, facilitating special handling for queries using indices loaded into multiple clusters. The method is integrated into MultiClusterSpecIT to streamline the logic for determining when to use remote indices, improving code clarity and maintainability. * Fix for the last review fix * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Craig Taverner <craig@amanzi.com> Co-authored-by: Fabrizio Ferri-Benedetti <algernon@fastmail.com> Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: elasticsearchmachine <58790826+elasticsearchmachine@users.noreply.github.com> Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co> Co-authored-by: Moritz Mack <mmack@apache.org> Co-authored-by: Jack Shirazi <jack.shirazi@elastic.co> Co-authored-by: Matt <matthew.alp@elastic.co> Co-authored-by: Fang Xing <155562079+fang-xing-esql@users.noreply.github.com> Co-authored-by: Liam Thompson <leemthompo@gmail.com> Co-authored-by: Tim Grein <tim.grein@elastic.co> Co-authored-by: Alexander Spies <alexander.spies@elastic.co> Co-authored-by: kosabogi <105062005+kosabogi@users.noreply.github.com> Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@elastic.co> Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co> Co-authored-by: Matteo Mazzola <matteo.mazzola@elastic.co> Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com> Co-authored-by: Mridula <mridula.s@elastic.co> Co-authored-by: Dmitry Kubikov <Kubik42@users.noreply.github.com> Co-authored-by: Pat Whelan <pat.whelan@elastic.co> Co-authored-by: Alan Woodward <romseygeek@apache.org> Co-authored-by: Keith Massey <keith.massey@elastic.co> Co-authored-by: Neil Bhavsar <neil.bhavsar@elastic.co> Co-authored-by: Ioana Tagirta <ioanatia@users.noreply.github.com> Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co> Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com> Co-authored-by: Jordan Powers <jordan.powers@elastic.co>
|
hi @fang-xing-esql , just checking the feature is still available in snapshot build and NOT still into GA? |
This feature is behind snapshot, and not formally released yet.
This PR enables support for
non-correlated subquerieswithin theFROMcommand. Related to https://github.com/elastic/esql-planning/issues/89A
non-correlated subqueryin this context is one that is fully self-contained and does not reference attributes from the outer query. Enabling support for these subqueries in theFROMcommand provides an additional way to define a data source, beyond directly specifying index patterns in anES|QLquery.Example
This feature is built on top of
Fork. Subqueries are processed in a manner similar to howForkoperates today, with modifications made to the following components to support this functionality:FROM_MODEis updated to support subquery syntax.LogicalPlanBuildercreates aUnionAlllogical plan on top of multiple data sources. Each data source can be either index patterns or subqueries.UnionAllextendsFork, but unlikeFork, eachUnionAllleg may fetch data from different indices—this is one of the key differences betweenUnionAllandFork.fieldcapscalls to build anIndexResolutionfor each subquery.UnionAllleg,InvalidMappedFieldare not created across them. If conversion functions are required for common fields between the main index and subquery indices, those conversion functions must be pushed down into eachUnionAllleg.UnionAllandFork, as predicate pushdown applies only toUnionAll, whileForkremains unchanged.Restrictions and follow ups to be addressed in the next PRs:
LogicalPlanOptimizerwill error out, if the subquery has commands besidesFROMcommand. This is tracked in [ES|QL] Allow nested non-correlated subqueries in from command #136034.FieldNameUtils.resolveFieldNamesto identify subquery field names for field caps call, instead of using all fields*. [ES|QL] Improve FieldNameUtils.resolveFieldNames to identify subquery field names for field caps call, instead of using all fields*#137283LocalRelationcreated byPruneFiltersincludes all of the output of the subquery, and the output is the superset of the outputs from each subquery, which looks confusing, the outputs that are not directly from the current subquery can be excluded from theLocalRelation.UnionAlloutput with explicit casting. [ES|QL] Push down Filters/Predicates on mixed data typeUnionAlloutput with explicit casting. #137284UnionAlloutputLIMITappended to each subquery(UnionAllbranch). [ES|QL] Remove the implicitLIMITcommand added to each subquery (UnionAllbranch) #138106