ESQL: Logical Planning on the Lookup Node#144241
ESQL: Logical Planning on the Lookup Node#144241julian-elastic merged 7 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Buildkite benchmark this with esql-joins please |
💚 Build Succeeded
This build ran two esql-joins benchmarks to evaluate performance impact of this PR. History |
...sql/compute/src/main/java/org/elasticsearch/compute/operator/lookup/LookupQueryOperator.java
Show resolved
Hide resolved
...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/enrich/LookupExecutionPlanner.java
Show resolved
Hide resolved
...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/enrich/LookupFromIndexService.java
Show resolved
Hide resolved
...g/elasticsearch/xpack/esql/optimizer/rules/logical/local/ReplaceFieldWithConstantOrNull.java
Show resolved
Hide resolved
cimequinox
left a comment
There was a problem hiding this comment.
This is the first time I've looked at optimizer code and the logic using SearchStats. Of course this is an important part so it would be good if someone familiar with the optimizer would review and look for gaps there. The other parts of the code make sense to me.
alex-spies
left a comment
There was a problem hiding this comment.
FullTextFunction validation
Added ParameterizedQuery as a valid terminal node for QSTR/KQL validation, so full-text filters pushed down to lookup plans pass verification.
Is that an enhancement, or just something we enable for logical planning on the lookup node specifically? Full-text functions are already supported on lookup fields, right?
| Project project = as(plan, Project.class); | ||
| Filter filter = as(project.child(), Filter.class); | ||
| assertThat(filter.child(), instanceOf(ParameterizedQuery.class)); |
There was a problem hiding this comment.
These assertions are hard to grok for a human. By now, we have golden tests that span all optimizer stages. Could you please hook the lookup planning/optimization stages into the golden tests in a follow-up and refactor the tests for more readability? (Also the tests added in #143707)
There was a problem hiding this comment.
OK, will address in a follow up PR
alex-spies
left a comment
There was a problem hiding this comment.
Very superficial review, but the general approach looks A-OK to me and I love the enhancement that we immediately get from leveraging SearchStats. Thanks @julian-elastic !
...l/src/test/java/org/elasticsearch/xpack/esql/optimizer/LookupPhysicalPlanOptimizerTests.java
Show resolved
Hide resolved
...plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LookupLogicalOptimizer.java
Outdated
Show resolved
Hide resolved
...plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LookupLogicalOptimizer.java
Outdated
Show resolved
Hide resolved
...plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LookupLogicalOptimizer.java
Outdated
Show resolved
Hide resolved
Not an enhancement. After adding Lookup Logical Planning some of the validation logic started failing for csv tests that used to run without the change. It uses a whitelist and we are using ParameterizedQuery instead of EsRelation, so added ParameterizedQuery to the no alarm whitelist. |
* Lookup Logical Planning * Address code review comments, UTs Assisted by Cursor/Claude
Summary
The change only affects Streaming Lookup Join which is behind Snapshot flag. No changes expected for release build.
Introduces logical plan optimization on the lookup node for LOOKUP JOIN. Previously, only physical optimization ran on the lookup-node plan. Now, a new
LookupLogicalOptimizerruns beforeLocalMapper, enabling shard-level field statistics (missing fields, constant fields) to fold and prune filters on the lookup side — the same optimizations the data node already benefits from viaLocalLogicalPlanOptimizer.When a pushed-down filter folds to
false(e.g. filtering on a missing field, or aconstant_keywordthat doesn't match), the lookup node marks the plan asemptyResultand theLookupQueryOperatorshort-circuits by discarding input pages without executing any Lucene queries.Changes
New optimizer:
LookupLogicalOptimizerLocalLogicalPlanOptimizerwith a reduced rule set appropriate for the narrow lookup plan shape (Project -> optional Filter -> ParameterizedQuery).ReplaceFieldWithConstantOrNull,InferIsNotNull, and the standard operator-optimization rules (fold nulls, simplify booleans, prune filters).LookupFromIndexService.createLookupPhysicalPlanbeforeLocalMapper.map.New rule:
LookupPruneFiltersPruneFiltersthat overrides always-false filter handling. Instead of collapsing the plan toLocalRelation(whichLookupExecutionPlannercannot handle), it setsemptyResult=trueon theParameterizedQuery, preserving the plan structure.ReplaceFieldWithConstantOrNullextended forParameterizedQueryParameterizedQueryoutput (in addition toEsRelation).ParameterizedQueryfor missing fields, same pattern as forEsRelation.emptyResultflag threaded through the planParameterizedQuery(logical),ParameterizedQueryExec(physical),LookupQueryOperatorFactory, andLookupQueryOperator.LookupQueryOperatorshort-circuitemptyResult=true,addInputreleases page blocks immediately and returns without setting up query processing.LookupExecutionPlannersupportsEvalExecplanEvalExecto handleEvalnodes inserted byReplaceFieldWithConstantOrNull.FullTextFunctionvalidationParameterizedQueryas a valid terminal node for QSTR/KQL validation, so full-text filters pushed down to lookup plans pass verification.Test plan
LookupLogicalOptimizerTests: simple lookup, filter on existing field, filter on missing field (folds to empty and to true), constant field match/mismatch.LookupPhysicalPlanOptimizerTests: missing field fold to empty/true at physical level, drop missing field prunes eval.testEmptyResultDiscardsInputinLookupQueryOperatorTests: operator-level test for theemptyResult=trueshort-circuit path.LookupFromIndexITto pass field attributes through toEsRelationoutput for logical optimizer compatibility.190_lookup_join.yml): end-to-end constant_keyword filter match and mismatch against a lookup index.emptyResultparameter andEsRelationoutput changes.