Skip to content

ESQL: Logical Planning on the Lookup Node#144241

Merged
julian-elastic merged 7 commits intoelastic:mainfrom
julian-elastic:lookupLogicalPlanning_v4
Mar 21, 2026
Merged

ESQL: Logical Planning on the Lookup Node#144241
julian-elastic merged 7 commits intoelastic:mainfrom
julian-elastic:lookupLogicalPlanning_v4

Conversation

@julian-elastic
Copy link
Copy Markdown
Contributor

@julian-elastic julian-elastic commented Mar 13, 2026

Summary

The change only affects Streaming Lookup Join which is behind Snapshot flag. No changes expected for release build.

Introduces logical plan optimization on the lookup node for LOOKUP JOIN. Previously, only physical optimization ran on the lookup-node plan. Now, a new LookupLogicalOptimizer runs before LocalMapper, enabling shard-level field statistics (missing fields, constant fields) to fold and prune filters on the lookup side — the same optimizations the data node already benefits from via LocalLogicalPlanOptimizer.

When a pushed-down filter folds to false (e.g. filtering on a missing field, or a constant_keyword that doesn't match), the lookup node marks the plan as emptyResult and the LookupQueryOperator short-circuits by discarding input pages without executing any Lucene queries.

Changes

New optimizer: LookupLogicalOptimizer

  • Mirrors LocalLogicalPlanOptimizer with a reduced rule set appropriate for the narrow lookup plan shape (Project -> optional Filter -> ParameterizedQuery).
  • Runs ReplaceFieldWithConstantOrNull, InferIsNotNull, and the standard operator-optimization rules (fold nulls, simplify booleans, prune filters).
  • Inserted into the pipeline in LookupFromIndexService.createLookupPhysicalPlan before LocalMapper.map.

New rule: LookupPruneFilters

  • Subclass of PruneFilters that overrides always-false filter handling. Instead of collapsing the plan to LocalRelation (which LookupExecutionPlanner cannot handle), it sets emptyResult=true on the ParameterizedQuery, preserving the plan structure.

ReplaceFieldWithConstantOrNull extended for ParameterizedQuery

  • Now collects constant field values from ParameterizedQuery output (in addition to EsRelation).
  • Inserts null-Eval nodes after ParameterizedQuery for missing fields, same pattern as for EsRelation.

emptyResult flag threaded through the plan

  • Added to ParameterizedQuery (logical), ParameterizedQueryExec (physical), LookupQueryOperatorFactory, and LookupQueryOperator.
  • Runtime-only flag, not serialized — computed locally on the lookup node after deserialization.

LookupQueryOperator short-circuit

  • When emptyResult=true, addInput releases page blocks immediately and returns without setting up query processing.

LookupExecutionPlanner supports EvalExec

  • Added planEvalExec to handle Eval nodes inserted by ReplaceFieldWithConstantOrNull.

FullTextFunction validation

  • Added ParameterizedQuery as a valid terminal node for QSTR/KQL validation, so full-text filters pushed down to lookup plans pass verification.

Test plan

  • New LookupLogicalOptimizerTests: simple lookup, filter on existing field, filter on missing field (folds to empty and to true), constant field match/mismatch.
  • New tests in LookupPhysicalPlanOptimizerTests: missing field fold to empty/true at physical level, drop missing field prunes eval.
  • New testEmptyResultDiscardsInput in LookupQueryOperatorTests: operator-level test for the emptyResult=true short-circuit path.
  • Updated LookupFromIndexIT to pass field attributes through to EsRelation output for logical optimizer compatibility.
  • New YAML REST tests (190_lookup_join.yml): end-to-end constant_keyword filter match and mismatch against a lookup index.
  • Existing tests updated for new emptyResult parameter and EsRelation output changes.

@julian-elastic julian-elastic self-assigned this Mar 13, 2026
@julian-elastic julian-elastic added :Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Mar 13, 2026
@julian-elastic julian-elastic marked this pull request as ready for review March 13, 2026 18:24
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@julian-elastic
Copy link
Copy Markdown
Contributor Author

julian-elastic commented Mar 13, 2026

Logged follow up work enhancements
ESQL: Enhance Index and Shard pruning to take advantage of Min/Max Stats values - #144219
ESQL: Skip data transfer for pruned LOOKUP JOIN shards - #144222

@julian-elastic
Copy link
Copy Markdown
Contributor Author

Buildkite benchmark this with esql-joins please

@elasticmachine
Copy link
Copy Markdown
Collaborator

elasticmachine commented Mar 14, 2026

💚 Build Succeeded

This build ran two esql-joins benchmarks to evaluate performance impact of this PR.

History

cc @julian-elastic

Copy link
Copy Markdown
Contributor

@cimequinox cimequinox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time I've looked at optimizer code and the logic using SearchStats. Of course this is an important part so it would be good if someone familiar with the optimizer would review and look for gaps there. The other parts of the code make sense to me.

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FullTextFunction validation

Added ParameterizedQuery as a valid terminal node for QSTR/KQL validation, so full-text filters pushed down to lookup plans pass verification.

Is that an enhancement, or just something we enable for logical planning on the lookup node specifically? Full-text functions are already supported on lookup fields, right?

Comment on lines +110 to +112
Project project = as(plan, Project.class);
Filter filter = as(project.child(), Filter.class);
assertThat(filter.child(), instanceOf(ParameterizedQuery.class));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions are hard to grok for a human. By now, we have golden tests that span all optimizer stages. Could you please hook the lookup planning/optimization stages into the golden tests in a follow-up and refactor the tests for more readability? (Also the tests added in #143707)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will address in a follow up PR

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very superficial review, but the general approach looks A-OK to me and I love the enhancement that we immediately get from leveraging SearchStats. Thanks @julian-elastic !

@julian-elastic
Copy link
Copy Markdown
Contributor Author

FullTextFunction validation

Added ParameterizedQuery as a valid terminal node for QSTR/KQL validation, so full-text filters pushed down to lookup plans pass verification.

Is that an enhancement, or just something we enable for logical planning on the lookup node specifically? Full-text functions are already supported on lookup fields, right?

Not an enhancement. After adding Lookup Logical Planning some of the validation logic started failing for csv tests that used to run without the change. It uses a whitelist and we are using ParameterizedQuery instead of EsRelation, so added ParameterizedQuery to the no alarm whitelist.

@julian-elastic julian-elastic merged commit 3874506 into elastic:main Mar 21, 2026
36 checks passed
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
* Lookup Logical Planning

* Address code review comments, UTs

Assisted by Cursor/Claude
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants