ESQL: evaluate ReferenceAttributes to potentially FieldAttributes for full-text functions restriction#143893
Conversation
complex cases (RENAMEd fields) for full-text functions.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @astefan, I've created a changelog YAML for you. |
carlosdelest
left a comment
There was a problem hiding this comment.
LGTM, thanks @astefan ! Very thorough testing 💯
I have a question regarding field resolution on query translation
docs/changelog/143893.yaml
Outdated
| issues: | ||
| - 143859 | ||
| pr: 143893 | ||
| summary: Evaluate `ReferenceAttributes` to potentially `FieldAttributes` for full-text |
There was a problem hiding this comment.
| summary: Evaluate `ReferenceAttributes` to potentially `FieldAttributes` for full-text | |
| summary: Identify correctly a renamed attribute in full text functions as a field attribute |
| * Resolves the given field expression to a {@link FieldAttribute} by following rename alias chains | ||
| * through {@link Project} nodes in the plan. | ||
| */ | ||
| private FieldAttribute resolveToFieldAttribute(LogicalPlan plan, Expression field) { |
There was a problem hiding this comment.
Should we replace fieldAsFieldAttribute for this method instead? We're still using fieldAsFieldAttribute for the query generation - I wonder why we don't need the same field resolution code there?
There was a problem hiding this comment.
The query generation path doesn't need this because PushDownAndCombineFilters.pushDownPastProject (via resolveRenamesFromProject) already resolves the ReferenceAttribute back to the original FieldAttribute when pushing the filter below the Project.
The same applies to postOptimizationPlanVerification: by that point renames are already resolved. The resolveToFieldAttribute traversal is only truly needed in postAnalysisPlanVerification, where the filter is still above the Project and the field is still a ReferenceAttribute.
But I chose to keep it like this (checked in both places - post analysis and post optimization) because the optimizer runs unnecessarily on queries that should be rejected immediately. For example, FROM idx | EVAL x = 1 | WHERE match(x, "foo") would go through all optimization rules before the user sees the error. Also, I hope that resolveToFieldAttribute is cost-effective enough (single traversal with early exit) to not impact the execution flow too much.
There was a problem hiding this comment.
Thanks for explaining!
From my PoV, I'd prefer a unified way of identifying this even though we run through the optimization before we fail - we would avoid having separate code paths for having essentially the same outcome. From what you say, we could use the same code we had on a PostOptimizationPlanVerificationAware.
I trust your judgement on this of course 👍
…locations * upstream/main: (126 commits) Update KnnIndexTester to use more settings from datasets (elastic#143869) fix: dynamic template vector array is overridden by automatic dense_vector mapping (elastic#143733) ES|QL: Don't reuse the same alias for _fork column (elastic#143909) Close and initialize clients after each node upgrade in logsdb rolling upgrade tests. (elastic#143823) ESQL: Added GroupedTopNOperator for LIMIT BY, compute only (elastic#143476) Handle views in ResolveIndexAction (elastic#143561) Improve reindex rethrottle API in stateless (elastic#143771) Use a copy of the SearchExecutionContext for each Percolator execution (elastic#142765) Log the stacktrace when we encounter a deprecation warning for `default_metric` (elastic#143929) ESQL: evaluate ReferenceAttributes to potentially FieldAttributes for full-text functions restriction (elastic#143893) Add ClusterStateSerializationStats Serializatation Tests (elastic#142703) Adds Coordination Diagnostics Tests (elastic#142709) Upgrade Elasticsearch to Apache Lucene 10.4 (elastic#141882) ESQL: Add configurable bracket-based multi-value support for CSV reader (elastic#143890) time series es819 binary dv use up to a 1mb block size (elastic#143049) Dynamically enable / disable plugins in correspondence to stateless mode. (elastic#142147) ES|QL: Implement first/last_over_time for tdigest (elastic#143832) Document CHANGE_POINT limitation (elastic#143877) Fix OperationsOnSeqNoDisabledIndicesIT (elastic#143892) [Test] Test that sequence numbers are not pruned with retention lease (elastic#143825) ...
Fixes #143859