Skip to content

ESQL: allow sorting by expressions and not only regular fields#107158

Merged
astefan merged 9 commits intoelastic:mainfrom
astefan:sort_expressions
Apr 9, 2024
Merged

ESQL: allow sorting by expressions and not only regular fields#107158
astefan merged 9 commits intoelastic:mainfrom
astefan:sort_expressions

Conversation

@astefan
Copy link
Copy Markdown
Contributor

@astefan astefan commented Apr 5, 2024

This allows ESQL to support sort commands over complex expressions like sort salary + languages or sort concat(left(last_name, 1), left(first_name, 1)), salary desc.

@astefan astefan requested review from alex-spies and costin April 5, 2024 14:18
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 5, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @astefan, I've created a changelog YAML for you.

* \_Eval[[emp_no{f}#36 + salary{f}#41 * 13[INTEGER] AS $$order_by$temp_name$0, NEG(salary{f}#41) AS $$order_by$temp_name$1]]
* \_EsRelation[test][_meta_field{f}#42, emp_no{f}#36, first_name{f}#37, ..]
*/
public void testPushdownWithOverwrittenName() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, we already supported expressions in sort but not in all cases, thanks to the work @alex-spies has done in #105650. I have kept his original test, but changed it so that the sort is on individual fields (notice below the sort command that I changed to SORT emp_no ASC nulls first, salary DESC nulls last, emp_no).

* \_Eval[[emp_no{f}#36 + salary{f}#41 * 13[INTEGER] AS $$order_by$temp_name$0, NEG(salary{f}#41) AS $$order_by$temp_name$1]]
* \_EsRelation[test][_meta_field{f}#42, emp_no{f}#36, first_name{f}#37, ..]
*/
public void testReplaceSortByExpressions() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original queries from testPushdownWithOverwrittenName in #105650 are now used to prove the functionality added by this PR is sound.

Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


for (Order order : orderBy.order()) {
if (order.child() instanceof Attribute == false) {
var name = rawTemporaryName("order_by", "temp_name", String.valueOf(counter++));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe find a better name than "temp_name" - such as the ordinal of the order it tries to replace?

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; only left minor comments, but this would be good-to-go as-is IMHO.

new SkipQueryOnEmptyMappings(),
new SubstituteSpatialSurrogates()
new SubstituteSpatialSurrogates(),
new ReplaceOrderByExpressionWithEval()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: There's 4 places now that take expressions out of a plan node and turn them into evals:

  • ReplaceStatsNestedExpressionWithEval and this create an Eval before the node.
  • ReplaceStatsAggExpressionWithEval and SubstituteSurrogates create an Eval after the node.

I wonder if this should be consolidated into 2 opt. rules, or at least should use common helper methods.

It's probably fine for now, though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you in that we should look into the possibility of refactoring this. But this should probably be done separately under the umbrella of "refactoring" or maybe "tech debt".

Comment on lines +339 to +343
new org.elasticsearch.xpack.esql.expression.Order(
order.source(),
eval.toAttribute(),
order.direction(),
order.nullsPosition()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this just be order.replaceChildren(List.of(eval.toAttribute()))?

from employees
| sort concat(left(last_name, 1), left(first_name, 1)), salary desc
| keep first_name, last_name, salary
| eval c = concat(left(last_name, 1), left(first_name, 1))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty paranoid, but maybe it's better if we just have

Suggested change
| eval c = concat(left(last_name, 1), left(first_name, 1))
| eval l = left(last_name, 1, r = left(first_name, 1)

Reason: in the future, c might be eliminated by common subexpression elimination and we end up testing a different opt. rule.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's pretty far fetched, but I don't mind changing this. The spirit of the test remains the same with this suggestion.


@Override
protected LogicalPlan rule(OrderBy orderBy) {
int counter = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this counter maybe should be a (static?) class member instead of a local var?

I could see multiple subsequent sorts accidentally shadowing each other's temporary evals in some edge cases/weird interactions with other opt. rules - by making the name+counter essentially unique, we'd be avoiding this.

Comment on lines +3948 to +3951
assertThat(firstOrderExpr.direction(), equalTo(org.elasticsearch.xpack.ql.expression.Order.OrderDirection.ASC));
assertThat(firstOrderExpr.nullsPosition(), equalTo(org.elasticsearch.xpack.ql.expression.Order.NullsPosition.LAST));
var renamedEmpNoSalaryExpression = as(firstOrderExpr.child(), ReferenceAttribute.class);
assertThat(renamedEmpNoSalaryExpression.toString(), startsWith("$$order_by$temp_name$0"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't assert the expression that defines $$order_by$temp_name$0 anymore; I think we should add these asserts, otherwise we only test the correctness of the sort but not of the eval after the opt. rule has run. Same in testReplaceSortByExpressions

Mokhtar |Bernatsky |38992 |BM
Parto |Bamford |61805 |BP
Premal |Baek |52833 |BP
;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a test with two subesequent sorts with expressions for their groupbys can't hurt, esp. if they use the same expressions:

...
| sort a + b, c
| eval d = c
| sort a + b, d

This could have interesting interactions with var. shadowing and some pruning opt. rules.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing spectacular since both sorts are merged and they are identified as the same, but I've added a test to LogicalPlanOptimizerTests.

@astefan astefan merged commit 31c05e9 into elastic:main Apr 9, 2024
@astefan astefan deleted the sort_expressions branch April 9, 2024 15:57
@drewdaemon
Copy link
Copy Markdown
Contributor

Great improvement! However, I don't see this in the docs. Is an update there planned?

drewdaemon added a commit to elastic/kibana that referenced this pull request May 1, 2024
## Summary

`SORT` accepts expressions as of
elastic/elasticsearch#107158.

This PR updates the validator and autocompletion engine to account for
this improvement.

### Checklist

- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added — looks like docs haven't been added for this feature yet. I
opened the discussion with ES team
([ref](elastic/elasticsearch#107158 (comment)))
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: Stratoula Kalafateli <efstratia.kalafateli@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request May 1, 2024
## Summary

`SORT` accepts expressions as of
elastic/elasticsearch#107158.

This PR updates the validator and autocompletion engine to account for
this improvement.

### Checklist

- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added — looks like docs haven't been added for this feature yet. I
opened the discussion with ES team
([ref](elastic/elasticsearch#107158 (comment)))
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: Stratoula Kalafateli <efstratia.kalafateli@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit 5ba6a39)
kibanamachine added a commit to elastic/kibana that referenced this pull request May 1, 2024
# Backport

This will backport the following commits from `main` to `8.14`:
- [[ES|QL] Sorting accepts expressions
(#181916)](#181916)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Drew
Tate","email":"drew.tate@elastic.co"},"sourceCommit":{"committedDate":"2024-05-01T17:32:37Z","message":"[ES|QL]
Sorting accepts expressions (#181916)\n\n## Summary\r\n\r\n`SORT`
accepts expressions as
of\r\nhttps://github.com/elastic/elasticsearch/pull/107158.\r\n\r\nThis
PR updates the validator and autocompletion engine to account
for\r\nthis improvement.\r\n\r\n### Checklist\r\n\r\n- [
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added — looks like docs haven't been added for this feature yet.
I\r\nopened the discussion with ES
team\r\n([ref](https://github.com/elastic/elasticsearch/pull/107158#issuecomment-2080063533))\r\n-
[x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common
scenarios\r\n\r\n---------\r\n\r\nCo-authored-by: Stratoula Kalafateli
<efstratia.kalafateli@elastic.co>\r\nCo-authored-by: Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"5ba6a399f23282603011332b997044fc485fb270","branchLabelMapping":{"^v8.15.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:prev-minor","Feature:ES|QL","v8.14.0","Team:ESQL","v8.15.0"],"title":"[ES|QL]
Sorting accepts
expressions","number":181916,"url":"https://github.com/elastic/kibana/pull/181916","mergeCommit":{"message":"[ES|QL]
Sorting accepts expressions (#181916)\n\n## Summary\r\n\r\n`SORT`
accepts expressions as
of\r\nhttps://github.com/elastic/elasticsearch/pull/107158.\r\n\r\nThis
PR updates the validator and autocompletion engine to account
for\r\nthis improvement.\r\n\r\n### Checklist\r\n\r\n- [
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added — looks like docs haven't been added for this feature yet.
I\r\nopened the discussion with ES
team\r\n([ref](https://github.com/elastic/elasticsearch/pull/107158#issuecomment-2080063533))\r\n-
[x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common
scenarios\r\n\r\n---------\r\n\r\nCo-authored-by: Stratoula Kalafateli
<efstratia.kalafateli@elastic.co>\r\nCo-authored-by: Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"5ba6a399f23282603011332b997044fc485fb270"}},"sourceBranch":"main","suggestedTargetBranches":["8.14"],"targetPullRequestStates":[{"branch":"8.14","label":"v8.14.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.15.0","branchLabelMappingKey":"^v8.15.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/181916","number":181916,"mergeCommit":{"message":"[ES|QL]
Sorting accepts expressions (#181916)\n\n## Summary\r\n\r\n`SORT`
accepts expressions as
of\r\nhttps://github.com/elastic/elasticsearch/pull/107158.\r\n\r\nThis
PR updates the validator and autocompletion engine to account
for\r\nthis improvement.\r\n\r\n### Checklist\r\n\r\n- [
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added — looks like docs haven't been added for this feature yet.
I\r\nopened the discussion with ES
team\r\n([ref](https://github.com/elastic/elasticsearch/pull/107158#issuecomment-2080063533))\r\n-
[x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common
scenarios\r\n\r\n---------\r\n\r\nCo-authored-by: Stratoula Kalafateli
<efstratia.kalafateli@elastic.co>\r\nCo-authored-by: Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"5ba6a399f23282603011332b997044fc485fb270"}}]}]
BACKPORT-->

Co-authored-by: Drew Tate <drew.tate@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.14.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants