Add support for `GROUP BY AUTO` aggregation #18390

ebyhr · 2023-07-24T21:22:27Z

Description

This syntax allows omitting column positions or names after GROUP BY.
For instance, SELECT name, count(1) FROM test GROUP BY AUTO will be translated to SELECT name, count(1) FROM test GROUP BY name

References in other database/query engines

Release notes

(x) Release notes are required, with the following suggested text:

# General
* Add support for `GROUP BY AUTO` aggregation that implicitly groups all non-aggregated columns. ({issue}`18390`)

electrum · 2023-07-25T16:30:05Z

This is great. I fully support adding this syntax.

martint

This conflicts with the standard syntax for GROUP BY:

 GROUP BY [ <set quantifier> ] <grouping element list>

where <set quantifier> can be ALL or DISTINCT and defaults to ALL if omitted. The quantifier affects the semantics for queries involving grouping sets. Overloading the meaning to indicate how the keys are selected instead of how the rows in the result are de-duplicated is confusing and error prone.

Before we could consider such syntax, we'd need to define the precise semantics and how it interacts and relates to the broader GROUP BY feature.

martint · 2023-07-31T20:31:10Z

In particular, here are some inconsistencies:

SELECT k, count(*) 
FROM (VALUES 1,2,3) t(k) 
GROUP BY ALL

produces a result, but

SELECT k, count(*) 
FROM (VALUES 1,2,3) t(k) 
GROUP BY

fails with 'k' must be an aggregate expression or appear in GROUP BY clause, even though ALL is implied if missing, per the SQL standard.

SELECT k, count(*) 
FROM (VALUES 1,2,3) t(k) 
GROUP BY DISTINCT

fails with groupingElements must not be empty when type is DISTINCT. Presumably, it should also work. But, what would the meaning of such a query?

Without precise semantics, it's hard to tell what's the behavior of these queries after this change:

SELECt k, count(*) 
FROM (VALUES 1,2,3) t(k) 
GROUP BY ALL k

SELECT k, v, count(*) 
FROM (VALUES (1,1),(2,2),(3,3)) t(k, v) 
GROUP BY ALL

SELECT k, v, count(*) 
FROM (VALUES (1,1),(2,2),(3,3)) t(k, v) 
GROUP BY ALL k

SELECT k, v, count(*) 
FROM (VALUES (1,1),(2,2),(3,3)) t(k, v) 
GROUP BY ALL k, v

SELECT count(*) 
FROM (VALUES 1,2,3) t(k) 
GROUP BY ALL ()

ebyhr · 2023-08-01T01:13:51Z

Since GROUP BY ALL without grouping element list isn't defined in standard, we don't need to default to ALL in a such case and the failure is expected for me.
I don't think it should work. GROUP BY DISTINCT without grouping element list was disallowed even before this change. Also, the standard also disallows it in my understanding. The unclear semantics you mentioned implies the current behavior of throwing an exception is better.
Only 2nd example should be affected in this PR because it's the only example which uses GROUP BY ALL without grouping element list. Other examples should keep the original behavior. This rule is clear to me.

martint · 2023-08-18T00:32:27Z

To be clear, the ALL and DISTINCT qualifiers control whether grouping sets that have the same combination of keys are deduped.

Therefore, the only way this syntax makes sense is if we give meaning to omitting the grouping set specification. Specifically, to be equivalent to having a single grouping set composed of all the expressions in the group by clause that don’t contain aggregations.

In that case, the qualifier is orthogonal to such feature. Allowing one but not the other looks arbitrary and introduces cognitive load for a user who has to understand that they are somehow connected even though intuitively they should not be.

Another aspect that complicates issues conceptually is that the GROUP BY operation occurs before the SELECT clause is computed, so it’s a chicken-and-egg problem to determine which columns are grouping keys and which ones are derived. Also, the GROUP BY clause operates on input columns (those coming from the FROM clause) not on those from the SELECT clause. The implication arrow goes the other way: an expression in the SELECT clause is valid if it’s functionally dependent on the input columns used for computing the grouping sets.

ebyhr · 2024-03-01T01:51:32Z

@martint Thank you for your detailed explanation. Can you suggest alternative syntax? Or we don't want to add this feature?

ebyhr · 2025-03-24T01:41:05Z

Updated the syntax to GROUP BY IMPLICIT.

martint · 2025-03-26T00:19:59Z

I just had another idea -- instead of IMPLICIT, we could use the term AUTO. It's shorter, easy to type, and conveys the intention clearly.

ebyhr · 2025-03-26T04:50:33Z

I prefer AUTO to IMPLICIT. Updated.

core/trino-grammar/src/main/antlr4/io/trino/grammar/sql/SqlBase.g4

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java

core/trino-parser/src/main/java/io/trino/sql/tree/GroupingSets.java

ebyhr · 2025-04-15T01:14:13Z

@martint Gentle reminder.

core/trino-parser/src/test/java/io/trino/sql/parser/TestSqlParser.java

core/trino-main/src/test/java/io/trino/sql/query/TestGroupBy.java

ebyhr · 2025-04-16T23:47:43Z

Addressed comments.

core/trino-main/src/test/java/io/trino/sql/query/TestGroupBy.java

martint

Don't forget to squash the fixup commits.

ebyhr added the syntax-needs-review label Jul 24, 2023

cla-bot bot added the cla-signed label Jul 24, 2023

github-actions bot added the docs label Jul 24, 2023

ebyhr force-pushed the ebi/core-group-by-all branch from fc01df4 to e809697 Compare July 25, 2023 00:47

github-actions bot added tests:hive hive Hive connector labels Jul 25, 2023

ebyhr self-assigned this Jul 25, 2023

ilfrin requested a review from martint July 25, 2023 16:26

ebyhr force-pushed the ebi/core-group-by-all branch 2 times, most recently from 48bb65e to 36bff45 Compare July 31, 2023 02:34

martint requested changes Jul 31, 2023

View reviewed changes

ebyhr force-pushed the ebi/core-group-by-all branch from 36bff45 to 9153730 Compare August 1, 2023 06:17

ebyhr force-pushed the ebi/core-group-by-all branch from 9153730 to 9882169 Compare August 17, 2023 08:46

ebyhr force-pushed the ebi/core-group-by-all branch from 9882169 to 2d40f4a Compare November 1, 2023 01:53

ebyhr force-pushed the ebi/core-group-by-all branch from 2d40f4a to 7dc90bb Compare March 14, 2024 01:52

ebyhr force-pushed the ebi/core-group-by-all branch from 7dc90bb to e2b781e Compare March 24, 2024 23:22

ebyhr marked this pull request as ready for review April 17, 2024 22:18

findepi removed the tests:hive label Apr 18, 2024

github-actions bot added the stale label May 10, 2024

ebyhr added the stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. label May 10, 2024

trinodb deleted a comment from github-actions bot May 15, 2024

ebyhr force-pushed the ebi/core-group-by-all branch 2 times, most recently from 8890f18 to 0b7ccc1 Compare November 8, 2024 12:02

ebyhr changed the title ~~Add support for GROUP BY ALL aggregation~~ Add support for GROUP BY * aggregation Nov 8, 2024

ebyhr force-pushed the ebi/core-group-by-all branch from 0b7ccc1 to 98f0ff2 Compare November 12, 2024 01:54

ebyhr force-pushed the ebi/core-group-by-all branch 2 times, most recently from db27ca8 to b128dae Compare March 24, 2025 00:44

ebyhr removed the stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. label Mar 24, 2025

github-actions bot removed the stale label Mar 24, 2025

ebyhr force-pushed the ebi/core-group-by-all branch from b128dae to adc8993 Compare March 26, 2025 04:48

ebyhr changed the title ~~Add support for GROUP BY IMPLICIT aggregation~~ Add support for GROUP BY AUTO aggregation Mar 26, 2025

martint reviewed Mar 26, 2025

View reviewed changes

core/trino-grammar/src/main/antlr4/io/trino/grammar/sql/SqlBase.g4 Outdated Show resolved Hide resolved

ebyhr force-pushed the ebi/core-group-by-all branch 3 times, most recently from 8d9a394 to adf8bd3 Compare March 27, 2025 00:40

martint reviewed Mar 27, 2025

View reviewed changes

core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java Outdated Show resolved Hide resolved

core/trino-parser/src/main/java/io/trino/sql/tree/GroupingSets.java Outdated Show resolved Hide resolved

ebyhr force-pushed the ebi/core-group-by-all branch from adf8bd3 to 681b500 Compare March 27, 2025 02:06

ebyhr requested a review from martint March 27, 2025 21:58

martint reviewed Apr 16, 2025

View reviewed changes

core/trino-parser/src/test/java/io/trino/sql/parser/TestSqlParser.java Outdated Show resolved Hide resolved

core/trino-main/src/test/java/io/trino/sql/query/TestGroupBy.java Outdated Show resolved Hide resolved

martint reviewed Apr 17, 2025

View reviewed changes

core/trino-main/src/test/java/io/trino/sql/query/TestGroupBy.java Outdated Show resolved Hide resolved

ebyhr force-pushed the ebi/core-group-by-all branch from 015ac49 to 634a000 Compare April 22, 2025 23:30

ebyhr requested a review from martint April 23, 2025 00:03

martint approved these changes Apr 23, 2025

View reviewed changes

Add support for GROUP BY AUTO aggregation

826f3ac

ebyhr force-pushed the ebi/core-group-by-all branch from 634a000 to 826f3ac Compare April 23, 2025 04:20

ebyhr merged commit 598211e into trinodb:master Apr 23, 2025
5 of 15 checks passed

ebyhr deleted the ebi/core-group-by-all branch April 23, 2025 04:20

ebyhr removed the syntax-needs-review label Apr 23, 2025

github-actions bot added this to the 475 milestone Apr 23, 2025

ebyhr mentioned this pull request Sep 16, 2025

Document GROUP BY AUTO syntax #26646

Merged

Add support for GROUP BY AUTO aggregation #18390

Add support for GROUP BY AUTO aggregation #18390

Uh oh!

Conversation

ebyhr commented Jul 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Release notes

Uh oh!

electrum commented Jul 25, 2023

Uh oh!

martint left a comment

Choose a reason for hiding this comment

Uh oh!

martint commented Jul 31, 2023

Uh oh!

ebyhr commented Aug 1, 2023

Uh oh!

martint commented Aug 18, 2023

Uh oh!

ebyhr commented Mar 1, 2024

Uh oh!

ebyhr commented Mar 24, 2025

Uh oh!

martint commented Mar 26, 2025

Uh oh!

ebyhr commented Mar 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ebyhr commented Apr 15, 2025

Uh oh!

Uh oh!

Uh oh!

ebyhr commented Apr 16, 2025

Uh oh!

Uh oh!

martint left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Add support for `GROUP BY AUTO` aggregation #18390

Add support for `GROUP BY AUTO` aggregation #18390

ebyhr commented Jul 24, 2023 •

edited

Loading