Skip to content

Fix wrong results bug with count over mixed aggregation#23013

Merged
rschlussel merged 1 commit intoprestodb:masterfrom
rschlussel:fix-count-wrong-results
Jun 18, 2024
Merged

Fix wrong results bug with count over mixed aggregation#23013
rschlussel merged 1 commit intoprestodb:masterfrom
rschlussel:fix-count-wrong-results

Conversation

@rschlussel
Copy link
Contributor

@rschlussel rschlussel commented Jun 14, 2024

Description

Fix a wrong results bug with count over an aggregation that had a mix of global and non-global grouping sets.

We were not checking for single global aggregations correctly in QueryCardinalityUtil, so we would return that the plan was scalar if there were any empty grouping sets rather than if the empty grouping set was the only grouping set. we have now fixed this to return that if all of the grouping sets are global, then the cardinality will be the number of grouping sets, and otherwise it is at least the number of global grouping sets.

This fixes queries like the following:

SELECT COUNT(*) FROM (SELECT count(*) FROM tpch.sf1.nation GROUP BY GROUPING SETS (nationkey, ()));

previously we would incorrectly return 1. And now we return 26.

This change may also fix bugs with correlated subqueries, as those also use the isScalar() utility function.

Motivation and Context

Fixes #22977

Impact

Fixes wrong results for queries with counts over mixed global and grouped aggregations .e.g. SELECT COUNT(*) FROM (SELECT count(*) FROM tpch.sf1.nation GROUP BY GROUPING SETS (nationkey, ()));

Test Plan

added unit tests

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix a bug where count queries on top of aggregations with mixed global and grouped grouping sets could return wrong results.

Fix a wrong results bug with count over an aggregation that had a mix of
global and non-global grouping sets.

We were not checking for single global aggregations correctly in
QueryCardinalityUtil, so we would return that the plan was scalar if
there were *any* empty grouping sets rather than if the empty grouping
set was the only grouping set.  we have now fixed this to return that
if all of the grouping sets are global, then the cardinality will be the
number of grouping sets, and otherwise it is at least the number of
global grouping sets.

This fixes queries like the following:
SELECT COUNT(*) FROM (SELECT count(*) FROM tpch.sf1.nation GROUP BY GROUPING SETS (nationkey, ()));

previously we would incorrectly return 1. And now we return 26.

This change may also fix bugs with correlated subqueries, as those also use
the isScalar() utility function.
@rschlussel rschlussel force-pushed the fix-count-wrong-results branch from 6370057 to 7c925ce Compare June 14, 2024 18:03
@rschlussel rschlussel marked this pull request as ready for review June 17, 2024 17:58
@rschlussel rschlussel requested review from a team, feilong-liu and jaystarshot as code owners June 17, 2024 17:58
@rschlussel rschlussel requested a review from presto-oss June 17, 2024 17:58
@rschlussel rschlussel merged commit 7727beb into prestodb:master Jun 18, 2024
@tdcmeehan tdcmeehan mentioned this pull request Aug 23, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong results for count (and probably correlated subqueries) over empty grouping sets

2 participants