Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect SQL generated for Count over Group By (EFCore 2.1) #12351

Closed
mpetito opened this issue Jun 13, 2018 · 5 comments
Closed

Incorrect SQL generated for Count over Group By (EFCore 2.1) #12351

mpetito opened this issue Jun 13, 2018 · 5 comments
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression type-bug
Milestone

Comments

@mpetito
Copy link

mpetito commented Jun 13, 2018

EFCore 2.1 generates incorrect sql if you perform a Count operation on a queryable which uses the new GroupBy translation. Given a query of the form queryable.GroupBy(...).Count(), EF performs a count within the groups instead of a count of the groups.

Produced SQL which is incorrect:

SELECT COUNT(*)
FROM [Payments] AS [p]
GROUP BY [p].[AccountId]

Expected SQL similar to:

SELECT COUNT(*)
FROM
(SELECT 1 x
 FROM [Payments] AS [p]
 GROUP BY [p].[AccountId]) t

It may seem unusual to construct an EF query in this way. However, this issue surfaced when using GroupBy in a query for a paged list and I believe this to be a relatively common case. In my project the paging algorithm has no knowledge of the underlying query. The query is executed first with Count to get the total count of records, and again with Take/Skip to get the current page of records.

Steps to reproduce

I've created a small project to reproduce the issue: https://github.com/mpetito/efcore-2.1-groupby

Given the following seed data:

db.Payments.AddRange(
    new Payment {AccountId = 1, Value = 10},
    new Payment {AccountId = 2, Value = 11},
    new Payment {AccountId = 2, Value = 12},
    new Payment {AccountId = 3, Value = 13},
    new Payment {AccountId = 3, Value = 14},
    new Payment {AccountId = 3, Value = 15});

This query gives the incorrect count of groups:

var queryableCount = await db.Payments.GroupBy(p => p.AccountId, (accountId, group) => new {Account = accountId, Total = group.Sum(g => g.Value)}).CountAsync();
Console.WriteLine($"Count of queryable payments grouped by AccountId: {queryableCount}");

Count of queryable payments grouped by AccountId: 1

The correct count is 3 when grouping by AccountId. The count of 1 is arrived at because the first group itself has a count of 1, and EF must be taking the first result row for the count (discarding any additional result rows).

Further technical details

EF Core version: 2.1.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 10 / SQL Server 2016

@smitpatel
Copy link
Contributor

smitpatel commented Jun 14, 2018

GroupBy should cause pushdown when applying some result operators. Need to update method PrepareSelectExpressionForAggregate

Expected SQL

SELECT COUNT(*)
FROM (
    SELECT [p].[AccountId] AS [Account], Sum([p].[Value] AS [Total]
    FROM [Payments] AS [p]
    GROUP BY [p].[AccountId]) AS [t]

@smitpatel
Copy link
Contributor

@mpetito - As a work-around you can just use distinct instead of group by like this
var query = db.Payments.Select(p => p.AccountId).Distinct().Count()
It is not really good solution if you are reusing the queryable to do counting and paging/grouping but at least gives correct results.

@ajcvickers ajcvickers added this to the 2.1.3 milestone Jun 15, 2018
maumar added a commit that referenced this issue Jun 20, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 20, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
@ajcvickers ajcvickers added Servicing-consider closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. patch-approved and removed Servicing-consider labels Jun 21, 2018
@ajcvickers
Copy link
Member

@maumar This issue is approved for patch and the release\2.1 branch is now open for merging. Please ensure:

  • The appropriate changes get into both the release and dev branches
  • Quirking is included for the release branch only

maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jun 30, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jul 1, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
maumar added a commit that referenced this issue Jul 1, 2018
…re 2.1)

There was two problems here:
- we were not lifting a select expression that had a GroupBy-Aggregate pattern, and another result operator was composed on top,
- we were not propagating RequiresStreamingGroupBy flag to a parent QM, when we lifted a client group by subquery, which could result in the client methods being wiped out if Count/LongCount was composed on top.
@HarutyunI
Copy link

HarutyunI commented Dec 13, 2018

This problem still exists in 2.2, but there is another workaround, maybe helpful for others
var query = db.Payments.GroupBy(p => p.AccountId).Select(p => p.Key).Count()

@marcobarbero
Copy link

@HarutyunI thanks for your workaround, it works correctly on 2.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression type-bug
Projects
None yet
Development

No branches or pull requests

6 participants