Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: GroupBy scalar subquery generates invalid SQL #19027

Closed
smitpatel opened this issue Nov 22, 2019 · 3 comments · Fixed by #24676
Closed

Query: GroupBy scalar subquery generates invalid SQL #19027

smitpatel opened this issue Nov 22, 2019 · 3 comments · Fixed by #24676
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. punted-for-5.0 type-bug
Milestone

Comments

@smitpatel
Copy link
Contributor

        [ConditionalTheory]
        [MemberData(nameof(IsAsyncData))]
        public virtual Task GroupBy_Subquery(bool async)
        {
            return AssertQuery(
                async,
                ss => ss.Set<Order>()
                    .GroupBy(o => ss.Set<Customer>()
                        .Where(c => c.CustomerID == o.CustomerID)
                        .Select(c => c.ContactName)
                        .FirstOrDefault())
                    .Select(
                        g => new
                        {
                            g.Key,
                            Count = g.Count()
                        }),
                elementSorter: e => e.Key);
        }

Generates SQL

SELECT (
    SELECT TOP(1) [c].[ContactName]
    FROM [Customers] AS [c]
    WHERE [c].[CustomerID] = [o].[CustomerID]) AS [Key], COUNT(*) AS [Count]
FROM [Orders] AS [o]
GROUP BY (
    SELECT TOP(1) [c].[ContactName]
    FROM [Customers] AS [c]
    WHERE [c].[CustomerID] = [o].[CustomerID])

Throws

Message: 
    Microsoft.Data.SqlClient.SqlException : Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause.
  Stack Trace: 
    SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
    SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
    TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
    TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
    SqlDataReader.TryConsumeMetaData()
    SqlDataReader.get_MetaData()
    SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
    SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
    SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method)
    SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
    SqlCommand.ExecuteReader(CommandBehavior behavior)
    SqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
    DbCommand.ExecuteReader()
    RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject) line 404
    Enumerator.InitializeReader(DbContext _, Boolean result) line 176
    ExecutionStrategy.ExecuteImplementation[TState,TResult](Func`3 operation, Func`3 verifySucceeded, TState state) line 173
    ExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded) line 159
    Enumerator.MoveNext() line 125
    LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
    EnumerableHelpers.ToArray[T](IEnumerable`1 source)
    Enumerable.ToArray[TSource](IEnumerable`1 source)
    QueryAsserter`1.AssertQuery[TResult](Func`2 actualQuery, Func`2 expectedQuery, Func`2 elementSorter, Action`2 elementAsserter, Boolean assertOrder, Int32 entryCount, Boolean async, String testMethodName) line 87
    --- End of stack trace from previous location ---

Passes for InMemory/Sqlite

@smitpatel
Copy link
Contributor Author

smitpatel commented Nov 23, 2019

Note to implementer: Do processing similar to how we deal with constants/parameters.
We ignore GroupBy in scope, instead just work with it on client side. We need to pushdown.

@itan-mcp
Copy link

itan-mcp commented Dec 17, 2019

In a meanwhile, until we got a fix, there is a temporary solution. I built extension method ApplySubQuery to include subquery with APPLY operators, and it also solve case GROUP BY + SubQuery.

Using that approach, I'm finally able to use the columns from subquery in GroupBy clause, which is one of the "fundamental" issue of EF Core.

EF Core dupplicates the subquery 'statement' into another clauses as WHERE, which can also have negative affects on performance or doesn't work at all (in GroupBy case).

And it's happening constantly with EF Core. Thats why I said that is a fundamental issue.

Standard Linq, non-working

await _dbContext
		.Patients
		.Select(x => new
		{
			x.ClinicCreatedId,
			ConfirmationDate = _dbContext.Appointments
										.Where(y => y.PatientId == x.Id)
										.OrderByDescending(y => y.ConfirmationDate)
										.Select(x => x.ConfirmationDate)
										.FirstOrDefault()
		})
		.GroupBy(x => new { x.ClinicCreatedId, x.ConfirmationDate })
		.Select(x => new
		{
			x.Key.ClinicCreatedId,
			x.Key.ConfirmationDate,
			PatientCount = x.Count()
		})
		.ToListAsync();

it's produced invalid SQL

SELECT [p].[ClinicCreatedId], (
    SELECT TOP(1) [a].[ConfirmationDate]
    FROM [dbo].[Appointments] AS [a]
    WHERE [a].[PatientId] = [p].[Id]
    ORDER BY [a].[ConfirmationDate] DESC) AS [ConfirmationDate], COUNT(*) AS [PatientCount]
FROM [dbo].[Patients] AS [p]
GROUP BY [p].[ClinicCreatedId], (
    SELECT TOP(1) [a].[ConfirmationDate]
    FROM [dbo].[Appointments] AS [a]
    WHERE [a].[PatientId] = [p].[Id]
    ORDER BY [a].[ConfirmationDate] DESC)

Here the working example:

 await _dbContext
                                    .Patients
                                    .ApplySubQuery(x => _dbContext.Appointments.Where(y => y.PatientId == x.Id).OrderByDescending(y => y.ConfirmationDate).Take(1))
                                    .Select(x => new
                                    {
                                        x.Outer.ClinicCreatedId,
                                        x.Inner.ConfirmationDate
                                    })
                                    .GroupBy(x => new { x.ClinicCreatedId, x.ConfirmationDate })
                                    .Select(x => new
                                    {
                                        x.Key.ClinicCreatedId,
                                        x.Key.ConfirmationDate,
                                        PatientCount = x.Count()
                                    })
                                    .ToListAsync();
SELECT [p].[ClinicCreatedId], [t0].[ConfirmationDate], COUNT(*) AS [PatientCount]
FROM [dbo].[Patients] AS [p]
CROSS APPLY (
    SELECT [t].*
    FROM (
        SELECT TOP(1) [a].*
        FROM [dbo].[Appointments] AS [a]
        WHERE [a].[PatientId] = [p].[Id]
        ORDER BY [a].[ConfirmationDate] DESC
    ) AS [t]
    WHERE CAST(1 AS bit) = CAST(1 AS bit)
) AS [t0]
GROUP BY [p].[ClinicCreatedId], [t0].[ConfirmationDate]

@lauxjpn
Copy link
Contributor

lauxjpn commented Oct 15, 2020

Also generates the following error in MySQL with SQL_MODE containing ONLY_FULL_GROUP_BY:

MySqlConnector.MySqlException (0x80004005): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'northwind.o.CustomerID' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
SELECT (
    SELECT `c`.`ContactName`
    FROM `Customers` AS `c`
    WHERE `c`.`CustomerID` = `o`.`CustomerID`
    LIMIT 1) AS `Key`, COUNT(*) AS `Count`
FROM `Orders` AS `o`
GROUP BY (
    SELECT `c`.`ContactName`
    FROM `Customers` AS `c`
    WHERE `c`.`CustomerID` = `o`.`CustomerID`
    LIMIT 1)

Works fine however, if ONLY_FULL_GROUP_BY is not set.

@ajcvickers ajcvickers modified the milestones: Backlog, 6.0.0 Nov 5, 2020
smitpatel added a commit that referenced this issue Apr 17, 2021
So that we can reference them using column and generate valid SQL

Resolves #19027
Resolves #23432
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Apr 17, 2021
smitpatel added a commit that referenced this issue Apr 20, 2021
So that we can reference them using column and generate valid SQL

Resolves #19027
Resolves #23432
@ghost ghost closed this as completed in #24676 Apr 20, 2021
ghost pushed a commit that referenced this issue Apr 20, 2021
)

So that we can reference them using column and generate valid SQL

Resolves #19027
Resolves #23432
@ajcvickers ajcvickers removed this from the 6.0.0 milestone Apr 26, 2021
@ajcvickers ajcvickers added this to the 6.0.0-preview5 milestone Apr 26, 2021
@ajcvickers ajcvickers modified the milestones: 6.0.0-preview5, 6.0.0 Nov 8, 2021
lauxjpn added a commit to lauxjpn/Pomelo.EntityFrameworkCore.MySql that referenced this issue Nov 9, 2021
lauxjpn added a commit to lauxjpn/Pomelo.EntityFrameworkCore.MySql that referenced this issue Nov 9, 2021
lauxjpn added a commit to PomeloFoundation/Pomelo.EntityFrameworkCore.MySql that referenced this issue Nov 9, 2021
* Remove remnants of workaround for dotnet/efcore#25127.

* Remove workaround for dotnet/efcore#24819.

* Remove workaround for dotnet/efcore#24806.

* Remove workaround for dotnet/efcore#19027.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. punted-for-5.0 type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants