Fixes generation of broken SQL on multiple skip/take calls #23670

bart-degreed · 2020-12-12T22:26:32Z

I've spent some time investigating how to fix #21026, because our users keep hitting this bug over and over again.
To make analysis easier, I've simplified the repoduction scenario a bit.

The next query:

var query = dbContext.Blogs
    .OrderBy(blog => blog.Id)
    .Skip(2).Take(3) // <--- causes crash when BOTH lines are uncommented
    .Select(blog => new Blog
    {
        Posts = blog.Posts
            .OrderBy(post => post.Id)
            .Skip(1).Take(5) // <--- causes crash when BOTH lines are uncommented
            .Select(post => new Post
            {
                Author = post.Author
            })
            .ToList()
    });

produces the following when using SQL Server:

DECLARE @__p_0 int = 2;
DECLARE @__p_1 int = 3;

SELECT [t].[Id], [t1].[Id], [t1].[LastName], [t1].[Id0]
FROM (
    SELECT [b].[Id]
    FROM [Blogs] AS [b]
    ORDER BY [b].[Id]
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
OUTER APPLY (
    SELECT [a].[Id], [a].[LastName], [t].[Id] AS [Id0]    -- [t].[Id] should be: [t0].[Id]
    FROM (
        SELECT [p].[Id], [p].[AuthorId], [p].[BlogId], [p].[Title]
        FROM [Post] AS [p]
        WHERE [t].[Id] = [p].[BlogId]
        ORDER BY [p].[Id]
        OFFSET 1 ROWS FETCH NEXT 5 ROWS ONLY
    ) AS [t0]
    LEFT JOIN [Author] AS [a] ON [t].[AuthorId] = [a].[Id]    -- [t].[AuthorId] should be: [t0].[AuthorId]
) AS [t1]
ORDER BY [t].[Id], [t1].[Id0], [t1].[Id]

This fails with the below error when sent to SQL Server LocalDb:

Invalid column name 'AuthorId'.

Looking at the EF Core source code, I found that SelectExpression.PushdownIntoSubquery always uses the hardcoded alias 't' (here), which is called multiple times. This makes it impossible for the LEFT JOIN generator to distinguish between [t] and what is later renamed to [t0] by TableAliasUniquifyingExpressionVisitor.

I believe I've fixed this bug by assigning a unique alias each time SelectExpression.PushdownIntoSubquery executes.
The downside is that this changes the generated names in SQL, breaking lots of other tests (queries are built from the inside out).

When applying the changes from this PR, the next SQL is generated, which looks correct to me:

@__p_0='2'
@__p_1='3'

SELECT [t1].[Id], [t00].[Id], [t00].[LastName], [t00].[Id0]
FROM (
    SELECT [b].[Id]
    FROM [Blogs] AS [b]
    ORDER BY [b].[Id]
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t1]
OUTER APPLY (
    SELECT [a].[Id], [a].[LastName], [t0].[Id] AS [Id0]
    FROM (
        SELECT [p].[Id], [p].[AuthorId]
        FROM [Post21026] AS [p]
        WHERE [t1].[Id] = [p].[Blog21026Id]
        ORDER BY [p].[Id]
        OFFSET 1 ROWS FETCH NEXT 5 ROWS ONLY
    ) AS [t0]
    LEFT JOIN [Author21026] AS [a] ON [t0].[AuthorId] = [a].[Id]
) AS [t00]
ORDER BY [t1].[Id], [t00].[Id0], [t00].[Id]

An interesting side effect of my change is that the deepest SELECT no longer selects BlogId and Title, which were not needed anyway.

I also noticed my test succeeds when run in isolation, but fails when running all tests in the file. The alias numbers change. I'm probably generating them at the wrong scope.

As I'm fairly new to this codebase, I'm sure there must a better way to solve this. Any pointers are highly appreciated! I'll add doc-comments etc. once I know I'm on the right track.

Closes #21026.

dnfadmin · 2020-12-12T22:26:49Z

All CLA requirements met.

bart-degreed · 2020-12-12T22:47:42Z

@ajcvickers Can someone check if my approach is on the right track?

smitpatel · 2020-12-14T17:04:54Z

We don't intend to fix aliasing issue this way. See #17337

ajcvickers · 2020-12-15T20:00:56Z

@bart-degreed A couple of things about this. First, the scope of the fix here is too great to consider patching for 5.0.x. So any PR for this issue should be made against main for 6.0.

Second, there are some design notes on the issue: #17337 (comment). Please refer to these for guidance on how this issue should be addressed. However, it's worth noting that this is not a trivial change to make and likely requires significant experience in the query pipeline to fix, so it may not be the best choice for a community contribution.

bart-degreed · 2020-12-16T09:21:37Z

Thanks for linking, I now better understand what is going on. After reading the design, it makes sense to me that this is not going to be a trivial change that can be delivered as a patch. I'm glad though, to see there's activity on getting this addressed, and that we're not the only ones suffering from this bug.

I believe I don't have sufficient understanding of the internals at the moment to be productive on this, so I'm closing this PR. Any links to documentation on internals would be welcome. Once a fix is implemented, I'd be happy to verify it against our project (we have about 1300 tests).

Fixes #21026.

9e4c7be

bart-degreed changed the title ~~Fixes #21026.~~ Fixes generation of broken SQL on multiple skip/take calls Dec 12, 2020

bart-degreed closed this Dec 16, 2020

bart-degreed deleted the fix-21026 branch December 16, 2020 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes generation of broken SQL on multiple skip/take calls #23670

Fixes generation of broken SQL on multiple skip/take calls #23670

bart-degreed commented Dec 12, 2020 •

edited

Loading

dnfadmin commented Dec 12, 2020 •

edited

Loading

bart-degreed commented Dec 12, 2020

smitpatel commented Dec 14, 2020

ajcvickers commented Dec 15, 2020

bart-degreed commented Dec 16, 2020

Fixes generation of broken SQL on multiple skip/take calls #23670

Fixes generation of broken SQL on multiple skip/take calls #23670

Conversation

bart-degreed commented Dec 12, 2020 • edited Loading

dnfadmin commented Dec 12, 2020 • edited Loading

bart-degreed commented Dec 12, 2020

smitpatel commented Dec 14, 2020

ajcvickers commented Dec 15, 2020

bart-degreed commented Dec 16, 2020

bart-degreed commented Dec 12, 2020 •

edited

Loading

dnfadmin commented Dec 12, 2020 •

edited

Loading