Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes generation of broken SQL on multiple skip/take calls #23670

Closed
wants to merge 1 commit into from
Closed

Fixes generation of broken SQL on multiple skip/take calls #23670

wants to merge 1 commit into from

Conversation

bart-degreed
Copy link

@bart-degreed bart-degreed commented Dec 12, 2020

I've spent some time investigating how to fix #21026, because our users keep hitting this bug over and over again.
To make analysis easier, I've simplified the repoduction scenario a bit.

The next query:

var query = dbContext.Blogs
    .OrderBy(blog => blog.Id)
    .Skip(2).Take(3) // <--- causes crash when BOTH lines are uncommented
    .Select(blog => new Blog
    {
        Posts = blog.Posts
            .OrderBy(post => post.Id)
            .Skip(1).Take(5) // <--- causes crash when BOTH lines are uncommented
            .Select(post => new Post
            {
                Author = post.Author
            })
            .ToList()
    });

produces the following when using SQL Server:

DECLARE @__p_0 int = 2;
DECLARE @__p_1 int = 3;

SELECT [t].[Id], [t1].[Id], [t1].[LastName], [t1].[Id0]
FROM (
    SELECT [b].[Id]
    FROM [Blogs] AS [b]
    ORDER BY [b].[Id]
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
OUTER APPLY (
    SELECT [a].[Id], [a].[LastName], [t].[Id] AS [Id0]    -- [t].[Id] should be: [t0].[Id]
    FROM (
        SELECT [p].[Id], [p].[AuthorId], [p].[BlogId], [p].[Title]
        FROM [Post] AS [p]
        WHERE [t].[Id] = [p].[BlogId]
        ORDER BY [p].[Id]
        OFFSET 1 ROWS FETCH NEXT 5 ROWS ONLY
    ) AS [t0]
    LEFT JOIN [Author] AS [a] ON [t].[AuthorId] = [a].[Id]    -- [t].[AuthorId] should be: [t0].[AuthorId]
) AS [t1]
ORDER BY [t].[Id], [t1].[Id0], [t1].[Id]

This fails with the below error when sent to SQL Server LocalDb:

Invalid column name 'AuthorId'.

Looking at the EF Core source code, I found that SelectExpression.PushdownIntoSubquery always uses the hardcoded alias 't' (here), which is called multiple times. This makes it impossible for the LEFT JOIN generator to distinguish between [t] and what is later renamed to [t0] by TableAliasUniquifyingExpressionVisitor.

I believe I've fixed this bug by assigning a unique alias each time SelectExpression.PushdownIntoSubquery executes.
The downside is that this changes the generated names in SQL, breaking lots of other tests (queries are built from the inside out).

When applying the changes from this PR, the next SQL is generated, which looks correct to me:

@__p_0='2'
@__p_1='3'

SELECT [t1].[Id], [t00].[Id], [t00].[LastName], [t00].[Id0]
FROM (
    SELECT [b].[Id]
    FROM [Blogs] AS [b]
    ORDER BY [b].[Id]
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t1]
OUTER APPLY (
    SELECT [a].[Id], [a].[LastName], [t0].[Id] AS [Id0]
    FROM (
        SELECT [p].[Id], [p].[AuthorId]
        FROM [Post21026] AS [p]
        WHERE [t1].[Id] = [p].[Blog21026Id]
        ORDER BY [p].[Id]
        OFFSET 1 ROWS FETCH NEXT 5 ROWS ONLY
    ) AS [t0]
    LEFT JOIN [Author21026] AS [a] ON [t0].[AuthorId] = [a].[Id]
) AS [t00]
ORDER BY [t1].[Id], [t00].[Id0], [t00].[Id]

An interesting side effect of my change is that the deepest SELECT no longer selects BlogId and Title, which were not needed anyway.

I also noticed my test succeeds when run in isolation, but fails when running all tests in the file. The alias numbers change. I'm probably generating them at the wrong scope.

As I'm fairly new to this codebase, I'm sure there must a better way to solve this. Any pointers are highly appreciated! I'll add doc-comments etc. once I know I'm on the right track.

Closes #21026.

@dnfadmin
Copy link

dnfadmin commented Dec 12, 2020

CLA assistant check
All CLA requirements met.

@bart-degreed bart-degreed changed the title Fixes #21026. Fixes generation of broken SQL on multiple skip/take calls Dec 12, 2020
@bart-degreed
Copy link
Author

@ajcvickers Can someone check if my approach is on the right track?

@smitpatel
Copy link
Contributor

We don't intend to fix aliasing issue this way. See #17337

@ajcvickers
Copy link
Contributor

@bart-degreed A couple of things about this. First, the scope of the fix here is too great to consider patching for 5.0.x. So any PR for this issue should be made against main for 6.0.

Second, there are some design notes on the issue: #17337 (comment). Please refer to these for guidance on how this issue should be addressed. However, it's worth noting that this is not a trivial change to make and likely requires significant experience in the query pipeline to fix, so it may not be the best choice for a community contribution.

@bart-degreed
Copy link
Author

Thanks for linking, I now better understand what is going on. After reading the design, it makes sense to me that this is not going to be a trivial change that can be delivered as a patch. I'm glad though, to see there's activity on getting this addressed, and that we're not the only ones suffering from this bug.

I believe I don't have sufficient understanding of the internals at the moment to be productive on this, so I'm closing this PR. Any links to documentation on internals would be welcome. Once a fix is implemented, I'd be happy to verify it against our project (we have about 1300 tests).

@bart-degreed bart-degreed deleted the fix-21026 branch December 16, 2020 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants