Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenation in projection throws "System.InvalidOperationException: Unable to translate a collection subquery in a projection" #33410

Closed
PavelFischerCoupa opened this issue Mar 27, 2024 · 8 comments

Comments

@PavelFischerCoupa
Copy link

Hello,

after updating EfCore from 7.016 to 8.0.3 we receive now an exception if we use a static collection in projection like

    DbContext.Select(x => new
    {
         x.Id,
         Names = new[] { x.ReferencedEntity.SomeValue }.Concat(x.Children.Select(c => c.SomeValue)).ToList()
    })

Exception:

Unhandled exception. System.InvalidOperationException: Unable to translate a collection subquery in a projection since either parent or the subquery doesn't project necessary information required to uniquely identify it and correctly generate results on the client side. This can happen when trying to correlate 
on keyless entity type. This can also happen for some cases of projection before 'Distinct' or some shapes of grouping key in case of 'GroupBy'. These should either contain all key properties of the entity that the operation is applied on, or only contain simple property access expressions.
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplyProjection(Expression shaperExpression, ResultCardinality resultCardinality, QuerySplittingBehavior querySplittingBehavior)
   at Microsoft.EntityFrameworkCore.Query.Internal.SelectExpressionProjectionApplyingExpressionVisitor.VisitExtension(Expression extensionExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryTranslationPostprocessor.Process(Expression query)

Include provider and version information

EF Core version: 8.0.3
Database provider:icrosoft.EntityFrameworkCore.SqlServer
Target framework: .NET 8.0
Operating system: Win 10
IDE: not related to IDE

EfCoreIssue.zip

@PavelFischerCoupa
Copy link
Author

maybe it is related to #29718

@roji
Copy link
Member

roji commented Mar 27, 2024

Confirmed regression from 7.0 to 8.0.

Minimal repro
await using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();

await context.Blogs.Select(b => new
{
    b.Id,
    Names = new[] { b.Name }.Concat(b.Posts.Select(c => c.Title)).ToList()
}).ToListAsync();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Blog
{
    public int Id { get; set; }
    public string Name { get; set; }

    public List<Post> Posts { get; set; }
}

public class Post
{
    public int Id { get; set; }
    public string Title { get; set; }

    public Blog Blog { get; set; }
}

@maumar
Copy link
Contributor

maumar commented Mar 30, 2024

problem here is that now for inline array we create a proper query root, and the resulting expression is now subject to all the extra calculations we do for regular select expressions.

First issue is that inline query root is missing identifier (_ord column can be used for that).

Second issue is that in SelectExpression.ApplySetOperation when we generate outer projection based on two sources, type mapping of the resulting column is null. We intentionally generate InlineQueryRoot column with null type mapping, so that it can be inferred later (in ProcessTypeMappings after translation has been completed). However, that process only applies type mappings to the columns of the query roots, and not propagate it further up. We can either make the visitor "smarter" or during ApplySetOperation (and other similar places?), generate the outer column with the proper TypeMapping right away.

Third problem is that for Concat operation we can't properly reason about identifiers. Note that something like that:

Names = b.Posts.Select(c => c.Title).Concat(b.Posts2.Select(c => c.Title)).ToList()

fails currently for the same reason - both sources have different identifiers, so we don't know how to calculate the resulting identifier. However, if we fix issues 1-3, we can make this work for Union. Union makes the results distinct, so the entire projection (i.e. Title column) becomes the identifier. To fix this for Concat we would have to redesign the whole logic around identifiers, or relax the warning and then deal with occasional incorrect bucketing of the results (data corruption)

@roji

@roji
Copy link
Member

roji commented Mar 30, 2024

First issue is that inline query root is missing identifier (_ord column can be used for that).

You're right... Opened #33436 to track that separately, will submit a PR shortly.

Second issue is that in SelectExpression.ApplySetOperation when we generate outer projection based on two sources, type mapping of the resulting column is null.

Specifically looking at the query above:

new[] { b.Name }.Concat(b.Posts.Select(c => c.Title)).ToList()

I realized that the type mapping for the left side (VALUES) shouldn't be null; we should be bubbling it up from b.Name. Opened #33438 to track that - will submit a PR shortly. However, if the inline collection really contains no columns (i.e. no type mappings to infer), then we indeed still have a problem. We maybe need to think about this a bit - I'm not sure why we'd insist on having a non-null type mapping at the level of the set operation (I might have the wrong picture) - let's discuss.

BTW just noting that InlineQueryRootExpression is the pre-translation node - the translation for that is ValuesExpression (so it's the SelectExpression over this that's missing the identifier). And indeed, when do that translation we currently don't set an identifier.

Third problem is that for Concat operation we can't properly reason about identifiers. [...]

Maybe we should have a good think about the identifier question... For example, specifically for the query you quote above:

Names = b.Posts.Select(c => c.Title).Concat(b.Posts2.Select(c => c.Title)).ToList()

Intuitively, it doesn't seem to me like there should be a problem here with Concat any more than there is without that Concat (i.e. just with b.Posts.Select(c => c.Title) - maybe let's discuss this too.

@roji roji changed the title Concatination in projection throws "System.InvalidOperationException: Unable to translate a collection subquery in a projection" Concatenation in projection throws "System.InvalidOperationException: Unable to translate a collection subquery in a projection" Mar 30, 2024
@maumar
Copy link
Contributor

maumar commented Apr 6, 2024

with #33436 and #33438, this scenario get unblocked somewhat - should now work for Union case. Since Union applies distinct, the resulting identifier can be the output of the Union. In case of concat we can't do the same trick.

In principle, we shouldn't need identifiers here, because it's the only collection in the query, so we can try to improve that. Also, maybe there is a way to create an identifier for Concat? (like a composite of identifers from both sides? they should be unique. But that could be a significant rework of identifier logic

@roji
Copy link
Member

roji commented Apr 6, 2024

In principle, we shouldn't need identifiers here [...]

Yeah, I think that's important - there's indeed no related dependent collection, only a single, flat collection - that's seems to be a sort of bug. Maybe open an issue to track that?

Also, maybe there is a way to create an identifier for Concat?

Yeah, identifiers is something we probably want to think about a bit more globally, including cases like non-distinct set operations (do we have an issue for that too?). We also probably want to think about identifiers in the context of #29171, i.e. removing the orderings when joining (I'm still not sure of the exact interaction with identifiers).

@maumar
Copy link
Contributor

maumar commented Apr 6, 2024

filed #33485

@maumar
Copy link
Contributor

maumar commented May 24, 2024

closing this in favor of #33485 (and the two other issues that already have been fixed).

@maumar maumar closed this as completed May 24, 2024
@maumar maumar closed this as not planned Won't fix, can't repro, duplicate, stale May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants