Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Condition's order will cause different result when using .Contains() to search string cross varchar and nvarchar columns #29646

Closed
ian90911 opened this issue Nov 22, 2022 · 5 comments · Fixed by #29650
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Milestone

Comments

@ian90911
Copy link

When I search a string cross varchar and nvarchar columns, my condition's appearing order will generate different SQL parameter type, and get different result.

Here is my reproduce REPO:
https://github.com/ian90911/TestEfContainsBug/blob/master/TestEfContains/Program.cs

EF Core version:
Database provider: Microsoft.EntityFrameworkCore.SqlServer 6.0.9
Target framework: .NET 6.0
Operating system:
IDE: Visual Studio 2022 17.2.5

Consider a table have varchar and nvarchar column:

CREATE TABLE [dbo].[test](
	[stockId] [varchar](6) NOT NULL,
	[stockName] [nvarchar](48) NOT NULL,
	CONSTRAINT [pk_test] PRIMARY KEY CLUSTERED ([stockId] ASC)
)

Case 1 : When query's condition in Where use varchar column first, it will get nothing:

var query1 = await db.Tests.Where(x => x.StockId == queryName || x.StockName.Contains(queryName)).FirstOrDefaultAsync();
//result is null
      Generated query execution expression: 
      'queryContext => ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync<Test>(
          asyncEnumerable: new SingleQueryingEnumerable<Test>(
              (RelationalQueryContext)queryContext, 
              RelationalCommandCache.SelectExpression(
                  Projection Mapping:
                      EmptyProjectionMember -> Dictionary<IProperty, int> { [Property: Test.StockId (string) Required PK AfterSave:Throw MaxLength(6) Ansi, 0], [Property: Test.StockName (string) Required MaxLength(48), 1] }
                  SELECT TOP(1) t.stockId, t.stockName
                  FROM test AS t
                  WHERE (t.stockId == @__queryName_0) || ((@__queryName_0 LIKE N'') || (CHARINDEX(@__queryName_0, t.stockName) > 0))), 
              Func<QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator, Test>, 
              TestEfContains.MyDbContext, 
              False, 
              False, 
              True
          ), 
          cancellationToken: queryContext.CancellationToken)'

Case 2 : When use nvarchar column first, it will get expected result :

var query2 = await db.Tests.Where(x => x.StockName.Contains(queryName) || x.StockId == queryName).FirstOrDefaultAsync();
//print data
      Generated query execution expression: 
      'queryContext => ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync<Test>(
          asyncEnumerable: new SingleQueryingEnumerable<Test>(
              (RelationalQueryContext)queryContext, 
              RelationalCommandCache.SelectExpression(
                  Projection Mapping:
                      EmptyProjectionMember -> Dictionary<IProperty, int> { [Property: Test.StockId (string) Required PK AfterSave:Throw MaxLength(6) Ansi, 0], [Property: Test.StockName (string) Required MaxLength(48), 1] }
                  SELECT TOP(1) t.stockId, t.stockName
                  FROM test AS t
                  WHERE ((@__queryName_0 LIKE N'') || (CHARINDEX(@__queryName_0, t.stockName) > 0)) || (t.stockId == @__queryName_0)), 
              Func<QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator, Test>, 
              TestEfContains.MyDbContext, 
              False, 
              False, 
              True
          ), 
          cancellationToken: queryContext.CancellationToken)'

Case 3 : use EF.Functions.Like()

var query3 = await db.Tests.Where(x => x.StockId == queryName || EF.Functions.Like(x.StockName, $"%{queryName}%")).FirstOrDefaultAsync();
      Generated query execution expression: 
      'queryContext => ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync<Test>(
          asyncEnumerable: new SingleQueryingEnumerable<Test>(
              (RelationalQueryContext)queryContext, 
              RelationalCommandCache.SelectExpression(
                  Projection Mapping:
                      EmptyProjectionMember -> Dictionary<IProperty, int> { [Property: Test.StockId (string) Required PK AfterSave:Throw MaxLength(6) Ansi, 0], [Property: Test.StockName (string) Required MaxLength(48), 1] }
                  SELECT TOP(1) t.stockId, t.stockName
                  FROM test AS t
                  WHERE (t.stockId == @__queryName_0) || (t.stockName LIKE @__Format_2)), 
              Func<QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator, Test>, 
              TestEfContains.MyDbContext, 
              False, 
              False, 
              True
          ), 
          cancellationToken: queryContext.CancellationToken)'

It seems in case 1, ef core use Ansi string to declare parameter, which cause the empty result.
I can use EF.Functions.Like() to avoid condition order problem, but I think the result of case 1 is not as expected.

@roji
Copy link
Member

roji commented Nov 22, 2022

Duplicate of #19503

@roji roji marked this as a duplicate of #19503 Nov 22, 2022
@roji
Copy link
Member

roji commented Nov 22, 2022

This has already been fixed for the recently-released EF Core 7.0, please give that a try.

@ian90911
Copy link
Author

This has already been fixed for the recently-released EF Core 7.0, please give that a try.

@roji
I try to update my reproduce repo's package to EF core 7.0, and it shows same result on my case 1. ( Is there something I need to update more? )
ian90911/TestEfContainsBug@7016a01

Another new information from my partner, he found out cause my value in nvarchar column is Chinese base, if I change db's collation to Chinese_Taiwan_Stroke_CI_AS , it will fix, but that's not look like same reason as duplicate's one.
ian90911/TestEfContainsBug@57bc8b9

@roji
Copy link
Member

roji commented Nov 22, 2022

Confirmed that there's a bug here. Minimal repro:

await using var ctx = new BlogContext();
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();

var p = "foo";
_ = await ctx.Blogs.Where(x => x.Name1 == p || x.Name2.Contains(p)).FirstOrDefaultAsync();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Blog
{
    public int Id { get; set; }
    [Column(TypeName = "varchar(6)")]
    public string? Name1 { get; set; }
    [Column(TypeName = "nvarchar(48)")]
    public string? Name2 { get; set; }
}

7.0 SQL:

Executed DbCommand (45ms) [Parameters=[@__p_0='foo' (Size = 6) (DbType = AnsiString), @__p_0_1='foo' (Size = 48)], CommandType='Text', CommandTimeout='30']
SELECT TOP(1) [b].[Id], [b].[Name1], [b].[Name2]
FROM [Blogs] AS [b]
WHERE [b].[Name1] = @__p_0 OR (@__p_0_1 LIKE N'') OR CHARINDEX(@__p_0, [b].[Name2]) > 0

Note that CHARINDEX searches for @__p_0 in Name2 (mixing AnsiString and Unicode), although it should be searching for @__p_0_1.

For comparison, if we remove the first condition and leave only the Contains, we get:

Executed DbCommand (35ms) [Parameters=[@__p_0='foo' (Size = 48)], CommandType='Text', CommandTimeout='30']
SELECT TOP(1) [b].[Id], [b].[Name1], [b].[Name2]
FROM [Blogs] AS [b]
WHERE (@__p_0 LIKE N'') OR CHARINDEX(@__p_0, [b].[Name2]) > 0

Finally, in 6.0 we generated only one parameter for everything, which was also incorrect (#19503 was the fix for that):

6.0 SQL (just one parameter):

Executed DbCommand (35ms) [Parameters=[@__p_0='foo' (Size = 6) (DbType = AnsiString)], CommandType='Text', CommandTimeout='30']
SELECT TOP(1) [b].[Id], [b].[Name1], [b].[Name2]
FROM [Blogs] AS [b]
WHERE ([b].[Name1] = @__p_0) OR ((@__p_0 LIKE N'') OR (CHARINDEX(@__p_0, [b].[Name2]) > 0))

@roji roji marked this as not a duplicate of #19503 Nov 22, 2022
@roji roji self-assigned this Nov 22, 2022
roji added a commit to roji/efcore that referenced this issue Nov 22, 2022
@roji roji added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Nov 22, 2022
roji added a commit to roji/efcore that referenced this issue Nov 23, 2022
@ghost ghost closed this as completed in #29650 Nov 23, 2022
ghost pushed a commit that referenced this issue Nov 23, 2022
@roji
Copy link
Member

roji commented Nov 23, 2022

Reopening to consider for servicing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants