Skip to content

Conversation

ranma42
Copy link
Contributor

@ranma42 ranma42 commented Jul 5, 2024

COALESCEis a syntactic shortcut for the CASE expression. As such, the input values are evaluated multiple times.

ISNULL does not have this shortcoming.

Fixes #32519.

@ranma42 ranma42 marked this pull request as ready for review July 5, 2024 21:34
@ranma42
Copy link
Contributor Author

ranma42 commented Jul 6, 2024

At least a test for the type propagation is needed

@ranma42 ranma42 marked this pull request as draft July 6, 2024 09:51
@ranma42
Copy link
Contributor Author

ranma42 commented Jul 6, 2024

It looks like the type mapping computed in Coalesce does not match the actual expression type :(
At least the test was helpful in identifying this issue 😅
This problem could be related to #14719 or #15586

@raymens
Copy link

raymens commented Sep 4, 2024

Another application for this is that COALESCE cannot be used for Indexed Views in SQL Server, however ISNULL is supported in case of aggregations.

This makes it difficult/impossible to make EF Core generated queries use the indexed views as I suppose the query processor thinks they're different.

For example:

CREATE VIEW [dbo].[vAggregation]
   WITH SCHEMABINDING
   AS
      SELECT a, SUM(ISNULL(b, 0.0)) AS bSum, COUNT_BIG(*) AS Count 
      FROM dbo.c
      GROUP BY a
GO

CREATE UNIQUE CLUSTERED INDEX IDX_vAggregation
   ON vAggregation(a)
GO

SELECT SUM(b) FROM c -- SLOW: not using the indexed view
SELECT SUM(COALESCE(b, 0)) FROM c -- SLOW: not using the indexed view
SELECT SUM(ISNULL(b, 0)) FROM c -- FAST: uses the indexed view

@cincuranet
Copy link
Contributor

@ranma42, gentle nudge on this. We like the ISNULL idea and we'd like to see this done.

@ranma42
Copy link
Contributor Author

ranma42 commented Dec 19, 2024

@cincuranet I will rebase the PR and check the new test results ASAP (worst case: next weekend)

Last time I kind of got stuck because of some typing issues as mentioned in #34171 (comment)

I will try to point out explicitly the issues that must be solved (ideally as questions) to unblock this

public virtual Task Coalesce_Correct_Type(bool async)
=> AssertQuery(
async,
ss => ss.Set<Customer>().Select(c => c.Region ?? "no region specified"));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is designed to ensure that EFCore takes care of the difference in type propagation between ISNULL and COALESCE.

From the docs:

Data type determination of the resulting expression is different. ISNULL uses the data type of the first parameter, COALESCE follows the CASE expression rules and returns the data type of value with the highest precedence.

Specifically, in this case c.Region is an nvarchar(15), while "no region specified" cannot be represented as a value of that type (it's longer than 15).

The implementation I did in this PR would take care of this, under the assumption that the typeMapping for CONVERT can be used to determine the type (and/or whether any casting/conversion is needed), but unfortunately EFCore currently computes it as nvarchar(15) even though (I believe) its result is actually an nvarchar(18).

See https://dbfiddle.uk/ITGS6ZVJ

DECLARE @what sql_variant;
DECLARE @foo varchar(15) = NULL;

SELECT @what = ISNULL(@foo, 'no region selected');
SELECT
    @what,
    SQL_VARIANT_PROPERTY(@what, 'BaseType'),
    SQL_VARIANT_PROPERTY(@what, 'MaxLength');

SELECT @what = COALESCE(@foo, 'no region selected');
SELECT
    @what,
    SQL_VARIANT_PROPERTY(@what, 'BaseType'),
    SQL_VARIANT_PROPERTY(@what, 'MaxLength');

@ranma42
Copy link
Contributor Author

ranma42 commented Dec 19, 2024

What is the best way forward for the type issue?
Also, what should be the scope? Just the SqlServer provider or EFCore in general?

I see that the type inference already handles string concatenation in a special way, but it looks like nothing at all happens for other values/operations (CASE and COALESCE included), which is kind of suspicious (what should the typeMapping of c.IntColumn + c.DoubleColumn be?)

NB: I am not sure I actually grasp what typeMappings are supposed to represent (hence how they should be computed/used), so maybe I am simply misusing it in this PR 😇

@cincuranet
Copy link
Contributor

What is the best way forward for the type issue?

Are you talking about the nvarchar(15) and nvarchar(18)?

Also, what should be the scope? Just the SqlServer provider or EFCore in general?

Likely just SQL Server, given the ISNULL is SQL Server specific.

(what should the typeMapping of c.IntColumn + c.DoubleColumn be?)

It should result in double, I believe.

@ranma42
Copy link
Contributor Author

ranma42 commented Jan 22, 2025

What is the best way forward for the type issue?

Are you talking about the nvarchar(15) and nvarchar(18)?

Yes. More in general, I am referring to the mismatch between the typeMapping computed by EFCore and the type of the SQL (sub-)expressions.
I delved a little more in type mappings and I am still missing some pieces of the puzzle 😅
IIUC they are used for few main purposes:

  • during insert/update, to convert the C# values into the SQL literal representation
  • during materialization, to convert the SQL values into corresponding C# values
  • sometimes, in expressions (AFAICT only against constants/parameters, which for example makes this inconsistent when 2 columns are compared 😕 )

If this is correct, for use case 1 & 2 typeMappings could simply be attached to columns, with no complex machinery to propagate them across all expressions.
EDIT: this is obviously wrong: expressions can be part of projections, hence need to be materialized

I am unsure whether the expression/comparison behavior is intentional (it is not mentioned as a limitation of value conversion); I'll prepare an issue to showcase the behavior I am seeing and understand if I am simply misreading the docs or if the query conversion has some problems in this corner case.

In this case I am not looking for a representation of the C#/SQL conversion, but rather the (SQL) type of the COALESCE expression; I would like to use in order to preserve the same type when converting the expression to ISNULL, but AFAICT this is something that can't be done simply using typeMappings, as they don't really compute the type of the expression. As an example of a place that does this, see

case ExpressionType.Add:
case ExpressionType.Subtract:
case ExpressionType.Multiply:
case ExpressionType.Divide:
case ExpressionType.Modulo:
case ExpressionType.And:
case ExpressionType.Or:
case ExpressionType.ExclusiveOr:
{
inferredTypeMapping = typeMapping ?? ExpressionExtensions.InferTypeMapping(left, right);
resultType = inferredTypeMapping?.ClrType ?? (left.Type != typeof(object) ? left.Type : right.Type);
resultTypeMapping = inferredTypeMapping;
break;
}

If a has a type mapping, the type mapping of a + b, a - b, ... is simply the type mapping of a, even if a is an int and b is a double.

Also, what should be the scope? Just the SqlServer provider or EFCore in general?

Likely just SQL Server, given the ISNULL is SQL Server specific.

👍 other providers might want to use a similar type propagation mechanism, but as long as it is only used in the SqlServer provider, we can look for a reasonable API/implementation in there; I'll move along this direction

(what should the typeMapping of c.IntColumn + c.DoubleColumn be?)

It should result in double, I believe.

This would be my expectation as well; I'll double check (actually, I'll add a test), but IIRC it currently results in int 😞

@ranma42
Copy link
Contributor Author

ranma42 commented Jan 22, 2025

If a has a type mapping, the type mapping of a + b, a - b, ... is simply the type mapping of a, even if a is an int and b is a double.

In most cases this is not actually a problem, as the compiler inserts casts to ensure that the type of a and b is matching, but apparently it does not take care of the ?? operator :\

@ranma42
Copy link
Contributor Author

ranma42 commented Jan 22, 2025

I fixed the conflicts and force-pushed the branch. I added the test for int/double coalescing, but I found out that it also fails regardless of ISNULL, so I opened #35516 to track it.

I also opened #35515 regarding another part of type mapping (value conversion).

@ranma42
Copy link
Contributor Author

ranma42 commented Feb 7, 2025

@cincuranet I rebased the PR and limited the scope to a special case that I believe might still be interesting, i.e.:

  • all sub-expressions have the same type as the COALESCE expression
  • all sub-expressions have the same type mapping as the COALESCE expression
  • the expression is using the default type mapping (combined with the two above, this implies that all of the expressions are using the default type mapping of the type)

This ensures that the conversion avoids some issues related to #15586
I would like to eventually fix them, but in the meantime it might make sense to proceed with ISNULL limited to cases where it is safer to use.

@ranma42
Copy link
Contributor Author

ranma42 commented Feb 8, 2025

I might need some help for Cosmos 😇

@ranma42 ranma42 marked this pull request as ready for review March 24, 2025 08:32
@ranma42 ranma42 requested a review from a team as a code owner March 24, 2025 08:32
@cincuranet
Copy link
Contributor

@ranma42 Here's what's failing. The first two seem to be easy renames. Third one is new test, below is what was produced, validate that and change AssertSql as needed.

    Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.SelectMany_primitive_select_subquery(async: False) [FAIL]
      Assert.Equal() Failure: Strings differ
                                        ↓ (pos 189)
      Expected: ···"+<>c__DisplayClass172_0[Microsoft.EntityF"···
      Actual:   ···"+<>c__DisplayClass175_0[Microsoft.EntityF"···
                                        ↑ (pos 189)
      Stack Trace:
        D:\a\efcore\efcore\test\EFCore.Cosmos.FunctionalTests\Query\NorthwindMiscellaneousQueryCosmosTest.cs(3380,0): at Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.SelectMany_primitive_select_subquery(Boolean async)
        --- End of stack trace from previous location ---
      Output:
        Compiling query expression: 
        'DbSet<Employee>()
            .Any()'
        
        Generated query execution expression: 
        'queryContext => new QueryingEnumerable<bool>(
            (CosmosQueryContext)queryContext, 
            SqlExpressionFactory, 
            QuerySqlGeneratorFactory, 
            Projection Mapping:
                EmptyProjectionMember -> 0
            SELECT VALUE EXISTS (
                SELECT 1
                FROM root AS c) AS c, 
            Func<QueryContext, JToken, bool>, 
            TestModels.Northwind.NorthwindContext, 
            EntityType: Employee, 
            List<Expression> {  }, 
            False, 
            True
        )
            .Single()'
        
        An exception occurred while iterating over the results of a query for context type 'Microsoft.EntityFrameworkCore.TestModels.Northwind.NorthwindContext'.
        System.InvalidOperationException: An error was generated for warning 'Microsoft.EntityFrameworkCore.Database.SyncNotSupported': Azure Cosmos DB does not support synchronous I/O. Make sure to use and correctly await only async methods when using Entity Framework Core to access Azure Cosmos DB. See https://aka.ms/ef-cosmos-nosync for more information. This exception can be suppressed or logged by passing event ID 'CosmosEventId.SyncNotSupported' to the 'ConfigureWarnings' method in 'DbContext.OnConfiguring' or 'AddDbContext'.
           at Microsoft.EntityFrameworkCore.Diagnostics.EventDefinition.Log[TLoggerCategory](IDiagnosticsLogger`1 logger, Exception exception) in D:\a\efcore\efcore\src\EFCore\Diagnostics\EventDefinition.cs:line 66
           at Microsoft.EntityFrameworkCore.Cosmos.Diagnostics.Internal.CosmosLoggerExtensions.SyncNotSupported(IDiagnosticsLogger`1 diagnostics) in D:\a\efcore\efcore\src\EFCore.Cosmos\Diagnostics\Internal\CosmosLoggerExtensions.cs:line 35
           at Microsoft.EntityFrameworkCore.Cosmos.Storage.Internal.CosmosClientWrapper.ExecuteSqlQuery(String containerId, PartitionKey partitionKeyValue, CosmosSqlQuery query) in D:\a\efcore\efcore\src\EFCore.Cosmos\Storage\Internal\CosmosClientWrapper.cs:line 607
           at Microsoft.EntityFrameworkCore.Cosmos.Query.Internal.CosmosShapedQueryCompilingExpressionVisitor.QueryingEnumerable`1.Enumerator.MoveNext() in D:\a\efcore\efcore\src\EFCore.Cosmos\Query\Internal\CosmosShapedQueryCompilingExpressionVisitor.QueryingEnumerable.cs:line 151
        
    Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.SelectMany_primitive_select_subquery(async: True) [FAIL]
      Assert.Equal() Failure: Strings differ
                                        ↓ (pos 189)
      Expected: ···"+<>c__DisplayClass172_0[Microsoft.EntityF"···
      Actual:   ···"+<>c__DisplayClass175_0[Microsoft.EntityF"···
                                        ↑ (pos 189)
      Stack Trace:
        D:\a\efcore\efcore\test\EFCore.Cosmos.FunctionalTests\Query\NorthwindMiscellaneousQueryCosmosTest.cs(3380,0): at Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.SelectMany_primitive_select_subquery(Boolean async)
        --- End of stack trace from previous location ---
      Output:
        An exception occurred while iterating over the results of a query for context type 'Microsoft.EntityFrameworkCore.TestModels.Northwind.NorthwindContext'.
        System.InvalidOperationException: An error was generated for warning 'Microsoft.EntityFrameworkCore.Database.SyncNotSupported': Azure Cosmos DB does not support synchronous I/O. Make sure to use and correctly await only async methods when using Entity Framework Core to access Azure Cosmos DB. See https://aka.ms/ef-cosmos-nosync for more information. This exception can be suppressed or logged by passing event ID 'CosmosEventId.SyncNotSupported' to the 'ConfigureWarnings' method in 'DbContext.OnConfiguring' or 'AddDbContext'.
           at Microsoft.EntityFrameworkCore.Diagnostics.EventDefinition.Log[TLoggerCategory](IDiagnosticsLogger`1 logger, Exception exception) in D:\a\efcore\efcore\src\EFCore\Diagnostics\EventDefinition.cs:line 66
           at Microsoft.EntityFrameworkCore.Cosmos.Diagnostics.Internal.CosmosLoggerExtensions.SyncNotSupported(IDiagnosticsLogger`1 diagnostics) in D:\a\efcore\efcore\src\EFCore.Cosmos\Diagnostics\Internal\CosmosLoggerExtensions.cs:line 35
           at Microsoft.EntityFrameworkCore.Cosmos.Storage.Internal.CosmosClientWrapper.ExecuteSqlQuery(String containerId, PartitionKey partitionKeyValue, CosmosSqlQuery query) in D:\a\efcore\efcore\src\EFCore.Cosmos\Storage\Internal\CosmosClientWrapper.cs:line 607
           at Microsoft.EntityFrameworkCore.Cosmos.Query.Internal.CosmosShapedQueryCompilingExpressionVisitor.QueryingEnumerable`1.Enumerator.MoveNext() in D:\a\efcore\efcore\src\EFCore.Cosmos\Query\Internal\CosmosShapedQueryCompilingExpressionVisitor.QueryingEnumerable.cs:line 151
        
    Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.Coalesce_Correct_Multiple_Same_TypeMapping(async: True) [FAIL]
      Assert.Empty() Failure: Collection was not empty
      Collection: ["SELECT VALUE\r\n{\r\n    \"ReportsTo\" : c[\"Repor"···]
      Stack Trace:
        D:\a\efcore\efcore\test\EFCore.Cosmos.FunctionalTests\TestUtilities\TestSqlLoggerFactory.cs(53,0): at Microsoft.EntityFrameworkCore.TestUtilities.TestSqlLoggerFactory.AssertBaseline(String[] expected, Boolean assertOrder)
        D:\a\efcore\efcore\test\EFCore.Cosmos.FunctionalTests\Query\NorthwindMiscellaneousQueryCosmosTest.cs(5028,0): at Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.AssertSql(String[] expected)
        D:\a\efcore\efcore\test\EFCore.Cosmos.FunctionalTests\Query\NorthwindMiscellaneousQueryCosmosTest.cs(3238,0): at Microsoft.EntityFrameworkCore.Query.NorthwindMiscellaneousQueryCosmosTest.<>c__DisplayClass312_0.<<Coalesce_Correct_Multiple_Same_TypeMapping>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        D:\a\efcore\efcore\test\EFCore.Cosmos.FunctionalTests\TestUtilities\CosmosTestHelpers.cs(47,0): at Microsoft.EntityFrameworkCore.TestUtilities.CosmosTestHelpers.NoSyncTest(Boolean async, Func`2 testCode)
        --- End of stack trace from previous location ---
      Output:
        ---- New Baseline -------------------------------------------------------------------
                AssertSql(
        """
        SELECT VALUE
        {
            "ReportsTo" : c["ReportsTo"],
            "c" : 1,
            "c0" : 2,
            "c1" : 3
        }
        FROM root c
        ORDER BY c["EmployeeID"]
        """);

ranma42 added 3 commits April 15, 2025 23:31
`COALESCE`is a syntactic shortcut for the CASE expression. As such, the input
values are evaluated multiple times.

`ISNULL` does not have this shortcoming.

Fixes dotnet#32519.
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 21 out of 21 changed files in this pull request and generated no comments.

@cincuranet cincuranet merged commit 6f33e42 into dotnet:main Apr 16, 2025
7 checks passed
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the post-merge review... It's really nice to see this merged - should provide a nice perf improvement when coalescing heavy expressions, and provide sane behavior when coalescing non-deterministic ones.

See below for some minor thoughts that came up (nothing crucial).

// with the two above, this implies that all of the expressions
// are using the default type mapping of the type)
if (defaultTypeMapping == typeMapping
&& sqlFunctionExpression.Arguments.All(a => a.Type == type && a.TypeMapping == typeMapping)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above comment on the CLR type, I think we could compare the store type instead of referencing-comparing the type mapping instances... There's no guarantee in EF that there's only one type mapping instance for a given type mapping, and unless I'm mistaken the only thing that matters here for ISNULL is the actual store type. So comparing the store type should be safe and possibly allow for some more scenarios.

Suggested change
&& sqlFunctionExpression.Arguments.All(a => a.Type == type && a.TypeMapping == typeMapping)) {
&& sqlFunctionExpression.Arguments.All(a => a.Type == type && a.TypeMapping?.StoreType == typeMapping?.StoreType)) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this would not recognize cases in which the mismatch is in one of the parameters (such as varchar vs varchar(15)).
Maybe the right approach here would be to make typemappings IEquatable and compare them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this would not recognize cases in which the mismatch is in one of the parameters (such as varchar vs varchar(15)).

I think StoreType must always be the full store type - including any facets. If that's not the case, we most probably have bugs somewhere else as well. Do you have a specific case in mind where this doesn't work?

Maybe the right approach here would be to make typemappings IEquatable and compare them

I'm not sure... The point here is that in this specific context, the only thing we care about is the actual type in the database; in other words, various other details that distinguish type mappings from one another are irrelevant (value comparers, value converters...), since they affect EF client-side processing only. Assuming we have a 100% unambiguous representation of the database type - which is what StoreType is supposed to be - comparing those should be sufficient and allow for maximum coverage of this optimization, I think.

A general IEquatable would need to also compare e.g. value comparers (since it's possible that in other contexts that's relevant), and so would needlessly exclude some valid cases...

.Aggregate(head, (l, r) => new SqlFunctionExpression(
"ISNULL",
arguments: [l, r],
nullable: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for the top-most ISNULL, shouldn't we be taking the original coalesce expression's nullability? For example, in the very common case of b.Foo ?? "<unknown>", the entire expression should be non-nullable since the last sub-expression (the constant string) is non-nullable.

As this is done extremely late in the query pipeline (SQL generator), there's not going to be anything afterwards that cares about this (this node is immediately processed, generating SQL string data, and then discarded). But for correctness's sake (and in case this code is copy-pasted somewhere else), maybe we should do it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullable: true is always safer than nullable: false, so no risk of incorrectness here ;)
We could do this for the top-most ISNULL invocation (which, if the COALESCE has been optimized properly, is the only one that can be non-nullable), but it would make the code a little more complicated for no apparent advantage.

Note that in this context we do not have access to the results of the nullability processor (the nullable value in the expression does not correspond to the results of the processing).

Copy link
Member

@roji roji Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no risk of incorrectness here

Right, not incorrect in the sense of incorrect results - just potentially SQL that's more convoluted (and less efficient) than it needs to be.

Note that in this context we do not have access to the results of the nullability processor (the nullable value in the expression does not correspond to the results of the processing).

That's true, but the expression nullable does represent what it represents: it seems odd to basically lose that setting just because we're transforming one SQL function into another here.

But anyway, given where this code is, I agree it's theoretical. If we do any extra work on this, I'd add a comment making that clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no risk of incorrectness here

Right, not incorrect in the sense of incorrect results - just potentially SQL that's more convoluted (and less efficient) than it needs to be.

Note that in this context we do not have access to the results of the nullability processor (the nullable value in the expression does not correspond to the results of the processing).

That's true, but the expression nullable does represent what it represents: it seems odd to basically lose that setting just because we're transforming one SQL function into another here.

I believe we are not losing any information here: COALESCE expressions are always constructed with nullable: true from this code.
They can be re-written in a few places but the IsNullable property is never modified (neither for COALESCE nor for other function expressions).

But anyway, given where this code is, I agree it's theoretical. If we do any extra work on this, I'd add a comment making that clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I wasn't aware of that... That's presumably because we have special handling of COALESCE in any case in SqlNullabilityProcessor, so IsNullable is maybe effectively unused?

I'd prefer it IsNullable were accurate here (across the board, regardless of COALESCE vs. ISNULL) - you never know when someone might look at it - but I guess it's not super important at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I wasn't aware of that... That's presumably because we have special handling of COALESCE in any case in SqlNullabilityProcessor, so IsNullable is maybe effectively unused?

The nullability processor relies on IsNullable as input to decide whether to assume that the function is guaranteed to return a non-null value or whether its result should be considered (non)nullable depending on (some of) its arguments.
Currently it is accurate in the perspective of function definition and not from the point of view of the expression (even less of the expression in a given context).
IIRC the same holds true for other expressions as well: during the nullability computation for myNullableColumn is null ? 1 : myNullableColumn + 2 the second myNullableColumn expression is evaluated as non-nullable, but it is not rewritten to have IsNullable = false.

I'd prefer it IsNullable were accurate here (across the board, regardless of COALESCE vs. ISNULL) - you never know when someone might look at it - but I guess it's not super important at this point.

Your suggestion goes in the direction of tracking nullability in the expression, which might indeed prove very useful, for example when lowering SqlServer boolean expressions/predicates 😫, but I think is not specifically related to the change COALESCE->ISNULL in this PR

// stay on the safe side we only use ISNULL if:
// - all sub-expressions have the same type as the expression
// - all sub-expressions have the same type mapping as the expression
// - the expression is using the default type mapping (combined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one also seems a bit strict - if all sub-expressions have the same store type, shouldn't everything be fine and we can assume the overall expression has the same type? What's the scenario that could be problematic given a non-default store type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EFCore computes accurate store types only in some cases (column expressions, simple additions), but in many other cases it approximates them, for example a literal string is modeled as varchar instead of varchar(length-of-the-literal).

The idea was that if all of the the expressions are using the default type mapping, the C#-oriented type compatibility matches the SQL one as closely as possible... but now I wonder if some cases are actually unsafe even under this assumption 🤔 .
I think an example that would break with only same type and same type-mapping is:

(col1VarChar10 != 'foo' ? col1VarChar10 : 'a long literal') ??
(col2VarChar10 != 'foo' ? col2VarChar10 : 'a very long literal')

IIRC EFCore computes the type mapping of the LHS (and of the whole expression) as the same type mapping as col1VarChar10.

I think an example that would break with the current set of constraints is :

(col1VarChar10 == 'foo' ? 'a long literal' : col1VarChar10) ??
(col2VarChar10 == 'foo' ? 'a very long literal' : col2VarChar10)

I will try to check this ASAP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the general logic is that type mappings are configured on columns, and then inferred from them to other, non-column expression types (e.g. in b.MyCol == "some constant"), as well as bubbling up in the tree as necessary.

But I'm still a bit confused: if we know that all sub-expressions have the same store type (and therefore so should the SqlFunctionExpression for the COALESCE call on top), how could we possibly have an issue, regardless of whether that type mapping happens to be a default or not? Is this because you think that with a default type mapping there's less of chance that EF got the type mapping wrong (or similar)? Would be good to see a concrete case for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coalesce_Correct_TypeMapping_Double and Coalesce_Correct_TypeMapping_String are a couple of examples in which EFCore computes the wrong (/inaccurate) type mapping for some of the expressions.
Assuming that the implementation in this PR is wrong, I will add another test case that shows that the check is not strict enough.

Conversely, you are right that checking that the type is the default one is not required; its main purpose was to trust C# for the type computation, under the (presumably wrong) assumption that when the C#-computed type and the EFCore-computed type matched, that also meant a match for the SQL type.

@ranma42
Copy link
Contributor Author

ranma42 commented Apr 18, 2025

This test passes with the current code but would fail if the check for the default type(mapping) was removed:

    [ConditionalTheory]
    [MemberData(nameof(IsAsyncData))]
    public virtual Task Coalesce_Correct_TypeMapping_String_Expression(bool async)
        => AssertQuery(
            async,
            ss => ss.Set<Customer>().OrderBy(c => c.CustomerID).Select(c => (c.Region == null ? null : "R" + c.Region) ?? "no region specified"),
            assertOrder: true);

I am now pretty confident I can make the current translation fail with a different test (still working on it, though).

(also, the StoreType comparison is probably good; while experimenting I recalled I was trying to avoid the cases which involve type conversion, but iiuc the current state of EFCore is that those are known to behave in weird ways when used in expressions)

EDIT: The Coalesce_Correct_TypeMapping_String also fails if the default type(mapping) check is removed.

@roji
Copy link
Member

roji commented Apr 19, 2025

iiuc the current state of EFCore is that those are known to behave in weird ways when used in expressions

Yes, that's true.. I don't think we need to take special account of this for this particular case.

Thanks for doing the extra checks around the default type mapping...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQL Server: consider switching from COALESCE to ISNULL

6 participants