Query/Perf: don't compile liftable constant resolvers in interpretation mode when the resolver itself contains a lambda #35208

maumar · 2024-11-26T02:34:47Z

This is part of a bigger perf regression between EF8 and EF9: #35053

When processing LiftableConstantExpressions in the regular (non-AOT) mode we compile the resolver lambda and then evaluate it to get back the actual constant we plan to use. We do the compilation in the interpretation mode and that causes problems if the resulting object is itself (or contains) a delegate. Compiling into a delegate using interpreted mode, has impact on allocations as well as execution speed.

before the fix (this already includes the invoke fix #35206)

Method	Async	Mean	Error	StdDev	Op/s	Gen0	Gen1	Allocated
MultiInclue	False	487.1 ms	0.99 ms	0.88 ms	2.053	17000.0000	11000.0000	103.29 MB
MultiInclue	True	487.4 ms	3.63 ms	3.22 ms	2.052	17000.0000	11000.0000	103.29 MB

after the fix

Method	Async	Mean	Error	StdDev	Op/s	Gen0	Gen1	Allocated
MultiInclue	False	455.1 ms	8.94 ms	10.29 ms	2.197	11000.0000	6000.0000	67.92 MB
MultiInclue	True	435.4 ms	1.77 ms	1.66 ms	2.297	11000.0000	6000.0000	67.92 MB

maumar · 2024-11-26T06:27:27Z

smoking gun benchmark:

    internal class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run(args, typeof(Program).Assembly);
        }
    }

    public static class BenchmarkRunner
    {
        public static void Run(string[] args, Assembly assembly, IConfig config = null)
        {
            config ??= DefaultConfig.Instance;

            config = config
                .AddDiagnoser(MemoryDiagnoser.Default)
                .AddColumn(StatisticColumn.OperationsPerSecond);

            BenchmarkSwitcher.FromAssembly(assembly).Run(args, config);
        }
    }

    public class InterpretationBenchmark
    {
        private Expression<Func<object, Func<object?, object?, bool>[]>> _expression;
        private Func<object, object, bool>[] _compiled;
        private Func<object, object, bool>[] _compiledInterpreted;

        [GlobalSetup]
        public virtual void Initialize()
        {
            var left = Expression.Parameter(typeof(object), "left");
            var right = Expression.Parameter(typeof(object), "right");

            var prm = Expression.Parameter(typeof(object));
            _expression = Expression.Lambda<Func<object, Func<object?, object?, bool>[]>>(
                Expression.NewArrayInit(
                    typeof(Func<object, object, bool>),
                    Expression.Lambda<Func<object?, object?, bool>>(
                        Expression.Condition(
                            Expression.Equal(left, Expression.Constant(null)),
                            Expression.Equal(right, Expression.Constant(null)),
                            Expression.AndAlso(
                                Expression.NotEqual(right, Expression.Constant(null)),
                                Expression.Equal(
                                    Expression.Convert(left, typeof(int)),
                                    Expression.Convert(left, typeof(int))))),
                      left,
                      right)), prm);

            _compiled = _expression.Compile()("_");
            _compiledInterpreted = _expression.Compile(preferInterpretation: true)("_");
        }

        [Benchmark]
        public virtual void Compiled()
        {
            var equal = 0;
            for (var i = 0; i < 100; i++)
            {
                for (var j = 0; j < 100; j++)
                {
                    if (_compiled[0](i, j))
                    {
                        equal++;
                    }
                }
            }
        }

        [Benchmark]
        public virtual void CompiledInterpolated()
        {
            var equal = 0;
            for (var i = 0; i < 100; i++)
            {
                for (var j = 0; j < 100; j++)
                {
                    if (_compiledInterpreted[0](i, j))
                    {
                        equal++;
                    }
                }
            }
        }
    }

results:

Method	Mean	Error	StdDev	Op/s	Gen0	Allocated
Compiled	65.69 us	0.706 us	0.660 us	15,223.4	76.4160	468.75 KB
CompiledInterpolated	930.20 us	7.046 us	6.591 us	1,075.0	433.5938	2656.25 KB

… in interpretation mode when the resolver itself contains a lambda In EF9 we changed the way we generate shapers in preparation for AOT scenarios. We no longer can embed arbitrary objects into the shaper, instead we need to provide a way to construct that object in code (using LiftableConstant mechanism), or simulate the functionality it used to provide. At the end of our processing, we find all liftable constants and for the non-AOT case we compile their resolver lambdas and invoke the result with liftable context object to produce the resulting constant object we initially wanted. (in AOT case we generate code from the resolver lambda). Problem is that we are compiling the resolver lambda in the interpretation mode - if the final product is itself a delegate, that delegate will itself be in the interpreter mode and therefore less efficient. Fix is to scan the resolver expression and look for nested lambdas inside - if we find some, compile the resolver in the regular mode instead. Fixes #35208 This is part of a fix for a larger perf issue: #35053

… in interpretation mode when the resolver itself contains a lambda In EF9 we changed the way we generate shapers in preparation for AOT scenarios. We no longer can embed arbitrary objects into the shaper, instead we need to provide a way to construct that object in code (using LiftableConstant mechanism), or simulate the functionality it used to provide. At the end of our processing, we find all liftable constants and for the non-AOT case we compile their resolver lambdas and invoke the result with liftable context object to produce the resulting constant object we initially wanted. (in AOT case we generate code from the resolver lambda). Problem is that we are compiling the resolver lambda in the interpretation mode - if the final product is itself a delegate, that delegate will itself be in the interpreter mode and therefore less efficient. Fix is to use regular compilation rather than interpretation. Fixes #35208 This is part of a fix for a larger perf issue: #35053

Port of #35209 **Description** In EF9 we changed the way we generate shapers in preparation for AOT scenarios. We no longer can embed arbitrary objects into the shaper, instead we need to provide a way to construct that object in code (using LiftableConstant mechanism), or simulate the functionality it used to provide. At the end of our processing, we find all liftable constants and for the non-AOT case we compile their resolver lambdas and invoke the result with liftable context object to produce the resulting constant object we initially wanted. (in AOT case we generate code from the resolver lambda). Problem is that we are compiling the resolver lambda in the interpretation mode - if the final product is itself a delegate, that delegate will itself be in the interpreter mode and therefore less efficient when invoked multiple times when the query runs. Fix is to use regular compilation rather than interpretation. **Customer impact** Queries using collection navigation with significant amount of data suffer large performance degradation when compared with EF8. No good workaround. **How found** Multiple customer reports on 9.0.0 **Regression** Yes, from 8.0. **Testing** Ad-hoc perf testing with BenchmarkDotNet. Functional change already covered by numerous tests. **Risk** Low, quirk added.

maumar · 2024-11-26T09:58:26Z

reopening for potential patch

Port of #35209 **Description** In EF9 we changed the way we generate shapers in preparation for AOT scenarios. We no longer can embed arbitrary objects into the shaper, instead we need to provide a way to construct that object in code (using LiftableConstant mechanism), or simulate the functionality it used to provide. At the end of our processing, we find all liftable constants and for the non-AOT case we compile their resolver lambdas and invoke the result with liftable context object to produce the resulting constant object we initially wanted. (in AOT case we generate code from the resolver lambda). Problem is that we are compiling the resolver lambda in the interpretation mode - if the final product is itself a delegate, that delegate will itself be in the interpreter mode and therefore less efficient when invoked multiple times when the query runs. Fix is to use regular compilation rather than interpretation. **Customer impact** Queries using collection navigation with significant amount of data suffer large performance degradation when compared with EF8. No good workaround. **How found** Multiple customer reports on 9.0.0 **Regression** Yes, from 8.0. **Testing** Ad-hoc perf testing with BenchmarkDotNet. Functional change already covered by numerous tests. **Risk** Low, quirk added.

maumar · 2024-11-30T10:45:58Z

fixed for 9.0.1 - b50581a

maumar added customer-reported area-perf regression area-query labels Nov 26, 2024

maumar self-assigned this Nov 26, 2024

maumar mentioned this issue Nov 26, 2024

Fix to #35208 - Query/Perf: don't compile liftable constant resolvers in interpretation mode #35209

Merged

maumar mentioned this issue Nov 26, 2024

[release/9.0-staging] Fix to #35208 - Query/Perf: don't compile liftable constant resolvers in interpretation mode #35211

Merged

maumar closed this as completed in #35209 Nov 26, 2024

maumar closed this as completed in 1319ed4 Nov 26, 2024

maumar reopened this Nov 26, 2024

maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Nov 26, 2024

maumar added this to the 9.0.x milestone Nov 27, 2024

maumar closed this as completed Nov 30, 2024

maumar modified the milestones: 9.0.x, 9.0.1 Nov 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query/Perf: don't compile liftable constant resolvers in interpretation mode when the resolver itself contains a lambda #35208

Query/Perf: don't compile liftable constant resolvers in interpretation mode when the resolver itself contains a lambda #35208

maumar commented Nov 26, 2024

maumar commented Nov 26, 2024 •

edited

Loading

maumar commented Nov 26, 2024

maumar commented Nov 30, 2024

Query/Perf: don't compile liftable constant resolvers in interpretation mode when the resolver itself contains a lambda #35208

Query/Perf: don't compile liftable constant resolvers in interpretation mode when the resolver itself contains a lambda #35208

Comments

maumar commented Nov 26, 2024

maumar commented Nov 26, 2024 • edited Loading

maumar commented Nov 26, 2024

maumar commented Nov 30, 2024

maumar commented Nov 26, 2024 •

edited

Loading