Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed Optimization for short living applications #26

Closed
Fruchuxs opened this issue Sep 27, 2018 · 16 comments
Closed

Speed Optimization for short living applications #26

Fruchuxs opened this issue Sep 27, 2018 · 16 comments
Assignees
Labels
enhancement New feature or request performance
Milestone

Comments

@Fruchuxs
Copy link

We figured out, that we can speed up the executionen of an short living application (like a console application) with Expression<TDelegate>.Compile(preferInterpretation: true), because of avoiding some JIT overhead.

Can you maybe add a rule to DryIoc, so we can switch between preferInterpretation: true and preferInterpretation: false for the compilation of the resolve trees?

@dadhi
Copy link
Owner

dadhi commented Sep 27, 2018

@Fruchuxs Hi, thanks for bringing it.

First, what platform did you target?

Next, there maybe a more work involved, considering that on some platforms DryIoc uses custom expression compiler.

But everything is possible. In FEC we probably immediately fallback to Expression.Compile when preferInterpretation: true is passed.

@Fruchuxs
Copy link
Author

We also tested the FEC, but the speed up was in our cases negligible.
Thanks for your acceptance.

@dadhi
Copy link
Owner

dadhi commented Sep 27, 2018

@Fruchuxs, what platform did you test?

@Fruchuxs
Copy link
Author

Oh sorry, forget to advise: .net core 2.1

@EamonNerbonne
Copy link

EamonNerbonne commented Oct 8, 2018

It's conceivable that on .net core 2.1 you might get a similar boost by enabling tiered compilation.

It's certainly worth trying out; if that really works well, it might render the interpretation of expression trees themselves moot (because the JIT would be interpreting IL, and the conversion to IL is fairly fast with FastExpressionCompiler).

@dadhi
Copy link
Owner

dadhi commented Oct 8, 2018

Hi @EamonNerbonne , are you suggesting to add <TieredCompilation>true</TieredCompilation> and check?

@EamonNerbonne
Copy link

Yeah, that would be a start ;-).

But I have no experience with that (new) feature, so can't tell you about any gotchas - I mean I hope there aren't any, but it's conceivable it doesn't work with dynamically generated IL or some such thing. Only one way to find out! ;-).

@Fruchuxs
Copy link
Author

Fruchuxs commented Oct 9, 2018

Will check but their measurements doesn't look promising.

@Fruchuxs
Copy link
Author

Fruchuxs commented Oct 9, 2018

COMPlus_TieredCompilation

true false
1,2543458 1,1927924
1,1643181 1,7478098
1,1927924 1,3724727
1,1572949 1,413623
avg  avg  
1,1921878 1,431674475

Probes in seconds.
In my case it seems mostly faster, thanks for the info.

@EamonNerbonne
Copy link

Just a little worried about the high variance in those numbers - it might just be chance.

@Fruchuxs
Copy link
Author

Fruchuxs commented Oct 9, 2018

Yes, this is a general problem on my working machine (it's a laptop), as suggested here. I assume it's a CPU boost / throttling Problem. But my measurements gives an idea that COMPlus_TieredCompilation=true is faster at cold start.

@dadhi
Copy link
Owner

dadhi commented Oct 9, 2018

@Fruchuxs ,

Cool actually. I was sceptical, but it is a visible difference. Moreover, it affects dynamic methods as well.

Thanks for trying!

@dadhi
Copy link
Owner

dadhi commented Feb 10, 2019

#45 should help.

@dadhi dadhi closed this as completed Feb 10, 2019
@dadhi dadhi self-assigned this Feb 10, 2019
@dadhi dadhi added enhancement New feature or request performance labels Feb 10, 2019
@dadhi dadhi added this to the 4.0.0 milestone Feb 10, 2019
@dadhi
Copy link
Owner

dadhi commented Feb 22, 2019

Here is the proof: benchmark, sut and the related epic #44

DryIoc v3:

CreateContainerAndRegisterServices_Then_FirstTimeOpenScopeAndResolve

                            Method |        Mean |      Error |     StdDev |  Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |------------:|-----------:|-----------:|-------:|--------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |    128.3 us |   1.591 us |   1.489 us |   1.00 |    0.00 |     13.7939 |           - |           - |            58.66 KB |
                      BmarkAutofac |    425.8 us |   2.081 us |   1.947 us |   3.32 |    0.05 |     76.6602 |      2.4414 |           - |           354.91 KB |
                  BmarkAutofacMsDi |    434.9 us |   1.827 us |   1.709 us |   3.39 |    0.03 |     80.0781 |      0.4883 |           - |           371.22 KB |
                        BmarkGrace | 17,839.6 us |  99.023 us |  87.781 us | 138.99 |    1.72 |    156.2500 |     62.5000 |           - |           781.49 KB |
                    BmarkGraceMsDi | 20,568.8 us | 116.222 us | 108.714 us | 160.39 |    2.00 |    187.5000 |     93.7500 |     31.2500 |           954.13 KB |
                       BmarkDryIoc | 47,340.9 us | 175.076 us | 146.196 us | 368.47 |    3.79 |     90.9091 |           - |           - |            759.4 KB |
                   BmarkDryIocMsDi | 46,669.8 us | 302.005 us | 282.496 us | 363.92 |    4.44 |    181.8182 |     90.9091 |           - |           855.37 KB |

OpenScopeAndResolve

                            Method |      Mean |     Error |    StdDev | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |----------:|----------:|----------:|------:|--------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |  3.407 us | 0.0131 us | 0.0116 us |  1.00 |    0.00 |      0.8354 |           - |           - |             3.87 KB |
                        BmarkGrace |  3.954 us | 0.0102 us | 0.0095 us |  1.16 |    0.01 |      1.8921 |           - |           - |             8.73 KB |
                    BmarkGraceMsDi |  5.253 us | 0.0126 us | 0.0111 us |  1.54 |    0.01 |      2.2736 |           - |           - |            10.49 KB |
                       BmarkDryIoc | 25.944 us | 0.0288 us | 0.0270 us |  7.61 |    0.03 |      3.8757 |           - |           - |            17.88 KB |
                   BmarkDryIocMsDi | 29.552 us | 0.0841 us | 0.0746 us |  8.67 |    0.04 |      4.7302 |           - |           - |            21.94 KB |
                      BmarkAutofac | 40.997 us | 0.1914 us | 0.1790 us | 12.04 |    0.08 |     10.8643 |           - |           - |            50.23 KB |
                  BmarkAutofacMsDi | 51.114 us | 0.2980 us | 0.2788 us | 15.00 |    0.08 |     14.1602 |           - |           - |            65.39 KB |

DryIoc v4 (preview-02)

CreateContainerAndRegisterServices_Then_FirstTimeOpenScopeAndResolve

                            Method |        Mean |       Error |      StdDev |  Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |------------:|------------:|------------:|-------:|--------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |    134.0 us |   2.1675 us |   2.0275 us |   1.00 |    0.00 |     13.7939 |      0.1221 |           - |            58.65 KB |
                       BmarkDryIoc |    145.3 us |   0.8318 us |   0.7374 us |   1.08 |    0.02 |     30.2734 |           - |           - |           140.35 KB |
                   BmarkDryIocMsDi |    161.6 us |   0.9626 us |   0.9004 us |   1.21 |    0.02 |     32.2266 |           - |           - |           149.19 KB |
                  BmarkAutofacMsDi |    665.2 us |   5.4804 us |   5.1264 us |   4.96 |    0.08 |    105.4688 |      3.9063 |           - |           487.64 KB |
                      BmarkAutofac |    673.2 us |   5.5615 us |   5.2022 us |   5.02 |    0.08 |    101.5625 |     18.5547 |           - |           470.39 KB |
                        BmarkGrace | 18,480.8 us | 100.5977 us |  89.1773 us | 137.83 |    2.35 |    156.2500 |     62.5000 |           - |           755.18 KB |
                    BmarkGraceMsDi | 21,640.4 us | 119.7309 us | 106.1383 us | 161.39 |    2.81 |    187.5000 |     93.7500 |     31.2500 |           926.88 KB |

OpenScopeAndResolve

                            Method |      Mean |     Error |    StdDev | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |----------:|----------:|----------:|------:|--------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |  3.222 us | 0.0138 us | 0.0129 us |  1.00 |    0.00 |      0.8354 |           - |           - |             3.87 KB |
                       BmarkDryIoc |  4.274 us | 0.0276 us | 0.0259 us |  1.33 |    0.01 |      1.9531 |           - |           - |             9.02 KB |
                   BmarkDryIocMsDi |  4.498 us | 0.0237 us | 0.0222 us |  1.40 |    0.01 |      1.9608 |           - |           - |             9.06 KB |
                        BmarkGrace |  4.604 us | 0.0271 us | 0.0254 us |  1.43 |    0.01 |      2.3499 |           - |           - |            10.85 KB |
                    BmarkGraceMsDi |  5.280 us | 0.0267 us | 0.0236 us |  1.64 |    0.01 |      2.2202 |           - |           - |            10.24 KB |
                      BmarkAutofac | 37.600 us | 0.5655 us | 0.5289 us | 11.67 |    0.16 |      9.3994 |      0.0610 |           - |            43.47 KB |
                  BmarkAutofacMsDi | 49.487 us | 0.4901 us | 0.4585 us | 15.36 |    0.13 |     13.3667 |      0.1221 |           - |            61.75 KB |

@Fruchuxs
Copy link
Author

Fruchuxs commented Feb 24, 2019

Looks nice, now we need this Version out. :P
Any release plans?

EDIT: Btw. really good work!

@enif-lee
Copy link

enif-lee commented Mar 3, 2019

Wow... It's coooool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

4 participants