Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up shortstr/longstr (de)serialization. #985

Merged
merged 2 commits into from
Dec 10, 2020

Conversation

stebet
Copy link
Contributor

@stebet stebet commented Dec 10, 2020

Proposed Changes

Code cleanups and optimizations for shortstr/longstr (de)serialization.

This provides a pretty hefty improvement to almost all messages sent and some improvements to most messages read as well.

Types of Changes

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)
  • Optimizations

Checklist

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • All tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in related repositories

Benchmark results

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.20270
Intel Core i7-10700 CPU 2.90GHz, 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=5.0.101
  [Host]        : .NET Core 3.1.10 (CoreCLR 4.700.20.51601, CoreFX 4.700.20.51901), X64 RyuJIT
  .NET 4.8      : .NET Framework 4.8 (4.8.4261.0), X64 RyuJIT
  .NET Core 3.1 : .NET Core 3.1.10 (CoreCLR 4.700.20.51601, CoreFX 4.700.20.51901), X64 RyuJIT
Method Pre/Post Runtime Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated Code Size
ArrayReadEmpty Pre .NET 4.8 22.820 ns 0.4441 ns 0.4361 ns 0.0063 - - 40 B 672 B
ArrayReadEmpty Post .NET 4.8 21.905 ns 0.0683 ns 0.0639 ns 0.0063 - - 40 B 672 B
ArrayReadPopulated Pre .NET 4.8 137.263 ns 0.2047 ns 0.1710 ns 0.0343 - - 217 B 672 B
ArrayReadPopulated Post .NET 4.8 131.391 ns 0.3584 ns 0.3353 ns 0.0343 - - 217 B 672 B
ArrayWriteEmpty Pre .NET 4.8 20.649 ns 0.0291 ns 0.0227 ns - - - - 795 B
ArrayWriteEmpty Post .NET 4.8 21.355 ns 0.0494 ns 0.0438 ns - - - - 816 B
ArrayWritePopulated Pre .NET 4.8 142.924 ns 0.8935 ns 0.7461 ns 0.0062 - - 40 B 795 B
ArrayWritePopulated Post .NET 4.8 140.215 ns 0.2400 ns 0.2128 ns 0.0062 - - 40 B 816 B
TableReadEmpty Pre .NET 4.8 19.544 ns 0.0570 ns 0.0505 ns - - - - 999 B
TableReadEmpty Post .NET 4.8 19.186 ns 0.0727 ns 0.0607 ns - - - - 1008 B
TableReadPopulated Pre .NET 4.8 863.653 ns 2.5798 ns 2.2870 ns 0.2222 0.0010 - 1404 B 1002 B
TableReadPopulated Post .NET 4.8 797.831 ns 2.0380 ns 1.9064 ns 0.2222 0.0010 - 1404 B 1008 B
TableWriteEmpty Pre .NET 4.8 21.676 ns 0.2943 ns 0.2458 ns - - - - 1505 B
TableWriteEmpty Post .NET 4.8 21.620 ns 0.0779 ns 0.0651 ns - - - - 1546 B
TableWritePopulated Pre .NET 4.8 666.625 ns 3.1778 ns 2.6536 ns 0.0257 - - 168 B 1505 B
TableWritePopulated Post .NET 4.8 653.420 ns 1.8345 ns 1.6262 ns 0.0257 - - 168 B 1546 B
LongstrReadEmpty Pre .NET 4.8 18.582 ns 0.1331 ns 0.1180 ns - - - - 668 B
LongstrReadEmpty Post .NET 4.8 18.510 ns 0.0741 ns 0.0693 ns - - - - 665 B
LongstrReadPopulated Pre .NET 4.8 319.274 ns 2.2690 ns 2.0114 ns 0.6561 - - 4134 B 668 B
LongstrReadPopulated Post .NET 4.8 274.551 ns 1.9830 ns 1.8549 ns 0.6561 - - 4134 B 665 B
LongstrWriteEmpty Pre .NET 4.8 26.209 ns 0.0735 ns 0.0651 ns - - - - 846 B
LongstrWriteEmpty Post .NET 4.8 12.155 ns 0.0385 ns 0.0341 ns - - - - 733 B
LongstrWritePopulated Pre .NET 4.8 1,730.893 ns 6.9724 ns 6.5220 ns - - - - 840 B
LongstrWritePopulated Post .NET 4.8 1,696.238 ns 9.0216 ns 8.4388 ns - - - - 727 B
ShortstrReadEmpty Pre .NET 4.8 9.773 ns 0.0826 ns 0.0732 ns - - - - 830 B
ShortstrReadEmpty Post .NET 4.8 9.929 ns 0.0419 ns 0.0350 ns - - - - 714 B
ShortstrReadPopulated Pre .NET 4.8 178.406 ns 0.6001 ns 0.5613 ns 0.0854 - - 538 B 866 B
ShortstrReadPopulated Post .NET 4.8 168.054 ns 0.5726 ns 0.5356 ns 0.0854 - - 538 B 750 B
ShortstrWriteEmpty Pre .NET 4.8 27.042 ns 0.1473 ns 0.1306 ns - - - - 719 B
ShortstrWriteEmpty Post .NET 4.8 18.302 ns 0.0738 ns 0.0616 ns - - - - 834 B
ShortstrWritePopulated Pre .NET 4.8 144.082 ns 0.8153 ns 0.7626 ns - - - - 715 B
ShortstrWritePopulated Post .NET 4.8 143.466 ns 0.6751 ns 0.6315 ns - - - - 830 B
ArrayReadEmpty Pre .NET Core 3.1 12.401 ns 0.0976 ns 0.0865 ns 0.0038 - - 32 B 408 B
ArrayReadEmpty Post .NET Core 3.1 11.759 ns 0.0510 ns 0.0477 ns 0.0038 - - 32 B 408 B
ArrayReadPopulated Pre .NET Core 3.1 96.808 ns 0.3429 ns 0.3208 ns 0.0248 - - 208 B 408 B
ArrayReadPopulated Post .NET Core 3.1 87.208 ns 0.3289 ns 0.3077 ns 0.0248 - - 208 B 408 B
ArrayWriteEmpty Pre .NET Core 3.1 4.587 ns 0.0215 ns 0.0201 ns - - - - 413 B
ArrayWriteEmpty Post .NET Core 3.1 4.958 ns 0.1162 ns 0.1193 ns - - - - 413 B
ArrayWritePopulated Pre .NET Core 3.1 82.959 ns 0.4525 ns 0.4233 ns 0.0048 - - 40 B 413 B
ArrayWritePopulated Post .NET Core 3.1 79.865 ns 0.2570 ns 0.2404 ns 0.0048 - - 40 B 413 B
TableReadEmpty Pre .NET Core 3.1 9.277 ns 0.0173 ns 0.0135 ns - - - - 645 B
TableReadEmpty Post .NET Core 3.1 9.221 ns 0.0464 ns 0.0434 ns - - - - 634 B
TableReadPopulated Pre .NET Core 3.1 752.980 ns 3.8807 ns 3.2406 ns 0.1612 - - 1352 B 648 B
TableReadPopulated Post .NET Core 3.1 688.693 ns 3.0056 ns 2.8114 ns 0.1612 - - 1352 B 634 B
TableWriteEmpty Pre .NET Core 3.1 11.404 ns 0.0356 ns 0.0333 ns - - - - 1041 B
TableWriteEmpty Post .NET Core 3.1 11.040 ns 0.2385 ns 0.2114 ns - - - - 1041 B
TableWritePopulated Pre .NET Core 3.1 422.034 ns 1.4816 ns 1.1567 ns 0.0200 - - 168 B 1041 B
TableWritePopulated Post .NET Core 3.1 415.224 ns 1.2110 ns 1.1327 ns 0.0200 - - 168 B 1041 B
LongstrReadEmpty Pre .NET Core 3.1 3.173 ns 0.0597 ns 0.0529 ns - - - - 315 B
LongstrReadEmpty Post .NET Core 3.1 3.119 ns 0.0249 ns 0.0208 ns - - - - 312 B
LongstrReadPopulated Pre .NET Core 3.1 222.098 ns 4.3371 ns 4.9946 ns 0.4923 - - 4120 B 315 B
LongstrReadPopulated Post .NET Core 3.1 181.609 ns 3.6231 ns 4.1724 ns 0.4923 - - 4120 B 312 B
LongstrWriteEmpty Pre .NET Core 3.1 9.726 ns 0.0399 ns 0.0354 ns - - - - 516 B
LongstrWriteEmpty Post .NET Core 3.1 8.018 ns 0.0635 ns 0.0563 ns - - - - 281 B
LongstrWritePopulated Pre .NET Core 3.1 219.733 ns 0.7856 ns 0.6964 ns - - - - 512 B
LongstrWritePopulated Post .NET Core 3.1 223.276 ns 0.3555 ns 0.3326 ns - - - - 277 B
ShortstrReadEmpty Pre .NET Core 3.1 1.524 ns 0.0073 ns 0.0061 ns - - - - 248 B
ShortstrReadEmpty Post .NET Core 3.1 1.501 ns 0.0098 ns 0.0082 ns - - - - 244 B
ShortstrReadPopulated Pre .NET Core 3.1 57.677 ns 1.1032 ns 1.1329 ns 0.0641 - - 536 B 508 B
ShortstrReadPopulated Post .NET Core 3.1 52.808 ns 0.7226 ns 0.6759 ns 0.0641 - - 536 B 299 B
ShortstrWriteEmpty Pre .NET Core 3.1 11.495 ns 0.0496 ns 0.0414 ns - - - - 392 B
ShortstrWriteEmpty Post .NET Core 3.1 3.718 ns 0.0194 ns 0.0151 ns - - - - 468 B
ShortstrWritePopulated Pre .NET Core 3.1 25.018 ns 0.4949 ns 0.5083 ns - - - - 388 B
ShortstrWritePopulated Post .NET Core 3.1 24.920 ns 0.0683 ns 0.0605 ns - - - - 464 B
Method Pre/Post Runtime Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated Code Size
BasicAckRead Pre .NET 4.8 22.512 ns 0.1782 ns 0.1488 ns 0.0051 - - 32 B 676 B
BasicAckRead Post .NET 4.8 22.403 ns 0.0297 ns 0.0264 ns 0.0051 - - 32 B 676 B
BasicAckWrite Pre .NET 4.8 21.315 ns 0.0615 ns 0.0545 ns - - - - 689 B
BasicAckWrite Post .NET 4.8 21.317 ns 0.0742 ns 0.0694 ns - - - - 689 B
BasicDeliverRead Pre .NET 4.8 32.278 ns 0.4230 ns 0.3532 ns 0.0089 - - 56 B 1475 B
BasicDeliverRead Post .NET 4.8 30.830 ns 0.4390 ns 0.3890 ns 0.0089 - - 56 B 1505 B
BasicDeliverWrite Pre .NET 4.8 79.532 ns 0.2814 ns 0.2495 ns - - - - 873 B
BasicDeliverWrite Post .NET 4.8 48.970 ns 0.1490 ns 0.1390 ns - - - - 903 B
BasicPropertiesRead Pre .NET 4.8 108.924 ns 0.4782 ns 0.3994 ns 0.0318 - - 201 B 6603 B
BasicPropertiesRead Post .NET 4.8 101.670 ns 0.2830 ns 0.2650 ns 0.0318 - - 201 B 6750 B
BasicPropertiesWrite Pre .NET 4.8 76.225 ns 1.1911 ns 1.1698 ns - - - - 1991 B
BasicPropertiesWrite Post .NET 4.8 76.380 ns 0.2660 ns 0.2480 ns - - - - 2013 B
ChannelCloseRead Pre .NET 4.8 27.916 ns 0.2079 ns 0.1843 ns 0.0051 - - 32 B 1049 B
ChannelCloseRead Post .NET 4.8 27.363 ns 0.2029 ns 0.1694 ns 0.0051 - - 32 B 1058 B
ChannelCloseWrite Pre .NET 4.8 40.392 ns 0.3391 ns 0.3172 ns - - - - 816 B
ChannelCloseWrite Post .NET 4.8 32.318 ns 0.0920 ns 0.0861 ns - - - - 858 B
BasicAckRead Pre .NET Core 3.1 4.796 ns 0.1006 ns 0.0941 ns 0.0038 - - 32 B 260 B
BasicAckRead Post .NET Core 3.1 4.022 ns 0.0231 ns 0.0216 ns 0.0038 - - 32 B 260 B
BasicAckWrite Pre .NET Core 3.1 3.324 ns 0.0173 ns 0.0144 ns - - - - 256 B
BasicAckWrite Post .NET Core 3.1 3.269 ns 0.0310 ns 0.0290 ns - - - - 256 B
BasicDeliverRead Pre .NET Core 3.1 13.006 ns 0.1830 ns 0.1712 ns 0.0067 - - 56 B 935 B
BasicDeliverRead Post .NET Core 3.1 11.510 ns 0.0540 ns 0.0500 ns 0.0067 - - 56 B 960 B
BasicDeliverWrite Pre .NET Core 3.1 36.468 ns 0.2432 ns 0.2156 ns - - - - 438 B
BasicDeliverWrite Post .NET Core 3.1 13.130 ns 0.0350 ns 0.0330 ns - - - - 438 B
BasicPropertiesRead Pre .NET Core 3.1 70.037 ns 1.2143 ns 1.0140 ns 0.0229 - - 192 B 3461 B
BasicPropertiesRead Post .NET Core 3.1 55.300 ns 0.1370 ns 0.1290 ns 0.0229 - - 192 B 3085 B
BasicPropertiesWrite Pre .NET Core 3.1 34.849 ns 0.3061 ns 0.2713 ns - - - - 1433 B
BasicPropertiesWrite Post .NET Core 3.1 36.190 ns 0.1010 ns 0.0940 ns - - - - 1433 B
ChannelCloseRead Pre .NET Core 3.1 8.378 ns 0.0411 ns 0.0364 ns 0.0038 - - 32 B 582 B
ChannelCloseRead Post .NET Core 3.1 7.723 ns 0.0335 ns 0.0313 ns 0.0038 - - 32 B 608 B
ChannelCloseWrite Pre .NET Core 3.1 15.743 ns 0.2899 ns 0.4249 ns - - - - 392 B
ChannelCloseWrite Post .NET Core 3.1 6.979 ns 0.0272 ns 0.0254 ns - - - - 392 B

Copy link
Member

@michaelklishin michaelklishin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this looks like a pretty meaningful improvement indeed.

}
catch (ArgumentException)
{
return ThrowArgumentOutOfRangeException(val, maxLength);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we loose perf if we preserved the original exception as an inner for more transparency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so. We'd have to replace the maxLength parameter since there's a missing overload for ArgumentOutOfRangeException to accept both an inner exception as well as setting the value.

@danielmarbach
Copy link
Collaborator

Out of curiosity why are you not using the Benchmark Baseline for the comparison?

@stebet
Copy link
Contributor Author

stebet commented Dec 10, 2020

Out of curiosity why are you not using the Benchmark Baseline for the comparison?

I didn't want to copy the old logic over to the new one since I'm manually comparing runs between different branches so there's no straightforward way to add the baseline parameter. There is a tool that @adamsitnik has mentioned that can compare benchmarkdotnet runs but I haven't tried it yet. See here: dotnet/BenchmarkDotNet#973

Hopefully it'll become a global tool soon to make this easier :)

@adamsitnik
Copy link

Hopefully it'll become a global tool soon to make this easier

We don't have such plans for the near future, but it's available here if someone wants to use it

@michaelklishin michaelklishin merged commit a418ee4 into rabbitmq:master Dec 10, 2020
@michaelklishin
Copy link
Member

@stebet any reason not to backport this all the way to 6.x?

@stebet
Copy link
Contributor Author

stebet commented Dec 10, 2020

@stebet any reason not to backport this all the way to 6.x?

Don't think so :) You'll just need to remove the #if NETCOREAPP blocks.

using RabbitMQ.Client.Exceptions;
using RabbitMQ.Util;

namespace RabbitMQ.Client.Impl
{
internal static class WireFormatting
{
private static UTF8Encoding UTF8 = new UTF8Encoding();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What made you add this change? I've been wondering for some time now whether it's better to call Encoding.UTF8 vs a static instance. (Btw, should be readonly!)

From source I see they're not even the same instance 🤷‍♂️
=> https://source.dot.net/#System.Private.CoreLib/Encoding.cs,a10eb90a3d884500
=> https://source.dot.net/#System.Private.CoreLib/UTF8Encoding.Sealed.cs,98c4a50ea9fbaa13,references

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Encoding.UTF8 returns the base class instead of an UTF8Encoding. Maybe there isn't any difference but this way we're at least pretty certain to avoid any callvirt calls. I saw this method used in other high-perf libraries as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick test: (Code at the end)

GetBytes

Method value Mean Error StdDev Ratio Code Size
Original MyString 21.67 ns 0.887 ns 0.049 ns 1.00 230 B
Static MyString 21.40 ns 0.710 ns 0.039 ns 0.99 230 B
StaticSealed MyString 20.85 ns 0.800 ns 0.044 ns 0.96 230 B

GetString

Method value Mean Error StdDev Ratio Code Size
Original Byte[8] 22.11 ns 2.246 ns 0.123 ns 1.00 166 B
Static Byte[8] 22.25 ns 1.799 ns 0.099 ns 1.01 166 B
StaticSealed Byte[8] 22.11 ns 1.126 ns 0.062 ns 1.00 166 B

There's not much difference, only thing I see is a slight benefit for static sealed.

Benchmark code

using System.Collections.Generic;
using System.Text;
using BenchmarkDotNet.Attributes;

namespace Benchmarks.WireFormatting
{
    [ShortRunJob]
    [DisassemblyDiagnoser]
    public class EncodingBenchmarks
    {
        private static readonly UTF8Encoding UTF8 = new UTF8Encoding();
        private static readonly UTF8Encoding UTF8Sealed = (UTF8Encoding)Encoding.UTF8;

        public static IEnumerable<byte[]> ByteSource => new[] {Encoding.UTF8.GetBytes("MyString")};
        /*
        [Benchmark(Baseline = true)]
        [Arguments("MyString")]
        public byte[] Original(string value)
        {
            return Encoding.UTF8.GetBytes(value);
        }

        [Benchmark]
        [Arguments("MyString")]
        public byte[] Static(string value)
        {
            return UTF8.GetBytes(value);
        }

        [Benchmark]
        [Arguments("MyString")]
        public byte[] StaticSealed(string value)
        {
            return UTF8Sealed.GetBytes(value);
        }
        */
        [Benchmark(Baseline = true)]
        [ArgumentsSource(nameof(ByteSource))]
        public string Original(byte[] value)
        {
            return Encoding.UTF8.GetString(value);
        }

        [Benchmark]
        [ArgumentsSource(nameof(ByteSource))]
        public string Static(byte[] value)
        {
            return UTF8.GetString(value);
        }

        [Benchmark]
        [ArgumentsSource(nameof(ByteSource))]
        public string StaticSealed(byte[] value)
        {
            return UTF8Sealed.GetString(value);
        }
    }
}

#if NETCOREAPP
public static int WriteLongstr(Span<byte> span, ReadOnlySpan<char> val)
{
int bytesWritten = val.IsEmpty ? 0 : UTF8.GetBytes(val, span.Slice(4));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants