Skip to content
This repository has been archived by the owner on Mar 2, 2022. It is now read-only.

Optimize dictionary lookups in C# #32

Merged
merged 4 commits into from
Mar 16, 2021
Merged

Optimize dictionary lookups in C# #32

merged 4 commits into from
Mar 16, 2021

Conversation

yuriyostapenko
Copy link
Contributor

@yuriyostapenko yuriyostapenko commented Mar 15, 2021

Please note that there should be a better API to avoid double key lookups in a dictictionary hopefully coming some time later.

Until then, this PR removes second lookup for existing keys by wrapping counter in a reference cell, shaving off 15-20% on my machine (on .net5).

@akraines
Copy link

You probably should call this optimize.cs - since it isn't the simple case anymore (see the comment on to close pull request #24).

@benhoyt
Copy link
Owner

benhoyt commented Mar 15, 2021

Yep, this is 15% faster for me too, thanks. However, I don't want to optimize the simple.cs simple version. I'd be happy to add this as a first cut at the C# optimized version, though. If you want to do that, please add a new optimized.cs instead, and add to test.sh, benchmark.py, and the credits in README.md.

@benhoyt benhoyt mentioned this pull request Mar 15, 2021
@yuriyostapenko
Copy link
Contributor Author

Sure, to be honest I totally expected this, so I'll do almost that :)
But given it's dotnet, I think we need to introduce structure similar to Rust's, since it's very unidiomatic with just a single .cs file.

@yuriyostapenko
Copy link
Contributor Author

I've made the mentioned above changes.
It will only work with .net 5 SDK, which is how it should be IMHO. Measuring dotnet/c# performance using mono is, respectfully, like using an alternative distribution of python2.

@benhoyt
Copy link
Owner

benhoyt commented Mar 15, 2021

Thanks! I'll see if I can get .net going on Linux.

@yuriyostapenko
Copy link
Contributor Author

@erikbozic
Copy link
Contributor

Hi,
Didn't notice you were doing the same so I also sent a PR setting up a structure simillar to rust for c#: #35 😅

I guess we need to consolidate. I was thinking we could do something like techempower benchmarks, but simpler.

@yuriyostapenko
Copy link
Contributor Author

@erikbozic, I'm glad to see whatever structure will emerge.
I would not bother with mono performance, though, as it's going away soon, anyway.

@erikbozic
Copy link
Contributor

By going away you're referring to .NET 6? I guess you're right.
If we don't want to have both mono and dotnet benchmarks we can just use the structure you set up in this PR. I'll close mine :)

@yuriyostapenko
Copy link
Contributor Author

Yes, I did mean that .NET 6 will consume it. And mono is a weak "showcase" of modern .net potential, anyway.

@benhoyt
Copy link
Owner

benhoyt commented Mar 16, 2021

Thank you -- just downloaded .NET for Linux (it was a "snap" :-).

@benhoyt benhoyt merged commit a823d73 into benhoyt:master Mar 16, 2021

class Program
{
public sealed class Ref<T> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small note: The BCL supplies StrongBox<T> (under System.Runtime.CompilerServices) that fulfils the exact same purpose as this type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thanks!

Copy link

@calledude calledude Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of having wrapped the integer anyway? My initial thought was that this would actually make the program less performant, because a reference type lives on the heap, whereas an int lives on the stack. Clearly there must be something I'm missing here?

Copy link
Contributor

@erikbozic erikbozic Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calledude it's a reference type, so we don't copy a new int every time we get it out of the dictionary - but instead increment the existing one. If you try it without you will see all counts are just 1.
In a more general sense: we're storing a pointer to integer instead of only the integer itself.

Otherwise we would need to get it out of the dictionary, increment it and then set the new incremented value back.

Copy link

@calledude calledude Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a ref struct then? This would force it to still live on the stack and we would have a pointer to it, perhaps that would make it even more performant? I assume there's a tradeoff in the current solution where having it on the stack does not outweigh the cost of copying the value all the time?

Edit; Nvm, ref structs can't be used as generic type arguments.

Copy link
Contributor

@erikbozic erikbozic Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I haven't looked into that part too much.
A trace I did earlier shows that the actual lookup takes the most time, followed by string creation. (not sure, if the latest code version, but should be close)
image
So I'm focusing on that first.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

FWIW; According to BenchmarkDotNet int is still faster.

Copy link
Contributor Author

@yuriyostapenko yuriyostapenko Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW; According to BenchmarkDotNet int is still faster.

@calledude, I'm not sure what exactly you are measuring there, but if you simply removed wrapping object and still use the rest of the optimized code as-is (that is, using TryGetValue), your faster code does not work correctly. As @erikbozic already pointed out - all your counters will never increment and always stay at 1.

Yes, extra object indirection adds overhead, but this code is faster than code in "simple" without it exactly because code without it will have to do hash lookup twice: first, to read value and then to store it back again. That is because current Dictionary API does not support atomic increment or in-place value modification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for you question on ref struct - the whole Dictionary lives on the heap and anything you store there has to be boxed and put on a heap, which ref struct cannot be.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checked and I could've sworn I had made sure I was actually incrementing the value correctly, but I guess not then.

Not my proudest moment... :) Sorry about that!

string line;
while ((line = Console.ReadLine()) != null)
{
line = line.ToLower();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could save this potential string allocation by instantiating the dictionary with one of the *IgnoreCase StringComparers. Additionally, that would show the explicit choice of if the algorithm is culture-aware or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I intentionally did not include any of those optimizations that were plentiful in the many parallel PRs :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants