-
Notifications
You must be signed in to change notification settings - Fork 783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate tuple allocations in seekReadCustomAttributeRow via a custom reader for attribute rows #9218
Conversation
df057a1
to
da6df7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply changing this to a struct may be ok, but to make a better impact, you could do what I did here:
#4716
I think we should fix this despite not pushing the SRM PR. I don't think we can switch to a SRM implementation because of the risk involved. We need to have a huge suite of tests that directly test our IL metadata reader. |
Yeah, #4716 would be good. I just scoped this (and some other PRs) to address only one specific thing to keep them as small as possible rather than go through and convert a lot of the obvious ones to use structs (like this little chungus: https://github.com/dotnet/fsharp/blob/master/src/absil/ilread.fs#L256) |
I took a trace from using VS after editing the compiler codebase for an hour and this is basically near the top in terms of allocations While #4716 would also improve things this is a bit more targeted. @TIHan is there a good set of benchmarks we could potentially use here? I'd like to see if switching to a string tuple does help since it's shown up in two separate traces as a top thing |
@cartermp You could try using this benchmark. https://github.com/dotnet/fsharp/blob/master/benchmarks/CompilerServiceBenchmarks/Program.fs#L194 Maybe tweak it to read all the attributes in an assembly? |
This one was a little iffy for me ( BenchmarkDotNet=v0.11.3, OS=Windows 10.0.19041
Intel Core i7-6700K CPU 4.00GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
[Host] : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit LegacyJIT-v4.8.4180.0 DEBUG
Job-FEFVFA : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.8.4180.0
InvocationCount=1 UnrollFactor=1
Before:
After:
So if we consider allocations and mean CPU time with the worst-case of being at the upper end of the error, this appears to be a tradeoff of 75MB less allocations per op vs. ~0.2ms per op |
@TIHan what are your thoughts here? I think 75MB less allocations per operation outweighs the slight increase in CPU time. It's also worth noting that the measured CPU time now also has less variance than before. |
Okay, I tried this with @TIHan's proposed approach in #4716 and these were the results of the compiler service benchmarks: BenchmarkDotNet=v0.11.3, OS=Windows 10.0.19041
Intel Core i7-6700K CPU 4.00GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
[Host] : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit LegacyJIT-v4.8.4180.0 DEBUG
Job-FEFVFA : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.8.4180.0
InvocationCount=1 UnrollFactor=1
CPU time is improved over the current and new implementation by ~2%. Memory usage per op is 19% better than the current implementation, and 13% better than the change I made. So that's what I'll go with. |
da6df7f
to
70c2c20
Compare
Might be worth trying to unify on this? https://github.com/dotnet/fsharp/blob/master/src/absil/ilread.fs#L746 |
I wont bother trying to consolidate with https://github.com/dotnet/fsharp/blob/master/src/absil/ilread.fs#L746 It's a bit of additional churn and I haven't seen evidence from trace data that the tuples the functions that call this thing have an impact. |
@TIHan @dsyme @KevinRansom this one is ready for review - should be pretty easy considering the improvement IMO |
70c2c20
to
206607a
Compare
The numbers look intriguing, but with a single invocation, I'm not sure they proof anything (and may be better, or worse, than what you're seeing). Maybe you ran into a bug of BDN that I reported there (and isn't fixed yet), which chooses 1 invocation when there's a lot secundairy jitting going on. Can you share the BDN code, or force it with higher invocation count? |
@abelbraaksma I'm not sure if those values are referring to the actual number of times something is executed. As you can see in the log, each benchmark was executed many times to for the report:
The benchmark is here: https://github.com/dotnet/fsharp/tree/master/benchmarks/CompilerServiceBenchmarks |
@cartermp Actually, the log shows the opposite, each run has only It's a known bug, quite recently introduced in BDN dotnet/BenchmarkDotNet#1338. I'll have a look at the code. |
This kind of timings, plus that it reaches the max of 100 runs are also indicative of a problem with the benchmark:
|
Gotcha, I see. Well I don't think that this should stop progress on this though. The chief concern here is memory usage, and both switching to a struct tuple or this different approach are an obvious win. |
Oh yes, absolutely, I didn't mean to block progress in any way, it's certainly an improvement :). |
…m reader for attribute rows (dotnet#9218) * seekReadCustomAttributeRow as a struct 3-tuple * Read custom attributes via a specialized reader
This PR is more for folks like @dsyme and @TIHan to look at and see what they think.
From this issue: https://developercommunity.visualstudio.com/content/problem/1035124/trace-of-editing-experience-with-in-memory-cross-p-1.html
The submitted trace showed 120MB of allocations coming from this function, most of which are 3-tuples (~5.7% of all allocations in the sample):
It's also 0.7% of total CPU time:
Since the data being generated is all small value types, maybe a struct tuple would be better here.
Looking through what all calls this and the frequency, it looks like it's just a part of normal compiler routines so if we think this would speed things up a bit then that'd be nice.