Reduce allocations in TagHelperBinder.ProcessDescriptors#12237
Conversation
Will add more information and elevate out of draft status if a test insertion looks good
| using var pooledSet = HashSetPool<TagHelperDescriptor>.GetPooledObject(out var distinctSet); | ||
|
|
||
| // mapBuilder maps from tag name to either a single TagHelperDescriptor or a List<TagHelperDescriptor>. | ||
| using var pooledMap = StringDictionaryPool<object>.OrdinalIgnoreCase.GetPooledObject(out var mapBuilder); |
There was a problem hiding this comment.
Would it negate the benefits to use something a bit nicer than object? Perhaps something like https://github.com/dotnet/roslyn/blob/main/src/Dependencies/Collections/OneOrMany.cs ? Even just a tuple would be nicer
There was a problem hiding this comment.
I had thoughts along that line yesterday. There is also a MultiDictionary in Roslyn that might be nice for this scenario. I'll evaluate those too.
There was a problem hiding this comment.
It's early and I haven't had any caffeine yet, but I don't think I superficially really like either of those.
MultiDictionary:
It's ValueSet uses a ImmutableHashSet for it's value when there are multiple values. Those adds for the same key end up creating a new ImmutableHashSet each time. Ewww...
OneOrMany:
For the case where it represents "many", it uses an ImmutableArray. Again, each add in that case will allocate a new ImmutableArray. Ewww...
There was a problem hiding this comment.
@DustinCampbell -- Obviously we can write one, but do you know of a relevant data structure in Razor or Roslyn a bit better than the two options above?
There was a problem hiding this comment.
I've been looking at this and imagining something a bit more custom. My strawman idea is to do another through the descriptors. In the first pass, count up the size of the arrays we'll need. In the second pass, construct the arrays and fill them.
To avoid the mess that this will undoubtably create, we can create a custom "builder" struct to track the state. (I'll call the struct FinalArrayBuilder for now, but feel free to come up with something better. 😄)
Here's a strawman definition of the struct (which should be private to this class):
private struct FinalArrayBuilder
{
private int _size;
private TagHelperDescriptor[]? _array;
private int _index;
private TagHelperDescriptor[] GetArray() => _array ??= new[_size];
public void IncreaseSize()
{
Debug.Assert(_array is null, "Can't increase size once the array has been created");
_size++;
}
public void Add(TagHelperDescriptor value)
{
Debug.Assert(_index < _size, "Can't add past the end of the array");
var array ??= _array ??= new[_size];
array[_index++] = value;
}
public ImmutableArray<TagHelperDescriptor> ToImmutable()
{
if (_size == 0)
{
return [];
}
Debug.Assert(_assert is not null);
Debug.Assert(_index == _size, "Can't produce the final array until it's filled.");
return ImmutableColllectionsMarshal.AsImmutableArray(_array);
}
}Then, we'd make mapBuilder a Dicitonary<string, FinalArrayBuilder>.
In the first pass, we'd add catch-all descriptors but call IncreaseSize() for non-catch all descriptors. We could use a PooledArrayBuilder<(string, TagHelperDescriptor)> to track the tag name and descriptor pairs we need to visit in the second pass, rather than performing the whole descriptor walk again.
using var toVisit = new PooledArrayBuilder<(string, TagHelperDescriptor)>();
foreach (var descriptor in descriptors)
{
if (!distinctSet.Add(descriptor))
{
// We're already seen this descriptor, skip it.
continue;
}
foreach (var rule in descriptor.TagMatchingRules)
{
if (rule.TagName == TagHelperMatchingConventions.ElementCatchAllName)
{
// This is a catch-all descriptor, we can keep track of it separately.
catchAllBuilder.Add(descriptor);
}
else
{
// This is a specific tag name, we need to add it to the map.
var tagName = tagNamePrefix + rule.TagName;
if (!mapBuilder.TryGetValue(tagName, out var finalArrayBuilder))
{
finalArrayBuilder = default;
}
finalArrayBuilder.IncreaseSize();
// Copy back to the dictionary.
mapBuilder[tagName] = finalArrayBuilder;
// Ensure we visit tagName and descriptor in the next pass.
toVisit.Add((tagName, descriptor));
}
}
}In the second pass, add the descriptors to the FinalArrayBuilders.
string? currentTagName = null;
FinalArrayBuilder finalArrayBuilder = default;
foreach (var (tagName, descriptor) in toVisit)
{
if (currentTagName != tagName)
{
if (currentTagName != null)
{
// Copy the previous FinalArrayBuilder into the dictionary.
mapBuilder[currentTagName] = finalArrayBuilder;
}
currentTagHelper = tagName;
finalArrayBuilder = mapBuilder[currentTagName];
}
finalArrayBuilder.Add(descriptor);
}
// Update the map with the last FinalArrayBuilder
if (currentTagName != null)
{
mapBuilder[currentTagName] = finalArrayBuilder;
}At this point, mapBuilder should contain everything we need to produce the final result:
// Build the final dictionary with immutable arrays.
var map = new Dictionary<string, ImmutableArray<TagHelperDescriptor>>(capacity: mapBuilder.Count, StringComparer.OrdinalIgnoreCase);
foreach (var (tagName, finalArrayBuilder) in mapBuilder)
{
map.Add(tagName, finalArrayBuilder.ToImmutable());
}Anyway, that's my idea modulo typos. What do you think?
There was a problem hiding this comment.
(Much easier for me to read if you copy-and-paste that markdown table out of the console and into GitHub)
There was a problem hiding this comment.
Good changes! Another change I'd like to try is to take the Commit 1 approach where we stick with a single pass but use a struct builder like in Commit 3. Instead of building an array, make the struct builder produce a OneOrMany to store the final tag helpers. When we looked at it what was in the final map, we saw that there were a lot of arrays of 1 item. I think we can use a OneOrMany to avoid ever allocating an array for 1 item. What do you think?
There was a problem hiding this comment.
After some "shower thinking time", I strongly suspect that the biggest increase in commit 3's CPU time is the extra dictionary lookups to accommodate using a struct. Because we always have to copy the struct back into the dictionary, we end up with a lot of extra lookups. That seems to me to be the most expensive operation that's being increased by commit 3.
There was a problem hiding this comment.
OK. I've written up my initial ideas and here are the numbers I'm getting.
| Method | TagHelpers | Mean | Error | StdDev | Median | Min | Max | Code Size | Gen0 | Gen1 | Allocated |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 'Construct TagHelperBinders (Original)' | BlazorServerApp | 168.1 ms | 3.31 ms | 4.30 ms | 167.8 ms | 162.4 ms | 178.2 ms | 41 B | - | - | 127.98 MB |
| 'Construct TagHelperBinders (Original, with prefix)' | BlazorServerApp | 177.8 ms | 3.51 ms | 6.05 ms | 178.0 ms | 164.8 ms | 192.4 ms | 41 B | 1000.0000 | - | 167.27 MB |
| 'Construct TagHelperBinders (V1 - Commit 1)' | BlazorServerApp | 139.2 ms | 2.11 ms | 1.76 ms | 138.8 ms | 136.7 ms | 143.2 ms | 41 B | - | - | 69.2 MB |
| 'Construct TagHelperBinders (V1 - Commit 1, with prefix)' | BlazorServerApp | 199.4 ms | 4.10 ms | 12.10 ms | 200.5 ms | 182.8 ms | 231.5 ms | 5,196 B | - | - | 148.32 MB |
| 'Construct TagHelperBinders (V2 - Commit 3)' | BlazorServerApp | 207.9 ms | 2.45 ms | 2.05 ms | 208.3 ms | 205.2 ms | 210.7 ms | 41 B | - | - | 53.18 MB |
| 'Construct TagHelperBinders (V2 - Commit 3, with prefix)' | BlazorServerApp | 290.0 ms | 5.77 ms | 7.09 ms | 290.6 ms | 279.6 ms | 308.7 ms | 41 B | - | - | 132.29 MB |
| 'Construct TagHelperBinders (V3)' | BlazorServerApp | 142.7 ms | 2.46 ms | 2.18 ms | 142.6 ms | 138.9 ms | 146.5 ms | 41 B | - | - | 41.12 MB |
| 'Construct TagHelperBinders (V3, with prefix)' | BlazorServerApp | 155.9 ms | 2.98 ms | 3.65 ms | 155.6 ms | 150.1 ms | 165.3 ms | 41 B | - | - | 80.41 MB |
| 'Construct TagHelperBinders (Original)' | TelerikMvc | 1,401.1 ms | 28.17 ms | 83.05 ms | 1,367.5 ms | 1,295.0 ms | 1,602.9 ms | 5,710 B | 1000.0000 | - | 1369.17 MB |
| 'Construct TagHelperBinders (Original, with prefix)' | TelerikMvc | 1,457.5 ms | 29.00 ms | 66.63 ms | 1,489.3 ms | 1,368.9 ms | 1,578.3 ms | 5,749 B | 2000.0000 | 1000.0000 | 1597.21 MB |
| 'Construct TagHelperBinders (V1 - Commit 1)' | TelerikMvc | 1,314.3 ms | 8.81 ms | 7.81 ms | 1,314.2 ms | 1,302.5 ms | 1,330.0 ms | 41 B | 1000.0000 | - | 1048.32 MB |
| 'Construct TagHelperBinders (V1 - Commit 1, with prefix)' | TelerikMvc | 1,674.6 ms | 33.80 ms | 99.66 ms | 1,711.7 ms | 1,549.5 ms | 1,805.1 ms | 5,337 B | 1000.0000 | - | 1520.65 MB |
| 'Construct TagHelperBinders (V2 - Commit 3)' | TelerikMvc | 1,854.3 ms | 21.90 ms | 19.41 ms | 1,853.9 ms | 1,822.5 ms | 1,879.0 ms | 41 B | 1000.0000 | - | 1209.76 MB |
| 'Construct TagHelperBinders (V2 - Commit 3, with prefix)' | TelerikMvc | 2,341.2 ms | 46.23 ms | 43.25 ms | 2,353.4 ms | 2,287.0 ms | 2,415.5 ms | 41 B | 1000.0000 | - | 1682.09 MB |
| 'Construct TagHelperBinders (V3)' | TelerikMvc | 1,287.1 ms | 9.92 ms | 8.29 ms | 1,285.7 ms | 1,271.5 ms | 1,301.8 ms | 41 B | 1000.0000 | - | 788.88 MB |
| 'Construct TagHelperBinders (V3, with prefix)' | TelerikMvc | 1,351.0 ms | 8.08 ms | 7.56 ms | 1,350.5 ms | 1,340.2 ms | 1,364.3 ms | 41 B | 1000.0000 | - | 1016.92 MB |
There was a problem hiding this comment.
OK, ran on my machine and come to a similar conclusion, commit 5 is the most performant:
| Method | TagHelpers | Mean | Error | StdDev | Median | Min | Max | Code Size | Gen0 | Gen1 | Allocated |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 'Construct TagHelperBinders (Original)' | BlazorServerApp | 127.8 ms | 0.99 ms | 0.88 ms | 127.6 ms | 126.6 ms | 129.0 ms | 41 B | - | - | 127.75 MB |
| 'Construct TagHelperBinders (Original, with prefix)' | BlazorServerApp | 154.5 ms | 1.40 ms | 1.31 ms | 154.6 ms | 152.5 ms | 156.9 ms | 41 B | - | - | 206.87 MB |
| 'Construct TagHelperBinders (Commit 1)' | BlazorServerApp | 120.4 ms | 1.10 ms | 0.98 ms | 120.4 ms | 119.1 ms | 122.3 ms | 41 B | - | - | 68.97 MB |
| 'Construct TagHelperBinders (Commit 1, with prefix)' | BlazorServerApp | 128.8 ms | 1.62 ms | 1.51 ms | 129.2 ms | 126.4 ms | 131.3 ms | 41 B | - | - | 108.26 MB |
| 'Construct TagHelperBinders (Commit 3)' | BlazorServerApp | 168.3 ms | 1.97 ms | 1.84 ms | 168.3 ms | 165.8 ms | 172.3 ms | 41 B | - | - | 52.95 MB |
| 'Construct TagHelperBinders (Commit 3, with prefix)' | BlazorServerApp | 188.6 ms | 1.45 ms | 1.36 ms | 188.5 ms | 186.3 ms | 190.6 ms | 41 B | - | - | 92.24 MB |
| 'Construct TagHelperBinders (Commit 5)' | BlazorServerApp | 123.8 ms | 0.84 ms | 0.79 ms | 124.0 ms | 122.2 ms | 124.7 ms | 41 B | - | - | 40.89 MB |
| 'Construct TagHelperBinders (Commit 5, with prefix)' | BlazorServerApp | 137.2 ms | 2.31 ms | 2.27 ms | 136.9 ms | 134.4 ms | 142.7 ms | 41 B | - | - | 80.18 MB |
| 'Construct TagHelperBinders (Original)' | TelerikMvc | 1,171.0 ms | 23.00 ms | 37.78 ms | 1,156.9 ms | 1,116.4 ms | 1,243.2 ms | 41 B | 5000.0000 | - | 1368.94 MB |
| 'Construct TagHelperBinders (Original, with prefix)' | TelerikMvc | 1,338.0 ms | 26.68 ms | 46.73 ms | 1,318.6 ms | 1,284.8 ms | 1,457.0 ms | 41 B | 7000.0000 | 1000.0000 | 1841.28 MB |
| 'Construct TagHelperBinders (Commit 1)' | TelerikMvc | 1,189.7 ms | 19.33 ms | 29.52 ms | 1,175.6 ms | 1,152.3 ms | 1,266.6 ms | 41 B | 1000.0000 | - | 1048.09 MB |
| 'Construct TagHelperBinders (Commit 1, with prefix)' | TelerikMvc | 1,260.3 ms | 23.66 ms | 40.82 ms | 1,257.3 ms | 1,198.7 ms | 1,377.0 ms | 41 B | 5000.0000 | - | 1276.13 MB |
| 'Construct TagHelperBinders (Commit 3)' | TelerikMvc | 1,619.3 ms | 21.67 ms | 19.21 ms | 1,618.0 ms | 1,588.9 ms | 1,664.2 ms | 41 B | 5000.0000 | - | 1209.53 MB |
| 'Construct TagHelperBinders (Commit 3, with prefix)' | TelerikMvc | 1,714.1 ms | 27.59 ms | 23.04 ms | 1,711.8 ms | 1,688.2 ms | 1,770.7 ms | 41 B | 5000.0000 | 1000.0000 | 1437.57 MB |
| 'Construct TagHelperBinders (Commit 5)' | TelerikMvc | 1,078.2 ms | 15.52 ms | 12.96 ms | 1,075.8 ms | 1,064.8 ms | 1,104.6 ms | 41 B | 2000.0000 | - | 593.22 MB |
| 'Construct TagHelperBinders (Commit 5, with prefix)' | TelerikMvc | 1,141.2 ms | 22.35 ms | 26.61 ms | 1,131.5 ms | 1,109.7 ms | 1,203.6 ms | 41 B | 3000.0000 | - | 821.27 MB |
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.TagHelperSet.cs
Outdated
Show resolved
Hide resolved
src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/TagHelperBinder.cs
Outdated
Show resolved
Hide resolved
Rename member
|
@davidwengier -- I think it would be best if you gave this another lookover, as it's changed pretty significantly since your last look, and I probably shouldn't count Dustin's review as he authored a good chunk of this. |
davidwengier
left a comment
There was a problem hiding this comment.
Glad you two worked out what you were doing in the end 😛
I think we are going with commit 5 as our final version as although it shows a CPU increase from the original code in this test, the allocations are significantly lower and we've validated that the CPU usage scales better for larger scenarios.
*** CPU (ms) ***
*** Allocations (MB) ***
*** Numbers exclude bad runs where we see the 4x source generator costs
*** Commit 1 test insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/670902
*** Commit 3 test insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/671321
*** Commit 5 test insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/672200