Skip to content

Conversation

@Erarndt
Copy link
Contributor

@Erarndt Erarndt commented Aug 4, 2025

Fixes #

Context

Part of GC cost is the amount of surviving objects during a collection. During solution load, this stack showed up as currently rooted items:

image

You can see that there are roughly 30MB worth of object instances being held onto by the tables collection inside of ConcurrentDictionary.

Additionally, ConcurrentDictionary objects are significantly larger than a Dictionary. For instance:

10k of them when empty:

image

10k of them with one item:

image

This is a size difference between 5x and 15x.

The existing implementation can be swapped to just use a dictionary with a lock.

Changes Made

Testing

Notes

Copilot AI review requested due to automatic review settings August 4, 2025 22:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes memory usage by replacing ConcurrentDictionary with a regular Dictionary protected by locks in the RegisteredTaskRecord class. The change addresses heap pressure during solution load where ConcurrentDictionary objects were consuming significant memory (30MB+ of surviving objects).

  • Replaces ConcurrentDictionary<RegisteredTaskIdentity, object> with Dictionary<RegisteredTaskIdentity, object>
  • Implements manual locking around dictionary access to maintain thread safety
  • Reduces memory overhead by 5x-15x based on profiling data
Comments suppressed due to low confidence (1)

src/Build/Instance/TaskRegistry.cs:1441

  • The finally block is missing a closing brace. There should be two closing braces - one for the lock block and one for the finally block.
                        _taskNamesCreatableByFactory[taskIdentity] = creatableByFactory;

@YuliiaKovalova
Copy link
Member

@surayya-MS heads-up, it's related to your recent changes

Copy link
Member

@surayya-MS surayya-MS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! There are definitely reduced memory optimizations here.

However, I'm still leaning towards using ConcurrentDictionary instead of normal Dictionary + lock for the new Multi-threaded MSBuild feature because of optimized concurrent reads and writes.

@rainersigwald what do you think?

@Erarndt
Copy link
Contributor Author

Erarndt commented Aug 22, 2025

Thanks for the PR! There are definitely reduced memory optimizations here.

However, I'm still leaning towards using ConcurrentDictionary instead of normal Dictionary + lock for the new Multi-threaded MSBuild feature because of optimized concurrent reads and writes.

@rainersigwald what do you think?

I'd be curious how much contention you're actually seeing in these paths. From the heap dump, there are 41,000 RegisteredTaskRecord objects, so I don't expect there are significant contention issues.

The other thing to weigh with respect to performance in the Multi-threaded feature is that garbage collections are going to be even more impactful to performance. Just these ConcurrentDictionaries represent 6.3% of all object alive in the heap in the snapshot. There is a cost being paid due to the initial allocations (that cause GC to trigger more frequently) and the memory being kept alive (promoting more memory makes a GC take more time). Each GC pauses all managed threads, so any savings here can have a significant impact.

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my big question here is "why are there enough of these RegisteredTaskRecords to matter?"

The set of tasks is pretty uniform between projects--should we tackle the size here by refactoring to share a single task registration object per resolved task, so e.g. in normal cases there's only one Csc? Is it per-project now?

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks fine to me for now/if we don't pursue the better refactoring.

@surayya-MS surayya-MS enabled auto-merge (squash) August 24, 2025 06:16
@surayya-MS surayya-MS merged commit d8bb7ac into dotnet:main Aug 24, 2025
9 checks passed
JanProvaznik pushed a commit that referenced this pull request Aug 26, 2025
Fixes #

### Context

Part of GC cost is the amount of surviving objects during a collection.
During solution load, this stack showed up as currently rooted items:

<img width="1198" height="259" alt="image"
src="https://github.com/user-attachments/assets/b524dded-4941-43b9-95e7-a0bf7d6fcab4"
/>

You can see that there are roughly 30MB worth of `object` instances
being held onto by the tables collection inside of
`ConcurrentDictionary`.

Additionally, `ConcurrentDictionary` objects are significantly larger
than a `Dictionary`. For instance:

10k of them when empty:

<img width="1200" height="89" alt="image"
src="https://github.com/user-attachments/assets/38d2cda3-5a27-4a27-b902-d0a8d20e07f4"
/>

10k of them with one item:

<img width="1258" height="158" alt="image"
src="https://github.com/user-attachments/assets/6dd439ad-6689-4a09-9d1d-9f4fd8adda75"
/>

This is a size difference between 5x and 15x.

The existing implementation can be swapped to just use a dictionary with
a lock.

### Changes Made


### Testing


### Notes
@Erarndt Erarndt deleted the dev/erarndt/concurrentDict branch September 22, 2025 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants