Experimental: Symbol-based usage detection (opt-in)#135
Conversation
Replace the GetUsedAssemblyReferences approach with a Roslyn analyzer that tracks symbol usage at finer granularity, behind the ReferenceTrimmerUseSymbolAnalysis MSBuild property (opt-in, defaults to false). The new approach uses RegisterSymbolAction and RegisterOperationAction to track which assemblies contain symbols that the code actually references, rather than relying on the compiler's broader 'used assembly' heuristic which over-reports usage by treating transitive assembly dependencies as used. Key design decisions: - RT0001 (bare Reference): always uses conservative transitive closure to avoid breaking runtime dependencies that lack automatic transitive resolution - RT0002 (ProjectReference): uses transitive closure only when DisableTransitiveProjectReferences is set; otherwise uses precise detection - RT0003 (PackageReference): always uses precise symbol-based detection since NuGet handles transitive package deps automatically - Attribute constructor/named arguments (including typeof) are tracked - Early exit optimization when all reference assemblies are already tracked The legacy GetUsedAssemblyReferences code path is preserved as the default. All E2E tests run in both modes via DataRow parameterization (91 pass). Version bumped from 3.4 to 3.5. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Linking to dotnet/roslyn#625 as the original issue in the Roslyn repo. Tagging @AlekseyTs (the author of |
If I correctly interpret the approach, I think this approach is likely to undereport references needed for a successful build. There are situations when compiler needs an formation from types or assemblies that aren't explicitly mentioned in code. |
|
@AlekseyTs mind sharing an example? Are you saying there are things that are unavailable to analyze during compilation, or that they're just missing from the current implementation? |
I didn't review the implementation in this PR. Based on description of the approach, I assumed that implementation records information based on symbols referenced explicitly in source. If that is the case, this approach is likely to undereport references needed for a successful build. There are situations when compiler needs an formation from types or assemblies that aren't explicitly mentioned in code. For example, assemblies with type forwarders aren't referenced explicitly in source, but they are necessary to locate forwarded types. This is just one example, I am pretty sure there are other scenarios. |
|
@AlekseyTs i tried to cover some of the gaps in #138 . I think there two possibilities - either some of the functionality of Also, isn't type forwarding defined in the code, as an assembly attribute? That's available during compilation |
* type forwarding * Fill out the gaps * Add tests
- Fill operation coverage gaps: switch patterns (statement + expression), default expressions, user-defined operators (binary/unary/compound/ increment-decrement), function pointer types, and conversion operand types - Performance: replace ConcurrentDictionary.Count (acquires all stripe locks on .NET Framework) with a monotonic int flag on the hot path; skip tracking for the compilation's own assembly via ReferenceEquals - Platform-aware path comparers: case-insensitive on Windows/macOS, case-sensitive on Linux (addresses PR review feedback) - Rename legacy -> default for the GetUsedAssemblyReferences analysis path - Add analyzer unit tests (19 tests, ~6s) using CompilationWithAnalyzers for fast, targeted coverage of symbol tracking logic - Replace E2E edge-case tests with unit tests to reduce test data bloat - Add Microsoft.CodeAnalysis.CSharp.Workspaces package reference for tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace ConcurrentDictionary.Count with Interlocked.Increment counter to avoid acquiring all stripe locks on .NET Framework - Simplify allTracked flag to just trackedCount >= totalReferenceCount - Track local variable types via IVariableDeclaratorOperation - Add 17 new analyzer tests (19 -> 36 total): - nameof, XML doc cref (with doc mode on/off), type forwarding - local variables, lambdas, local functions, arrays, attributes - recursive patterns, events - PackageReference (RT0003): unused, used, multi-assembly aggregation - Bare Reference (RT0001): unused with transitive closure - Fix test isolation: lambda and local function tests now use the specific handler as the sole tracking path Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2a27871 to
55a91b2
Compare
Summary
Adds an experimental symbol-based analysis mode behind
ReferenceTrimmerUseSymbolAnalysis(opt-in, defaults to false). The legacyGetUsedAssemblyReferencescode path is preserved as the default.Motivation
GetUsedAssemblyReferencesover-reports usage by treating transitive assembly dependencies as "used" even when the project's code doesn't reference them directly.Approach
Uses
RegisterSymbolAction+RegisterOperationActionto track which assemblies contain symbols that code actually references. Safety measures for runtime deps: RT0001 uses conservative transitive closure for bare References; RT0002 respectsDisableTransitiveProjectReferences; RT0003 uses precise detection (NuGet handles transitive deps).Additional coverage beyond IOperation/ISymbol:
nameof()and XML doc<cref>via syntax node actionsIVariableDeclaratorOperationPerformance:
Interlocked.Incrementcounter replacesConcurrentDictionary.Count(which acquires all stripe locks on .NET Framework). Short-circuits all callbacks once every reference is accounted for.Opt-in
Testing
CompilationWithAnalyzers(~2s runtime), covering: method calls, object creation, member references, base types, interfaces, generics, type constraints, attributes, typeof, catch, default, operators, conversions, patterns (switch/is/recursive), nameof, XML doc cref, type forwarding, local variables, lambdas, local functions, arrays, events, and PackageReference/Reference aggregation (RT0001/RT0003)Rollout plan