Skip to content

Experimental: Symbol-based usage detection (opt-in)#135

Merged
dfederm merged 4 commits intomainfrom
dfederm/symbol-based-usage-detection
Apr 24, 2026
Merged

Experimental: Symbol-based usage detection (opt-in)#135
dfederm merged 4 commits intomainfrom
dfederm/symbol-based-usage-detection

Conversation

@dfederm
Copy link
Copy Markdown
Owner

@dfederm dfederm commented Apr 10, 2026

Summary

Adds an experimental symbol-based analysis mode behind ReferenceTrimmerUseSymbolAnalysis (opt-in, defaults to false). The legacy GetUsedAssemblyReferences code path is preserved as the default.

Motivation

GetUsedAssemblyReferences over-reports usage by treating transitive assembly dependencies as "used" even when the project's code doesn't reference them directly.

Approach

Uses RegisterSymbolAction + RegisterOperationAction to track which assemblies contain symbols that code actually references. Safety measures for runtime deps: RT0001 uses conservative transitive closure for bare References; RT0002 respects DisableTransitiveProjectReferences; RT0003 uses precise detection (NuGet handles transitive deps).

Additional coverage beyond IOperation/ISymbol:

  • C# nameof() and XML doc <cref> via syntax node actions
  • Type forwarding assemblies marked as used when their destination assembly is used (PR Cover type forwarding and other gaps #138)
  • Local variable type declarations via IVariableDeclaratorOperation

Performance: Interlocked.Increment counter replaces ConcurrentDictionary.Count (which acquires all stripe locks on .NET Framework). Short-circuits all callbacks once every reference is accounted for.

Opt-in

<PropertyGroup>
  <ReferenceTrimmerUseSymbolAnalysis>true</ReferenceTrimmerUseSymbolAnalysis>
</PropertyGroup>

Testing

  • 36 analyzer unit tests using CompilationWithAnalyzers (~2s runtime), covering: method calls, object creation, member references, base types, interfaces, generics, type constraints, attributes, typeof, catch, default, operators, conversions, patterns (switch/is/recursive), nameof, XML doc cref, type forwarding, local variables, lambdas, local functions, arrays, events, and PackageReference/Reference aggregation (RT0001/RT0003)
  • All E2E tests run in both modes via DataRow parameterization

Rollout plan

  1. Ship as opt-in in 3.5
  2. Enable on key repos, fix bugs
  3. If successful, make default in a future major version

Replace the GetUsedAssemblyReferences approach with a Roslyn analyzer that tracks
symbol usage at finer granularity, behind the ReferenceTrimmerUseSymbolAnalysis
MSBuild property (opt-in, defaults to false).

The new approach uses RegisterSymbolAction and RegisterOperationAction to track
which assemblies contain symbols that the code actually references, rather than
relying on the compiler's broader 'used assembly' heuristic which over-reports
usage by treating transitive assembly dependencies as used.

Key design decisions:
- RT0001 (bare Reference): always uses conservative transitive closure to avoid
  breaking runtime dependencies that lack automatic transitive resolution
- RT0002 (ProjectReference): uses transitive closure only when
  DisableTransitiveProjectReferences is set; otherwise uses precise detection
- RT0003 (PackageReference): always uses precise symbol-based detection since
  NuGet handles transitive package deps automatically
- Attribute constructor/named arguments (including typeof) are tracked
- Early exit optimization when all reference assemblies are already tracked

The legacy GetUsedAssemblyReferences code path is preserved as the default.
All E2E tests run in both modes via DataRow parameterization (91 pass).
Version bumped from 3.4 to 3.5.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/Analyzer/ReferenceTrimmerAnalyzer.cs Outdated
@stan-sz
Copy link
Copy Markdown
Contributor

stan-sz commented Apr 14, 2026

Linking to dotnet/roslyn#625 as the original issue in the Roslyn repo. Tagging @AlekseyTs (the author of GetUsedAssemblyReferences) for comments on this new approach.

@AlekseyTs
Copy link
Copy Markdown

Approach

Uses RegisterSymbolAction + RegisterOperationAction to track which assemblies contain symbols that code actually references.

If I correctly interpret the approach, I think this approach is likely to undereport references needed for a successful build. There are situations when compiler needs an formation from types or assemblies that aren't explicitly mentioned in code.

@olstakh
Copy link
Copy Markdown
Contributor

olstakh commented Apr 17, 2026

@AlekseyTs mind sharing an example? Are you saying there are things that are unavailable to analyze during compilation, or that they're just missing from the current implementation?

@AlekseyTs
Copy link
Copy Markdown

Are you saying there are things that are unavailable to analyze during compilation, or that they're just missing from the current implementation?

I didn't review the implementation in this PR. Based on description of the approach, I assumed that implementation records information based on symbols referenced explicitly in source. If that is the case, this approach is likely to undereport references needed for a successful build. There are situations when compiler needs an formation from types or assemblies that aren't explicitly mentioned in code. For example, assemblies with type forwarders aren't referenced explicitly in source, but they are necessary to locate forwarded types. This is just one example, I am pretty sure there are other scenarios.

@olstakh
Copy link
Copy Markdown
Contributor

olstakh commented Apr 19, 2026

@AlekseyTs i tried to cover some of the gaps in #138 . I think there two possibilities - either some of the functionality of GetUsedAssemblyReferences() API is not available for regular syntax registrations, or we just need to cover them all. I'm still seeking the evidence for the first option, in which case this endeavor becomes questionable or permanently opt-in at user's risk. Otherwise - we just need to cover all the cases

Also, isn't type forwarding defined in the code, as an assembly attribute? That's available during compilation

olstakh and others added 2 commits April 19, 2026 21:43
* type forwarding

* Fill out the gaps

* Add tests
- Fill operation coverage gaps: switch patterns (statement + expression),
  default expressions, user-defined operators (binary/unary/compound/
  increment-decrement), function pointer types, and conversion operand types
- Performance: replace ConcurrentDictionary.Count (acquires all stripe locks
  on .NET Framework) with a monotonic int flag on the hot path; skip tracking
  for the compilation's own assembly via ReferenceEquals
- Platform-aware path comparers: case-insensitive on Windows/macOS,
  case-sensitive on Linux (addresses PR review feedback)
- Rename legacy -> default for the GetUsedAssemblyReferences analysis path
- Add analyzer unit tests (19 tests, ~6s) using CompilationWithAnalyzers
  for fast, targeted coverage of symbol tracking logic
- Replace E2E edge-case tests with unit tests to reduce test data bloat
- Add Microsoft.CodeAnalysis.CSharp.Workspaces package reference for tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/Analyzer/ReferenceTrimmerAnalyzer.cs Outdated
- Replace ConcurrentDictionary.Count with Interlocked.Increment counter
  to avoid acquiring all stripe locks on .NET Framework
- Simplify allTracked flag to just trackedCount >= totalReferenceCount
- Track local variable types via IVariableDeclaratorOperation
- Add 17 new analyzer tests (19 -> 36 total):
  - nameof, XML doc cref (with doc mode on/off), type forwarding
  - local variables, lambdas, local functions, arrays, attributes
  - recursive patterns, events
  - PackageReference (RT0003): unused, used, multi-assembly aggregation
  - Bare Reference (RT0001): unused with transitive closure
- Fix test isolation: lambda and local function tests now use the
  specific handler as the sole tracking path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dfederm dfederm force-pushed the dfederm/symbol-based-usage-detection branch from 2a27871 to 55a91b2 Compare April 24, 2026 15:51
@dfederm dfederm marked this pull request as ready for review April 24, 2026 16:30
@dfederm dfederm merged commit cc3ff62 into main Apr 24, 2026
2 checks passed
@dfederm dfederm deleted the dfederm/symbol-based-usage-detection branch April 24, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants