Skip to content

[OpenTelemetry] Improve HistogramExplicitBounds#7165

Merged
martincostello merged 13 commits into
open-telemetry:mainfrom
martincostello:gh-3428
Jun 9, 2026
Merged

[OpenTelemetry] Improve HistogramExplicitBounds#7165
martincostello merged 13 commits into
open-telemetry:mainfrom
martincostello:gh-3428

Conversation

@martincostello

@martincostello martincostello commented Apr 25, 2026

Copy link
Copy Markdown
Member

Fixes #3428

Changes

Apply recommendations from #3428:

  • Use SIMD vectorization to find buckets.
  • Use radix sort.
  • Apply cap to histogram explicit bounds (10M).

Written primarily by Copilot through an iterative approach.

Final benchmark results are shown below: #7165 (comment)

I haven't benchmarked the ARM64 SIMD branch as I don't have the hardware for that locally.

Merge requirement checklist

  • CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate CHANGELOG.md files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

Apply recommendations from open-telemetry#3428:

- Use SIMD vectorization to find buckets.
- Use radix sort.
- Apply cap to histogram explicit bounds (10M).

Written primarily by Copilot through an iterative approach.
@github-actions github-actions Bot added pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package perf Performance related labels Apr 25, 2026
@codecov

codecov Bot commented Apr 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 95.76271% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.75%. Comparing base (e4762a0) to head (6e87e47).

Files with missing lines Patch % Lines
...c/OpenTelemetry/Metrics/HistogramExplicitBounds.cs 95.28% 5 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7165      +/-   ##
==========================================
- Coverage   89.82%   89.75%   -0.08%     
==========================================
  Files         276      276              
  Lines       14738    14822      +84     
==========================================
+ Hits        13238    13303      +65     
- Misses       1500     1519      +19     
Flag Coverage Δ
unittests-Project-Experimental 89.63% <95.76%> (-0.10%) ⬇️
unittests-Project-Stable 89.66% <95.76%> (-0.07%) ⬇️
unittests-Solution 89.46% <95.76%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/OpenTelemetry/Metrics/MetricStreamIdentity.cs 87.56% <100.00%> (+0.19%) ⬆️
...trics/View/ExplicitBucketHistogramConfiguration.cs 89.47% <100.00%> (+6.14%) ⬆️
...c/OpenTelemetry/Metrics/HistogramExplicitBounds.cs 95.68% <95.28%> (-4.32%) ⬇️

... and 6 files with indirect coverage changes

Add missing patch coverage.
Fix CS8604 warning.
@martincostello

This comment was marked as outdated.

Add note for histogram boundary upper limit.
@martincostello martincostello marked this pull request as ready for review April 25, 2026 14:55
@martincostello martincostello requested a review from a team as a code owner April 25, 2026 14:55
Copilot AI review requested due to automatic review settings April 25, 2026 14:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves histogram explicit-bound bucket lookup performance in the metrics pipeline, and adds a hard cap on the number of explicit boundaries to prevent pathological allocations/processing.

Changes:

  • Replaces the previous large-bound lookup strategy with a radix-partitioned search + SIMD-accelerated linear scan fallback.
  • Adds a 10,000,000 max explicit boundary count check for both view configuration and instrument advice.
  • Adds unit tests, fuzz/property tests, and benchmarks; updates changelog and solution/project wiring.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/OpenTelemetry.Tests/Metrics/MetricViewTests.cs Adds unit test validating boundary-count limit exception details.
test/OpenTelemetry.Tests/Metrics/AggregatorTests.cs Adds coverage for large-bound lookup edge cases (mixed signs, infinities, NaN, degenerate NaN bounds) and display-bounds cleanup.
test/OpenTelemetry.FuzzTests/OpenTelemetry.FuzzTests.csproj New fuzz test project for OpenTelemetry core metrics.
test/OpenTelemetry.FuzzTests/Metrics/HistogramExplicitBoundsFuzzTests.cs Adds FsCheck property tests comparing optimized path vs scalar baseline.
test/Benchmarks/Metrics/HistogramExplicitBoundsBenchmarks.cs Adds BenchmarkDotNet benchmarks for bucket lookup across sizes/layouts.
src/OpenTelemetry/OpenTelemetry.csproj Grants InternalsVisibleTo for the new OpenTelemetry.FuzzTests project.
src/OpenTelemetry/Metrics/View/ExplicitBucketHistogramConfiguration.cs Introduces MaxBoundaryCount and enforces it when setting Boundaries.
src/OpenTelemetry/Metrics/MetricStreamIdentity.cs Enforces MaxBoundaryCount for instrument advice-provided boundaries.
src/OpenTelemetry/Metrics/HistogramExplicitBounds.cs Implements radix-partitioned + SIMD-assisted bucket index lookup and infinity cleanup.
src/OpenTelemetry/CHANGELOG.md Documents the explicit-boundary cap as a breaking change.
OpenTelemetry.slnx Adds the new fuzz test project to the solution.
Comments suppressed due to low confidence (1)

src/OpenTelemetry/Metrics/View/ExplicitBucketHistogramConfiguration.cs:74

  • IsSortedAndDistinct currently treats double.NaN as valid because all comparisons with NaN return false, so arrays containing NaN can pass validation even though they are not meaningfully “ascending order with distinct values”. Consider explicitly rejecting NaN (and possibly non-finite values) in this check so Boundaries validation matches the documented requirements and downstream bucketing assumptions.
        for (var i = 1; i < values.Length; i++)
        {
            if (values[i] <= values[i - 1])
            {
                return false;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/OpenTelemetry/Metrics/MetricStreamIdentity.cs Outdated
Comment thread src/OpenTelemetry/Metrics/MetricStreamIdentity.cs Outdated
@martincostello martincostello added the keep-open Prevents issues and pull requests being closed as stale label May 7, 2026
@Kielek

Kielek commented May 19, 2026

Copy link
Copy Markdown
Member

@cijothomas, could you please check this PR? Especially adding boundaries?

@Kielek Kielek requested a review from cijothomas May 19, 2026 11:10

@Kielek Kielek left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some direct feedback while reviewing by Codex, in general LGTM.

It will be great to see Cijo final review, but I will not block this.

@martincostello, I think that you need to solve problems with MemoryMarshal, your preious changes are working ;)

@martincostello

Copy link
Copy Markdown
Member Author

Need to re-run the benchmarks with the latest changes.

@martincostello

Copy link
Copy Markdown
Member Author

Benchmark comparison: main (baseline) vs gh-3428 (candidate) (generated by Claude 🐙)

Benchmarks.Metrics.HistogramExplicitBoundsBenchmarks — BenchmarkDotNet v0.15.8, .NET 10.0.8 (X64 RyuJIT), 13th Gen Intel Core i7-13700H.

TL;DR

  • Allocations: unchanged. Every benchmark allocates 0 B on both branches — this change is allocation-neutral.
  • Special-case lookups (±Infinity, NaN) are dramatically faster. These now short-circuit instead of scanning the bounds, dropping from ~4–21 ns to well under 1 ns (often ~0).
  • General finite-value lookups are ~50% faster at larger bound counts. At BoundCount=49, ExactBoundary / InRangeValue / WithInfiniteBounds all roughly halve (~10.5 ns → ~5.1 ns).
  • At small bound counts (10) the general lookups are roughly neutral (within noise; one case ~5% slower).
  • ⚠️ BoundCount=50 and BoundCount=1000 comparisons for the finite-value lookups are unreliable — several baseline cells report sub-nanosecond means (e.g. 0.0007 ns, 0.0308 ns), which indicate the JIT eliminated the operation as dead code. The candidate measures real work in those cells, so the apparent "regression" there is a measurement artifact, not a true slowdown. I've excluded those from the conclusions above.

Representative results (.NET 10.0, Mean)

Cases below are where both branches measure real work, so the comparison is sound:

Scenario (BoundCount / Layout) Method Baseline Candidate Δ
49 / MixedSigned LookupExactBoundary 10.70 ns 5.14 ns −52%
49 / MixedSigned LookupInRangeValue 10.89 ns 5.03 ns −54%
49 / MixedSigned LookupWithInfiniteBounds 10.45 ns 5.12 ns −51%
49 / MixedSigned LookupPositiveInfinity 20.06 ns 0.54 ns −97%
49 / MixedSigned LookupNaN 15.67 ns 0.05 ns −100%
49 / PositiveOnly LookupExactBoundary 10.41 ns 4.98 ns −52%
49 / PositiveOnly LookupInRangeValue 10.34 ns 4.85 ns −53%
49 / PositiveOnly LookupPositiveInfinity 21.02 ns 0.46 ns −98%
49 / PositiveOnly LookupNaN 15.45 ns ~0 ns −100%
10 / MixedSigned LookupPositiveInfinity 7.84 ns 0.64 ns −92%
10 / MixedSigned LookupNaN 2.87 ns 0.02 ns −99%
10 / MixedSigned LookupExactBoundary 2.47 ns 2.18 ns −12%
10 / MixedSigned LookupWithInfiniteBounds 2.50 ns 2.26 ns −10%
10 / PositiveOnly LookupPositiveInfinity 4.48 ns 0.50 ns −89%
10 / PositiveOnly LookupNaN 3.04 ns ~0 ns −100%
10 / PositiveOnly LookupExactBoundary 2.48 ns 2.61 ns +5% (noise)
10 / PositiveOnly LookupInRangeValue 2.57 ns 2.41 ns −6%

The ±Infinity improvement also holds for the 50 and 1000 bound counts (e.g. LookupNegativeInfinity at 1000/MixedSigned: 3.29 ns → 0.18 ns; LookupPositiveInfinity: 4.13 ns → 0.48 ns) — these baseline cells were not dead-code-eliminated, so they're trustworthy.

Conclusion

A clear performance win with no allocation cost: infinity/NaN edge-case lookups are now effectively free, and the common finite-value lookups are ~50% faster once the bound array is large enough to matter. No genuine regressions were found — the only "slower" numbers come from baseline cells the JIT had optimized away, where the candidate is simply being measured doing real work.

Note: the .NET Framework 4.6.2 rows show the same trend and the same dead-code-elimination artifacts at BoundCount 50/1000.

intellitect-bot pushed a commit to IntelliTect/EssentialCSharp.Web that referenced this pull request Jun 12, 2026
….0 (#1201)

Updated
[OpenTelemetry.Exporter.OpenTelemetryProtocol](https://github.com/open-telemetry/opentelemetry-dotnet)
from 1.15.3 to 1.16.0.

<details>
<summary>Release notes</summary>

_Sourced from [OpenTelemetry.Exporter.OpenTelemetryProtocol's
releases](https://github.com/open-telemetry/opentelemetry-dotnet/releases)._

## 1.16.0

For highlights and announcements pertaining to this release see:
[Release Notes >
1.16.0](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/RELEASENOTES.md#​1160).

The following changes are from the previous release
[1.16.0-rc.1](https://github.com/open-telemetry/opentelemetry-dotnet/releases/tag/core-1.16.0-rc.1).

* NuGet: [OpenTelemetry
v1.16.0](https://www.nuget.org/packages/OpenTelemetry/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Api
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Api/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry.Api/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Api.ProviderBuilderExtensions
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Api.ProviderBuilderExtensions/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry.Api.ProviderBuilderExtensions/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Exporter.Console
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Exporter.Console/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry.Exporter.Console/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Exporter.InMemory
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Exporter.InMemory/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry.Exporter.InMemory/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Exporter.OpenTelemetryProtocol
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Exporter.OpenTelemetryProtocol/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Exporter.Zipkin
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Exporter.Zipkin/1.16.0)

  No notable changes.

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0/src/OpenTelemetry.Exporter.Zipkin/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Extensions.Hosting
v1.16.0](https://www.nuget.org/packages/OpenTelemetry.Extensions.Hosting/1.16.0)

  No notable changes.

 ... (truncated)

## 1.16.0-rc.1

The following changes are from the previous release
[1.15.3](https://github.com/open-telemetry/opentelemetry-dotnet/releases/tag/core-1.15.3).

* NuGet: [OpenTelemetry
v1.16.0-rc.1](https://www.nuget.org/packages/OpenTelemetry/1.16.0-rc.1)

* Stop validating View-provided metric stream `Name` against the
instrument
    name syntax, per
[spec
clarification](open-telemetry/opentelemetry-specification#5094).

([#​7300](open-telemetry/opentelemetry-dotnet#7300))
  
* Fix incorrect validation of `OTEL_BSP_*` and `OTEL_BLRP_*` environment
    variables.

([#​7187](open-telemetry/opentelemetry-dotnet#7187))
  
* Fix observable instrument callbacks running once per reader instead of
    once per collection cycle.

([#​7188](open-telemetry/opentelemetry-dotnet#7188))
  
* Added exception safety for user-supplied `ExemplarReservoir`
implementations.
Exceptions thrown from `Offer` are now caught and logged rather than
propagating
    out of `Counter.Add`/`Histogram.Record`.

([#​7277](open-telemetry/opentelemetry-dotnet#7277))
  
* Update `OpenTelemetrySdkEventSource` to support the W3C randomness
flag.

([#​7301](open-telemetry/opentelemetry-dotnet#7301))
  
  * Added `ObservedTimestamp` property to `LogRecord`.

([#​6979](open-telemetry/opentelemetry-dotnet#6979))
  
* **Breaking Change** Explicit histogram boundaries no longer allow more
than
    10 million values.

([#​7165](open-telemetry/opentelemetry-dotnet#7165))
  
* Fixed a circular reference which could cause a `LoggerProvider` to
fail to
resolve when one of its dependencies depends on `ILogger` or
`ILoggerFactory`.
As part of this fix the `LoggerProvider` resolved from dependency
injection
is now created lazily when the first logger is created rather than when
`ILoggerProvider` or `ILoggerFactory` is resolved. A consequence is that
any
invalid configuration now surfaces when the first log record is written
instead
    of when the logging services are resolved.

([#​7308](open-telemetry/opentelemetry-dotnet#7308))

See
[CHANGELOG](https://github.com/open-telemetry/opentelemetry-dotnet/blob/core-1.16.0-rc.1/src/OpenTelemetry/CHANGELOG.md)
for details.

* NuGet: [OpenTelemetry.Api
v1.16.0-rc.1](https://www.nuget.org/packages/OpenTelemetry.Api/1.16.0-rc.1)

  * **Experimental (pre-release builds only):**
Add support for using environment variables as context propagation
carriers.

([#​7174](open-telemetry/opentelemetry-dotnet#7174))
  
* Fix `BaggagePropagator` to correctly follow Key and Value Encoding
rules as per
 ... (truncated)

## 1.16.0-beta.1

The following changes are from the previous release
[1.15.3-beta.1](https://github.com/open-telemetry/opentelemetry-dotnet/releases/tag/coreunstable-1.15.3-beta.1).

* NuGet: [OpenTelemetry.Exporter.Prometheus.AspNetCore
v1.16.0-beta.1](https://www.nuget.org/packages/OpenTelemetry.Exporter.Prometheus.AspNetCore/1.16.0-beta.1)

* Fixed scrape response cache freshness using monotonic time so it is
not
    affected by NTP system clock adjustments.

([#​7253](open-telemetry/opentelemetry-dotnet#7253))
  
  * **Breaking Change** Removed `DisableTimestamp` property from
    `PrometheusAspNetCoreOptions`.

([#​7176](open-telemetry/opentelemetry-dotnet#7176))
  
* Fixed the serialization of `NaN`, `PositiveInfinity`, and
`NegativeInfinity`
    values in Prometheus metrics to be compliant with the specification.

([#​7179](open-telemetry/opentelemetry-dotnet#7179))
  
* Fixed loss of precision when serializing `double` and `float` values
in
Prometheus metrics to be compliant with the specification by using 17
    significant digits to represent such values.

([#​7179](open-telemetry/opentelemetry-dotnet#7179))
  
* Fix non-ASCII characters in metric names and unit strings not being
sanitized
    correctly during Prometheus serialization.

([#​7184](open-telemetry/opentelemetry-dotnet#7184))
  
* Fix case where reader tracking could be reset while readers were still
active.

([#​7190](open-telemetry/opentelemetry-dotnet#7190))
  
* Improve `Accept` header handling for format negotiation so OpenMetrics
is
    selected correctly by considering whitespace and `q` weights.

([#​7208](open-telemetry/opentelemetry-dotnet#7208))
  
  * Emit OpenMetrics exemplars for counters and histogram buckets.

([#​7222](open-telemetry/opentelemetry-dotnet#7222))
  
* Fix incorrect handling of untyped metrics when using OpenMetrics
format.

([#​7219](open-telemetry/opentelemetry-dotnet#7219))
  
* Fix Prometheus/OpenMetrics serialization to emit metric and label
names
containing `_` instead of dropping them and prefixing leading digits.
    Invalid characters are replaced with `_` instead of being dropped.

([#​7209](open-telemetry/opentelemetry-dotnet#7209))
  
* Add `escaping=underscores` to the `Accept` header handling for content
    negotiation so OpenMetrics are handled correctly.

([#​7209](open-telemetry/opentelemetry-dotnet#7209))
  
* Omit histogram `_sum` and `_count` in OpenMetrics when negative bucket
    thresholds are present.

([#​7221](open-telemetry/opentelemetry-dotnet#7221))
 ... (truncated)

Commits viewable in [compare
view](open-telemetry/opentelemetry-dotnet@core-1.15.3...core-1.16.0).
</details>

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=OpenTelemetry.Exporter.OpenTelemetryProtocol&package-manager=nuget&previous-version=1.15.3&new-version=1.16.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

keep-open Prevents issues and pull requests being closed as stale perf Performance related pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate possible histogram performance improvements

4 participants