Skip to content

Optimize histogram reservoir#7443

Merged
dashpole merged 8 commits intoopen-telemetry:mainfrom
dashpole:optimize_histogram_reservoir
Dec 15, 2025
Merged

Optimize histogram reservoir#7443
dashpole merged 8 commits intoopen-telemetry:mainfrom
dashpole:optimize_histogram_reservoir

Conversation

@dashpole
Copy link
Copy Markdown
Contributor

@dashpole dashpole commented Oct 2, 2025

This improves the concurrent performance of the histogram reservoir's Offer function by 4x (i.e. 75% reduction).

Accomplish this by locking each measurement, rather than locking around the entire storage. Also, defer extracting the trace context from context.Context until collection time. This improves the performance of Offer, which is on the measure hot path. Exemplars are often overwritten, so deferring the operation until Collect reduces the overall work.

                           │   main.txt   │              hist.txt              │
                           │    sec/op    │   sec/op     vs base               │
FixedSizeReservoirOffer-24    211.4n ± 3%   177.5n ± 3%  -16.04% (p=0.002 n=6)
HistogramReservoirOffer-24   200.85n ± 2%   47.41n ± 2%  -76.40% (p=0.002 n=6)
geomean                       206.1n        91.73n       -55.48%

Benchmarks for Measure:

                                                                                 │  main.txt   │             histres.txt             │
                                                                                 │   sec/op    │    sec/op     vs base               │
SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/0-24                 436.7n ± 4%   114.8n ±  5%  -73.72% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/10-24                472.2n ± 2%   169.7n ±  8%  -64.08% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/0-24               431.0n ± 2%   116.3n ±  2%  -73.01% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/10-24              470.9n ± 1%   171.0n ±  5%  -63.70% (p=0.002 n=6)

I explored using a []atomic.Pointer[measurement], but this had similar performance while being much more complex (needing a sync.Pool to eliminate allocations). The single-threaded performance was also much worse for that solution. See main...dashpole:optimize_histogram_reservoir_old.

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.2%. Comparing base (c15644d) to head (95f44c1).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #7443     +/-   ##
=======================================
- Coverage   86.2%   86.2%   -0.1%     
=======================================
  Files        302     302             
  Lines      21973   21971      -2     
=======================================
- Hits       18949   18947      -2     
  Misses      2643    2643             
  Partials     381     381             
Files with missing lines Coverage Δ
sdk/metric/exemplar/fixed_size_reservoir.go 94.7% <100.0%> (ø)
sdk/metric/exemplar/histogram_reservoir.go 90.9% <100.0%> (-1.7%) ⬇️
sdk/metric/exemplar/storage.go 100.0% <100.0%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread sdk/metric/exemplar/storage.go Outdated
Comment thread sdk/metric/exemplar/storage.go Outdated
@dashpole dashpole force-pushed the optimize_histogram_reservoir branch 2 times, most recently from 7457c73 to 7c1476f Compare October 4, 2025 03:47
@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from 7c1476f to 7b79e43 Compare October 7, 2025 14:40
dashpole added a commit that referenced this pull request Oct 7, 2025
Forked from this discussion here:
#7443 (comment)

It seems like a good idea for us as a group to align on and document
what we are comfortable with in terms of how ordered measurements are
reflected in collected metric data.

---------

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
Comment thread sdk/metric/exemplar/storage.go Outdated
@pellared pellared mentioned this pull request Oct 10, 2025
Comment thread sdk/metric/exemplar/storage.go Outdated
Comment thread sdk/metric/exemplar/storage.go Outdated
@bboreham
Copy link
Copy Markdown
Contributor

On further reflection, I fixed the copying issue before running the benchmark, so it is perhaps reasonable that less racy code runs slower.

Would be good if the tests and/or linter detected the issue. I note that NoCopy was removed from atomic.Value here: golang/go#21504.

@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from 433ff16 to e4dfbac Compare October 15, 2025 01:01
@dashpole
Copy link
Copy Markdown
Contributor Author

I also see slightly worse results, but agree it is definitely better to be correct. I'll work on a test.

@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from e4dfbac to 67df837 Compare October 15, 2025 15:38
@dashpole
Copy link
Copy Markdown
Contributor Author

I added a ConcurrentSafe test, and verified that it fails (quite spectacularly) with the previous atomic.Value implementation.

Copy link
Copy Markdown
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment thread sdk/metric/exemplar/reservoir_test.go Outdated
@dashpole
Copy link
Copy Markdown
Contributor Author

The concurrent safe test found another race condition around my usage of sync.Pool, which i'm looking into

@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from 597d23c to 81231b8 Compare October 15, 2025 20:00
@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from 2c82611 to 5e17e43 Compare October 15, 2025 20:12
Copy link
Copy Markdown
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much simpler now.

Comment thread sdk/metric/exemplar/fixed_size_reservoir.go
Comment thread sdk/metric/exemplar/storage.go
@MrAlias MrAlias added this to the v1.39.0 milestone Oct 16, 2025
Copy link
Copy Markdown
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me. Just testing cleanup.

Comment thread sdk/metric/exemplar/reservoir_test.go Outdated
Comment thread sdk/metric/exemplar/reservoir_test.go Outdated
Comment thread sdk/metric/exemplar/reservoir_test.go Outdated
Comment thread sdk/metric/exemplar/reservoir_test.go
Comment thread sdk/metric/exemplar/reservoir_test.go
Comment thread sdk/metric/exemplar/reservoir_test.go Outdated
Comment thread sdk/metric/exemplar/histogram_reservoir_test.go
Comment thread sdk/metric/exemplar/reservoir_test.go
@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from 03e0957 to e3936fe Compare November 20, 2025 16:10
@MrAlias MrAlias self-requested a review December 4, 2025 18:10
@MrAlias MrAlias modified the milestones: v1.39.0, v1.40.0 Dec 4, 2025
@dashpole dashpole force-pushed the optimize_histogram_reservoir branch from e3936fe to 4415931 Compare December 11, 2025 17:22
@dashpole
Copy link
Copy Markdown
Contributor Author

I put the Measure benchmarks in the description. Not as good of an improvement as I had expected, so i'll have to dig into that more...

@dashpole
Copy link
Copy Markdown
Contributor Author

@dmathieu if you have the chance to review, this is related to other optimization PRs.

@dashpole
Copy link
Copy Markdown
Contributor Author

I figured out why the benchmark results were so poor: The benchmark was recording all observations in a single bucket, and each bucket has its own lock, so there were effectively no parallelism gains. I switched the benchmark to record observations in different buckets, and that shows that this is a ~70% performance improvement when exemplars are being recorded.

@dashpole dashpole merged commit e8542ae into open-telemetry:main Dec 15, 2025
33 checks passed
@dashpole dashpole deleted the optimize_histogram_reservoir branch December 15, 2025 14:59
@MrAlias MrAlias mentioned this pull request Jan 16, 2026
39 tasks
dashpole added a commit that referenced this pull request Jan 23, 2026
~Depends on #7441, #7443~

This improves the concurrent performance of the fixed size reservoir's
Offer function by 4x (i.e. 75% reduction). This improves the performance
of Measure() for fixed-size reservoirs by 60% overall.

Accomplish this by:

* using a single atomic for count and next. This assumes that both can
fit in a uint32.
* only use a lock to guard changing `w` and `next` together.

Offer benchmarks:
```
                           │   main.txt   │           fixedsize.txt            │
                           │    sec/op    │   sec/op     vs base               │
FixedSizeReservoirOffer-24   185.25n ± 4%   45.58n ± 1%  -75.40% (p=0.002 n=6)
```

Measure benchmarks:
```
                                                                          │   main.txt   │            fixedsize.txt            │
                                                                          │    sec/op    │    sec/op     vs base               │
SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/0-24            175.45n ± 6%   67.01n ±  9%  -61.81% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/1-24            170.25n ± 1%   69.82n ±  6%  -58.99% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/10-24           167.40n ± 2%   64.52n ± 10%  -61.46% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/0-24          173.55n ± 0%   69.17n ± 12%  -60.14% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/1-24          169.50n ± 1%   68.55n ±  5%  -59.56% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/10-24         166.95n ± 1%   65.82n ±  6%  -60.58% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/0-24      168.85n ± 1%   67.99n ± 11%  -59.73% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/1-24      173.50n ± 1%   66.69n ±  2%  -61.56% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/10-24     171.30n ± 5%   67.73n ±  8%  -60.46% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/0-24    168.90n ± 2%   67.69n ±  9%  -59.92% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/1-24    173.35n ± 2%   68.25n ±  9%  -60.63% (p=0.002 n=6)
SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/10-24   172.95n ± 2%   70.90n ±  7%  -59.01% (p=0.002 n=6)
geomean                                                                      171.0n        67.83n        -60.33%
```

---------

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
Co-authored-by: Robert Pająk <pellared@hotmail.com>
@MrAlias MrAlias mentioned this pull request Feb 2, 2026
MrAlias added a commit that referenced this pull request Feb 2, 2026
### Added

- Add `Enabled` method to all synchronous instrument interfaces
(`Float64Counter`, `Float64UpDownCounter`, `Float64Histogram`,
`Float64Gauge`, `Int64Counter`, `Int64UpDownCounter`, `Int64Histogram`,
`Int64Gauge`,) in `go.opentelemetry.io/otel/metric`. This stabilizes the
synchronous instrument enabled feature, allowing users to check if an
instrument will process measurements before performing computationally
expensive operations. (#7763)
- Add `AlwaysRecord` sampler in `go.opentelemetry.io/otel/sdk/trace`.
(#7724)
- Add `go.opentelemetry.io/otel/semconv/v1.39.0` package. The package
contains semantic conventions from the `v1.39.0` version of the
OpenTelemetry Semantic Conventions. See the [migration
documentation](https://github.com/open-telemetry/opentelemetry-go/blob/298cbedf256b7a9ab3c21e41fc5e3e6d6e4e94aa/semconv/v1.39.0/MIGRATION.md)
for information on how to upgrade from
`go.opentelemetry.io/otel/semconv/v1.38.0.` (#7783, #7789)

### Changed

- `Exporter` in `go.opentelemetry.io/otel/exporter/prometheus` ignores
metrics with the scope `go.opentelemetry.io/contrib/bridges/prometheus`.
This prevents scrape failures when the Prometheus exporter is
misconfigured to get data from the Prometheus bridge. (#7688)
- Improve performance of concurrent histogram measurements in
`go.opentelemetry.io/otel/sdk/metric`. (#7474)
- Add experimental observability metrics in
`go.opentelemetry.io/otel/exporters/stdout/stdoutmetric`. (#7492)
- Improve the concurrent performance of `HistogramReservoir` in
`go.opentelemetry.io/otel/sdk/metric/exemplar` by 4x. (#7443)
- Improve performance of concurrent synchronous gauge measurements in
`go.opentelemetry.io/otel/sdk/metric`. (#7478)
- Improve performance of concurrent exponential histogram measurements
in `go.opentelemetry.io/otel/sdk/metric`. (#7702)
- Improve the concurrent performance of `FixedSizeReservoir` in
`go.opentelemetry.io/otel/sdk/metric/exemplar`. (#7447)
- The `rpc.grpc.status_code` attribute in the experimental metrics
emitted from
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc` is
replaced with the `rpc.response.status_code` attribute to align with the
semantic conventions. (#7854)
- The `rpc.grpc.status_code` attribute in the experimental metrics
emitted from
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc` is
replaced with the `rpc.response.status_code` attribute to align with the
semantic conventions. (#7854)

### Fixed

- Fix bad log message when key-value pairs are dropped because of key
duplication in `go.opentelemetry.io/otel/sdk/log`. (#7662)
- Fix `DroppedAttributes` on `Record` in
`go.opentelemetry.io/otel/sdk/log` to not count the non-attribute
key-value pairs dropped because of key duplication. (#7662)
- Fix `SetAttributes` on `Record` in `go.opentelemetry.io/otel/sdk/log`
to not log that attributes are dropped when they are actually not
dropped. (#7662)
- `WithHostID` detector in `go.opentelemetry.io/otel/sdk/resource` to
use full path for `ioreg` command on Darwin (macOS). (#7818)
- Fix missing `request.GetBody` in
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp` to
correctly handle HTTP2 GOAWAY frame. (#7794)

### Deprecated

- Deprecate `go.opentelemetry.io/otel/exporters/zipkin`. For more
information, see the [OTel blog post deprecating the Zipkin
exporter](https://opentelemetry.io/blog/2025/deprecating-zipkin-exporters/).
(#7670)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants