PoC for batching in PeriodicReader#7930
Closed
dashpole wants to merge 1 commit intoopen-telemetry:mainfrom
Closed
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7930 +/- ##
=======================================
- Coverage 81.7% 81.5% -0.2%
=======================================
Files 304 305 +1
Lines 23283 23430 +147
=======================================
+ Hits 19032 19116 +84
- Misses 3864 3926 +62
- Partials 387 388 +1
🚀 New features to boost your workflow:
|
3 tasks
github-merge-queue Bot
pushed a commit
to open-telemetry/opentelemetry-specification
that referenced
this pull request
Mar 17, 2026
…ader (#4895) Fixes #4852 ## Prior Art The Trace SDK and Logging SDK both support a `maxExportBatchSize` parameter to limit the number of spans/logs exported in a batch. The collector's exporter helper and batch processor support a `send_batch_max_size` configuration option, which (by default) applies to the number spans, logs, or metric data points. In all cases, the configured timeout applies to a single request. ## Requirements * Apply a limit to the number of metric data points exported in a single OTLP batch. * Maintain existing ordering of metric data points. Batching must not result in metric data from a subsequent Collect to be exported prior to data from the earlier Collect call. * Apply the timeout to individual requests, not to multiple requests * The batch size must apply to a single exporter, and if multiple exporters are used, each must be able to have its own batch size. ## Non-goals * Introduce any parallelism into the metric export path * Limit by bytes, or anything else ## Proposal Add `maxExportBatchSize` to the periodic exporting MetricReader. The periodic exporting MetricReader splits the batch of metric data points received from Collect, if necessary, and then serially invokes `Export` on each split batch with the configured timeout. ## Alternatives considered ### maxExportBatchSize for all MetricReaders Instead of applying to only periodic readers, the batch size could apply to all readers. This alternative is not chosen because * Splitting batches is only required for push exporters. * It makes more sense to group the batching configuration with timeout configuration (which is on the periodic exporting MetricReader). ### maxExportBatchSize on OTLP exporters Instead of being on the periodic exporting MetricReader, we could add this configuration on the OTLP http and grpc exporters. This alternative is not chosen because: * The timeout should apply to individual batches, not to many, split batches in order to match behavior of other SDKs and the collector. This is only possible if batches are split _before_ the exporter, since the Periodic MetricReader applies the timeout. * It is more helpful to provide this functionality for all exporters, so it doesn't need to be re-implemented or copied. Prototypes: * Go: open-telemetry/opentelemetry-go#7930 * [x] Links to the prototypes (when adding or changing features) * [x] [`CHANGELOG.md`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/CHANGELOG.md) file updated for non-trivial changes * [x] [Spec compliance matrix](https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix/template.yaml) updated if necessary
dashpole
added a commit
that referenced
this pull request
Apr 9, 2026
Adds experimental support for maxExportBatchSize using the `OTEL_GO_X_METRIC_EXPORT_BATCH_SIZE=<size>` environment variable. Previous prototype: #7930 This preserves existing behavior for timeouts when batching is not used, but individually applies the timeout to export calls when batching is used.
pellared
pushed a commit
to pellared/opentelemetry-go
that referenced
this pull request
Apr 23, 2026
…try#8071) Adds experimental support for maxExportBatchSize using the `OTEL_GO_X_METRIC_EXPORT_BATCH_SIZE=<size>` environment variable. Previous prototype: open-telemetry#7930 This preserves existing behavior for timeouts when batching is not used, but individually applies the timeout to export calls when batching is used.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Batching logic is based on the collector's batchprocessor: https://github.com/open-telemetry/opentelemetry-collector/blob/587b90b9ecc1db959ee9104d5bf993591f80ca43/processor/batchprocessor/splitmetrics.go
PoC for open-telemetry/opentelemetry-specification#4895
This PR adds
metric.WithMaxExportBatchSizetogo.opentelemetry.io/sdk/metric, and causes the SDK to split batches before passing them to the exporter.One potential issue is that I was hoping the export timeout could be applied individually to each batch, rather than to multiple serial export calls. But currently, we apply the timeout to collect + export. I've changed it to apply the timeout individually to collect, and to each export, but I'm curious how acceptable other maintainers think this kind of change would be. In practice, I suspect Collect() is not a notable source of latency or timeouts.
The spec for the timeout says:
It would strike me as a bit odd if we did split metrics into batches to have the timeout still applied to the entire collect + multiple exports based on the spec.