Skip to content

[exporter/exporterhelper] Add DroppedItemsErr to report intentionally dropped items#15028

Open
Krishnachaitanyakc wants to merge 1 commit into
open-telemetry:mainfrom
Krishnachaitanyakc:fix/issue-13643
Open

[exporter/exporterhelper] Add DroppedItemsErr to report intentionally dropped items#15028
Krishnachaitanyakc wants to merge 1 commit into
open-telemetry:mainfrom
Krishnachaitanyakc:fix/issue-13643

Conversation

@Krishnachaitanyakc
Copy link
Copy Markdown
Contributor

Description

Exporters that intentionally discard items (e.g. a Prometheus exporter dropping non-monotonic DELTA sums) previously had no way to signal this to the exporterhelper observation layer. Dropped items were silently counted as successfully sent via otelcol_exporter_sent_metric_points, making internal telemetry misleading and difficult to use for diagnosing data loss.

This PR adds a lightweight mechanism for exporters to report intentionally dropped items:

  • experr.DroppedItemsErr / NewDroppedItemsErr(count, reason): a non-failure sentinel that an exporter's push function can return to report how many items were intentionally dropped and why.
  • Four new telemetry counters (otelcol_exporter_dropped_{spans,metric_points,log_records,profile_samples}): emitted by obsReportSender when a DroppedItemsErr is received.
  • obsReportSender adjustments: dropped items are subtracted from the "sent" count, and DroppedItemsErr is not propagated as a pipeline error (the pipeline sees nil).

A follow-up to the prometheusexporter in opentelemetry-collector-contrib will be needed to actually return DroppedItemsErr for unsupported metric types (e.g. non-monotonic DELTA sums).

Resolves open-telemetry/opentelemetry-collector-contrib#48172

@Krishnachaitanyakc Krishnachaitanyakc marked this pull request as ready for review April 2, 2026 15:28
@mattclay
Copy link
Copy Markdown

@Krishnachaitanyakc Why did I get pinged on this? I'm not involved with this project.

@Krishnachaitanyakc
Copy link
Copy Markdown
Contributor Author

@Krishnachaitanyakc Why did I get pinged on this? I'm not involved with this project.

my bad, i was tagging you on a different PR and got it mixed up, apologies

@Krishnachaitanyakc
Copy link
Copy Markdown
Contributor Author

@bogdandrutu @dmitryax could you take a look at this change when you have a chance?

@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.27%. Comparing base (e7f2274) to head (ca51457).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15028      +/-   ##
==========================================
+ Coverage   91.24%   91.27%   +0.02%     
==========================================
  Files         705      706       +1     
  Lines       46084    46217     +133     
==========================================
+ Hits        42050    42183     +133     
  Misses       2826     2826              
  Partials     1208     1208              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 2, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 7 untouched benchmarks
⏩ 76 skipped benchmarks1


Comparing Krishnachaitanyakc:fix/issue-13643 (ca51457) with main (e7f2274)

Open in CodSpeed

Footnotes

  1. 76 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@dmitryax
Copy link
Copy Markdown
Member

We need #15280 to land first

Comment thread exporter/exporterhelper/internal/obs_report_sender.go
Comment thread .chloggen/exporterhelper-dropped-items.yaml Outdated
Comment thread exporter/exporterhelper/internal/experr/err.go
Comment thread exporter/exporterhelper/internal/obs_report_sender.go Outdated
… dropped items

Exporters that intentionally discard items (e.g. the Prometheus exporter
dropping non-monotonic DELTA sums) previously had no way to signal this to
the exporterhelper observation layer. Those items were silently counted as
successfully sent, making internal telemetry misleading.

This change adds:

- exporterhelper.DroppedItemsErr / NewDroppedItemsErr: a public, non-failure
  sentinel that an exporter's push function can return to report a number of
  intentionally dropped items and an optional reason. The type is re-exported
  from the internal experr package so external exporters can construct it.
- Four new telemetry counters
  (otelcol_exporter_dropped_{spans,metric_points,log_records,profile_samples})
  emitted by the obsReportSender whenever a DroppedItemsErr is received.
- At detailed telemetry level, an exporter.dropped.reason metric attribute is
  attached to the four counters. A metric view in
  service/internal/metricviews filters the attribute out at non-detailed
  levels, mirroring the existing handling of error.type / error.permanent on
  exporter_send_failed_*, to bound cardinality.
- An items.dropped span attribute is added alongside items.sent / items.failed
  on the exporter span.
- A warning is logged when an exporter reports more dropped items than the
  request actually contained; the sent counter is clamped to zero in that
  case.
- obsReportSender adjusts the sent count so that dropped items are excluded
  from "sent" and the DroppedItemsErr is not propagated as a pipeline error.

An end-to-end test exercises the full public API (exporterhelper.NewMetrics +
NewDroppedItemsErr) and asserts the emitted telemetry contract.

Resolves #13643

Signed-off-by: Kc Balusu <kcbalusu@meta.com>
Copy link
Copy Markdown
Contributor

@bogdan-st bogdan-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id also say that it s a bit confusing for someone reading the code to see numSent as the total number of items instead of the ones actually sent. But I think that s out of scope for now.

// successfully sent. Clamp to zero in case the exporter reported more
// dropped items than the total item count, and warn — that condition
// indicates an exporter bug.
if numDropped > 0 && numSent <= numDropped {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true even when all items have been dropped correctly and that should not produce a warning. The else if block handles that correctly without the warning.

Suggested change
if numDropped > 0 && numSent <= numDropped {
if numDropped > 0 && numSent < numDropped {

@dmitryax
Copy link
Copy Markdown
Member

Let's have #15280 merged first so we're aligned on the long term plan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Internal telemetry of Prometheus exporter incorrectly reports dropped metrics as "Sent"

4 participants