Skip to content

[processor/tailsampling] Allow faster decisions after the root span is received#44878

Merged
atoulme merged 9 commits into
open-telemetry:mainfrom
csmarchbanks:tsp-decision-wait-after-root-received
Jan 8, 2026
Merged

[processor/tailsampling] Allow faster decisions after the root span is received#44878
atoulme merged 9 commits into
open-telemetry:mainfrom
csmarchbanks:tsp-decision-wait-after-root-received

Conversation

@csmarchbanks
Copy link
Copy Markdown
Contributor

@csmarchbanks csmarchbanks commented Dec 10, 2025

Description

When testing the work done to provide early decisions (#44456) the impact was very limited in some scenarios, specifically when policies existed that look for long traces or if an error is ever present. It is not possible to know if an error span will come along in the future so almost all traces end up waiting until the decision wait causing the savings to only be a couple of percent in some environments.

What I found myself wishing for was a way to run all decisions more quickly for most cases, and what worked fairly well is to base that decision on if a root span has been received or not. This change implements a second decision wait to collect any straggler spans that might be present for a trace (e.g. a second service with a different batch timer before sending), but still allow it to be much faster than the base decision wait. A good way of thinking about the two options is that decision_wait is the maximum amount of time we will wait for a span to arrive, and decision_wait_after_root_received is the minimum amount of time we will wait for additional spans to come in.

The downside of this approach is that heavily asynchronous traces may not be sampled as expected, however we find that those do not work very well as it is since they commonly last longer than the decision wait anyway. The behavior is opt in so no changes are needed for any users.

I wanted to keep the changes in this PR relatively small, but in the future we could re-implement id batcher to support moving traces between batches, or possibly a priority queue where we pop values until some threshold, rather than having two batchers.

Link to tracking issue

Part of #43876

Testing

Added smoke test for the new functionality

Documentation

Added documentation explaining the new configuration variable.

@csmarchbanks csmarchbanks changed the title Allow faster decisions after the root span is received [processor/tailsampling] Allow faster decisions after the root span is received Dec 10, 2025
When testing the work done to provide early decisions the impact was
very limited in some scenarios, specifically when policies existed that
look for long traces or if an error is ever present. It is not possible
to know if an error span will come along in the future so almost all
traces end up waiting until the decision wait causing the savings to
only be a couple of percent in some environments.

What I found myself wishing for was a way to run all decisions more
quickly for most cases, and what worked fairly well is to base that
decision on if a root span has been received or not. This change
implements a second decision wait to collect any straggler spans that
might be present for a trace (e.g. a second service with a different
batch timer before sending), but still allow it to be much faster than
the base decision wait. A good way of thinking about the two options is
that decision_wait is the maximum amount of time we will wait for a span
to arrive, and decision_wait_after_root_received is the minimum amount
of time we will wait for additional spans to come in.
@csmarchbanks csmarchbanks force-pushed the tsp-decision-wait-after-root-received branch from 8dcdb13 to f39e550 Compare December 10, 2025 18:57
We only want to increment the trace dropped to early metric when
encounter a trace that did not already have a decision made for it. We
can use the cache to determine if a decision was appropriately made or
if the trace was simply dropped.

If running with block on overflow enabled there is no need to check as
the code blocks instead of dropping traces that have not yet been
processed by one of the two batchers.
@csmarchbanks csmarchbanks force-pushed the tsp-decision-wait-after-root-received branch from 9f6a321 to dbecfe5 Compare December 11, 2025 16:31
@csmarchbanks csmarchbanks marked this pull request as ready for review December 12, 2025 16:25
@csmarchbanks csmarchbanks requested a review from a team as a code owner December 12, 2025 16:25
@github-actions github-actions Bot added the processor/tailsampling Tail sampling processor label Dec 12, 2025
@crobert-1
Copy link
Copy Markdown
Member

Pinging @Logiraptor for review as code owner

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new configuration option decision_wait_after_root_received that enables faster sampling decisions after receiving a trace's root span, while still collecting straggler spans. This optimizes tail sampling by allowing most traces to be decided more quickly when their root span arrives, rather than waiting for the full decision_wait period.

Key Changes:

  • Added decision_wait_after_root_received configuration parameter for faster decisions after root span arrival
  • Refactored Batch type from slice to map for efficient batch merging
  • Enhanced trace processing to detect and track root spans

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
processor/tailsamplingprocessor/config.go Adds new DecisionWaitAfterRootReceived configuration field
processor/tailsamplingprocessor/README.md Documents the new configuration option
processor/tailsamplingprocessor/processor.go Implements root span detection and dual-batcher logic for faster decisions
processor/tailsamplingprocessor/processor_test.go Adds test coverage for root-received batcher and updates test utilities
processor/tailsamplingprocessor/internal/idbatcher/id_batcher.go Changes Batch from slice to map for efficient merging
processor/tailsamplingprocessor/internal/idbatcher/id_batcher_test.go Updates tests to work with map-based batches
.chloggen/tsp-decision-wait-after-root-received.yaml Changelog entry for the enhancement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +906 to 908
if sp.ParentSpanID().IsEmpty() {
rootSpan = &sp
}
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the address of a loop variable sp is problematic. In Go, loop variables are reused across iterations, so &sp will always point to the last iteration's value. This means rootSpan may reference the wrong span or become invalid. Instead, create a copy of the span or use a different approach to store the root span reference.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

@csmarchbanks csmarchbanks Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a valid comment as:

  1. sp is not a loop variable, but instead created in each iteration of the loop.
  2. I believe this behavior was fixed in Go 1.22 anyway.

Leaving up for someone to double check though.

Comment on lines +1187 to +1188
assert.Less(t, len(allSampledTraces), len(traceIDs)*6/10)
assert.Greater(t, len(allSampledTraces), len(traceIDs)*4/10)
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic numbers 6/10 and 4/10 represent the expected sampling range (40%-60%) around the 50% configured sampling percentage. Consider extracting these as named constants to make the test's intent clearer, such as expectedSamplingPercentage = 0.5, toleranceLower = 0.4, and toleranceUpper = 0.6.

Copilot uses AI. Check for mistakes.
Comment on lines +562 to +571
if tsp.rootReceivedBatcher != nil {
rootBatch, _ := tsp.rootReceivedBatcher.CloseCurrentAndTakeFirstBatch()
if batch == nil {
batch = rootBatch
} else {
for id := range rootBatch {
batch[id] = struct{}{}
}
}
}
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If rootBatch is nil, the code at line 567-569 will panic when attempting to range over it. Add a nil check for rootBatch before attempting to merge it with batch, similar to the check for batch at line 564.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not how looping over a nil map works in Go...

Copy link
Copy Markdown
Contributor

@Logiraptor Logiraptor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems reasonable. A few nits. I would be interested in a refactor of the idBatcher so it can support re-prioritization, but I don't mind merging this first and leaving that for later.

Comment thread processor/tailsamplingprocessor/processor.go Outdated
}
// Wait long enough that we pass the decision wait after a root is received,
// but no where near the base decision wait.
time.Sleep(2 * time.Second)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use the controller here and waitForTick instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not easily, mainly because the test controller uses a syncIDBatcher so any wait for tick is going to pull from the normal batch. I could refactor the test controller to allow multiple batches but didn't seem worthwhile as the test just takes a couple seconds. Plus it is kind of nice to have a test using the real batcher.

Let me know if you want me to look into the refactor though!

@csmarchbanks
Copy link
Copy Markdown
Contributor Author

Agreed on wanting to refactor the id batcher to either a priority queue or an implementation that allows rebatching. I opened an issue for that work: #45054.

@atoulme atoulme marked this pull request as draft December 31, 2025 22:30
@atoulme
Copy link
Copy Markdown
Contributor

atoulme commented Dec 31, 2025

You have approvals from codeowners but CI doesn't pass. Please check CI, push changes and mark ready to review again.

@csmarchbanks csmarchbanks marked this pull request as ready for review January 5, 2026 16:36
@csmarchbanks
Copy link
Copy Markdown
Contributor Author

Updated! Just merging main fixed the CI issues.

@atoulme atoulme merged commit 6ef0986 into open-telemetry:main Jan 8, 2026
194 checks passed
@otelbot
Copy link
Copy Markdown
Contributor

otelbot Bot commented Jan 8, 2026

Thank you for your contribution @csmarchbanks! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help.

@github-actions github-actions Bot added this to the next release milestone Jan 8, 2026
mbdeveci pushed a commit to mbdeveci/opentelemetry-collector-contrib that referenced this pull request Jan 12, 2026
…s received (open-telemetry#44878)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
When testing the work done to provide early decisions
(open-telemetry#44456)
the impact was very limited in some scenarios, specifically when
policies existed that look for long traces or if an error is ever
present. It is not possible to know if an error span will come along in
the future so almost all traces end up waiting until the decision wait
causing the savings to only be a couple of percent in some environments.

What I found myself wishing for was a way to run all decisions more
quickly for most cases, and what worked fairly well is to base that
decision on if a root span has been received or not. This change
implements a second decision wait to collect any straggler spans that
might be present for a trace (e.g. a second service with a different
batch timer before sending), but still allow it to be much faster than
the base decision wait. A good way of thinking about the two options is
that decision_wait is the maximum amount of time we will wait for a span
to arrive, and decision_wait_after_root_received is the minimum amount
of time we will wait for additional spans to come in.

The downside of this approach is that heavily asynchronous traces may
not be sampled as expected, however we find that those do not work very
well as it is since they commonly last longer than the decision wait
anyway. The behavior is opt in so no changes are needed for any users.

I wanted to keep the changes in this PR relatively small, but in the
future we could re-implement id batcher to support moving traces between
batches, or possibly a priority queue where we pop values until some
threshold, rather than having two batchers.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Part of open-telemetry#43876

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Added smoke test for the new functionality

<!--Describe the documentation added.-->
#### Documentation
Added documentation explaining the new configuration variable.

<!--Please delete paragraphs that you did not use before submitting.-->
atoulme pushed a commit that referenced this pull request Apr 23, 2026
…47535)

#### Description

Replaces @Logiraptor with @csmarchbanks as a codeowner of the tail
sampling processor as I have been more active recently and @Logiraptor
is working on other efforts. In addition, adds @carsonip as a new code
owner.

Some of my TSP efforts:
*
#42573
*
#44878
*
#45286
*
#46161

TSP work that @carsonip has done:
*
#43561
*
#46762
*
#42326
AndrewCharlesHay pushed a commit to AndrewCharlesHay/opentelemetry-collector-contrib that referenced this pull request Apr 23, 2026
…pen-telemetry#47535)

#### Description

Replaces @Logiraptor with @csmarchbanks as a codeowner of the tail
sampling processor as I have been more active recently and @Logiraptor
is working on other efforts. In addition, adds @carsonip as a new code
owner.

Some of my TSP efforts:
*
open-telemetry#42573
*
open-telemetry#44878
*
open-telemetry#45286
*
open-telemetry#46161

TSP work that @carsonip has done:
*
open-telemetry#43561
*
open-telemetry#46762
*
open-telemetry#42326
gracewehner pushed a commit to gracewehner/opentelemetry-collector-contrib that referenced this pull request Apr 29, 2026
…pen-telemetry#47535)

#### Description

Replaces @Logiraptor with @csmarchbanks as a codeowner of the tail
sampling processor as I have been more active recently and @Logiraptor
is working on other efforts. In addition, adds @carsonip as a new code
owner.

Some of my TSP efforts:
*
open-telemetry#42573
*
open-telemetry#44878
*
open-telemetry#45286
*
open-telemetry#46161

TSP work that @carsonip has done:
*
open-telemetry#43561
*
open-telemetry#46762
*
open-telemetry#42326
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants