fix(ddtrace/opentelemetry): fix ignored sampling decision from otel#4238
fix(ddtrace/opentelemetry): fix ignored sampling decision from otel#4238dd-mergequeue[bot] merged 2 commits intomainfrom
Conversation
BenchmarksBenchmark execution time: 2025-12-23 10:10:22 Comparing candidate commit ff7dd2a in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 0 unstable metrics. |
Codecov Report❌ Patch coverage is
Additional details and impacted files
🚀 New features to boost your workflow:
|
71af6bc to
ff7dd2a
Compare
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
The expected merge time in
|
…led spans (#4631) ### What does this PR do? Fixes the OTel bridge's handling of unsampled spans in `FromGenericCtx`. Previously, when an OTel parent context had `IsSampled() == false`, the bridge set `samplingDecision = decisionDrop` directly on the trace. Bypassing the atomic Compare-And-Swap (CAS) semantics that `keep()` and `drop()` rely on. This meant: - Error spans could never rescue the trace as `keep()` only CAS from `decisionNone`, not `decisionDrop` - The behavior diverged from the native DD tracer, where P0 traces are not hard-dropped client-side The fix leaves `samplingDecision` as `decisionNone` for drop decisions while still setting the P0 priority and locking the trace against resampling. This preserves the OTel sampling intent while restoring the native DD keep/drop CAS flow. The bug has been introduced in `v2.6.0` following: #4238 ### Motivation Fixes #4624 Discovered during investigation of [APMS-19054](https://datadoghq.atlassian.net/browse/APMS-19054) — a customer upgrading to dd-trace-go v2.6.0 + OTel observed `trace.*` metrics dropping to near-zero under low sampling rates when client-side stats were disabled. ### Reviewer's Checklist - [ ] Changed code has unit tests for its functionality at or near 100% coverage. - [ ] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag. - [ ] There is a benchmark for any new code, or changes to existing code. - [ ] If this interacts with the agent in a new way, a system test has been added. - [ ] New code is free of linting errors. You can check this by running `make lint` locally. - [ ] New code doesn't break existing tests. You can check this by running `make test` locally. - [ ] Add an appropriate team label so this PR gets put in the right place for the release notes. - [ ] All generated files are up to date. You can check this by running `make generate` locally. - [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally. [APMS-19054]: https://datadoghq.atlassian.net/browse/APMS-19054?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Co-authored-by: kakkoyun <kakkoyun@users.noreply.github.com> Co-authored-by: benjamin.debernardi <benjamin.debernardi@datadoghq.com>
…led spans (#4631) Fixes the OTel bridge's handling of unsampled spans in `FromGenericCtx`. Previously, when an OTel parent context had `IsSampled() == false`, the bridge set `samplingDecision = decisionDrop` directly on the trace. Bypassing the atomic Compare-And-Swap (CAS) semantics that `keep()` and `drop()` rely on. This meant: - Error spans could never rescue the trace as `keep()` only CAS from `decisionNone`, not `decisionDrop` - The behavior diverged from the native DD tracer, where P0 traces are not hard-dropped client-side The fix leaves `samplingDecision` as `decisionNone` for drop decisions while still setting the P0 priority and locking the trace against resampling. This preserves the OTel sampling intent while restoring the native DD keep/drop CAS flow. The bug has been introduced in `v2.6.0` following: #4238 Fixes #4624 Discovered during investigation of [APMS-19054](https://datadoghq.atlassian.net/browse/APMS-19054) — a customer upgrading to dd-trace-go v2.6.0 + OTel observed `trace.*` metrics dropping to near-zero under low sampling rates when client-side stats were disabled. - [ ] Changed code has unit tests for its functionality at or near 100% coverage. - [ ] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag. - [ ] There is a benchmark for any new code, or changes to existing code. - [ ] If this interacts with the agent in a new way, a system test has been added. - [ ] New code is free of linting errors. You can check this by running `make lint` locally. - [ ] New code doesn't break existing tests. You can check this by running `make test` locally. - [ ] Add an appropriate team label so this PR gets put in the right place for the release notes. - [ ] All generated files are up to date. You can check this by running `make generate` locally. - [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally. [APMS-19054]: https://datadoghq.atlassian.net/browse/APMS-19054?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Co-authored-by: kakkoyun <kakkoyun@users.noreply.github.com> Co-authored-by: benjamin.debernardi <benjamin.debernardi@datadoghq.com>
…led spans (#4631) Fixes the OTel bridge's handling of unsampled spans in `FromGenericCtx`. Previously, when an OTel parent context had `IsSampled() == false`, the bridge set `samplingDecision = decisionDrop` directly on the trace. Bypassing the atomic Compare-And-Swap (CAS) semantics that `keep()` and `drop()` rely on. This meant: - Error spans could never rescue the trace as `keep()` only CAS from `decisionNone`, not `decisionDrop` - The behavior diverged from the native DD tracer, where P0 traces are not hard-dropped client-side The fix leaves `samplingDecision` as `decisionNone` for drop decisions while still setting the P0 priority and locking the trace against resampling. This preserves the OTel sampling intent while restoring the native DD keep/drop CAS flow. The bug has been introduced in `v2.6.0` following: #4238 Fixes #4624 Discovered during investigation of [APMS-19054](https://datadoghq.atlassian.net/browse/APMS-19054) — a customer upgrading to dd-trace-go v2.6.0 + OTel observed `trace.*` metrics dropping to near-zero under low sampling rates when client-side stats were disabled. - [ ] Changed code has unit tests for its functionality at or near 100% coverage. - [ ] [System-Tests](https://github.com/DataDog/system-tests/) covering this feature have been added and enabled with the va.b.c-dev version tag. - [ ] There is a benchmark for any new code, or changes to existing code. - [ ] If this interacts with the agent in a new way, a system test has been added. - [ ] New code is free of linting errors. You can check this by running `make lint` locally. - [ ] New code doesn't break existing tests. You can check this by running `make test` locally. - [ ] Add an appropriate team label so this PR gets put in the right place for the release notes. - [ ] All generated files are up to date. You can check this by running `make generate` locally. - [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild. Make sure all nested modules are up to date by running `make fix-modules` locally. [APMS-19054]: https://datadoghq.atlassian.net/browse/APMS-19054?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Co-authored-by: kakkoyun <kakkoyun@users.noreply.github.com> Co-authored-by: benjamin.debernardi <benjamin.debernardi@datadoghq.com>
What does this PR do?
This PR fixes the current faulty behavior of the tracer that do not respect the parent sampling decision when the parent span is not a DataDog span but an OTel one.
Fixes #3639.
Motivation
Old ER ticket APMS-15887
takes over #3718
Reviewer's Checklist
./scripts/lint.shlocally.Unsure? Have a question? Request a review!