Skip to content

[processor/tailsampling] Add way-in sampling strategies as alternative to #46600#46734

Closed
carsonip wants to merge 17 commits into
open-telemetry:mainfrom
carsonip:tailsampling-full-trace-way-in
Closed

[processor/tailsampling] Add way-in sampling strategies as alternative to #46600#46734
carsonip wants to merge 17 commits into
open-telemetry:mainfrom
carsonip:tailsampling-full-trace-way-in

Conversation

@carsonip
Copy link
Copy Markdown
Contributor

@carsonip carsonip commented Mar 6, 2026

Summary

  • Introduces sampling_strategy modes to explicitly trade policy flexibility for performance in tailsamplingprocessor.
  • Adds two ingest-time modes in addition to the existing default behavior:
    • root-span-only-way-in: decide on root-span arrival using only root-span data.
    • full-trace-way-in: evaluate on ingest per incoming batch; terminal decisions finalize immediately, unresolved traces are implicitly not sampled at cleanup.
  • Keeps full-trace-way-out as the default behavior for backward compatibility.

Why

Issue #46600 proposes constraining evaluation timing/data to enable performance optimizations.

This draft PR implements that direction as a concrete alternative design to #46600, centered on an explicit strategy enum and strict policy compatibility checks (stateful policies rejected for way-in modes).

What is included

  • Config migration from sample_on_root_span_only to sampling_strategy with default full-trace-way-out.
  • Stateful-policy enforcement for way-in modes.
  • Ingest-time decision path for full-trace-way-in without re-evaluating prior spans.
  • Cleanup flow for pending traces in full-trace-way-in without policy re-evaluation.
  • Telemetry updates so way-in decisions emit existing sampling metrics consistently.
  • Unit tests for ingest-time behavior, including root-first/child-first ordering scenarios.
  • Docs/schema updates describing the new strategy semantics.

Notes for reviewers

  • This PR intentionally keeps existing metric families and reuses current decision-latency instrumentation.
  • Follow-up perf comparisons can be attached after functional review.

Test plan

  • go test ./... -count=1 in processor/tailsamplingprocessor
  • Added/updated unit tests for way-in strategy behavior and ordering cases

Made with Cursor

carsonip added 17 commits March 3, 2026 21:32
When sample_on_root_span_only is enabled and a root span arrives, evaluate immediately and avoid appending that root-triggering batch to tail storage. For sampled traces, merge the current batch in memory with any previously buffered spans and release once; add a regression test to verify append/take behavior.

Assisted-by: ChatGPT
Made-with: Cursor
Resolve tailsamplingprocessor conflict by keeping root-only immediate decision path and avoiding appending the root-triggering batch before release.

Made-with: Cursor
…ng_strategy

Introduce a string enum sampling_strategy to describe decision timing semantics explicitly, with full-trace-way-out as default and root-span-only-way-in for immediate root-based decisions. Update processor logic, schema, docs, and tests to use the new strategy and validate unsupported values.

Assisted-by: ChatGPT 5.3
Made-with: Cursor
Add IsStateful() to the samplingpolicy.Evaluator interface and implement it across all tail sampling evaluators while keeping the previous stateful semantics based on evaluator-held state across evaluations.

Made-with: Cursor
…fulness

Enforce root-span-only-way-in compatibility at policy construction time by rejecting evaluators that report IsStateful(), so statefulness stays defined by policy implementations and extensions without duplicating policy rules in config validation.

Made-with: Cursor
Remove newTracesProcessor fallback for empty sampling_strategy and update direct Config literals in tests/benchmarks/fuzz to set full-trace-way-out explicitly, keeping validation strict while relying on default config initialization in production paths.

Made-with: Cursor
Keep sampling_strategy as a typed enum but make the type and constants package-private so they are not exported outside tailsamplingprocessor, while preserving strict validation and existing behavior.

Made-with: Cursor
Document sampling_strategy in the existing flat option-list style and describe both values inline with clearer behavior semantics for decision timing and root-only evaluation scope.

Made-with: Cursor
Introduce full-trace-way-in sampling strategy with ingest-time terminal decisions, cleanup-time implicit not-sampled finalization, stateless-policy enforcement, and tests covering ingest behavior and no re-evaluation of prior spans.

Made-with: Cursor
Record global and per-policy sampling metrics for full-trace-way-in and root-span-only-way-in ingest-time decisions, and add telemetry coverage to ensure way-in modes emit decision counters consistently.

Made-with: Cursor
Add decision tests for root-first/child-later and child-first/root-later scenarios in full-trace-way-in mode, including pending cleanup behavior without policy re-evaluation.

Made-with: Cursor
Rename the policy metrics accumulator to policyEvaluationMetrics so its name matches how it is used in both tick-driven and ingest-time decision paths.

Assisted-by: ChatGPT 5.3
Made-with: Cursor
…pan helper

Add explicit dropped-decision coverage for both way-in strategies and refactor repeated single-span test construction into a shared helper for cleaner, easier-to-maintain tests.

Assisted-by: ChatGPT 5.3
Made-with: Cursor
Expand the README with dedicated sampling strategy sections that compare decision timing, tradeoffs, and caveats across full-trace-way-out and both way-in modes.

Assisted-by: ChatGPT 5.3
Made-with: Cursor
Apply formatter-driven cleanup to tests after running make fmt lint, including struct literal and parameter list formatting updates.

Assisted-by: ChatGPT 5.3
Made-with: Cursor
@carsonip
Copy link
Copy Markdown
Contributor Author

carsonip commented Mar 9, 2026

Superseded by #46762

@carsonip carsonip closed this Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant