Skip to content

Commit

Permalink
Randomness requirements following W3C Trace Context level 2 (#4162)
Browse files Browse the repository at this point in the history
  • Loading branch information
jmacd authored Feb 13, 2025
1 parent a8ded1a commit 8155988
Show file tree
Hide file tree
Showing 6 changed files with 160 additions and 15 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ release.

### Traces

- Define randomness value requirements for W3C Trace Context Level 2.
([#4162](https://github.com/open-telemetry/opentelemetry-specification/pull/4162))
- Deprecate `exception.escaped` attribute, add link to in-development semantic-conventions
on how to record errors across signals.
([#4368](https://github.com/open-telemetry/opentelemetry-specification/pull/4368))
Expand Down
1 change: 1 addition & 0 deletions spec-compliance-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ formats is required. Implementing more than one format is optional.
| [Built-in `SpanProcessor`s implement `ForceFlush` spec](specification/trace/sdk.md#forceflush-1) | | | + | | + | + | + | + | + | + | + | |
| [Attribute Limits](specification/common/README.md#attribute-limits) | X | | + | | + | + | + | + | | | | |
| Fetch InstrumentationScope from ReadableSpan | | | + | | + | | | + | | | | |
| [Support W3C Trace Context Level 2 randomness](specification/trace/sdk.md#traceid-randomness) | X | | | | | | | | | | | |

## Baggage

Expand Down
12 changes: 12 additions & 0 deletions specification/context/api-propagators.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
* [Get Global Propagator](#get-global-propagator)
* [Set Global Propagator](#set-global-propagator)
- [Propagators Distribution](#propagators-distribution)
* [W3C Trace Context Requirements](#w3c-trace-context-requirements)
* [B3 Requirements](#b3-requirements)
+ [B3 Extract](#b3-extract)
+ [B3 Inject](#b3-inject)
Expand Down Expand Up @@ -377,6 +378,17 @@ Additional `Propagator`s implementing vendor-specific protocols such as AWS
X-Ray trace header protocol MUST NOT be maintained or distributed as part of
the Core OpenTelemetry repositories.

### W3C Trace Context Requirements

A W3C Trace Context propagator MUST parse and validate the `traceparent` and `tracestate` HTTP headers as specified in [W3C Trace Context Level 2](https://www.w3.org/TR/trace-context-2/). A W3C Trace Context propagator MUST propagate a valid `traceparent` value using the same header. A W3C Trace Context propagator MUST propagate a valid `tracestate` unless the value is empty, in which case the `tracestate` header may be omitted.

When injecting and extracting trace context to or from a carrier, the following fields from the `SpanContext` are propagated.

- TraceID (16 bytes)
- SpanID (8 bytes)
- TraceFlags (8 bits)
- TraceState (string, unless empty)

### B3 Requirements

B3 has both single and multi-header encodings. It also has semantics that do not
Expand Down
21 changes: 11 additions & 10 deletions specification/trace/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,16 +233,17 @@ non-zero byte.
`SpanId` A valid span identifier is an 8-byte array with at least one non-zero
byte.

`TraceFlags` contain details about the trace. Unlike TraceState values,
TraceFlags are present in all traces. The current version of the specification
only supports a single flag called [sampled](https://www.w3.org/TR/trace-context/#sampled-flag).

`TraceState` carries vendor-specific trace identification data, represented as a list
of key-value pairs. TraceState allows multiple tracing
systems to participate in the same trace. It is fully described in the [W3C Trace Context
specification](https://www.w3.org/TR/trace-context/#tracestate-header). For
specific OTel values in `TraceState`, see the [TraceState Handling](tracestate-handling.md)
document.
`TraceFlags` contain details about the trace.
Unlike TraceState values, TraceFlags are present in all traces.
The current version of the specification supports two flags:

- [Sampled](https://www.w3.org/TR/trace-context-2/#sampled-flag)
- [Random](https://www.w3.org/TR/trace-context-2/#random-trace-id-flag)

`TraceState` carries tracing-system-specific trace identification data, represented as a list of key-value pairs.
TraceState allows multiple tracing systems to participate in the same trace.
It is fully described in the [W3C Trace Context specification](https://www.w3.org/TR/trace-context-2/#tracestate-header).
For specific OpenTelemetry values in `TraceState`, see the [TraceState Handling](tracestate-handling.md) document.

`IsRemote`, a boolean indicating whether the SpanContext was received from somewhere
else or locally generated, see [IsRemote](#isremote).
Expand Down
111 changes: 106 additions & 5 deletions specification/trace/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ linkTitle: SDK
- [Sampling](#sampling)
* [Recording Sampled reaction table](#recording-sampled-reaction-table)
* [SDK Span creation](#sdk-span-creation)
+ [Span flags](#span-flags)
* [Sampler](#sampler)
+ [ShouldSample](#shouldsample)
+ [GetDescription](#getdescription)
Expand All @@ -33,8 +34,17 @@ linkTitle: SDK
- [Requirements for `TraceIdRatioBased` sampler algorithm](#requirements-for-traceidratiobased-sampler-algorithm)
+ [ParentBased](#parentbased)
+ [JaegerRemoteSampler](#jaegerremotesampler)
* [Sampling Requirements](#sampling-requirements)
+ [TraceID randomness](#traceid-randomness)
+ [Random trace flag](#random-trace-flag)
+ [Explicit trace randomness](#explicit-trace-randomness)
- [Do not overwrite explicit trace randomness](#do-not-overwrite-explicit-trace-randomness)
- [Root samplers set explicit trace randomness for non-random TraceIDs](#root-samplers-set-explicit-trace-randomness-for-non-random-traceids)
+ [Presumption of TraceID randomness](#presumption-of-traceid-randomness)
+ [IdGenerator randomness](#idgenerator-randomness)
- [Span Limits](#span-limits)
- [Id Generators](#id-generators)
* [IdGenerator randomness](#idgenerator-randomness-1)
- [Span processor](#span-processor)
* [Interface definition](#interface-definition)
+ [OnStart](#onstart)
Expand Down Expand Up @@ -263,7 +273,7 @@ The OpenTelemetry API has two properties responsible for the data collection:
receive them unless the `Sampled` flag was also set.
* `Sampled` flag in `TraceFlags` on `SpanContext`. This flag is propagated via
the `SpanContext` to child Spans. For more details see the [W3C Trace Context
specification](https://www.w3.org/TR/trace-context/#sampled-flag). This flag indicates that the `Span` has been
specification][W3CCONTEXTSAMPLEDFLAG]. This flag indicates that the `Span` has been
`sampled` and will be exported. [Span Exporters](#span-exporter) MUST
receive those spans which have `Sampled` flag set to true and they SHOULD NOT receive the ones
that do not.
Expand Down Expand Up @@ -298,10 +308,7 @@ When asked to create a Span, the SDK MUST act as if doing the following in order
1. If there is a valid parent trace ID, use it. Otherwise generate a new trace ID
(note: this must be done before calling `ShouldSample`, because it expects
a valid trace ID as input).
2. Query the `Sampler`'s [`ShouldSample`](#shouldsample) method
(Note that the [built-in `ParentBasedSampler`](#parentbased) can be used to
use the sampling decision of the parent,
translating a set SampledFlag to RECORD and an unset one to DROP).
2. Query the `Sampler`'s [`ShouldSample`](#shouldsample) method.
3. Generate a new span ID for the `Span`, independently of the sampling decision.
This is done so other components (such as logs or exception handling) can rely on
a unique span ID, even if the `Span` is a non-recording instance.
Expand All @@ -314,6 +321,14 @@ When asked to create a Span, the SDK MUST act as if doing the following in order
`Span` is created without an SDK installed or as described in
[wrapping a SpanContext in a Span](api.md#wrapping-a-spancontext-in-a-span).

#### Span flags

The OTLP representation for Span and Span Link includes a 32-bit field declared as Span Flags.

Bits 0-7 (8 least significant bits) of the Span Flags field are reserved for the 8 bits of Trace Context flags,
specified in the [W3C Trace Context Level 2][W3CCONTEXTMAIN] Candidate Recommendation.
[See the list of recognized flags](./api.md#spancontext).

### Sampler

`Sampler` interface allows users to create custom samplers which will return a
Expand Down Expand Up @@ -464,6 +479,80 @@ The following configuration properties should be available when creating the sam
[jaeger-remote-sampling-api]: https://www.jaegertracing.io/docs/1.41/apis/#remote-sampling-configuration-stable
[jaeger-adaptive-sampling]: https://www.jaegertracing.io/docs/1.41/sampling/#adaptive-sampling

### Sampling Requirements

**Status**: [Development](../document-status.md)

The [W3C Trace Context Level 2][W3CCONTEXTMAIN] Candidate Recommendation includes [a Random trace flag][W3CCONTEXTRANDOMFLAG] for indicating that the TraceID contains 56 random bits, specified for statistical purposes.
This flag indicates that [the least-significant ("rightmost") 7 bytes or 56 bits of the TraceID are random][W3CCONTEXTTRACEID].

Note the Random flag does not propagate through [Trace Context Level 1][W3CCONTEXTLEVEL1] implementations, which do not recognize the flag.
When this flag is 1, it is considered meaningful. When this flag is 0, it may be due to a non-random TraceID or because a Trace Context Level 1 propagator was used.
To enable sampling in this and other situations where TraceIDs lack sufficient randomness,
OpenTelemetry defines an optional [explicit randomness value][OTELRVALUE] encoded in the [W3C TraceState field][W3CCONTEXTTRACESTATE].

This specification recommends the use of either TraceID randomness or explicit trace randomness,
which ensures that samplers always have sufficient randomness when using W3C Trace Context propagation.

[W3CCONTEXTMAIN]: https://www.w3.org/TR/trace-context-2
[W3CCONTEXTLEVEL1]: https://www.w3.org/TR/trace-context
[W3CCONTEXTTRACEID]: https://www.w3.org/TR/trace-context-2/#randomness-of-trace-id
[W3CCONTEXTTRACESTATE]: https://www.w3.org/TR/trace-context-2/#tracestate-header
[W3CCONTEXTSAMPLEDFLAG]: https://www.w3.org/TR/trace-context-2/#sampled-flag
[W3CCONTEXTRANDOMFLAG]: https://www.w3.org/TR/trace-context-2/#random-trace-id-flag
[OTELRVALUE]: ./tracestate-handling.md#explicit-randomness-value-rv

#### TraceID randomness

For root span contexts, the SDK SHOULD implement the TraceID randomness requirements of the [W3C Trace Context Level 2][W3CCONTEXTTRACEID] Candidate Recommendation when generating TraceID values.

#### Random trace flag

For root span contexts, the SDK SHOULD set the `Random` flag in the trace flags when it generates TraceIDs that meet the [W3C Trace Context Level 2 randomness requirements][W3CCONTEXTTRACEID].

#### Explicit trace randomness

Explicit trace randomness is a mechanism that enables API users and
SDK authors to control trace randomness. The following recommendation
applies to Trace SDKs that have disregarded the recommendation on
TraceID randomness, above. It has two parts.

##### Do not overwrite explicit trace randomness

API users control the initial TraceState of a root span, so they can
provide explicit trace randomness for a trace by defining the [`rv`
sub-key of the OpenTelemetry TraceState][OTELRVALUE]. SDKs and Samplers
MUST NOT overwrite explicit trace randomness in an OpenTelemetry TraceState
value.

##### Root samplers set explicit trace randomness for non-random TraceIDs

When the SDK has generated a TraceID that does not meet the [W3C Trace
Context Level 2 randomness requirements][W3CCONTEXTTRACEID], indicated
by an unset trace random flag, and when the the [`rv` sub-key of the
OpenTelemetry TraceState][OTELRVALUE] is not already set, the Root
sampler has the opportunity to insert explicit trace randomness.

Root Samplers MAY insert an explicit trace randomness value into the
OpenTelemetry TraceState value in cases where an explicit trace
randomness value is not already set.

For example, here's a W3C Trace Context with non-random identifiers and an
explicit randomness value:

```
traceparent: 00-ffffffffffffffffffffffffffffffff-ffffffffffffffff-00
tracestate: ot=rv:7479cfb506891d
```

#### Presumption of TraceID randomness

For all span contexts, OpenTelemetry samplers SHOULD presume that TraceIDs meet the W3C Trace Context Level 2 randomness requirements, unless an explicit randomness value is present in the [`rv` sub-key of the OpenTelemetry TraceState][OTELRVALUE].

#### IdGenerator randomness

If the SDK uses an `IdGenerator` extension point, the SDK SHOULD allow the extension to determine whether the Random flag is set when new IDs are generated.

## Span Limits

Span attributes MUST adhere to the [common rules of attribute limits](../common/README.md#attribute-limits).
Expand Down Expand Up @@ -532,6 +621,18 @@ Additional `IdGenerator` implementing vendor-specific protocols such as AWS
X-Ray trace id generator MUST NOT be maintained or distributed as part of the
Core OpenTelemetry repositories.

### IdGenerator randomness

**Status**: [Development](../document-status.md)

Custom implementations of the `IdGenerator` SHOULD identify themselves
appropriately when all generated TraceID values meet the [W3C Trace
Context Level 2 randomness requirements][W3CCONTEXTTRACEID], so that
the Trace `random` flag will be set in the associated Trace contexts.
This is presumed to be a static property of the `IdGenerator`
implementation which can be inferred using language features, for
example by extending a marker interface.

## Span processor

Span processor is an interface which allows hooks for span start and end method
Expand Down
28 changes: 28 additions & 0 deletions specification/trace/tracestate-handling.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,31 @@ if ok {
// traceState was not updated.
}
```

## Pre-defined OpenTelemetry sub-keys

The following values have been defined by OpenTelemetry.

### Explicit randomness value `rv`

The OpenTelemetry TraceState `rv` sub-key defines an alternative source of randomness called the _explicit randomness value_.
Values of `rv` MUST be exactly 14 lower-case hexadecimal digits:

```
hexdigit = DIGIT ; a-f
```

The explicit randomness value is meant to be used instead of extracting randomness from TraceIDs, therefore it contains the same number of bits as W3C Trace Context Level 2 recommends for TraceIDs.

Lowercase hexadecimal digits are specified to enable direct lexicographical comparison between a sampling thresohld and either the TraceID (as it appears in the `traceparent` header) or the explicit randomness value (as it appears in the `tracestate` header).

Explicit randomness values are meant to propagate through [span contexts](../context/README.md) unmodified.
Explicit randomness values SHOULD NOT be erased from the OpenTelemetry TraceState or modified once associated with a new TraceID, so that sampling decisions made using the explicit randomness value are consistent across signals.

For example, here is a W3C TraceState value including an OpenTelemetry explicit randomness value:

```
tracestate: ot=rv:6e6d1a75832a2f
```

This corresponds with the explicit randomness value, an unsigned integer value, of 0x6e6d1a75832a2f. This randomness value is meant to be used instead of the least-significant 56 bits of the TraceID. In this example, the 56-bit fraction (i.e., 0x6e6d1a75832a2f / 0x100000000000000 = 43.1%) supports making a consistent positive sampling decision at probabilities ranging from 56.9% through 100% (i.e., rejection thresohld values 0x6e6d1a75832a2f through 0), the same as for a hexadecimal TraceID ending in 6e6d1a75832a2f without explicit randomness value.

0 comments on commit 8155988

Please sign in to comment.