Proposal for sampling.priority #107

Kanwaldeep · 2020-05-18T18:00:21Z

No description provided.

Kanwaldeep · 2020-05-18T19:31:07Z

I signed it

jmacd

I don't understand what "priority" means in the context of this document, nor why it should be in the range 1-1000. Can you explain?

I am familiar with sampling "probability". How will a fixed-probability sampler encode its probability? How will an unequal-probability sampler encode its probability?

Kanwaldeep · 2020-05-19T01:15:36Z

We could consider changing the name to make it clear also I was pointed out by @reyang that it is not following the ABNF format for the key . As the trace passes service boundaries that are using different algorithms for sampling how do we ensure that the trace that was sampled in by the upstream component is honored. So as the trace is initiated the sampling priority is calculated based on any algorithm and then the other components in the call path use this value to decide whether the trace should be recorded based on it's sampling rate.

Oberon00 · 2020-05-19T09:55:29Z

To try clearing up some confusion: Apart from the 0-1000 instead of 0..1 range, when you talk about "priority", did you mean to use the word "probability" instead, @Kanwaldeep?

Oberon00 · 2020-05-19T09:58:10Z

text/trace/0107-sampling-priority.md

+
+### Probability Sampler default behavior
+
+The default behavior of probability sampler is to calculate a stateless hash


Note that this is not yet agreed upon, although there are some PRs open, namely open-telemetry/opentelemetry-specification#331 which is several months old and just yesterday we got open-telemetry/opentelemetry-specification#611

Is this behavior even relevant to this OTEP? I think whichever algorithm is used by the sampler, as long as it can express the result via a "probability" value (not every sampler can, e.g. rate limiting), the actual calculation is not relevant.

Oberon00 · 2020-05-19T10:27:42Z

text/trace/0107-sampling-priority.md

+function of `trace-id` to make a sampling decision.
+
+It is suggested to extend the sampler behavior to expose a `tracestate` field
+called `sampling.priority` with the integer value `[0-1000]` that will indicate


indicate how? What does a value of 408 mean? 40.8% probability?

yes. The max value need to be agreed on. 0-1000 would work for most services (sampling down to 1 out of 1000). For extremely loaded services may need more, but likely they may opt out of propagating this flag to optimize performance.

Alternative approach is to go 1..100 and allow non-integers.

Large scale systems are especially in need of having one trace stand for many traces; thus I worry about artificially capping at 1000 sample rate.

Oberon00 · 2020-05-19T10:28:47Z

text/trace/0107-sampling-priority.md

+sampling priority of the current span.
+
+This will allow to align sampling algorithms between various components.
+Especially for the transition scenarios from the different SDKs.


Which "transition" scenarios? Switching from an old to a new version? Please clarify.

yes. If there are different hashing algos used for historical reasons - you can make compoents respect sampling.priority first and then change algo underneath.

it may be expanded in a real spec. @Oberon00 does this note clarify things or you think more descriptive tesxt is needed here. If it's very unclear - @Kanwaldeep worth expanding a bit.

Agree, that the discussion in open-telemetry/opentelemetry-specification#611 has people split between using the "random bits" of the UUID from traceid generation, vs. hashing the whole traceid again just in case the trace id was not randomly generated.

SergeyKanzhelev · 2020-05-19T16:44:13Z

@Oberon00 name is coming from OpenTracing and is already used in collector:

https://github.com/open-telemetry/opentelemetry-collector/blob/60b03d0d2d503351501291b30836d2126487a741/processor/samplingprocessor/probabilisticsamplerprocessor/testdata/config.yaml#L10

Oberon00 · 2020-05-19T17:43:19Z

Please write about this in the prior art section. I can't find the connection of the word "priority" as defined in e.g. https://www.merriam-webster.com/dictionary/priority to the meaning I assume it has in this OTEP. But following the link behind your link to https://github.com/opentracing/specification/blob/11dd7f8/semantic_conventions.md#span-tags-table the definition of priority seems to be effectively a simple boolean hasPriority:

If greater than 0, a hint to the Tracer to do its best to capture the trace. If 0, a hint to the trace to not-capture the trace. If absent, the Tracer should use its default sampling mechanism.

(OK, actually a tri-bool, with the three states absent or less than zero: default; zero: don't sample; greater than zero: please sample).

Kanwaldeep · 2020-05-19T22:03:54Z

(OK, actually a tri-bool, with the three states absent or less than zero: default; zero: don't sample; greater than zero: please sample).

I can add the link to previous art as mentioned. Also I don't think the tri-bool would work as the proposal is to use the sampling.priority to make the sampling decision if passed. If the sampling priority is within the adjusted sampling rate configured for the component. @SergeyKanzhelev let me know if this is what you are thinking as well. If so I can work on updating the proposal.

SergeyKanzhelev · 2020-05-22T00:10:58Z

@Oberon00 most implementation just set a value >0 to indicate "sampled" decision. This proposal extends it to also propagate the probability of sampling. In other words, even so the decision was to DO sample, as component really wants to sample, it can use this probability instead of a hash of traceID.

Do you know of other prior art and names that can be used? Are you concerned with the previous art as some might have used the attribute with the same name differently before and didn't set probabilistic value?

Oberon00 · 2020-05-26T12:39:19Z

I'm mostly concerned that you never define what your "priority" is in this PR. If you clearly define what it is, maybe we can then find a better name as a next step, or maybe we find that priority does fit the definition.

Kanwaldeep · 2020-05-28T07:31:55Z

I'm mostly concerned that you never define what your "priority" is in this PR. If you clearly define what it is, maybe we can then find a better name as a next step, or maybe we find that priority does fit the definition.

I've updated the proposal with the priority calculation.

jmacd · 2020-05-28T16:49:16Z

I still do not understand why the word "priority" is being used. It seems misleading, and I also do not see why we would create a standard that encourages loss of information due to rounding.

I can see two useful and non-lossy conventions we might adopt:

sampling.probability with a floating point value in the range 0-1
sampling.adjusted_count with a floating point value in the range 0-infinity

One is the inverse of the other. The reason why "adjusted_count" (i.e., the inverse probability)( is my preference is that it has a natural interpretation. The adjusted count of a sampled item is the number of items it is representative for in the whole population. Items generated with a 40.8% probability sample have adjusted counts of 1/0.408 = 2.45098039216. When you see one of these items, you would count it as 2.45098039216 items.

It is important not to lose precision and accuracy here because we would like to transform sampled span data into metric data. When we generate a metric count from a 40.8% sampled span, we should count precisely 2.45098039216 items.

MikeGoldsmith · 2020-05-28T17:50:43Z

I like probability over adjusted_count with the bound range between 0 and 1. However, I generally agree these terms are more accurate to what the intention is versus priority.

Kanwaldeep · 2020-05-28T23:05:56Z

I agree with sampling.probability instead of priority with a floating point value in the range 0-1. My only concern is if the name would be confusing with this open PR - open-telemetry/opentelemetry-specification#570

jmacd · 2020-05-29T18:32:12Z

It looks like open-telemetry/opentelemetry-specification#570 and this proposal are the same, other than "sampling" vs "sample" as the label key prefix. Maybe we don't need this OTEP, just need to merge 570.

Oberon00 · 2020-05-29T19:15:10Z

Except for the in-band propagation via tracestate, which seems pretty important.

jmacd · 2020-05-29T20:28:20Z

OK. I read through this and feel a bit befuddled by the discussion about coordinating sampling. Can we agree that the in-band propagation of sampling probability is independent and orthogonal to the in-band propagation of the sampling policy that was used? I am interested in knowing that my parent was or was-not selected for sampling and the (upper bound on) its probability. I am not interested in head sampling using fixed probability.

The reason I wrote "upper bound on" is that it is easy to imagine a sampling scheme combining head and tail sampling policies. We might use a 10% head sample to lower the overhead of tracing, and then decide which among those 10% of spans should be recorded in a tail sampler. The tail sampler will correctly compute the outcome probability and apply it as a sampling.probability attribute, but the probability is expected to be <= 10%. Therefore the value we propagate in-band is an upper-bound on the probability that the parent was recorded.

Kanwaldeep · 2020-05-29T23:25:59Z

Yes agreed that sampling probability is different than upper bound of the sampling policy.

This proposal is about having the first app/component in the distributed trace calculating the probability and propagating the value so the next app/component can honor it. This is to align apps using different algorithms for Probability Sampler.

jmacd · 2020-05-30T01:48:07Z

Could we make clear why a tracing system would prefer to propagate probability and then use it downstream, as opposed to propagating a decision (true/false) for the downstream system to honor? (I'm not sure I know the answer.)

Kanwaldeep · 2020-05-30T05:28:49Z

The reason we need probability is the components downstream are using different algorithm for sampling. So let's assume head component A started the trace and sampled in 60% of the traces and then next component B has sampling policy of 30% but is using different sampling algorithm. In this scenario how can we make sure that the 30% of the traces sampled in by component B are subset of the 60% traces sampled in by component A.

lizthegrey · 2020-06-09T15:36:00Z

The reason we need probability is the components downstream are using different algorithm for sampling. So let's assume head component A started the trace and sampled in 60% of the traces and then next component B has sampling policy of 30% but is using different sampling algorithm. In this scenario how can we make sure that the 30% of the traces sampled in by component B are subset of the 60% traces sampled in by component A.

This, and also systems that depend upon dynamically sampling traces to generate metrics rather than a separate metrics system, need to know how many spans one span represents when counting up spans, computing percentiles of latency, etc.

jmacd · 2020-06-09T15:38:46Z

I do not see any disagreement with open-telemetry/opentelemetry-specification#570.

The calculated probability of the span equals the configured probability of the sampler. If you have another interpretation, please clarify?

lizthegrey · 2020-06-09T15:41:10Z

I do not see any disagreement with open-telemetry/opentelemetry-specification#570.

The calculated probability of the span equals the configured probability of the sampler. If you have another interpretation, please clarify?

Yup, we're in vigorous agreement.

jmacd · 2020-06-09T15:41:23Z

Note that we are discussing sampling in the metrics API here: #113

In that proposal, I've encouraged us to use the term "sample_count" to describe inverse probability. I'm willing to accept a "sample_probability" field instead, containing the inverse.

Kanwaldeep · 2020-06-09T16:26:27Z

I do not see any disagreement with open-telemetry/opentelemetry-specification#570.

The calculated probability of the span equals the configured probability of the sampler. If you have another interpretation, please clarify?

Definitely seems like I am missing something :( . The probability sampler calculates the hash function and checks if it is within the configured probability of the sampler. For example Sampled-in = hash(trace-id) / max_long < sampling-ratio. So the output of hash(trace-id)/ max_long is what we want to propagate. My understanding is the sampling-ratio is the configured probability of the sampler.

jmacd · 2020-06-09T17:58:39Z

The value of hash(trace-id) / max_long is a presumably a uniform random number in the range [0,1] (assuming a perfect hash function). The equation hash(trace-id) / max_long < sampling_probability will be true with probability sampling_probability. If you consider one span, the expected number of spans that will be recorded equals sampling_probability, so with 100% sampling you expect 1 output sample span per arriving span. With 50% sampling you expect 0.5 sample spans per arriving span.

If you consider 1 / sampling_probability arriving spans, the expected number of sample spans equals sampling_probability * 1 / samping_probability == 1, and that is why when we see one sample span we want to know its sampling_probability (or the inverse).

After the sampling decision is made, the value of hash(trace-id) / max_long is not relevant, sampling_probability (or the inverse) is what matters to the downstream system.

Kanwaldeep · 2020-06-09T18:50:18Z

So if you only pass sampling_probability say 0.6 then it's not clear to me how the following scenario can be handled.

So let's assume head component A started the trace and sampled in 60% of the traces and then next component B has sampling policy of 30% but is using different sampling algorithm. In this scenario how can we make sure that the 30% of the traces sampled in by component B are subset of the 60% traces sampled in by component A.

The idea of the proposal is Component B checks the passed probability and makes the decision. So if the passed value is 0.6 then Component B would not sample it in. But if the output of the hash function in component A was 0.25 and that value was passed to Component B then it would sample this trace in as it is using the calculated probability by A to make the decision.

jmacd · 2020-06-09T22:11:37Z

Now I understand the meaning of "priority", I apologize for taking so long to get to this point.

There are a lot of ways to organize a sampling scheme. The ones I'm most familiar with allow the root in a trace to make the sampling decision, and all the children will respect it. In this case, the children don't need to know the probability.

Your example describes a coordinated sampling scheme, where it is important to know how the decision was made, for the purpose of coordination. Sending the proposed "priority" (hash(trace_id) / max_long) is one way to achieve that, but then all uses of this value have to follow this one scheme, right?

I believe elsewhere @yurishkuro has described how Jaeger conveys this information as two attributes: (1) the sampling probability, (2) the sampling technique. In your example, we could transmit the sample.probability as a value in the range [0,1] along with a description of the coordination scheme you've used, which in this case means a descriptor for "consistent hashing of trace_id". For example:

sample.probability=0.5
sample.scheme=cityhash(traceid)

The recipient must know the scheme, and then can apply the same hash to compute the priority value that you're interested in.

Kanwaldeep · 2020-06-09T23:53:26Z

Your example describes a coordinated sampling scheme, where it is important to know how the decision was made, for the purpose of coordination. Sending the proposed "priority" (hash(trace_id) / max_long) is one way to achieve that, but then all uses of this value have to follow this one scheme, right?

They could configure not to use the passed in sampling.priority. But yes all the components use the output of the algorithm as calculated by the root span.

Kanwaldeep · 2020-06-09T23:56:49Z

The recipient must know the scheme, and then can apply the same hash to compute the priority value that you're interested in.

I will wait for @yurishkuro for more details on this. But passing the sample.scheme would require all the downstream components to handle implementation for various schemes to calculate the probability. My 2 cents: Passing sampling.priority seems much more efficient for this use case.

jmacd · 2020-06-11T22:30:45Z

It looks like Jaeger uses another meaning for the word "priority", see here: #115 (comment)

@yurishkuro there is a question for you above.

yurishkuro · 2020-06-12T00:39:24Z

Not sure what the question is. Jaeger does not pass sampling parameters downstream. The decision is made once, represented by a single bit in the flags, and respected by all children. The sampler.type and sampler.param tags are reported with the span out of band.

jmacd · 2020-06-12T06:45:04Z

Thanks @yurishkuro, sorry it wasn't a proper question. I couldn't find information about sampler.type and sampler.param.

@Kanwaldeep I wonder if this use of a tracestate attribute even needs to be standardized. The system you are describing will have to be aware of this on either end, right? At the very least, I believe the word "priority" has caused considerable confusion and I wonder if there might be a better choice.

Kanwaldeep · 2020-06-17T06:11:33Z

@Kanwaldeep I wonder if this use of a tracestate attribute even needs to be standardized. The system you are describing will have to be aware of this on either end, right? At the very least, I believe the word "priority" has caused considerable confusion and I wonder if there might be a better choice.

I was thinking sampling.probability.output as this would make it clear. By making this a standard we make it easier for teams to adopt OTel SDK if they are already using other SDKs.

Kanwaldeep · 2020-07-04T21:26:10Z

Based on discussion in Sampling WG meeting we are working on PR for:

Sampler should be able to add/modify keys on tracestate.
Plugins to have access tracestate. This way, new samplers are not blocked by having to agree on the meanings of common terms like “priority.”

Oberon00 · 2020-07-06T07:07:53Z

What are "plugins" here? You mean tracestate should be publicly accessible? I think it is.

ericmustin · 2020-07-17T11:55:53Z

👋 Hope all is well. Just popping in from otel-js to echo what @lizthegrey offered here #107 (comment). This PR's approach, or something like otep-specification #570 would be tremendously helpful for vendor specific exporters/systems which rely on the sampled traces/spans themselves to generate metrics (like hits/latency/etc) rather than a separate metrics system. I won't pretend to have enough knowledge about the Spec to say whether this info should live on span.attribute or span.context.traceState, but just that having enough information to infer a span's sample.count/weight/etc at the exporter level would be quite useful for many practical use cases.

Oberon00 · 2020-07-17T12:41:26Z

I won't pretend to have enough knowledge about the Spec to say whether this info should live on span.attribute or span.context.traceState

This is the crucial difference though. Samplers can already set normal attributes today. Is this enough for you? The key difference is that normal attributes are not propagated in-band, while traceState is (e.g. in the W3C tracestate HTTP header).

ericmustin · 2020-07-17T13:06:15Z

This is the crucial difference though. Samplers can already set normal attributes today. Is this enough for you? The key difference is that normal attributes are not propagated in-band, while traceState is (e.g. in the W3C tracestate HTTP header).

@Oberon00 In-band means, within the same process right? Basically, would that attribute be propagated from the parent to a non-root span? It seems like at the moment (at least in practice) that is not the case, and what PR #570 is trying to address, no?

https://github.com/open-telemetry/opentelemetry-specification/blame/efb73763c63ff5a2e14e7f30735509f3139c82bd/specification/trace/sdk.md#L122

In-Band (if i understand the term correctly) propagation is the main issue for the use case I have in mind, since the idea is that currently the default ProbabilitySampler leaves exporters blind to the sampling decision on any downstream spans they receive.

Oberon00 · 2020-07-17T13:13:50Z

In-band means, within the same process

No, it means in the same communication "band" that the process would also use without instrumentation to communicate with other processes. E.g. if a process normally makes an HTTP request to some other service, with in-band propagation, headers would be injected into that HTTP request.

But span attributes (as opposed to trace state) are not propagated at all (not in-process, not cross-process, not in-band; they are sent out-of-band to some backend with the exporter typically).

ericmustin · 2020-07-17T13:41:10Z

@Oberon00 ah ok gotcha, thank you for the clarification. So, at the most basic level, what matters more for me at least, is in-process propagation. This seems more reasonably accomplished by PR #570. Does that help clarify things? Hmm TL;DR I should've commented on PR #570 instead, but i felt the the use case described in the comments here was worth highlighting.

lmolkova · 2020-08-21T18:26:35Z

Closing in favor of #135.
I've tried to address all comments there, sorry for the lost history of comments...

Kanwaldeep · 2020-08-21T18:33:41Z

Closing as @lmolkova has started a new pull request for this proposal.

Kanwaldeep added 2 commits May 18, 2020 10:45

Proposal for sampling.priority

207ba2f

Renamed the file and updated the contents for the markdown checks

0510dfd

jmacd reviewed May 18, 2020

View reviewed changes

Oberon00 reviewed May 19, 2020

View reviewed changes

Oberon00 mentioned this pull request May 19, 2020

Add probability sampler details open-telemetry/opentelemetry-specification#331

Closed

garthk mentioned this pull request May 20, 2020

Provide interim sampling capability in exporter garthk/opentelemetry_honeycomb#5

Open

Kanwaldeep added 2 commits May 28, 2020 00:11

Added more details based on feedback

096d10b

Fixed link

c41aa8a

jmacd mentioned this pull request May 29, 2020

Making tracing SDK metrics aware open-telemetry/opentelemetry-specification#381

Open

Kanwaldeep requested a review from SergeyKanzhelev June 2, 2020 21:27

toumorokoshi mentioned this pull request Jun 16, 2020

[feat] Allow additional sampling hooks #115

Closed

dyladan mentioned this pull request Jul 15, 2020

feat: [do not merge] OpenTelemetry Datadog Exporter open-telemetry/opentelemetry-js#1316

Closed

lizthegrey mentioned this pull request Jul 21, 2020

Add the requirement for the probability sampler open-telemetry/opentelemetry-specification#570

Closed

SergeyKanzhelev mentioned this pull request Aug 19, 2020

Rename ProbabilitySampler to TraceIdRatioBasedSampler and add requirements open-telemetry/opentelemetry-specification#611

Merged

lmolkova mentioned this pull request Aug 21, 2020

Introduce sampling score and propagate it with the trace #135

Closed

Kanwaldeep closed this Aug 21, 2020

ericmustin mentioned this pull request Aug 25, 2020

REQUEST: New membership for @ericmustin open-telemetry/community#460

Closed

6 tasks


		### Probability Sampler default behavior

		The default behavior of probability sampler is to calculate a stateless hash

Proposal for sampling.priority #107

Proposal for sampling.priority #107

Conversation

Kanwaldeep commented May 18, 2020

Kanwaldeep commented May 18, 2020

jmacd left a comment

Choose a reason for hiding this comment

Kanwaldeep commented May 19, 2020

Oberon00 commented May 19, 2020

Oberon00 May 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SergeyKanzhelev commented May 19, 2020

Oberon00 commented May 19, 2020 • edited Loading

Kanwaldeep commented May 19, 2020

SergeyKanzhelev commented May 22, 2020

Oberon00 commented May 26, 2020

Kanwaldeep commented May 28, 2020

jmacd commented May 28, 2020

MikeGoldsmith commented May 28, 2020

Kanwaldeep commented May 28, 2020

jmacd commented May 29, 2020

Oberon00 commented May 29, 2020

jmacd commented May 29, 2020

Kanwaldeep commented May 29, 2020

jmacd commented May 30, 2020

Kanwaldeep commented May 30, 2020

lizthegrey commented Jun 9, 2020

jmacd commented Jun 9, 2020

lizthegrey commented Jun 9, 2020

jmacd commented Jun 9, 2020

Kanwaldeep commented Jun 9, 2020 • edited Loading

jmacd commented Jun 9, 2020

Kanwaldeep commented Jun 9, 2020

jmacd commented Jun 9, 2020

Kanwaldeep commented Jun 9, 2020

Kanwaldeep commented Jun 9, 2020

jmacd commented Jun 11, 2020

yurishkuro commented Jun 12, 2020

jmacd commented Jun 12, 2020

Kanwaldeep commented Jun 17, 2020

Kanwaldeep commented Jul 4, 2020

Oberon00 commented Jul 6, 2020 • edited Loading

ericmustin commented Jul 17, 2020

Oberon00 commented Jul 17, 2020

ericmustin commented Jul 17, 2020

Oberon00 commented Jul 17, 2020 • edited Loading

ericmustin commented Jul 17, 2020

lmolkova commented Aug 21, 2020

Kanwaldeep commented Aug 21, 2020 • edited Loading

Oberon00 May 19, 2020 •

edited

Loading

Oberon00 commented May 19, 2020 •

edited

Loading

Kanwaldeep commented Jun 9, 2020 •

edited

Loading

Oberon00 commented Jul 6, 2020 •

edited

Loading

Oberon00 commented Jul 17, 2020 •

edited

Loading

Kanwaldeep commented Aug 21, 2020 •

edited

Loading