SDK tracing: clarify multi-processors scenarios #338

lmolkova · 2019-10-26T02:57:50Z

Fixes #316

Clarifies that multiple processors registered on TracerFactory allow multiple exporters (i.e. not chainable) but chaining could be done on additionally.

Now spec recommends chaining on exporters, check out #316 for reasons why processors are better suited for it.

Oberon00 · 2019-10-28T14:01:12Z

Could you add a (pseudo-)code example (as comment or in the OTEP) how that could look like?

E.g. I'm not sure I understand

SDK MUST allow to end each pipeline with individual exporter

Does this mean if I have an AnnotateSpanWithThreadNameSpanProcessor that modifies the span to set a thread_name property during the "start" event would somehow have the ability to forward spans to a dedicated list of "chained" exporters? I hope not.

lmolkova · 2019-10-29T07:06:56Z

@Oberon00 I've added an example in the spec.
Your AnnotateSpanWithThreadNameSpanProcessor can be followed by something like FilterOutHEADRequestsProcessor that sends to Jaeger.

Another pipeline sends all spans (unfiltered) to ZPages or stdout implemented as processors or exporters.

You may want to send all spans to different exporters and potentially process (filter and batch) them differently depending on exporter capabilities, storage cost and other factors. Tagging cannot be done independently as instances of spans are shared among processors/exporters.

Please let me know if it clarifies your question.

Oberon00 · 2019-10-29T09:28:04Z

Well, the AnnotateSpanWithThreadNameSpanProcessor I have in mind would look like this:

class AnnotateSpanWithThreadNameSpanProcessor: ISpanProcessor {
  public override void OnStartSpan(Span span) {
    span.setAttribute("thread_name", Thread.CurrentThread.Name);
  }
}

It would not seem useful to add an exporter behind that span processor, as (1) it modifies the Span in-line and (2) it does not even do anything upon ending the Span.

To me it seems that chainability in the base SpanProcessor interface would require that base interface to be Split into SpanStartProcessor and SpanEndProcessor interfaces and chainability would only be incorporated in the latter.

But actually I think that the base interface can stay as it is now, and we can still have e.g.:

class FilteringSpanProcessor: ISpanProcessor {
  public FilteringSpanProcessor(Predicate<ReadableSpan> filter, ISpanProcessor next) {  /* ... */  }
}

class MultiSpanProcessor: ISpanProcessor {
  public MultiSpanProcessor(params ISpanProcessor[] next) { /* ... */ }
}

// ...

tracerFactory.AddSpanProcessor(new FilteringSpanProcessor(s => s.HasAttribute("foo"), new MultiSpanProcessor(
  new BatchExportProcessor(new ZipkinExporter(...)),
  new FooSpanProcessor(...)
));

lmolkova · 2019-10-30T03:20:07Z

According to spec we have this:

Span.Start() {
   ... 
   for (SpanProcessor p : processors) {
       p.OnStart(this);
   }
}

Span.End() {
   ...
   for (SpanProcessor p : processors) {
       p.OnEnd(this)
   }
}

What I propose (no change in SpanProcessor interface) in addition to what spec tells is to encouradge chaining on processors:

class ThreadIdProcessor : SpanProcessor
{
     private final SpanProcessor next;
     public ThreadIdProcessor(SpanProcessor next) { this.next= next; }
     public void OnStart(Span s) {  
         // tag
         //...
         
        next.OnStart(s);
     }

     public void OnEnd(Span s) {  
         // tag
         //...
         
        next.End(s);
     }
}

class FilterOutHEADProcessor : SpanProcessor
{
     private final SpanProcessor next;
     public ThreadIdProcessor(SpanProcessor next) { this.next= next; }
     public void OnStart(Span s) {  
         if (s.getAttributes().get("http.method") != "HEAD") {
          next.OnStart(s);
        }
     }

     public void OnEnd(Span s) {  
         if (s.getAttributes().get("http.method") != "HEAD") {
          next.OnEnd(s);
        }
     }
}

It might be useful to split them further into Start and End, but seems still useful without splitting.

Configuration then looks like

   var pipeline1 = new ThreadIdProcessor(
                                    new FilterOutHEADProcessor(
                                           new BatchingProcessor(
                                                    new JaegerExporter)));

   var pipeline2 = new ZPagesProcessor();

   var factory = new TracerFactory();
   factory.addProcessor(pipeline1);
   factory.addProcessor(pepline2);

this may be done in a much more friendly way like we've done in .NET open-telemetry/opentelemetry-dotnet#286, samples

lmolkova · 2019-10-30T03:33:16Z

So basically what @Oberon00 you propose and what I propose is a very similar thing. You may fork processing pipeline in multiple places (multi-processor is a fork). Your example forks before filtering (like you've done it). I fork first by adding two processors (to filter spans differently for different destinations). All I'm asking for to spec to clearly allow and encourage possibility of such behavior.

yurishkuro · 2019-10-30T22:43:14Z

I agree with @lmolkova 's example here #338 (comment)

What I don't understand is why the spec needs to be explicit about it and even allow for multiple Processors. Both chaining and forked processing can be achieved through composition. There can be provided standard processors (or rather factory methods) implementing chained or forked/parallel pipelines.

SergeyKanzhelev · 2019-10-30T22:50:38Z

@yurishkuro I think it is beneficial to document a concept of chained processors. So when you have a need to implement one of these features - this type of extensibility will be used and not any other.

yurishkuro · 2019-10-30T22:56:39Z

specification/sdk-tracing.md

+  |     |                |   |                     |   |                   |
+  |     |                |   | BatchProcessor      |   |    SpanExporter   |
+  |     | Span.start() leProcessor     +--->  (JaegerExporter) |
+  | SDK | Span.end()   |   |                     |   |                   |


is the formatting broken here? I don't know how to read this.

yurishkuro · 2019-10-30T23:00:11Z

I think it is beneficial to document a concept of chained processors.

@SergeyKanzhelev https://en.wikipedia.org/wiki/Decorator_pattern. What other explanation is needed?

yurishkuro · 2019-10-30T23:01:30Z

I think there is a different aspect to it - if there are some standard features that each SDK must provide, that's worth describing (assuming those qualify as MUST).

tigrannajaryan · 2019-10-31T00:36:06Z

specification/sdk-tracing.md

+or batching independently on each pipeline.
+
+SDK MUST allow implementing helpers as composable components that use the same
+chainable `SpanProcessor` interface.


I wonder if we really want to have this functionality in a language library? This largely duplicates functionality and is very similar to what OpenTelemetry Collector implements. Batching, tagging, filtering and all sorts of processing already exists in the Collector.

I am worried that we are complicating client libraries with features that may be best implemented elsewhere (and are already implemented). In my opinion client libraries must stay as simple as possible and delegate all complicated functionality to a nearby running Agent or Collector.

This will result in considerable saving of time and effort we are spending on OpenTelemetry project (we have a bunch of language libraries but only one Collector) and ensure all that complicated behavior is implemented and once and consistently. If this is done in each individual library there is a chance they will behave slightly differently (which may be frustrating in mixed-language environments) and we are overburdening language library authors to write all this code and risk introducing bugs.

In my opinion language library should do only minimal processing that is important from performance perspective and just perform batching.

Even multiple pipelines seem to be an overkill to me.

I would like to know how others feel about this. I may be wrong here, so thoughts are welcome.

Certain tagging and filtering will be better off implemented in-proc. And to enable this we need to describe what extensibility points must be used.

@SergeyKanzhelev yes, I understand that certain functionality is better done in-proc, particularly because otherwise it is significantly less efficient from overall CPU consumption perspective. I am arguing that we should however clearly articulate what is that certain functionality and instead of broadly recommending to implement rich set of processing capabilities in language libraries do exactly the opposite and recommend not to implement anything other than that minimal set of necessary functionality.

What exactly is the necessary functionality can be discussed separately. My argument is that this PR recommends and encourages library authors to do more than is necessary.

My initial take on this would be that we need only batching and head sampling (not even tail sampling). Everything else belongs to out-of process agent. I may be wrong on this so probably some experimentation and benchmarking is necessary to find the right minimal set of processors that we want.

Again, it may well be that we need all the listed processors (tagging, filtering, etc) but the recommendation should be expressed in the opposite form and encourages minimalism in implementation.

I don't see this PR recommending to do more than ideal. When it talks about example - there is a note that it is "complicated configuration". Can you suggest the language that will help to discourage people to do it? Or perhaps explicitly mention that Collector can do all the processing?

I would remove this:

SDK authors are encouraged to implement common functionality such as queuing,
batching, tagging, etc. as helpers. This functionality will be applicable
regardless of what protocol exporter is used.

and instead say this:

SDK authors are encouraged to implement minimal functionality that must be done in-process for efficient operation (e.g. batching and head sampling) and delegate complicated processing functionality to out-of-process Agents or Collectors. This minimizes the burden of language library implementation and results in consistent behavior of such processing in multi-language deployment scenarios.

I would completely remove the following 2 sentences:

Processors (or exporters) may
implement tagging, batching, filtering and other advanced scenarios.

SDK MUST allow to end each pipeline with individual exporter and do filtering
or batching independently on each pipeline.

I would also remove the complicated example with the diagram.

SergeyKanzhelev · 2019-10-31T05:09:39Z

I think it is beneficial to document a concept of chained processors.

@SergeyKanzhelev https://en.wikipedia.org/wiki/Decorator_pattern. What other explanation is needed?

@yurishkuro the problem here is that decorator pattern explains how to do chaining, but doesn't explain what kind of functionality will be chained. It also goes to your other comment. Filtering that can be explained via configuration is in many cases not descriptive enough. So it may be easier to tell how to code filters and where to plug them.

lmolkova · 2019-11-08T23:37:18Z

Sorry I was on vacation and did not update it for a while.

I'm all in to not require SDK to do any extra work. While batching and head sampling are needed everywhere (or almost everywhere), tagging and filtering are needed frequently and cannot/should not be done by Collector.

Examples:

tagging based on custom implicit context such as thread-id or more advanced things.
filtering based on implicit context.
filtering that also helps to save on Collector resources

As active contributor to the OpenTelemtery .NET I wanted to understand how to design SpanProcessor and SpanExporter and their registration APIs on TraceFactory.

I found that spec is too vague on what multiple processors mean and how to achieve the above scenarios in process. I'm attempting to close this gap.

This PR and advanced example intend to clarify how it is going to be used: multi processors are for multiple destinations, chains of processors are advanced configuration users may do (or not).

Based on the feedback I will:

Remove suggestion for common processors for filtering/tagging: there is not much value in them and users can still do them.
Add that SDK should allow users to decorate processors.
Add that anything that is possible out-of-process should be done by Agents or Collectors rather than SpanProcessor.
Remove advanced example, but keep the explanation that multiple processors on tracer are for multiple destinations and each of them represent pipeline that could be enriched with chain of processors/exporters

tigrannajaryan · 2019-11-09T16:47:28Z

I found that spec is too vague on what multiple processors mean and how to achieve the above scenarios in process. I'm attempting to close this gap.

@lmolkova I think what you do is very important, thank you.

I am not against a particular functionality in client libraries. If we feel that it is a must have then let's have it.

My objection is primarily conceptual: let's reverse our recommendation and instead of encouraging client libraries to add rich functionality, let's discourage them from adding functionality that is not absolutely required.

The reason I want to do that is what I already wrote above:

Save development time
Reduce chances of inconsistent implementations across different client libraries
Reduce chances of introducing bugs in client libraries due to complicated features

If we feel that a particular functionality is important but can be implemented well outside the process let's recommend it to be implemented in OpenTelemetry Collector.

lmolkova · 2019-11-11T18:51:07Z

@tigrannajaryan thanks, now I see your point and agree with it.

I updated the doc.

tigrannajaryan · 2019-11-12T13:57:55Z

specification/sdk-tracing.md

@@ -161,22 +162,29 @@ Manipulation of the span processors collection must only happen on `TracerFactor
 instances. This means methods like `addSpanProcessor` must be implemented on
 `TracerFactory`.

+Each processor registered on `TracerFactory` is a start of pipeline that consist


@lmolkova I understand that you are clarifying the document and not suggesting new functionality but if you don't mind I would like to bring attention of approvers to this issue while we are working on this part of the spec.

@open-telemetry/specs-approvers what are the arguments for having multi-pipeline feature in-process rather than use the already existing equivalent functionality in OpenTelemetry Collector? In my opinion this could be removed from the SDK and we encourage using Collector for this, but perhaps there are use cases that I am not aware of that require this to be in SDK.

I don't think it can be removed from SDK. There are definitely cases when collector cannot be installed. Also some spans can be filtered out right from the process to save on cross-proc communication with collector.

@tigrannajaryan for the z-pages functionality that we want to support it is inefficient to stream everything to the collector. So at least one example that we want to support is:

Send all spans that record events to the z-pages span processor

Send all sampled spans to the configure exporter (collector probably)

+1 to Sergey and Bogdan's comments

I don't think that having equivalent functionality is bad in this case (and I say this as someone who expects most of their customers to use the collector). As Sergey mentioned, there will always be cases where people can't use the collector or where it doesn't add much value.

bogdandrutu · 2019-11-22T22:32:35Z

specification/sdk-tracing.md

@@ -161,22 +162,29 @@ Manipulation of the span processors collection must only happen on `TracerFactor
 instances. This means methods like `addSpanProcessor` must be implemented on
 `TracerFactory`.

+Each processor registered on `TracerFactory` is a start of pipeline that consist


@tigrannajaryan for the z-pages functionality that we want to support it is inefficient to stream everything to the collector. So at least one example that we want to support is:

Send all spans that record events to the z-pages span processor

Send all sampled spans to the configure exporter (collector probably)

specification/sdk-tracing.md

mtwo · 2019-11-22T23:01:59Z

Echoing from my comment further up:

I don't think that having equivalent functionality in the libraries and the collector is bad in this case (and I say this as someone who expects most of their customers to use the collector). As Sergey mentioned, there will always be cases where people can't use the collector or where it doesn't add much value, and they should still be able to export to multiple backends simultaneously.

Fixes open-telemetry#316

… SDK

SergeyKanzhelev · 2019-12-04T08:57:20Z

Have enough approvals and active long enough. Merging

* Recommend chaining on processors Fixes open-telemetry#316 * fix lint * fix lint2 * example for span chaining * fix formatting * Update sdk-tracing.md * Update sdk-tracing.md * Update sdk-tracing.md * remove advanced example, do not encourage implementing helpers in the SDK * recommend to move implement new common processing scearios out-of-proc * fix lint * BatchProcessor -> BatchEporterProcessor

lmolkova requested review from AloisReitbauer, bogdandrutu, c24t, carlosalberto, iredelmeier, reyang, SergeyKanzhelev, songy23, tedsuo, tigrannajaryan and yurishkuro as code owners October 26, 2019 02:57

SergeyKanzhelev approved these changes Oct 26, 2019

View reviewed changes

yurishkuro reviewed Oct 30, 2019

View reviewed changes

tigrannajaryan reviewed Oct 31, 2019

View reviewed changes

tigrannajaryan reviewed Nov 12, 2019

View reviewed changes

lmolkova changed the title ~~SDK tracing: Allow and recommend chaining on processors~~ SDK tracing: clarify multi-processors scenarios Nov 12, 2019

SergeyKanzhelev requested a review from jmacd as a code owner November 22, 2019 22:28

bogdandrutu approved these changes Nov 22, 2019

View reviewed changes

Liudmila Molkova added 12 commits November 22, 2019 16:14

Recommend chaining on processors

b357525

Fixes open-telemetry#316

fix lint

9724a8c

fix lint2

dac6f49

example for span chaining

9cf2f48

fix formatting

0634a69

Update sdk-tracing.md

3cea141

Update sdk-tracing.md

4e469e7

Update sdk-tracing.md

ab69437

remove advanced example, do not encourage implementing helpers in the…

0a9ef03

… SDK

recommend to move implement new common processing scearios out-of-proc

238945f

fix lint

f92579f

BatchProcessor -> BatchEporterProcessor

7761327

lmolkova force-pushed the patch-3 branch from 02b890b to 7761327 Compare November 23, 2019 00:14

jmacd approved these changes Nov 25, 2019

View reviewed changes

Merge branch 'master' into patch-3

6e6bda3

SergeyKanzhelev merged commit f9d22be into open-telemetry:master Dec 4, 2019

Oberon00 mentioned this pull request Aug 19, 2020

Implement thread.id and thread.name semantic attributes open-telemetry/opentelemetry-java#1554

Merged

TuckTuckFloof pushed a commit to TuckTuckFloof/opentelemetry-specification that referenced this pull request Oct 15, 2020

Removed unnecessary reference to StartSpanOptions (open-telemetry#338)

813816c

lmolkova deleted the patch-3 branch April 19, 2022 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK tracing: clarify multi-processors scenarios #338

SDK tracing: clarify multi-processors scenarios #338

lmolkova commented Oct 26, 2019

Oberon00 commented Oct 28, 2019 •

edited

Loading

lmolkova commented Oct 29, 2019

Oberon00 commented Oct 29, 2019 •

edited

Loading

lmolkova commented Oct 30, 2019 •

edited

Loading

lmolkova commented Oct 30, 2019

yurishkuro commented Oct 30, 2019

SergeyKanzhelev commented Oct 30, 2019

yurishkuro Oct 30, 2019

yurishkuro commented Oct 30, 2019

yurishkuro commented Oct 30, 2019

tigrannajaryan Oct 31, 2019

SergeyKanzhelev Oct 31, 2019

tigrannajaryan Oct 31, 2019

SergeyKanzhelev Oct 31, 2019

tigrannajaryan Oct 31, 2019

SergeyKanzhelev commented Oct 31, 2019

lmolkova commented Nov 8, 2019 •

edited

Loading

tigrannajaryan commented Nov 9, 2019

lmolkova commented Nov 11, 2019

tigrannajaryan Nov 12, 2019

SergeyKanzhelev Nov 22, 2019

bogdandrutu Nov 22, 2019

mtwo Nov 22, 2019 •

edited

Loading

tigrannajaryan Nov 25, 2019

bogdandrutu Nov 22, 2019

mtwo commented Nov 22, 2019

SergeyKanzhelev commented Dec 4, 2019

SDK tracing: clarify multi-processors scenarios #338

SDK tracing: clarify multi-processors scenarios #338

Conversation

lmolkova commented Oct 26, 2019

Oberon00 commented Oct 28, 2019 • edited Loading

lmolkova commented Oct 29, 2019

Oberon00 commented Oct 29, 2019 • edited Loading

lmolkova commented Oct 30, 2019 • edited Loading

lmolkova commented Oct 30, 2019

yurishkuro commented Oct 30, 2019

SergeyKanzhelev commented Oct 30, 2019

Choose a reason for hiding this comment

yurishkuro commented Oct 30, 2019

yurishkuro commented Oct 30, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SergeyKanzhelev commented Oct 31, 2019

lmolkova commented Nov 8, 2019 • edited Loading

tigrannajaryan commented Nov 9, 2019

lmolkova commented Nov 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtwo Nov 22, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtwo commented Nov 22, 2019

SergeyKanzhelev commented Dec 4, 2019

Oberon00 commented Oct 28, 2019 •

edited

Loading

Oberon00 commented Oct 29, 2019 •

edited

Loading

lmolkova commented Oct 30, 2019 •

edited

Loading

lmolkova commented Nov 8, 2019 •

edited

Loading

mtwo Nov 22, 2019 •

edited

Loading