Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify that the Start Span operation does not set active state #238

Closed
carlosalberto opened this issue Aug 29, 2019 · 20 comments
Closed

Specify that the Start Span operation does not set active state #238

carlosalberto opened this issue Aug 29, 2019 · 20 comments
Assignees
Milestone

Comments

@carlosalberto
Copy link
Contributor

Currently, different implementations decide, on Span creation, whether to set it as the active/current instance or not (in Java startSpan() does not, but in Python start_span() does, for example).

I feel like there should be a clarification or recommendation for every language, mentioning the default Start Span operation does not set the active/current instance, and optionally offer an Start Span + activate operation*. This would give us some homogeneity.

If this makes sense, I'll open up a RFC. Opinions?

(In OT, we had two operations for this actually, namely Start Span and Start Active Span).

[1] In Java we cannot offer an Start + Activate method as it becomes hard to properly handle/report errors. See opentracing/opentracing-java#291

@Oberon00
Copy link
Member

So to summarize, there are three options when creating a span:

  • Just create
  • Create & start
  • Create & start & activate

@carlosalberto
Copy link
Contributor Author

Hey @Oberon00 !

there are three options when creating a span

Actually I think most languages only offer two: 1) start, 2) start & activate - this is because most (all with the exception of Python?) do not offer created, non-started Spans.

@Oberon00
Copy link
Member

Right, I thought there was resistance against removing that anomaly (open-telemetry/opentelemetry-python#97 (comment)), but reading @reyang's comment again, I think this is ripe for a Python PR removing it.

@Oberon00
Copy link
Member

Oberon00 commented Sep 18, 2019

We should also specify or at least give guidelines on:

  • whether some sope/AutoCloseable/context manager/Disposable-like object is returned from start or activate operations
  • is so, whether leaving/closing/exiting/disposing of that deactivates the span, ends it or both.

@carlosalberto
Copy link
Contributor Author

#287 should fix this, by adding a small note about this (without many details). I feel like what @Oberon00 is mentioning should be done separately, as it touches the entire currently Span active handling.

Let me know ;)

@yurishkuro
Copy link
Member

What is the motivation for now allowing auto-activation?

@carlosalberto
Copy link
Contributor Author

Hey @yurishkuro

The motivation is to keep uniformity among different languages/implementations, and reduce confusion.

On point: currently, OTel Python has its Tracer.start_span() method automatically set the newly created Span as the currently active instance, while the equivalent in other languages does not.

I have suggested that we have an additional operation/method that handles this, such as start_active_span() or set_current_span(), in case this is a common operation, though.

@yurishkuro
Copy link
Member

I agree, similarly names methods should behave similarly across languages.

Would this also apply to Go? I thought there the API was explicitly tied to Context.

@krnowak
Copy link
Member

krnowak commented Oct 7, 2019

In Go, Tracer.Start creates, starts and activates the span, just like python's Tracer.start_span(). Currently, we do not provide any means to separate these steps.

Separation of creation of the span and starting it may be awkward, because how can we distinguish the non-started span from the started one? With a separate type? Would mean another interface for SDK to implement? But I don't know what's the use case would be for separating those two steps in Go.

Similarly, not sure about splitting starting and activation in Go - when creating an active span in Go, you still have access to the "old" context with the previous active span if you need it. Again, not sure what would be the use case here.

Would this also apply to Go? I thought there the API was explicitly tied to Context.

Go API could have theoretically have Start(ctx context.Context, name string, opts…) Span and StartActive(ctx context.Context, name string, opts…) (context.Context, Span) functions, so the passed context could be used by tracer to figure out the current active span and make it a parent of the new span. Could also be used for whatever purposes SDK sees fit. But I still have my doubts about it.

@Oberon00
Copy link
Member

Oberon00 commented Oct 7, 2019

An example where you may not want to activate the Span is when you start multiple asynchronous I/O operations at once. They should all become children of the same parent Span but with the Go/Python behavior, the second request would become a child of the first, etc. unless you manually specify a parent Span.

Also, we do not currently assign a semantic meaning of a Span being "active" but it is well imaginable that the activation/deactivation could be used to measure CPU time, in which case you may also want to have more control over activation.

@krnowak
Copy link
Member

krnowak commented Oct 7, 2019

An example where you may not want to activate the Span is when you start multiple asynchronous I/O operations at once. They should all become children of the same parent Span but with the Go/Python behavior, the second request would become a child of the first, etc. unless you manually specify a parent Span.

With Go, I imagine it would look like this:

parentCtx, parentSpan := tracer.Start(grandParentCtx, "parent", opts…)
defer parentSpan.Finish()
for i := 0; i < jobs; i++ {
    go func() {
        childCtx, childSpan := tracer.Start(parentCtx, fmt.Sprintf("child %d", i), someChildOpts…)
        defer childSpan.Finish()
        // do the I/O request and stuff
    }()
}
// wait for the requests to finish

Every childSpan now has parentSpan as a parent, because that parent is stored as an active span in parentCtx.

Also, we do not currently assign a semantic meaning of a Span being "active" but it is well imaginable that the activation/deactivation could be used to measure CPU time, in which case you may also want to have more control over activation.

With that semantics, I have a feeling that starting a span should be equivalent to making it active. If the thread gets preempted or goes to sleep then CPU time is stopped being accounted. Unless I misunderstood that or I used the wrong abstraction. Or at least it doesn't make much sense to me with my Golang goggles on.

@Oberon00
Copy link
Member

Oberon00 commented Oct 7, 2019

That's because you only start the span in the goroutine. If, however, your async I/O is handled entirely by the OS and you only get a completion callback, there is no new context for the I/O operation, so you can't do that then. Also, what if you want to track the delay before the goroutine starts?

@jmacd
Copy link
Contributor

jmacd commented Oct 7, 2019

I believe that this is a non-issue in Go. Each span is active in the context that is returned from Start. If you want the span to be active, use that context. If you want it not active, use a different context. If you want to set the span as active in a different context, there can be a helper for that. OTel-Go doesn't have that helper today, but I'd consider that part of the propagation API--a way to switch and clear contexts w/o starting new spans for example.

@tsloughter
Copy link
Member

If the user is wanting to create a span that is not the current active span then why would this be through the tracer api?

An interface for the span data structure would be appropriate for creating spans that are not started.

@jmacd
Copy link
Contributor

jmacd commented Oct 7, 2019

I don't think "spans that are not started" is meaningful. If you are reporting a span, surely it started. You're trying to start and end a span that never enters the current scope.

The issue of using a SpanData structure to enter these spans was already settled. It requires unnecessary API surface area to achieve what is still just a span starting and ending.

Personally I wouldn't have lobbied for any distinction at all. I think Start() should return an active span and if you're reporting out-of-band data you are free to make the next statement End() that span and deactivate it. There's logically no harm done in activating and immediately deactivating a span.

@Oberon00
Copy link
Member

Oberon00 commented Oct 8, 2019

@tsloughter @jmacd it sounds to me like you are mixing up the concepts of "active" and "started" (maybe because there is no distinction in Go?).

I agree that there should be no un-started Spans. But I think that we need inactive Spans in pracice. Consider for example a language that only has thread locals for automatic Span management. So in these languages, an active Span is defined as one that is bound to the current thread. There can only be one such span at a time. If the operation traced by the Span becomes e.g. suspended because it waits for I/O, the thread can be used for another operation. So you deactivate the current span and activate a new one on the thread. Once the I/O completes, the Span that waited for it can be activated again on some thread and finally, after processing the I/O results, it can be ended and deactivated.

You could argue that there should be a 1:1 mapping between context and active Span and instead of changing the Span that is active on the context, I should rather change the context. I agree that that would be ideal but the reality is that in many languages, the available context implementations are imperfect. I guess an alternative resolution to this issue would be to force these languages to use manual Context management (possibly supported by what automatic management is available) instead of manual active span management.

EDIT: See also https://github.com/open-telemetry/oteps/pull/42/files#r331999299

@tsloughter
Copy link
Member

I am in the sense that I think starting a span should, with the default tracer, make it the active span for the current context. Anything else will be very confusing to users.

@tsloughter
Copy link
Member

The way the Erlang lib works currently is that starting a span sets it to the active in current context span. The default context implementation uses the process dictionary, which can be thought of like ThreadLocal for Java folks, and it keeps track of the parent as well as the current span context:

start_span(Name, Opts) ->
    case ?ctx:get(?SPAN_CTX) of
        {SpanCtx, _}=Ctx ->
            SpanCtx1 = ?span:start_span(Name, Opts#{parent => SpanCtx}),
            ?ctx:with_value(?SPAN_CTX, {SpanCtx1, Ctx}),
            SpanCtx1;
        _ ->
            SpanCtx = ?span:start_span(Name, Opts#{parent => undefined}),
            ?ctx:with_value(?SPAN_CTX, {SpanCtx, undefined}),
            SpanCtx
    end.

So when the current span is ended it is able to take the parent (second element in the tuple that is stored in the process dictionary) which itself is also a 2-tuple and make it the current active context again. This is done as spans are finished, up until the second element of the 2-tuple is undefined, which you can see is done in the above code when there is no parent (meaning when start_span was called there was no active span), ?ctx:with_value(?SPAN_CTX, {SpanCtx, undefined}),.

@tsloughter
Copy link
Member

If the operation traced by the Span becomes e.g. suspended because it waits for I/O, the thread can be used for another operation. So you deactivate the current span and activate a new one on the thread.

Oh... maybe this is why there is confusion. The process dictionary I mentioned may not be as similar to ThreadLocal as I thought if this is accurate. I didn't realize it was such that it was shared by multiple unrelated spans.

@carlosalberto
Copy link
Contributor Author

Closing it. @tsloughter please re-open it (or open a new issue) if you think this needs further addressing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants