-
Notifications
You must be signed in to change notification settings - Fork 893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distinguish starting an active span and creating an inactive span #485
Conversation
76a8524
to
0f0e9ad
Compare
Are you suggesting to introduce the possibility of creating unstarted Spans? I'm strongly against that tbh, I think it makes many things more complicated. |
specification/api-tracing.md
Outdated
@@ -122,24 +122,29 @@ mechanism, for instance the `ServiceLoader` class in Java. | |||
|
|||
The `Tracer` MUST provide functions to: | |||
|
|||
- Create a new `Span` | |||
- Start a new active `Span` | |||
- Create a new inactive `Span` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"inactive" reads very confusing. I would separate "starting a span" and "making a span active" (even though both can be done by a single function).
specification/api-tracing.md
Outdated
|
||
The `Tracer` SHOULD provide methods to: | ||
|
||
- Get the currently active `Span` | ||
- Make a given `Span` as active | ||
|
||
The `Tracer` MUST internally leverage the `Context` in order to get and set the | ||
current `Span` state and how `Span`s are passed across process boundaries. | ||
currently active `Span` and how `Span`s are passed across process boundaries. A | ||
`Span` that is created, as opposed to started, is not tracked in the `Context` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created, as opposed to started
-1, this sounds like a whole new dimension of behavior. If you want to discuss this, please create an OTEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't. The created span still has a start timestamp.
I would recommend to do away with a duality of "creating" vs. "starting" a span. They are the same thing in all prior art, and I would prefer we use a single verb "starting a span" to avoid confusion. |
@yurishkuro currently we have started spans and active spans (which are started). I'm attempting to make a clear distinction between functions to create inactive spans (but have start timestamps) and starting active spans. Right now the API only says you can start an inactive span. |
@Oberon00 no, created spans still have start timestamps, they simply aren't made active. This is the behaviour currently defined in the API. The change is to specify that |
To maybe make it more clear why I'm using The existing APIs and SDKs use The API specifies it should not be set active in the context. I wanted to keep the naming the same as what is done today ( |
I don't think it's a good idea to be using 3 words (start, create, active) to describe 2 orthogonal behaviors (starting a span and making it active). In OT-Java we had to roll-back "making span active" from start_span function, because it was causing issues with try/catch. Have we considered NOT having tracer make the span active at all? It's merely a convenience, which I personally don't believe is worth it, and it works on shielding the developer from understanding that there is such a thing as context propagation, which has nothing to do with managing spans. For example, in Go: ctx := tracer.StartSpan(ctx, ...) is just a convenience over more clear separation of: span := tracer.StartSpan(...)
ctx := otel.ActiveSpan(ctx, span) One important benefit of the latter is that starting a span is an implementation-specific action, thus it requires a tracer. Activating the span is NOT implementation specific, all tracers will share the propagation mechanism. The one-liner can always be implemented as a helper function: ctx := otel.StartSpan(ctx, tracer, ...) |
I think it would be very confusing to force devs to start having to manage the context explicitly. Also, the API does specify that it is not made active, so what you are suggesting is the API spec currently. If the Tracer is responsible for tracking the active span in the context then I think it needs to handle it when the user starts a span unless they specify otherwise, and based on existing implementations of both OpenTelemetry and OpenTracing/Census this is clearly what the user expects. If the Tracer were not responsible for this then I could see an argument that some other module handles the context, like in your example where the Tracer returns the span and then But that is a much larger change. Today the API spec is confusing to me because it both has the Tracer as responsible for the span being activated and doesn't offer a way to start+activate -- although the implementations do just that for I think we need both and would suggest, since the use of
|
I didn't want to touch it in this PR to keep it as small and focused as possible but a related confusion I have is that the spec specifies:
I agree with this API but it doesn't seem to match that If the |
otep-66 states:
It calls this "not final", so maybe there was other conversation on why this shouldn't be the case? otep-66 also uses |
specification/api-tracing.md
Outdated
|
||
The `Tracer` SHOULD provide methods to: | ||
|
||
- Get the currently active `Span` | ||
- Make a given `Span` as active | ||
- Make a given `Span` active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If providing a method to start an inactive span is a MUST, this should also be one, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, good question. I suppose so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too! So please add it to the MUST section then.
A fundamental question: After OTEP 66, is there even such a thing as an inactive span now? See https://github.com/open-telemetry/oteps/blob/49316bc20167a0a6e2214bbf5806e0e7d763b2d0/text/0066-separate-context-propagation.md#observability-api
It seems the description of StartSpan in the spec is simply wrong now. |
Even with otep66 there has to be a concept of an inactive span, it is just to useful and in use. But I do think that the default action of a tracer's |
This might be better discussed tomorrow in the morning meeting. Then I can rework the PR based on the outcome there and discussion can pick up again on the PR. |
@open-telemetry/specs-approvers please review this. We need to make progress. |
913aee4
to
5b96bda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title so that it matches the content (starting an active vs. inactive span) since you removed the separate create without starting option in the meantime.
specification/api-tracing.md
Outdated
@@ -120,31 +120,45 @@ mechanism, for instance the `ServiceLoader` class in Java. | |||
|
|||
### Tracer operations | |||
|
|||
The currently active `Span` is the one that is tracked in the current `Context` | |||
by the `Tracer`. An inactive `Span` is not currently tracked in any `Context`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An inactive span can still be tracked in a non-current context, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it would make it active (unless you track via some private mechanism, which would be out of scope here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually in Java (and most likely in Python) you can have a Span
in a Context
that is not the active one:
// ctx contains span1 but ctx itself is not active.
Context ctx = TracingContextUtils.withSpan(span1, Context.current());
// *Now* it is.
try (Scope scope = ContextUtils.withScopedContext(ctx)) {
}
Because of this I suggest changing the second sentence to: "A Span is considered inactive when it is not tracked in the currently active Context." or something along that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider that span to be active for that context, but the context itself is not active.
We could have a different term for it though, "current span" to replace how I've been using "active span". So a span can be "current" to a context without being active because the context is not active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest, for the sake of moving on with this PR, to not change the wording between active/current, and use whatever is used in the given section (we could do a follow up this in another PR). Likewise, let's use active/current only when the Span is associated with the current Context
(else, you'd say its associated with a given Context
).
specification/api-tracing.md
Outdated
@@ -120,31 +120,45 @@ mechanism, for instance the `ServiceLoader` class in Java. | |||
|
|||
### Tracer operations | |||
|
|||
The currently active `Span` is the one that is tracked in the current `Context` | |||
by the `Tracer`. An inactive `Span` is not currently tracked in any `Context`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it would make it active (unless you track via some private mechanism, which would be out of scope here)
specification/api-tracing.md
Outdated
The `Tracer` MUST provide functions to: | ||
|
||
- Create a new `Span` | ||
- Start a new active `Span` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a requirement on the API, but, for example, in OpenTracing we explicitly removed such ability, in favor of starting inactive span and them manually activating it via Scope. What are the implications of this change?
NB: this kind of change sounds to me like it should go through OTEP with some code examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC there was agreement to clearly separate these two operations ("start active Span" and "start Span"), but "start active Span" MUST be optional (in Java, as you mentioned, we can't implement this one correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with separate operations as decided in the SIG call, but I'd suggest changing this to:
- Start a new Span
- Start a new Span as the current instance (optional operation).
2 is optional as, for example, Go already handles this explicitly in 1) depending on any specified Context
, and also because in Java we won't implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlosalberto is this what you are thinking:
The `Tracer` MUST provide functions to:
- Start a new active `Span`
The `Tracer` SHOULD provide methods to:
- Get the currently active `Span`
- Start a new inactive `Span`
- Make a given `Span` active
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or wait, you wanted the start inactive span to be the MUST
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, switched it in the PR.
specification/api-tracing.md
Outdated
|
||
The `Tracer` SHOULD provide methods to: | ||
|
||
- Get the currently active `Span` | ||
- Make a given `Span` as active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks odd that making span active is in MUST section, but getting active span is in SHOULD. I would think they should go in the same category.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #527 might actually hit this one, so I'd suggest we fix that based on that one ;)
specification/api-tracing.md
Outdated
current `Span` state and how `Span`s are passed across process boundaries. | ||
currently active `Span` and how `Span`s are passed across process boundaries. A | ||
`Span` that is started but inactive is not tracked in the `Context` by the | ||
`Tracer`, but it still MUST have a start timestamp set at the time of creation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the point about the timestamp bundled into the paragraph about context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same feeling here, I think we don't need to mention timestamp. Having two separated operations for start-span and start-span-as-current should make this clear enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I was thinking that "inactive" may give the impression to someone that it could not have a timestamp. But since it is still called start
and this is defined elsewhere it can be removed.
specification/api-tracing.md
Outdated
selected, or the current active `Span` is invalid. | ||
SHOULD create each new `Span` as a child of its active `Span` unless an explicit | ||
parent is provided or the option to create a span without a parent is selected, | ||
or the current active `Span` is invalid. Last the `Tracer` would check if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"or the current active Span
is invalid" doesn't seem to belong here (can be in #determining-the-parent-span-from-a-context
).
specification/api-tracing.md
Outdated
SHOULD create each new `Span` as a child of its active `Span` unless an explicit | ||
parent is provided or the option to create a span without a parent is selected, | ||
or the current active `Span` is invalid. Last the `Tracer` would check if | ||
`Context` has an extracted `SpanContext`. See [Determining the Parent Span from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last the
Tracer
would check ifContext
has an extractedSpanContext
I don't follow why this is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paragraph is going through the ways a span's parent is set when a span is created. If that part is removed I think the whole thing should be removed and simply link to "Determining the Parent Span".
specification/api-tracing.md
Outdated
as a separate operation. | ||
|
||
The API MUST accept the following parameters: | ||
The API functions for starting a `Span` MUST accept the following parameters: | ||
|
||
- The span name. This is a required parameter. | ||
- The parent `Span` or a `Context` containing a parent `Span` or `SpanContext`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this bullet is confusing. I think it's mutually exclusive to accept parent Span OR an indicator to create a new root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is outside the changes of this PR. Though I agree there doesn't appear to be a reason to have to explicitly state if a span is a root span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will also be covered/clarified/improved by #510 so I'd suggest we work on this part there, so we can move on with this PR.
Else, track this clarification in its own issue ;)
specification/api-tracing.md
Outdated
SHOULD create each new `Span` as a child of its active `Span` unless an | ||
explicit parent is provided or the option to create a span without a parent is | ||
selected, or the current active `Span` is invalid. | ||
SHOULD create each new `Span` as a child of its active `Span` unless an explicit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a child of THE active" Span? Otherwise, it sounds like each Tracer
can have its own active Span
, which just adds excessive (and not needed) complexity.
As stated in #516, I see "starting an inactive span" as the Tracer's responsibility, and "making a span active" the context library's responsibility. |
I guess I go the other way since I think Span should go away and the Tracer take the context to do everything, like in open-telemetry/oteps#68 :) |
Includes a note about async callbacks which are a common use case for creating an inactive span.
Co-Authored-By: Armin Ruech <[email protected]>
Co-Authored-By: Yuri Shkuro <[email protected]>
21d8da6
to
82cef8c
Compare
- Get the currently active `Span` | ||
- Make a given `Span` as active | ||
- Make a given `Span` active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is still some vagueness here. I have users at Uber who are absolutely opposed to thread-local based context propagation, and instead insist that in their services all context propagation is done explicitly, similar to Go. That means that while they may still use the same Context implementation, the three SHOULD operations here are against the explicitly passed Context object, rather than building an assumption that a thread-local mechanism is being used and understood by the tracer. In other words, the "Get the currently active Span
" is equivalent to this function in OpenTracing Go:
func SpanFromContext(ctx context.Context) Span {
val := ctx.Value(activeSpanKey)
if sp, ok := val.(Span); ok {
return sp
}
return nil
}
Which opens up another trail of questions: why does this need to be a functionality of the Tracer? Tracer is an API that can be implemented differently. Accessing spans in the Context is not related to specific tracer implementation, it's a common behavior at the API level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider it part of the Tracer functionality because the Tracer is what knows where its data is stored in the context.
In your example from OpenTracing that would be activeSpanKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In OpenTracing it's not part of Tracer interface, it's a shared static function. If it was part of Tracer, then every tracer implementation would have to implement it identically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming Tracers from multiple Tracer Providers in a single application are meant to be able to read traces from the same contexts, yes they'd have to use the same key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, wouldn't we want tracers from different providers to be able to act separate and not clash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observe this will probably go away with #527 - Maybe hold on till that one is resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't really change this PR much since it is mainly about starting the spans and #527 isn't. Having re-read it I'm not opposed to #527 anymore either, I had thought it also moved starting a span to "utils".
The change in this PR is adding an additional function for starting a span, the "make span active" already exists in the spec and is not changed by this PR. So I think this should be considered separate.
And don't think the function to make a span active should hold up this PR since it already exists in the spec and is not being changed here.
I saw that Python has gone with |
So what is going on with this PR? Right now the spec says the API call must only start inactive spans and I don't think that is what implementations are doing -- instead they are providing both like in this PR. I'm going to update it to resolve the conflicts in a bit, hopefully it can get merged after that? |
Nevermind. It now says:
I think it should be the other way around (as in |
Includes a note about async callbacks which are a common use case
for creating an inactive span.
Closes #469