Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Unseal SpanContext #58

Closed

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Oct 7, 2019

Remove the proposition that "SpanContext MUST be a final (sealed) class.".

Currently the final SpanContext is sort of a bottleneck for flexibility of API-implementations.

@Oberon00 Oberon00 changed the title Unseal spancontext Unseal SpanContext Oct 7, 2019
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

* Generate Span IDs lazily (e.g., if a Span only has internal parents and children, a mechanism that is more efficient than a combined 128 + 64 bit identifier could be used).
* Count the number of times a SpanContext was injected to better estimate on the back end how long to wait for (additional) child Spans.

## Explanation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal

Since the implementation of propagators is on the SDK level, all public APIs could be removed from the SpanContext
so that it becomes an object that is completely opaque (to everyone but the propagator implementations).
This would enable even more flexibility or at least spare vendors from adding boilerplate code that, e.g., calculates TraceIds
if there is no such concept in their own system.
Copy link
Member

@yurishkuro yurishkuro Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't push for this argument, because that was the approach in OpenTracing and there was a lot of grief from people who simply wanted to log the damned trace id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you just want something to log, then then you could have string properties for Trace and Span ID without forcing them to have a particular format or be the canonical representation. Even a ToString override may be sufficient for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to have guarantees of the length and format in some case, even for logging. I think this was discussed as a requirement for OpenTelemetry from the beginning and cannot be changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These guarantees (such as ASCII-only, no spaces) could be documented for API-implementations to follow instead of enforced by fixing certain types for them.

But I fully agree that this cannot really be changed at this point (the cost-benefit ratio would be very bad here).

The decision to have SpanContext as a non-abstract class in the API results from the goal to keep compatibility.
The cleaner alternative would be to have SpanContext be an interface like all other API classes.

Since the implementation of propagators is on the SDK level, all public APIs could be removed from the SpanContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, there is an argument to have default propagation in the no-op implementation of the API.

Copy link
Member Author

@Oberon00 Oberon00 Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, the API could use it's own private implementation on top of the "interface"-level implementation. But I do not really think we should pursue this approach as it would be too huge of a change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a requirement to have propagation of w3c tracecontext be the default behavior of the API? That would imply that some level of propagation works with the API exclusively.

Since the implementation of propagators is on the SDK level, all public APIs could be removed from the SpanContext
so that it becomes an object that is completely opaque (to everyone but the propagator implementations).
This would enable even more flexibility or at least spare vendors from adding boilerplate code that, e.g., calculates TraceIds
if there is no such concept in their own system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to have guarantees of the length and format in some case, even for logging. I think this was discussed as a requirement for OpenTelemetry from the beginning and cannot be changed.

Comment on lines 56 to 60
In a (currently hypothetical) C++ implementation of OpenTelemetry (and possibly some other languages),
having the SpanContext be a "polymorphic", potentially derived class is a big difference to
the old SpanContext that might be implemented as a simple data structure with a few data fields.
These languages could opt to make SpanContext effectively a "smart pointer".
That is, the SpanContext could be implemented as a final class containing nothing but a reference to the actual SpanContextData (which is not final) and non-virtual functions that delegate to the corresponding virtual functions of the SpancontextData class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge performance hit for languages like C++, because you require refcounting for the SpanContext now. I think this is a big mistake, and some companies may refuse to use this. We need to propose a solution that does not force heap allocation + refcounting for C++ (maybe golang as well where you need this to be an interface now).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That huge performance hit is already incurred for Spans though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, which is a problem, but you are doubling that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think making SpanContext into an interface == doubling the allocations. The Span type can easily implement SpanContext interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misread your first comment a bit. You were complaining about the reference counting! This is not necessarily required, if you don't need the smart pointer to be copyable, you can use a unique_ptr-like zero-cost abstraction. Also, I wrote "could opt to". But thinking about it now, it seems in my head we already had a C++-implemenation were we need to care about minimizing breaking changes. In a new implementation you would probably just return a unique_ptr and have users use -> instead of . I guess I should just remove that paragraph.

If Span implements SpanContext, that would be a bit of a problem because then who is responsible for deleting the span is different between SpanContexts extracted from remote requests and SpanContexts that refer to inproc Spans. That could be solved with a custom smart pointer (nearly zero-cost, I'm thinking about a bool shouldDelete member) though.

Anyway, I feel this conversation is drifting too much into C++-specifics. I guess I should really remove that paragraph.

@SergeyKanzhelev
Copy link
Member

my biggest question would be whether you plan to use this extensibility right away or this discussion can be postponed to after 1.0. Basically, what feature is blocked on the type being sealed.

I see good arguments both ways and this change may need more thinking that we will not have any bandwidth to do now.

full transparency, I might need a similar extensibility for the scenario that may block some internal MS customers. But I didn't find time to think thoroughly about yet. So not sure what would be the right solution - span or spancontext extensibility

@Oberon00
Copy link
Member Author

Oberon00 commented Oct 21, 2019

@SergeyKanzhelev To be honest, I think I'm in the same position as you here. I know that I would need the feature for one approach (where I would need to store information on the Span about each time its context was propagated and also propagate some information that will be different each time the Span is propagated) but that approach is almost certainly going to be abandoned very soon for a new approach, and I don't know yet if I need something like that for the new approach.

@Oberon00
Copy link
Member Author

Actually, something that I will almost definitely need is propagating Dynatrace-specific sampling information. While this could probably be done with just TraceState strings, since there should be only one entry per vendor, it may add considerable parsing overhead.

@SergeyKanzhelev
Copy link
Member

@Oberon00 I believe the answer for sampling overhead is to use Tracestate. And solution may be a better data type for the Tracestate to minimize this overhead.

I'm closing this PR to indicate that we need to have a motivational scenario to move this forward.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants