-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add resource to Recordable, make room for InstrumentationLibrary #580
Add resource to Recordable, make room for InstrumentationLibrary #580
Conversation
- Tracer now pulls its processor from TracerProvider (as per spec) - Remove `GetProcessor()` and friends from Tracer - Update all tests to reflect the pipeline is now on TracerProvider
- Remove passing of Processor/Resource to span - Spans now lookup processor from associated Tracer This should allow us to add InstrumenationLibrary to `Tracer` and expose lookups for `Resource`/`InstrumentationLibrary` from `Span`.
…instrumentation library.
Codecov Report
@@ Coverage Diff @@
## main #580 +/- ##
==========================================
+ Coverage 94.39% 94.49% +0.09%
==========================================
Files 200 200
Lines 9068 9066 -2
==========================================
+ Hits 8560 8567 +7
+ Misses 508 499 -9
|
- OTLP exporter now correctly constructs a `Resource` for spans. - Expand OTLP tests to do a modicum of content validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing the approach. LGTM in general apart from comment about using reference instead of pointer for resources.
{ | ||
// *resource_span->mutable_resource() = std::move(rec->resource()); | ||
*resource_span->mutable_resource() = rec->resource(); | ||
has_resource = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can break
from loop once resource is found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could assert the later found resource equals the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't. I'm using the loop that adds the spans to add the resource. IF we break
than we'll be dropping spans.
* @param resource pointer to the Resource for this span. | ||
*/ | ||
virtual void SetResourceRef( | ||
const opentelemetry::sdk::resource::Resource *const resource) noexcept = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong preference, but can we pass const reference to resource here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I can do this, it's a bit "odd". TL;DR: I didn't want to convert from a reference to a pointer as that causes me to cringe in the notion of "likely bug". I found a way to not do that, but it's still a bit cringey, PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I generally like this approach. I'm just trying to think through its implications (some first thoughts in inline comments).
opentelemetry::sdk::AtomicSharedPtr<SpanProcessor> processor_; | ||
const std::shared_ptr<Sampler> sampler_; | ||
const opentelemetry::sdk::resource::Resource &resource_; | ||
TracerProvider *provider_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? We have the processor, the sampler, and the resource on the tracer itself.
I vaguely remember a use case for using a Tracer
without a TracerProvider
(I think it was related to envoy). I can try to dig that up. I think this design effectively makes that impossible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(a) We can't enforce the lifetime of Processor
at all if we share it across Tracer
. One goal here is to give each component of the SDK an explicit owner who controls when memory is terminated
(b) TracerProvider
owns resource AND processor/sampler. So the idea is that Tracers are tied to the lifetime of TracerProvider and can leverage both of those freely without sacrificing the "known lifetime" issue as you stated.
I'd love to know the Envoy use case. AFAIK you shouldn't be able to get a Tracer
without a TracerProvider
, but I could be wrong. I think Tracer
need some work for InstrumentationLibrary
, and likely that may push things this direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One advantage of not having dependency between the two is that the TracerProvider
and Tracer
can be implemented independently by developers, without worrying about the undocumented interfaces provided/to-be-provided by TracerProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was one of the discussions mentioning the Envoy use case: #35 (comment)
Currently we can use tracers without tracer providers, I think that's not violating the spec. I don't know how many use cases are out there for that (unfortunately I'm myself not that familiar with Envoy). However, if we remove that ability we'd need to be deliberate and careful about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configuration (i.e., SpanProcessors, IdGenerator, SpanLimits and Sampler) MUST be managed solely by the TracerProvider and it MUST provide some way to configure all of them that are implemented in the SDK, at least when creating or initializing it.
So the envoy concern in question is the following:
I could imagine users wanting to use a different ownership model than the TracerProvider that creates one tracer per named component and ties lifetime to a global.
I think having Tracer
tie its lifetime shorter than a TracerProvider would still be possible here. Resource
would be global, but tracer could have shorter lifetime. That's actually a significant API change we'd have to do though, and has tons of implicaitons around INstrumentationLirbary
implementation.
/** | ||
* Sets the resource associated with a span. | ||
* | ||
* <p> The resource is guaranteed to have a lifetime longer than this recordable. It is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource is guaranteed to have a lifetime longer than this recordable.
I'm not sure if we can make this statement as is. I'm thinking about a scenario where the user frees the TracerProvider
, but the exporter is still working on exporting spans (recordables), maybe in another thread.
If we go with plain pointers for resources (instead of shared_ptr
), we'd need to be very careful, and either explicitly document all implicit assumptions we make, or take additional precautions to avoid problematic scenarios (e. g. making sure that in the TracerProvider
destructor Shutdown
is called for all exporters).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to change TracerProvider to (a) ensure shutdown and (b) require resource cleanup after shutdown.
Would you be amenable to that change in correlation with this? I think it's needed to make this safe, and I'm happy to go update the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm definitely amenable to you making those changes.
Some other question: did you for performance reasons decide not to use a managed pointer (std::shared_ptr
) for holding on to the resource in the recordable? And rather have implicit ownership via trace provider/recordable lifetimes?
So far we tried to make all ownership explicit to be on the (memory) safe side.
Ok thinking about the lifetime of resources, and scenarios where tracer provider is shutdown while the export is still happening, references mayn't be the right approach. So please ignore that suggestion. |
// We assume all the spans are for the same resource. | ||
if (!has_resource) | ||
{ | ||
// *resource_span->mutable_resource() = std::move(rec->resource()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: replace this comment with TODO support for moving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, theoretically it is moved, I should have removed it before :)
@lalitb I did implement references prior to seeing this..... |
|
@lalitb I can remove that commit. However two questions for the group:
|
My main concern about the approach (which I generally like):
After that, the recordable might be batched by the exporter or another component it is forwarded to, at the same time the global tracer provider could be replaced (there's that possibility at least). To take care of that, we would need to come up with lots of implicit lifetime dependencies (like "shutting down an exporter invalidates all recordables created from it"), and we would need to make it a hard requirement for every exporter to implement Maybe I'm too paranoid. However, what are your thoughts about this approach:
|
Probably I am may be biased with existing design and would like to be corrected if so - but how about keeping the current approach of keeping sampler, processor ( and now also including resource ) as |
@pyohannes / @lalitb Two points:
We're not actually using a. We could make sure however we store it leads to efficient sorting/ordering in Exporters for the OTLP case (i.e. an efficient hash on |
Regarding the concerns around using shared pointer, I think we have to change the existing implementation no matter what. See the specificaiton line:
|
: tracer_{std::move(tracer)}, | ||
processor_{processor}, | ||
recordable_{processor_->MakeRecordable()}, | ||
recordable_{nullptr}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This initialization is unnecessary?
*/ | ||
std::shared_ptr<Sampler> GetSampler() const noexcept; | ||
explicit Tracer(TracerProvider *provider) noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the ctor of Tracer
private and only accessible only in TracerProvider
?
Just another idea I had. What about putting all struct TracerContext {
std::unique_ptr<SpanProcessor> processor;
std::unique_ptr<Sampler> sampler;
std::shared_ptr<Resource> resource;
}; The The In this way, I think it cleans up component dependencies in the SDK, however it doesn't give us an efficient way to pass |
Thanks @pyohannes for the proposal - Just to be clear to me, the proposal is to have cleaner approach for removing the dependency between |
@jsuereth - I like this approach of using sdk::span to store the reference to Resource to have consistent implementation with other languages, just wondering how it would be different and efficient from current approach of storing reference to Resource in otlp |
@pyohannes With the caveat that we do multithreading primitives correctly around updates, I think that could work for decoupling Tracer from TracerProvider, while still meeting the spec requirement around the processor. Should I take a crack at this now as part of this CL? |
yeah, OTLP requires grouping by Resource/InstrumentationLibrary by its design. Alternatively we can store a map from IL => ILSpans and Resource => ResourceSpans to do that lookup/grouping as we iterate, depends on which ends up being faster. I won't be able to get to showing this code until thursday, but I'll try to have something then for you to see what I'm suggesting. We can discuss in the SiG :) |
It ensures that the configuration is owned by a single |
Maybe let's get other people's opinion first in tomorrow's SIG. |
This PR Is the result of discussion/design from #575.
SetResourceRef
to theRecordable
interfaceTracer
methods not in the specification.Tracer
to getProcessor
/Exporter
solely fromTracerProvider
Span
to keep a reference to generatingTracer
and pull information from it rather than duplicatingSetResourceRef
is called onRecordable
when creating spansResource
For a separate PR?:
SpanProcessor
to beunique_ptr
onTracerProvider
.Exporter
to beunique_ptr
onSpanProcessor
s.InstrumentationLirbary
classTracerProvider
to construct multipleTracer
s, each remembering theirInstrumentationLibrary