-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow reactor instrumentation to pick up spans from reactor context #4159
Conversation
a4d94df
to
3c4c469
Compare
Our current reactive framework instrumentations (usually) try to execute all callbacks in the context that was active during subscribe. Do you propose to replace that or augment that with the ability to specify span context explicitly? |
I'm proposing to support both - full change is in this pr and adds explicit context support. |
I see this problem not as changing when context is captured (I think it should still be on subscribe), but rather allowing downstream publishers to influence the otel context of upstream publishers. E.g. here's a more complex example that I would love to see create the trace
|
This comment has been minimized.
This comment has been minimized.
So as a summary:
While I'd like to solve p2 as well, and will look for the solution - p1 is still very valuable for the reactor instrumentation. I hope we can start with it and see how p2 can be fixed for the implicit case. In our libraries, I'll still use explicit context with reactor context whenever possible. |
This comment has been minimized.
This comment has been minimized.
So, can we discuss it at one of the instrumentation meetings? Can someone more familiar with the reactor help to find a more elegant approach? And just to clarify: this PR, even though it does not solve |
definitely, I just added it to the agenda for Thu, and will try to understand things a bit better before then
cc: @HaloFour 👀 |
I've been thinking about a general API for tracing Reactor that didn't entirely rely on the instrumentation/operator being installed. It's something I'd be interested in adding to the instrumentation library, but it would work a bit differently than what is proposed here. Beyond that I'm not sure I understand the purpose of this PR. The TracingOperator is supposed to capture the current Context at the point of assembly, so the following should work fine and not leak: private Mono<?> traceMono(Mono<?> publisher) {
Span span = tracer.spanBuilder("traceMono").startSpan();
try (Scope scope = Context.current().with(span)) {
return publisher
.doOnError(t -> span.recordException(t))
.doFinally(t -> span.end());
}
} Apart from that I'm concerned that the proposed PR captures the Span separately from the Context and then tries to tie them back together. If we were to put anything into the Reactor context it feels like it should be the full OTel Context. As for creating the span on subscribe, which is certainly necessary for cold and reusable publishers, that is something probably better served with a separate tracing operator that can be applied with |
Unfortunately I don't think there is a way for code to know if a Mono is going to be subscribed in the future or not, so we couldn't ever use an approach in auto instrumentation that doesn't trace on subscription. While manual code could probably do it, it feels hairy to have both approaches interacting with each other. One issue we often have is when a framework does not provide a filter-like API for interception. For example, tracing seems best implemented in The javadoc describes perfectly what we want to do, add behavior when the Mono is surprised. But without something like Reactive frameworks could be nice to us and provide such a callback which I think could help here. But I suspect their general stance would be, use the reactive context. So in this sense allowing reading from either Also, agree with @HaloFour that we want to expect the OTel Context object in the reactor context, not just a Span. And rather than expose a String, we should expose static methods for 1) create new reactor context with an OTel context and 2) read OTel context from reactor context. |
I'd propose an API utilize Mono<T> mono = ...;
Mono<T> traced = mono.transform(withSpan("MySpan")); I started to work on a PoC of this when I was toying with multiple subscriptions via
This is definitely my biggest concern here as well. It's quite likely that the Context captured on assembly is the better option but that could be difficult to discern. I've not fully thought it through but perhaps the Reactor operator could put a cookie into the OTel context that is captured in the Reactor context. The operator could compare the cookies within the current/captured OTel context with that in the Reactor context to determine if the former is based on the latter and would be a better context to use. But that probably doesn't solve every case. |
Here's a real quick&dirty implementation of the TracingOperator API I suggested that can do this: https://gist.github.com/HaloFour/6cac6653120da96fc1e8e1effca7b5b5 The method |
This comment has been minimized.
This comment has been minimized.
Another discussion point on exposing APIs on
I think the main (and huge!) value comes if it's an example and guidance on how to instrument reactor - may be a blog post or md file in this repo? I promise to steal from it a lot. |
I understand the sentiment but I do think that copied code will not be trivial. It also seemingly doesn't make sense to have a proposal here for behavior in the Reactor instrumentation if it is a goal to not use the Reactor instrumentation. Something needs to extract that explicit Reactor context into the current Context for the rest of the SDK and instrumentation to work with it. |
|
Ah, I get you now. It's not that you're avoiding the instrumentation, it's that you don't want a dependency on the library APIs. That does make more sense. I assume that would also preclude any helper methods that would avoid leaking a String key to the consumer. I am still quite concerned about how this feature would interact with other instrumentation around Reactor or libraries that utilize Reactor. It seems like it would be really easy for the OTel Context stored in the Reactor Context to not be the correct context. |
Do you have any examples of it? I'd like to play and find some reasonable solutions. My point here is that we know for sure current trace context does not work (see my example above). Or try running the same example with It sounds like reactor context works at least better than So I hope to find some reasonable solution here.
It involves some manual steps, but people who create spans are probably willing to put a bit extra effort to make their instrumentation work (and current() doesn't). I'm new to the reactor world (which I'm working on), so I'm looking for some examples here, based on the community expertise, where this solution does not work (or works worse than current()) and am happy to adjust it. |
I've not tried to put together a proof of concept, but if your traced Reactor chain used
I disagree that the Reactor context would be better.
Ultimately I think the problem is not one of explicitness, it's that the |
…n addition to current tracing context
072264e
to
38afb0b
Compare
733ad17
to
32ac3e7
Compare
...ctor-3.1/library/src/main/java/io/opentelemetry/instrumentation/reactor/TracingOperator.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/opentelemetry/instrumentation/reactor/ReactorAsyncOperationEndStrategy.java
Outdated
Show resolved
Hide resolved
...ctor-3.1/library/src/main/java/io/opentelemetry/instrumentation/reactor/TracingOperator.java
Outdated
Show resolved
Hide resolved
public static reactor.util.context.Context storeInContext( | ||
reactor.util.context.Context context, Context traceContext) { | ||
return context.put(TRACE_CONTEXT_KEY, traceContext); | ||
} | ||
|
||
/** | ||
* Gets Trace {@link io.opentelemetry.context.Context} from Reactor {@link | ||
* reactor.util.context.Context}. | ||
* | ||
* @param context Reactor's context to get trace context from. | ||
* @param defaultTraceContext Default value to be returned if no trace context is found on Reactor | ||
* context. | ||
* @return Trace context or default value. | ||
*/ | ||
public static Context fromContextOrDefault( | ||
reactor.util.context.Context context, Context defaultTraceContext) { | ||
return context.getOrDefault(TRACE_CONTEXT_KEY, defaultTraceContext); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think of:
storeInContext()
->storeOpenTelemetryContext()
fromContextOrDefault()
->getOpenTelemetryContext()
also, I think (future) renaming of the class might help:
TracingOperator
->ReactorTracing
TracingOperatorBuilder
->ReactorTracingBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to rename all of TracingOperator
to ReactorTracing
, it sort of matches the naming we have started applying to other instrumentation (e.g. GrpcTracing
/GrpcTracingBuilder
), and it's probably nice to have a single entry point for the instrumentation instead of two.
other than that looks great to me.
Yeah, I can do it. I thought that as we'll add more tracing stuff (start/end span convenience) having configuration (register/reset hooks) on the same class may get noisy, so I thought about separating tracing. |
...actor-3.1/library/src/main/java/io/opentelemetry/instrumentation/reactor/ReactorTracing.java
Outdated
Show resolved
Hide resolved
...actor-3.1/library/src/main/java/io/opentelemetry/instrumentation/reactor/ReactorTracing.java
Outdated
Show resolved
Hide resolved
...actor-3.1/library/src/main/java/io/opentelemetry/instrumentation/reactor/ReactorTracing.java
Outdated
Show resolved
Hide resolved
...actor-3.1/library/src/main/java/io/opentelemetry/instrumentation/reactor/ReactorTracing.java
Outdated
Show resolved
Hide resolved
* @param context Reactor's context to store trace context in. | ||
* @param traceContext Trace context to be stored. | ||
*/ | ||
public static reactor.util.context.Context storeOpenTelemetryContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of exposing store/get
, can we expose runInScope
(runWithContext
)? It seems to be the better API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but it implies using a hack with dummy mono/flux even when it's not needed (not a big deal). But I think we ultimately don't need runWithContext
- having something like tracePublisher(Publisher, SpanBuilder, AttributeExtractors?, whatnot)
will eliminate the need for it and provide convenience.
For those who want low-level control (e.g. if I want to start/end a span conditionally or in some funky way), get/set context is the minimal needed functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, if I need to inject headers or enrich span, I'd prefer explicit context from reactor (i.e. get
method) over Context.current()
. Since current can change in any user callback, I never know what it refers to. There are no guarantees with explicit reactor context either, so it's rather a preference.
fdd7984
to
3dc4175
Compare
For existing customers of the javaagent who monitors their Reactor-Netty call, will anything change after this PR? I mean, will the produced telemetry change? |
@iNikem As far as I know, the only existing instrumentation affected by this change is @lmolkova Do you think it's worth having a test including |
@HaloFour if you have time, would be great to get your review on this also |
Confirming what @anuraaga said - |
agreed, trying to add one and stuck with something unrelated |
@anuraaga You sent me to the rabbit hole :) Now that I'm back, I added ReactorNettyWithSpanTest to overcome some test instrumentation issues ( thanks @trask for sharing it) and since test HTTP server is fully synchronous, it needs a bit of hacking. |
7078376
to
49f8ccc
Compare
import static io.opentelemetry.api.trace.SpanKind.INTERNAL | ||
import static io.opentelemetry.api.trace.SpanKind.SERVER | ||
|
||
class ReactorNettyWithSpanTest extends InstrumentationSpecification implements AgentTestTrait { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Looks good to me! The approach with the "dummy" publisher and |
…pen-telemetry#4159) * Allow reactor instrumentation to pick up spans from reactor context in addition to current tracing context * And fix nested WithSpan with defer * up * Review comments * move ReactorTracing to Operation and rename to ContextPropagationOperator * fix build * Add reactor-netty client test nested under WithSpan * Add link to the issue in comments * clean up
I'm trying to instrument my async library APIs that return e.g.
Mono
and never subscribe/block. Internally they do HTTP requests and logs. I'm looking for a way for HTTP auto-instrumentation and loggers MDC to pick up context from my library instrumentation.Let's say that's the library call I'm trying to instrument:
And let's say here's how I trace it
If I do this with agent enabled, I will have my HTTP spans unrelated to
traceMono
span, since I never madetraceMono
current. But there is no good way (?) to maketraceMono
span current without leaking context.Proposal
Let reactor instrumentation capture trace context from
current
or from explicit reactor context. Libraries/user code can explicitly pass context to instrumentation and it will propagate to upstream calls.This way I don't ever need to make
traceMono
span current myself and instrumentation will make it current in all callbacks.Here are more examples to play with https://gist.github.com/lmolkova/156a9e6453f5d3f8242689cfe6a6c07a.
*This change only adds context APIs, but no convenience tracing methods yet