Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for "current span" formerly supported with thread binders #369

Merged
merged 1 commit into from
Apr 3, 2017

Conversation

codefromthecrypt
Copy link
Member

@codefromthecrypt codefromthecrypt commented Mar 21, 2017

This implements a "current span" concept which represents the in-flight
operation. Tracer.currentSpan() can be used to add custom tags to a
span and Tracer.nextSpan() can be used to create a child of whatever
is in-flight.

The backend of this is CurrentTraceContext which can be extended to
support SLF4J or Finagle synchronization. It also exposes concurrent
wrappers for types like ExecutorService.

Particularly, this includes the ability to get the "current span" and assign is to a scope. In most cases the implementation will be backed by thread locals, whether that's directly (as brave's former thread state was) or indirectly via a GRPC or Finagle Context.

The (power) user apis are located in the Tracer api:

  • Span currentSpan - retrieves any span in scope or null
  • Span nextSpan - creates a new child or trace if there's no current span
  • SpanInScope withSpanInScope(span) - makes a span current and available to calls within the scope

The SPI for this is called CurrentTraceContext, and it can be implemented to support MDC integration or alternate span storage

  • TraceContext get() - returns the current trace context or null
  • Scope newScope(context) - creates a scope where calls to get will return the input

By default, CurrentTraceContext is implemented by a thread local, though there are tests to show it can be implemented by Finagle (or gRPC or otherwise).

Fixes #166

@devinsba
Copy link
Member

I'm not sold on the name SpanScoper any more than I like Scheduler from the OT PR that is like this one. Possibly SpanManager or SpanExecutionManager or SpanExecutionContext

Really though, naming things is hard so I hope others have some input

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Mar 22, 2017 via email

@codefromthecrypt
Copy link
Member Author

@eirslett @kristofa what are your thoughts on this.. I think you have used different names in the past. what are your thoughts today? (also if you have any thoughts about the impl please shout)

@llinder
Copy link
Member

llinder commented Mar 22, 2017

I like the direction so far. SpanScoper does seem a bit unusual. If better names pop into my head I will share them :)

@codefromthecrypt
Copy link
Member Author

PS I updated my text above to include 5 examples of similar code, hopefully to help hone in on best vocab

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Mar 22, 2017

the words: "scope context current" seem the more popular words in almost all the existing types that do stuff like this. The subject in the Tracer methods is usually a what we call here either a span or a trace context.

The design in this PR and the thing currently called SpanScoper actually acts on a TraceContext, which is more specific than Span. How about this?

interface CurrentTraceContext {
  @Nullable TraceContext get();
  TraceContextScope newScope(TraceContext traceContext);
}

@codefromthecrypt
Copy link
Member Author

added a trace context interop example between this api and finagle cc @eirslett @klingerf @kristofa

@codefromthecrypt
Copy link
Member Author

Added a commit that reuses code from https://github.com/DistributedTracing/continuation-local-storage-jvm/pulls to make a better base impl of thread storage. thanks to @eirslett cc @schlosna

@codefromthecrypt
Copy link
Member Author

progress so far:

  • started with a name which seems unpopular
  • proved this works with log4j2 (or slf4j)
  • proved this can do finagle interop

next steps:

  • get feedback on esp on naming .. I personally like CurrentTraceContext
  • integrate with real instrumentation

@llinder
Copy link
Member

llinder commented Mar 22, 2017

+1 for CurrentTraceContext

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Mar 23, 2017 via email

@tramchamploo
Copy link

+1 for CurrentTraceContext

@codefromthecrypt
Copy link
Member Author

actually I'm going to revert the continuation-local based default impl as it is over-engineering for what we are using it for. For example, we don't yet search or take action on multiple contexts (such as clearing them). In that case, the static array is a bit distracting as it doesn't do anything. We can add it later or re-use a full implementation in a different PR.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Mar 23, 2017

updated the code and PR description with the following.. please let me know if this is not good :)


The (power) user apis are located in the Tracer api:

  • Span currentSpan - retrieves any span in scope or null
  • Span nextSpan - creates a new child or trace if there's no current span
  • SpanInScope withSpanInScope(span) - makes a span current and available to calls within the scope

The SPI for this is called CurrentTraceContext, and it can be implemented to support MDC integration or alternate span storage

  • TraceContext get() - returns the current trace context or null
  • Scope newScope(context) - creates a scope where calls to get will return the input

By default, CurrentTraceContext is implemented by a thread local, though there are tests to show it can be implemented by Finagle (or gRPC or otherwise).

@codefromthecrypt
Copy link
Member Author

Added Tracer.nextSpan based on feedback from @reta #356 (comment)

@codefromthecrypt
Copy link
Member Author

starting to revisit the PR to actually use this code. writing a test for currentSpan access via user code is interesting as it smoked out a bug in our approach of instrumenting apache hc https://github.com/openzipkin/brave/pull/350/files#r107617533

@codefromthecrypt
Copy link
Member Author

in the latest work on #350, I had to place a span in scope in one callback and close it in another.

TL;DR; this is one of those scenarios where it would be nice to use try-with-resources, but the
actual library has no closure over user code. You have to set it in one callback and close it in another.

details..

Apache HttpClient includes a way to customize its HttpClientBuilder, which allows for some decoration. If you set a span in scope in the decorateProtocolExec callback, any interceptor
can use the "current span" without any special knowledge. For example, it can grab the trace ID
and add it as a different header (we have a test for this). The decorateMainExec callback gets
the span in scope attribute and closes it when finished tracing (as main exec is the real http call).

in simplified pseudocode

Result protocolExec(Request request) {
  Span span = tracer.nextSpan();
  // it is as if we started the try-with-resources here
  context.setAttribute(SpanInScope.class, tracer.withSpanInScope(next));
  return delegate.protocolExec(request);
}

Result mainExec(Request request) {
  Span span = tracer.currentSpan(); // works because it is set in scope!
  try {
    // do tracing work
    return delegate.protocolExec(request);
  } finally {
    // it is as if we ended the try-with-resources here
    context.getAttribute(SpanInScope.class).close();
  }
}

@kmaci3k
Copy link

kmaci3k commented Mar 27, 2017

I'm wondering how brave is related to opentracing. In opentracing website there is information that brave is an experimental bridge. How (and if) should it be used with opentracing ?

Nevermind, I found brave-opentracing project...

Is there any plan when those changes will be available in master ?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Mar 27, 2017 via email

@bogdandrutu
Copy link

Question:

Why not withSpan? Golang defined this model withX for their context API and I think having consistency between languages is a win for developers.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Mar 28, 2017 via email

@codefromthecrypt
Copy link
Member Author

@kmaci3k I think you asked when this will land.. probably this weekend. I wanted to go back and implement all of the instrumentation on top before cementing in the api. I'm almost done with that.

So far, at least we'll need something to support executors.. maybe that's the only change to this PR.

@codefromthecrypt codefromthecrypt mentioned this pull request Mar 28, 2017
11 tasks
@codefromthecrypt codefromthecrypt force-pushed the spanscoper branch 2 times, most recently from 1af45d9 to dde3b76 Compare April 1, 2017 12:34

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note implementations can completely replace any of these features below as they are non-final

*/
public ExecutorService executorService(ExecutorService delegate) {
class CurrentTraceContextExecutorService extends brave.internal.WrappingExecutorService {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing like this allows extenders to not rely on any brave internal classes if they have their own executorservice

@codefromthecrypt
Copy link
Member Author

Added tests. will update README and do polishing. will merge and release in the next 24hrs

@codefromthecrypt codefromthecrypt changed the title WIP in process propagation api Adds support for "current span" formerly supported with thread binders Apr 2, 2017
@codefromthecrypt codefromthecrypt force-pushed the spanscoper branch 2 times, most recently from 06e2fe9 to b8db361 Compare April 3, 2017 04:40
This implements a "current span" concept which represents the in-flight
operation. `Tracer.currentSpan()` can be used to add custom tags to a
span and `Tracer.nextSpan()` can be used to create a child of whatever
is in-flight.

The backend of this is `CurrentTraceContext` which can be extended to
support SLF4J or Finagle synchronization. It also exposes concurrent
wrappers for types like `ExecutorService`.

Particularly, this includes the ability to get the "current span" and assign is to a scope. In most cases the implementation will be backed by thread locals, whether that's directly (as brave's former thread state was) or indirectly via a GRPC or Finagle Context.

The (power) user apis are located in the Tracer api:
* `Span currentSpan` - retrieves any span in scope or null
* `Span nextSpan` - creates a new child or trace if there's no current span
* `SpanInScope withSpanInScope(span)` - makes a span current and available to calls within the scope

The SPI for this is called `CurrentTraceContext`, and it can be implemented to support MDC integration or alternate span storage
* `TraceContext get()` - returns the current trace context or null
*  `Scope newScope(context)` - creates a scope where calls to get will return the input

By default, `CurrentTraceContext` is implemented by a thread local, though there are tests to show it can be implemented by Finagle (or gRPC or otherwise).
@codefromthecrypt codefromthecrypt merged commit c1cec35 into master Apr 3, 2017
@codefromthecrypt codefromthecrypt deleted the spanscoper branch April 3, 2017 06:47
@kmaci3k
Copy link

kmaci3k commented Apr 3, 2017

If I understand correctly a developer who wants to trace asynchronous call should implement on their own a wrapper (similar to this one: https://github.com/spring-cloud/spring-cloud-sleuth/blob/master/spring-cloud-sleuth-core/src/main/java/org/springframework/cloud/sleuth/TraceCallable.java) using Brave API ?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Apr 3, 2017 via email

codefromthecrypt pushed a commit that referenced this pull request May 2, 2017
This adds trace and span IDs to the SLF4J Mapped Diagnostic Context (MDC)
so that you can search or aggregate logs accordingly.

To enable this, configure `brave.Tracing` with `MDCCurrentTraceContext`
like so:

```java
tracing = Tracing.newBuilder()
    .currentTraceContext(MDCCurrentTraceContext.create())
    ...
    .build();
```

Then, in your log configuration, you can use `traceId` and or `spanId`.

Here's an example logback configuration:

```xml
<pattern>%d [%X{traceId}/%X{spanId}] [%thread] %-5level %logger{36} - %msg%n</pattern>
```

When a trace is in progress, it would log statements like this:
```
2017-05-02 23:36:04,789 [fcd015bf6f8b05ba/fcd015bf6f8b05ba] [main] INFO  c.a.FooController - I got here!
```

Users could then copy/paste the trace ID into the zipkin UI, or use log
correlation to further debug a problem.

Fixes #150
See #369
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants