Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Add specification of log correlation for tracing (issue #123). #181

Merged
merged 12 commits into from
Oct 9, 2018
Merged
58 changes: 58 additions & 0 deletions trace/LogCorrelation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Log Correlation (draft)

This specification is not done until the key names below are finalized (issue
[#195](https://github.com/census-instrumentation/opencensus-specs/issues/195)).

Log correlation is a feature that inserts information about the current span into log entries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be more specific than "current span". I would say "the trace span in the context when the logged event occurred".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that "current span" needs more explanation. I added a new section and linked to the existing section on the current span.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for @Ramonza.

created by existing logging frameworks. The feature can be used to add more context to log entries,
filter log entries by trace ID, or find log entries associated with a specific trace or span.

The design of a log correlation implementation depends heavily on the details of the particular
logging framework that it supports. Therefore, this document only covers the aspects of log
correlation that could be shared across log correlation implementations for multiple languages and
logging frameworks. It doesn't cover how to hook into the logging framework.

## Identifying the span to associate with a log entry

A log correlation implementation should look up tracing data from the span that is current at the
point of the log statement. See
[Span.md#how-span-interacts-with-context](Span.md#how-span-interacts-with-context) for the
definition of the current span.

## Tracing data to include in log entries

A log correlation implementation should make the following pieces of tracing data from the current
span context available in each log entry:

### Trace ID

The trace ID of the current span. See [Span.md#traceid](Span.md#traceid).

### Span ID

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note it is common for some tooling to attempt to reverse engineer spans from log entries. when this is the case the parent id is also put in the logging context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the parent span ID wasn't accessible through the API, except after export as SpanData.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a todo to consider this. Until we have a clear request it is not worth doing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


The span ID of the current span. See [Span.md#spanid](Span.md#spanid).

### Sampling Decision
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put a note on samplingScore storing into the logs. It may be useful if one collects logs only from sampled traces and needs to estimate the actual count assuming probability sampling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKanzhelev do you want to write a one pager explaining the samplingScore, how can be used, why is important to be propagated, etc.

For the moment I would not mention it here until we define it. We can leave a TODO to say when decide on sampling score then consider to add it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO.


The sampling bit of the current span, as a boolean. See
[Span.md#supported-bits](Span.md#supported-bits).

TODO(sebright): Include "samplingScore" once that field is added to the SpanContext.

TODO(sebright): Add a section on fields from the Tracestate. Users should be able to add
vendor-specific fields from the Tracestate to logs, using a callback mechanism.

TODO(sebright): Consider adding parent span ID, to allow recreating the trace structure from logs.

## String format for tracing data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add a section for "Other fields from tracestate" or something like this. So vendors-specific key/value pairs can be added to the logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth mentioning the Tracestate here. Probably every integration should offer a callback mechanism to extract and log data from Tracestate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I add a TODO and open an issue for this part? I think it will require more discussion, and this PR already has a lot of comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO for this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #196.


The logging framework may require the pieces of tracing data to be converted to strings. In that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracing data or identifiers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean it should say "The logging framework may require the tracing data or identifiers to be converted to strings"?

case, the log correlation implementation should format the trace ID and span ID as lowercase base 16
and format the sampling decision as "true" or "false".

## Key names for tracing data

Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries. When
a log correlation implementation inserts tracing data by that method, the key names should be
"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very mouthful key names.

OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very mouthful key names.

There was some discussion about short vs long key names in census-instrumentation/opencensus-java#1371 (comment).

I thought that it would be better to use long key names by default, to avoid conflicting with other tracing libraries' key names. Users could then override the key names if they wanted to purposefully use the same key names for, say, Brave and OpenCensus.

OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys.

I'm happy to change the key names if underscores are more common. Where have you seen underscores used?

I saw camel case logging context key names in Brave, Wingtips, and Spring Cloud Sleuth and camel case tracing-related log entry fields in Stackdriver (https://cloud.google.com/logging/docs/agent/configuration#special_fields_in_structured_payloads). The three libraries are all Java, though, so it could easily be Java specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a language-community preference. I saw it on Stackdriver Logging, zap and logrus.

https://github.com/uber-go/zap
https://github.com/sirupsen/logrus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for shorter names and underscores

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the unusual case of conflict, users can always override the defaults.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

I think that makes sense. I initially thought that the language wouldn't matter once the logs were output by the application and that it would be better to keep the log format consistent. However, the logging framework already can have a large effect on the format of the log entries, so differences in key name format between log correlation implementations are unlikely to make log formats less consistent.

I mentioned that the case and format of the key names should be consistent with the supported logging framework.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think supporting per language name is not the best option:

  1. Usually the log backends are not per language. So the backend has to understand all of these combinations.
  2. I agree that size matters here so probably shorter is better (less data written).
  3. I understand that for Java a key like "oc_trace_id" is probably not a good fit. For the moment we have only Java and Go here. Can we get some input from other languages?

Even if we have multiple keys we need to document them here for each language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue for continuing the discussion: #195

correlation implementation may allow the user to override the tracing data key names.
1 change: 1 addition & 0 deletions trace/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ describes the key types and the overall behavior.
* [gRPC integration](gRPC.md): document about how to instrument gRPC framework.
* [HTTP integration](HTTP.md): document about how to instrument http frameworks.
* [Sampling logic](Sampling.md): document about how sampling works.
* [Log correlation](LogCorrelation.md): specification for a feature for inserting tracing data into log entries.