-
Notifications
You must be signed in to change notification settings - Fork 50
Add specification of log correlation for tracing (issue #123). #181
Changes from all commits
617f706
5b42bdf
96d6c2a
56fb4e0
3e11867
2780b69
328e0f5
6263fce
f03fe83
a55edf7
29f8253
bb44f13
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Log Correlation (draft) | ||
|
||
This specification is not done until the key names below are finalized (issue | ||
[#195](https://github.com/census-instrumentation/opencensus-specs/issues/195)). | ||
|
||
Log correlation is a feature that inserts information about the current span into log entries | ||
created by existing logging frameworks. The feature can be used to add more context to log entries, | ||
filter log entries by trace ID, or find log entries associated with a specific trace or span. | ||
|
||
The design of a log correlation implementation depends heavily on the details of the particular | ||
logging framework that it supports. Therefore, this document only covers the aspects of log | ||
correlation that could be shared across log correlation implementations for multiple languages and | ||
logging frameworks. It doesn't cover how to hook into the logging framework. | ||
|
||
## Identifying the span to associate with a log entry | ||
|
||
A log correlation implementation should look up tracing data from the span that is current at the | ||
point of the log statement. See | ||
[Span.md#how-span-interacts-with-context](Span.md#how-span-interacts-with-context) for the | ||
definition of the current span. | ||
|
||
## Tracing data to include in log entries | ||
|
||
A log correlation implementation should make the following pieces of tracing data from the current | ||
span context available in each log entry: | ||
|
||
### Trace ID | ||
|
||
The trace ID of the current span. See [Span.md#traceid](Span.md#traceid). | ||
|
||
### Span ID | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. note it is common for some tooling to attempt to reverse engineer spans from log entries. when this is the case the parent id is also put in the logging context There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought that the parent span ID wasn't accessible through the API, except after export as SpanData. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add a todo to consider this. Until we have a clear request it is not worth doing it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
The span ID of the current span. See [Span.md#spanid](Span.md#spanid). | ||
|
||
### Sampling Decision | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's put a note on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @SergeyKanzhelev do you want to write a one pager explaining the samplingScore, how can be used, why is important to be propagated, etc. For the moment I would not mention it here until we define it. We can leave a TODO to say when decide on sampling score then consider to add it here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added a TODO. |
||
|
||
The sampling bit of the current span, as a boolean. See | ||
[Span.md#supported-bits](Span.md#supported-bits). | ||
|
||
TODO(sebright): Include "samplingScore" once that field is added to the SpanContext. | ||
|
||
TODO(sebright): Add a section on fields from the Tracestate. Users should be able to add | ||
vendor-specific fields from the Tracestate to logs, using a callback mechanism. | ||
|
||
TODO(sebright): Consider adding parent span ID, to allow recreating the trace structure from logs. | ||
|
||
## String format for tracing data | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest to add a section for "Other fields from tracestate" or something like this. So vendors-specific key/value pairs can be added to the logs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it worth mentioning the Tracestate here. Probably every integration should offer a callback mechanism to extract and log data from Tracestate. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about I add a TODO and open an issue for this part? I think it will require more discussion, and this PR already has a lot of comments. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added a TODO for this feature. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I opened #196. |
||
|
||
The logging framework may require the pieces of tracing data to be converted to strings. In that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tracing data or identifiers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean it should say "The logging framework may require the tracing data or identifiers to be converted to strings"? |
||
case, the log correlation implementation should format the trace ID and span ID as lowercase base 16 | ||
and format the sampling decision as "true" or "false". | ||
|
||
## Key names for tracing data | ||
|
||
Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries. When | ||
a log correlation implementation inserts tracing data by that method, the key names should be | ||
"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are very mouthful key names. OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was some discussion about short vs long key names in census-instrumentation/opencensus-java#1371 (comment). I thought that it would be better to use long key names by default, to avoid conflicting with other tracing libraries' key names. Users could then override the key names if they wanted to purposefully use the same key names for, say, Brave and OpenCensus.
I'm happy to change the key names if underscores are more common. Where have you seen underscores used? I saw camel case logging context key names in Brave, Wingtips, and Spring Cloud Sleuth and camel case tracing-related log entry fields in Stackdriver (https://cloud.google.com/logging/docs/agent/configuration#special_fields_in_structured_payloads). The three libraries are all Java, though, so it could easily be Java specific. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be a language-community preference. I saw it on Stackdriver Logging, zap and logrus. https://github.com/uber-go/zap There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 for shorter names and underscores There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the unusual case of conflict, users can always override the defaults. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that makes sense. I initially thought that the language wouldn't matter once the logs were output by the application and that it would be better to keep the log format consistent. However, the logging framework already can have a large effect on the format of the log entries, so differences in key name format between log correlation implementations are unlikely to make log formats less consistent. I mentioned that the case and format of the key names should be consistent with the supported logging framework. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think supporting per language name is not the best option:
Even if we have multiple keys we need to document them here for each language. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I opened an issue for continuing the discussion: #195 |
||
correlation implementation may allow the user to override the tracing data key names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be more specific than "current span". I would say "the trace span in the context when the logged event occurred".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that "current span" needs more explanation. I added a new section and linked to the existing section on the current span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for @Ramonza.