Add specification of log correlation for tracing (issue #123). #181

sebright · 2018-09-20T01:02:58Z

The specification only covers aspects of log correlation that are likely to be
shared by log correlation implementations for multiple languages and logging
frameworks. It is based on the experimental log correlation libraries in
opencensus-java:

https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/log_correlation/stackdriver
https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/log_correlation/log4j2

/cc @Ramonza @odeke-em @rghetia

sebright · 2018-09-20T01:07:07Z

/cc @savaki

rakyll · 2018-09-20T19:52:10Z

trace/LogCorrelation.md

+
+Log correlation is a feature that inserts information about the current span into log entries
+created by existing logging frameworks.  The feature can be used to add more context to log entries,
+filter log entries by trace ID, or show log entries as annotations on a trace.


"log entries as annotations on a trace"

This is a possibility but do we want to encourage this usage? I think we should keep it out.

"log entries as annotations on a trace"

This is a possibility but do we want to encourage this usage?

I wanted to provide some background on log correlation to help explain why someone might want to implement it, so I listed all uses that I could think of. I didn't want to encourage any specific way of using the feature over the others. Do you think I should reword it so that it sounds less like a recommendation?

I think we should keep it out.

Why should we avoid mentioning that specific use of log correlation?

I think the assumption that this would be used to decorate traces might be faulty. it would be more likely to have correlated queries especially as tracing uis are not as good at filtering logs (which can be extremely volumnous) than logging uis.

so I agree we should not anchor people with the idea that this will result in log entries as span annotations. eg leave it out or rename to language that is correlation in nature vs the word annotation which is historically the name of a part of a span

I reworded it. Do you think this is clearer, or is this still an uncommon use case?

"find log entries associated with a specific trace or span"

rakyll · 2018-09-20T19:54:25Z

trace/LogCorrelation.md

+
+## String format for tracing data
+
+The logging framework may require the pieces of tracing data to be converted to strings.  In that


Tracing data or identifiers?

Do you mean it should say "The logging framework may require the tracing data or identifiers to be converted to strings"?

rakyll · 2018-09-20T19:58:55Z

trace/LogCorrelation.md

+
+Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries.  When
+a log correlation implementation inserts tracing data by that method, the key names should be
+"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default.  The log


These are very mouthful key names.

OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys.

These are very mouthful key names.

There was some discussion about short vs long key names in census-instrumentation/opencensus-java#1371 (comment).

I thought that it would be better to use long key names by default, to avoid conflicting with other tracing libraries' key names. Users could then override the key names if they wanted to purposefully use the same key names for, say, Brave and OpenCensus.

OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys.

I'm happy to change the key names if underscores are more common. Where have you seen underscores used?

I saw camel case logging context key names in Brave, Wingtips, and Spring Cloud Sleuth and camel case tracing-related log entry fields in Stackdriver (https://cloud.google.com/logging/docs/agent/configuration#special_fields_in_structured_payloads). The three libraries are all Java, though, so it could easily be Java specific.

It could be a language-community preference. I saw it on Stackdriver Logging, zap and logrus.

https://github.com/uber-go/zap
https://github.com/sirupsen/logrus

+1 for shorter names and underscores

In the unusual case of conflict, users can always override the defaults.

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

I think that makes sense. I initially thought that the language wouldn't matter once the logs were output by the application and that it would be better to keep the log format consistent. However, the logging framework already can have a large effect on the format of the log entries, so differences in key name format between log correlation implementations are unlikely to make log formats less consistent.

I mentioned that the case and format of the key names should be consistent with the supported logging framework.

I think supporting per language name is not the best option:

Usually the log backends are not per language. So the backend has to understand all of these combinations.

I agree that size matters here so probably shorter is better (less data written).

I understand that for Java a key like "oc_trace_id" is probably not a good fit. For the moment we have only Java and Go here. Can we get some input from other languages?

Even if we have multiple keys we need to document them here for each language.

I opened an issue for continuing the discussion: #195

semistrict · 2018-09-20T21:56:21Z

trace/LogCorrelation.md

@@ -0,0 +1,63 @@
+# Log Correlation
+
+Log correlation is a feature that inserts information about the current span into log entries


I think we should be more specific than "current span". I would say "the trace span in the context when the logged event occurred".

I agree that "current span" needs more explanation. I added a new section and linked to the existing section on the current span.

+1 for @Ramonza.

semistrict · 2018-09-20T21:58:40Z

trace/LogCorrelation.md

+
+Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries.  When
+a log correlation implementation inserts tracing data by that method, the key names should be
+"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default.  The log


+1 for shorter names and underscores

semistrict · 2018-09-20T22:01:07Z

trace/LogCorrelation.md

+
+Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries.  When
+a log correlation implementation inserts tracing data by that method, the key names should be
+"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default.  The log


In the unusual case of conflict, users can always override the defaults.

codefromthecrypt

glad to see work here. probably a generic comment to apply to all of our specs. let's not overstep and try to create conventions which wont apply to frameworks. we wont be adding value to dictate variable naming conventions. It will be perceived as arbitrary and unnecessary to others, so lets try to focus on dictating only things that we add value doing.

codefromthecrypt · 2018-09-23T14:35:22Z

trace/LogCorrelation.md

+## Tracing data to include in log entries
+
+A log correlation implementation should insert the following pieces of tracing data from the current
+span context into each log entry:


make available in the logging entry. eg in some systems it might not literally be a copy rather sharing in context.

codefromthecrypt · 2018-09-23T14:36:56Z

trace/LogCorrelation.md

+
+The trace ID of the current span.  See [Span.md#traceid](Span.md#traceid).
+
+### Span ID


note it is common for some tooling to attempt to reverse engineer spans from log entries. when this is the case the parent id is also put in the logging context

I thought that the parent span ID wasn't accessible through the API, except after export as SpanData.

Maybe add a todo to consider this. Until we have a clear request it is not worth doing it.

codefromthecrypt · 2018-09-23T14:39:42Z

trace/LogCorrelation.md

+
+Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries.  When
+a log correlation implementation inserts tracing data by that method, the key names should be
+"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default.  The log


I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

codefromthecrypt · 2018-09-23T14:40:40Z

trace/LogCorrelation.md

+"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default.  The log
+correlation implementation may allow the user to override the tracing data key names.
+
+## Deciding when to add tracing data to a log entry


I think this feature is a bit speculative. I would leave it out as I have only seen this practice here. usually it isnt selective.

I removed it for now.

SergeyKanzhelev · 2018-09-24T17:23:29Z

trace/LogCorrelation.md

+
+The span ID of the current span.  See [Span.md#spanid](Span.md#spanid).
+
+### Sampling Decision


Let's put a note on samplingScore storing into the logs. It may be useful if one collects logs only from sampled traces and needs to estimate the actual count assuming probability sampling.

@SergeyKanzhelev do you want to write a one pager explaining the samplingScore, how can be used, why is important to be propagated, etc.

For the moment I would not mention it here until we define it. We can leave a TODO to say when decide on sampling score then consider to add it here.

I added a TODO.

SergeyKanzhelev · 2018-09-24T17:28:28Z

trace/LogCorrelation.md

+The sampling bit of the current span, as a boolean.  See
+[Span.md#supported-bits](Span.md#supported-bits).
+
+## String format for tracing data


I suggest to add a section for "Other fields from tracestate" or something like this. So vendors-specific key/value pairs can be added to the logs.

I think it worth mentioning the Tracestate here. Probably every integration should offer a callback mechanism to extract and log data from Tracestate.

How about I add a TODO and open an issue for this part? I think it will require more discussion, and this PR already has a lot of comments.

I added a TODO for this feature.

I opened #196.

…mentation#123). The specification only covers aspects of log correlation that are likely to be shared by log correlation implementations for multiple languages and logging frameworks. It is based on the experimental log correlation libraries in opencensus-java: https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/log_correlation/stackdriver https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/log_correlation/log4j2

This reverts commit 328e0f5.

sebright · 2018-10-08T23:51:41Z

I reverted the changes to the key names for now, and I'm planning to open an issue for deciding on the key names separately.

bogdandrutu

We have few issues to continue to discuss, but @sebright is no longer in charge for these. Please add a comment at the top of the doc that it is still in progress (draft) and we have to address issues X,Y,Z.

…n#195 needs to be addressed.

sebright · 2018-10-09T03:19:50Z

I labeled the specification as a draft and linked to the issue about deciding on key names. I'll merge it now.

sebright · 2018-10-09T03:46:25Z

I think I addressed all of the comments by updating the PR, adding TODOs, or opening issues, but please let me know if I missed anything. Thanks for the reviews!

sebright added the trace label Sep 20, 2018

sebright assigned codefromthecrypt, bogdandrutu and g-easy Sep 20, 2018

sebright requested review from codefromthecrypt, bogdandrutu and g-easy September 20, 2018 01:02

g-easy approved these changes Sep 20, 2018

View reviewed changes

rakyll reviewed Sep 20, 2018

View reviewed changes

sebright assigned rakyll Sep 20, 2018

semistrict reviewed Sep 20, 2018

View reviewed changes

sebright assigned sebright and unassigned g-easy Sep 21, 2018

codefromthecrypt reviewed Sep 23, 2018

View reviewed changes

SergeyKanzhelev reviewed Sep 24, 2018

View reviewed changes

SergeyKanzhelev suggested changes Sep 24, 2018

View reviewed changes

sebright force-pushed the log-correlation-spec branch 2 times, most recently from e3a77e2 to 8b8146b Compare September 28, 2018 18:11

sebright assigned bogdandrutu, semistrict and SergeyKanzhelev and unassigned bogdandrutu and sebright Sep 28, 2018

sebright added 6 commits October 8, 2018 16:39

Link to log correlation spec from main tracing page.

5b42bdf

Add a section on identifying the current span.

96d6c2a

Remove section on "span selection".

56fb4e0

Reword section on modifying log entries.

3e11867

Reword part about showing log entries associated with a trace.

2780b69

sebright added 4 commits October 8, 2018 16:39

Allow any case or format for context key names.

328e0f5

Add TODO for adding parent span ID.

6263fce

Add TODO for "samplingScore".

f03fe83

Add TODO for adding fields from the Tracestate.

a55edf7

sebright force-pushed the log-correlation-spec branch from 6cab6de to a55edf7 Compare October 8, 2018 23:40

Revert "Allow any case or format for context key names."

29f8253

This reverts commit 328e0f5.

sebright mentioned this pull request Oct 8, 2018

Decide on the specification for key names for log correlation #195

Closed

bogdandrutu approved these changes Oct 9, 2018

View reviewed changes

Label the specification as a draft, since issue census-instrumentatio…

bb44f13

…n#195 needs to be addressed.

sebright merged commit 598cbb8 into census-instrumentation:master Oct 9, 2018

sebright deleted the log-correlation-spec branch October 9, 2018 03:21

sebright mentioned this pull request Oct 9, 2018

Add support for Tracestate fields to the log correlation spec #196

Open

sebright mentioned this pull request Oct 9, 2018

Add support for correlating traces and logs. #123

Closed

sebright mentioned this pull request Nov 13, 2018

Provide a way to override the traceId/spanId/sampled keys in opencensus-contrib-log-correlation-log4j census-instrumentation/opencensus-java#1389

Open


		## String format for tracing data

		The logging framework may require the pieces of tracing data to be converted to strings. In that

		@@ -0,0 +1,63 @@
		# Log Correlation

		Log correlation is a feature that inserts information about the current span into log entries


		The trace ID of the current span. See [Span.md#traceid](Span.md#traceid).

		### Span ID


		The span ID of the current span. See [Span.md#spanid](Span.md#spanid).

		### Sampling Decision

Add specification of log correlation for tracing (issue #123). #181

Add specification of log correlation for tracing (issue #123). #181

Conversation

sebright commented Sep 20, 2018

sebright commented Sep 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codefromthecrypt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sebright commented Oct 8, 2018

bogdandrutu left a comment

Choose a reason for hiding this comment

sebright commented Oct 9, 2018

sebright commented Oct 9, 2018