Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Add specification of log correlation for tracing (issue #123). #181

Merged
merged 12 commits into from
Oct 9, 2018

Conversation

sebright
Copy link
Contributor

The specification only covers aspects of log correlation that are likely to be
shared by log correlation implementations for multiple languages and logging
frameworks. It is based on the experimental log correlation libraries in
opencensus-java:

https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/log_correlation/stackdriver
https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/log_correlation/log4j2

/cc @Ramonza @odeke-em @rghetia

@sebright
Copy link
Contributor Author

/cc @savaki


Log correlation is a feature that inserts information about the current span into log entries
created by existing logging frameworks. The feature can be used to add more context to log entries,
filter log entries by trace ID, or show log entries as annotations on a trace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"log entries as annotations on a trace"

This is a possibility but do we want to encourage this usage? I think we should keep it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"log entries as annotations on a trace"

This is a possibility but do we want to encourage this usage?

I wanted to provide some background on log correlation to help explain why someone might want to implement it, so I listed all uses that I could think of. I didn't want to encourage any specific way of using the feature over the others. Do you think I should reword it so that it sounds less like a recommendation?

I think we should keep it out.

Why should we avoid mentioning that specific use of log correlation?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assumption that this would be used to decorate traces might be faulty. it would be more likely to have correlated queries especially as tracing uis are not as good at filtering logs (which can be extremely volumnous) than logging uis.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I agree we should not anchor people with the idea that this will result in log entries as span annotations. eg leave it out or rename to language that is correlation in nature vs the word annotation which is historically the name of a part of a span

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded it. Do you think this is clearer, or is this still an uncommon use case?

"find log entries associated with a specific trace or span"


## String format for tracing data

The logging framework may require the pieces of tracing data to be converted to strings. In that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracing data or identifiers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean it should say "The logging framework may require the tracing data or identifiers to be converted to strings"?


Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries. When
a log correlation implementation inserts tracing data by that method, the key names should be
"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very mouthful key names.

OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very mouthful key names.

There was some discussion about short vs long key names in census-instrumentation/opencensus-java#1371 (comment).

I thought that it would be better to use long key names by default, to avoid conflicting with other tracing libraries' key names. Users could then override the key names if they wanted to purposefully use the same key names for, say, Brave and OpenCensus.

OTOH, underscore is often preferred over camel case in log labels. We might want to reconsider these keys.

I'm happy to change the key names if underscores are more common. Where have you seen underscores used?

I saw camel case logging context key names in Brave, Wingtips, and Spring Cloud Sleuth and camel case tracing-related log entry fields in Stackdriver (https://cloud.google.com/logging/docs/agent/configuration#special_fields_in_structured_payloads). The three libraries are all Java, though, so it could easily be Java specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a language-community preference. I saw it on Stackdriver Logging, zap and logrus.

https://github.com/uber-go/zap
https://github.com/sirupsen/logrus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for shorter names and underscores

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the unusual case of conflict, users can always override the defaults.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

I think that makes sense. I initially thought that the language wouldn't matter once the logs were output by the application and that it would be better to keep the log format consistent. However, the logging framework already can have a large effect on the format of the log entries, so differences in key name format between log correlation implementations are unlikely to make log formats less consistent.

I mentioned that the case and format of the key names should be consistent with the supported logging framework.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think supporting per language name is not the best option:

  1. Usually the log backends are not per language. So the backend has to understand all of these combinations.
  2. I agree that size matters here so probably shorter is better (less data written).
  3. I understand that for Java a key like "oc_trace_id" is probably not a good fit. For the moment we have only Java and Go here. Can we get some input from other languages?

Even if we have multiple keys we need to document them here for each language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue for continuing the discussion: #195

@@ -0,0 +1,63 @@
# Log Correlation

Log correlation is a feature that inserts information about the current span into log entries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be more specific than "current span". I would say "the trace span in the context when the logged event occurred".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that "current span" needs more explanation. I added a new section and linked to the existing section on the current span.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for @Ramonza.


Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries. When
a log correlation implementation inserts tracing data by that method, the key names should be
"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for shorter names and underscores


Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries. When
a log correlation implementation inserts tracing data by that method, the key names should be
"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the unusual case of conflict, users can always override the defaults.

@sebright sebright assigned sebright and unassigned g-easy Sep 21, 2018
Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad to see work here. probably a generic comment to apply to all of our specs. let's not overstep and try to create conventions which wont apply to frameworks. we wont be adding value to dictate variable naming conventions. It will be perceived as arbitrary and unnecessary to others, so lets try to focus on dictating only things that we add value doing.

## Tracing data to include in log entries

A log correlation implementation should insert the following pieces of tracing data from the current
span context into each log entry:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make available in the logging entry. eg in some systems it might not literally be a copy rather sharing in context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


The trace ID of the current span. See [Span.md#traceid](Span.md#traceid).

### Span ID

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note it is common for some tooling to attempt to reverse engineer spans from log entries. when this is the case the parent id is also put in the logging context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the parent span ID wasn't accessible through the API, except after export as SpanData.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a todo to consider this. Until we have a clear request it is not worth doing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


Some logging frameworks allow the insertion of arbitrary key-value pairs into log entries. When
a log correlation implementation inserts tracing data by that method, the key names should be
"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should defer to language defaults. it would be weird to not use lower camel in most java logging formats (though not all). so we can make examples in a case format and caveat that appropriate use will occur when applied to a logging framework

"opencensusTraceId", "opencensusSpanId", and "opencensusTraceSampled" by default. The log
correlation implementation may allow the user to override the tracing data key names.

## Deciding when to add tracing data to a log entry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this feature is a bit speculative. I would leave it out as I have only seen this practice here. usually it isnt selective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it for now.


The span ID of the current span. See [Span.md#spanid](Span.md#spanid).

### Sampling Decision
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put a note on samplingScore storing into the logs. It may be useful if one collects logs only from sampled traces and needs to estimate the actual count assuming probability sampling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKanzhelev do you want to write a one pager explaining the samplingScore, how can be used, why is important to be propagated, etc.

For the moment I would not mention it here until we define it. We can leave a TODO to say when decide on sampling score then consider to add it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO.

The sampling bit of the current span, as a boolean. See
[Span.md#supported-bits](Span.md#supported-bits).

## String format for tracing data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add a section for "Other fields from tracestate" or something like this. So vendors-specific key/value pairs can be added to the logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth mentioning the Tracestate here. Probably every integration should offer a callback mechanism to extract and log data from Tracestate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I add a TODO and open an issue for this part? I think it will require more discussion, and this PR already has a lot of comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO for this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #196.

@sebright sebright force-pushed the log-correlation-spec branch 2 times, most recently from e3a77e2 to 8b8146b Compare September 28, 2018 18:11
@sebright
Copy link
Contributor Author

sebright commented Oct 8, 2018

I reverted the changes to the key names for now, and I'm planning to open an issue for deciding on the key names separately.

Copy link
Contributor

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have few issues to continue to discuss, but @sebright is no longer in charge for these. Please add a comment at the top of the doc that it is still in progress (draft) and we have to address issues X,Y,Z.

@sebright
Copy link
Contributor Author

sebright commented Oct 9, 2018

I labeled the specification as a draft and linked to the issue about deciding on key names. I'll merge it now.

@sebright sebright merged commit 598cbb8 into census-instrumentation:master Oct 9, 2018
@sebright sebright deleted the log-correlation-spec branch October 9, 2018 03:21
@sebright
Copy link
Contributor Author

sebright commented Oct 9, 2018

I think I addressed all of the comments by updating the PR, adding TODOs, or opening issues, but please let me know if I missed anything. Thanks for the reviews!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants