Skip to content

common: introduce stable logger /w examples in DNS #17772

Merged
snowp merged 9 commits intoenvoyproxy:mainfrom
snowp:stable-log
Aug 24, 2021
Merged

common: introduce stable logger /w examples in DNS #17772
snowp merged 9 commits intoenvoyproxy:mainfrom
snowp:stable-log

Conversation

@snowp
Copy link
Contributor

@snowp snowp commented Aug 19, 2021

Adds a new logging macro that allows for logging with a stable identifier. This is intended to allow
a log sink to be able to key off the identifier, simplifying the process of having a log sink react to specific
log messages (e.g. to emit specific events when certain log lines are hit).

The main alternative to this is using some kind of log parsing to understand when a certain log is hit, which ends up being brittle as code is refactored. Using a stable identifier we can ensure via code review that the stable id remains as log lines are moved around, refactored, etc.

This PR includes applying the new log macro to the c-ares DNS resolver to demonstrate it's usage.

Signed-off-by: Snow Pettersen snowp@lyft.com

Risk Level: Medium
Testing: Manual testing, will add coverage
Docs Changes: n/a
Release Notes: n/a (will add)
Platform Specific Features: n/a

Signed-off-by: Snow Pettersen <snowp@lyft.com>
@repokitteh-read-only
Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #17772 was opened by snowp.

see: more, trace.

@snowp
Copy link
Contributor Author

snowp commented Aug 19, 2021

Note that this does not support log levels as it bypasses spdlog. spdlog does not support any form of structured logging, so going through spdlog would mean either 1) extracting out the stable name from the formatted log string (not much better than having a log sink itself try to parse out information) or 2) by having different spdlog loggers set up for each call site (which seemed very heavy handed).

It is also always on as written, even when no log sink is consuming this data. The consequence of this is mostly that the fmt::format is evaluated. I imagine we want to avoid this, but figured I'd put up the PR first before investing the time into figuring out a good solution there.

@snowp snowp mentioned this pull request Aug 19, 2021
@mattklein123 mattklein123 self-assigned this Aug 19, 2021
fix
Signed-off-by: Snow Pettersen <snowp@lyft.com>
@snowp
Copy link
Contributor Author

snowp commented Aug 19, 2021

An alternative API might be something like

ENVOY_LOG_EVENT(name, level, msg) which delegates to ENVOY_LOG(level, msg) while also emitting an ENVOY_STABLE_LOG(name, msg)

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks like it's on the right track. Flushing out some initial comments.

/wait

virtual ~SinkDelegate();

virtual void log(absl::string_view msg) PURE;
virtual void logStable(absl::string_view, absl::string_view) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would probably call this something like logWithStableId or something like that.
Also, I think I would pass the level and component to this function in addition to the stable ID (also can you add parameter names)

*/
#define ENVOY_LOG(LEVEL, ...) ENVOY_LOG_TO_LOGGER(ENVOY_LOGGER(), LEVEL, ##__VA_ARGS__)

#define ENVOY_LOG_STABLE(STABLE_NAME, ...) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per above I still think we want component, level, etc. Also I think for these logs we want them forked to both functions with the stable ID removed for the normal one potentially?

Snow Pettersen added 2 commits August 20, 2021 15:03
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks right to me. @envoyproxy/maintainers thoughts on this? This feels like something we can ship and iterate on.

/wait


ActiveDnsQuery* DnsResolverImpl::resolve(const std::string& dns_name,
DnsLookupFamily dns_lookup_family, ResolveCb callback) {
ENVOY_LOG_EVENT(debug, "cares_dns_resolution_started", "dns resolution for {} started", dns_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent use of "cares" in this event name vs the other names.

@ggreenway
Copy link
Member

What's an example output from one of these logs? I'm not sure I 100% understand the goal, but it seems like not having log-levels will quickly become troublesome.

@snowp
Copy link
Contributor Author

snowp commented Aug 23, 2021

@ggreenway As written there is no new sink handling built in for these logs - the logs are currently sent both to ENVOY_LOG(level, ...) as well as the new logWithStableName function on the sink (which is a noop by default).

What we're trying to do in EM is allow hooking up a custom sink that can intercept these log events and emit analytic events, allowing us to incorporate Envoy events into debugging. This would allow us to correlate sequences of events where we see requests fail with Envoy events, e.g. failing connection setup, DNS timing out etc.

This could also be built up into a more sophisticated event tracing library where we can associated events with each other, so that more sophisticated tooling can be built to analyze or render entire stream lifetimes, but for now this simple "log the event with a stable name" would be a good start for the EM use case.

@ggreenway
Copy link
Member

Got it; that makes sense. SGTM, but please put a big doc comment above the new top-level macro so people know what it is for and when to use it.

Snow Pettersen added 3 commits August 23, 2021 16:50
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
@snowp snowp marked this pull request as ready for review August 23, 2021 18:18
@snowp
Copy link
Contributor Author

snowp commented Aug 23, 2021

I updated this to check the log level for these new events as well, which ended up being pretty simple since can just check the log level of the associated component.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Snow Pettersen added 2 commits August 23, 2021 23:51
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
@snowp snowp merged commit df9cee3 into envoyproxy:main Aug 24, 2021
mpuncel added a commit to mpuncel/envoy that referenced this pull request Aug 25, 2021
* main: (32 commits)
  Stop processing pending H/2 frames if connection transitioned to the closed state
  http2: limit use of deferred resets in the http2 codec to server-side connections
  Abort filter chain iteration on local reply
  Reject or strip fragment from request URI
  ext-authz: merge duplicate headers from client request in check request
  common: introduce stable logger /w examples in DNS  (envoyproxy#17772)
  route: fast return when route matches failed (envoyproxy#17769)
  owners: add owners for dubbo proxy network filter (envoyproxy#17820)
  config/router/tcp_proxy/options: v2 API, boosting and --bootstrap-version CLI removal. (envoyproxy#17724)
  coverage: revert the limit http/cache to 92.6. (envoyproxy#17817)
  network: rename SocketAddressProvider as ConnectionInfoProvider (envoyproxy#17717)
  test: bumping coverage (envoyproxy#17757)
  conn_pool: Minor cleanups to ConnPoolBaseImpl (envoyproxy#17710)
  Split VaryHeader into VaryAllowList and VaryUtils to organize vary-related logic (envoyproxy#17728)
  ext_proc: Make tests more resilient to IPv6 support (envoyproxy#17784)
  Remove invlaid backquote from doc (envoyproxy#17797)
  rocketmq: move to contrib (envoyproxy#17796)
  kafka: upstream kafka facade in mesh-filter (envoyproxy#17783)
  ecds: create shared base class for DynamicFilterConfigProviderImpl (envoyproxy#17735)
  Change log level from debug to trace (envoyproxy#17774)
  ...

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants