Add hooks for distributed tracing #201

Flarna · 2019-11-25T14:33:46Z

Not sure if it is the correct time to start this topic as there seem to be quite a lot other todos open which are more important. But I think it can't hurt to start a discussion early regarding this but feel free to defer this or move it to a better place.

Is your feature request related to a problem? Please describe.
Support for distributed tracing (e.g. OpenTelemetry, APMs) requires monkey patching at this time. I as thinking about adding hooks in core intended for tools like OpenTelemetry and APMs to get data they need and to inject required data.

Describe the solution you'd like
Monitoring tools like OpenTelemetry, APMs,... use monkey patching since a long time to hook into various libraries to gather the data they need and to inject their tags to link transactions of distributed applications.
I think core could include hooks to provide the data needed by such tools to avoid (or at least reduce) the need of monkey patching.
I think having hooks on HTTP/3 is most likely more important as hooks on Quic.

A fast guess of which hooks would be helpful:

A static hook to notify about creation of new sockets
life cycle events on sessions (including topology information like remote address/port/udp4/6,...)
hooks to record and modify outgoing headers (e.g. to inject a unique tag identifying the transaction)
hooks to record incoming headers, maybe even modify them to hide tags it from application
a more advance feature to inject e.g. a JS tag would require to intercept/monitor the data stream returned by a server. For sure such a hook should be installed per stream only (if supported at all)

It's also important to allow these tools to associate their context to the async transactions involved (from high level HTTP/3 protocol point of view). As a http request is mapped to a stream this will most likely work out of the box by using async hooks.

Describe alternatives you've considered
Keep using monkey patching. Ensure that exposed public API allows to capture the needed information if monkey patched.

addaleax · 2019-11-26T13:56:31Z

I’m a fan of having hooks for these kinds of things … is there any chance you could put together a proposal for how an API for this should look like?

jasnell · 2019-11-26T14:17:53Z

I'm also happy to have these built in. A couple comments tho...

I'm not sure the "static hook to notify about creation of new sockets" makes sense in this case. The "socket" in QUIC is just a local UDP binding that user code has to create -- therefore, user code is presumably already aware when they are created.
The lifecycle events on sessions should be fairly straight forward as we already have most of that information available in the existing ready and secure events. Those can evolve a bit and we can make sure they are providing enough information.
The hooks to record and modify outgoing headers is interesting for sure but would also have a performance impact. What I'm thinking is an event that can be triggered just immediately before the outgoing headers are serialized for internal handling that allows them to be augmented... e.g.

// headers is the actual block of headers, type is an indication of
// whether they are informational, initial, or trailing headers
stream.on('outgoingHeaders', (headers, type) => {
  headers['NEW_HEADER_TO_ADD'] = 'trackingID'
});

Keep in mind, however, that with QUIC, header support is only enabled per Application protocol. Not all QUIC protocols will support headers so this will be best effort at best. It would always work when using HTTP/3, however.

This approach is something we can easily backport to the http/2 and http/1 implementations as well, although with http/1 it would be a bit more difficult to support consistently due to how headers are handled there.

I'm not sure about the need for an inbound header hook in the low level API, but if it is needed, it can be handled as stream.on('inboundHeaders', (headers, type) => {}).

I really don't think we should provide any special API or hook for intercepting the data flow. That's precisely why we have Transforms in the streams API already.

Flarna · 2019-11-26T15:28:56Z

Thanks for your answers. Sure, I can draft a more concrete API (but will take a while...)

Regarding static hook and use of transform in stream. I agree that these are not needed if monitoring would be implemented as part of the application. But the typical use case is that you have an existing application and an additional, independent module from some other vendor is preloaded to add monitoring (e.g. via -r command line argument). This monitoring module needs to get the information about new sockets to install the followup listeners it uses to record transactions and to transform the data.

And sure, monitoring like this has an overhead. The overhead should be mostly the same if hooks are used compared to monkey patching.

Flarna · 2019-12-06T13:39:41Z

The QUIC API looks quite similar as the HTTP/2 API but has still some ToDos inside therefore I think it's a good starting point to describe how we capture data from HTTP/2.

Server:

monkey patch createSecureServer and createSecureServer to get an Http2Server/Http2SecureServer (as they are not accessible via public API).
wrap "stream" listeners on Http2Server/Http2SecureServer
monkey patch ServerHttp2Stream.prototype.respond(), .respondWithFile() and .respondWithFD() once we see a ServerHttp2Stream.
install "error"/"close" listeners on each stream
for POST and content-type application/x-www-form-urlencoded we wrap/path incoming data stream

Captured data:

a transaction is started once a "stream" event is emitted.
capture incoming headers, local bound socket info, remote socket info
its vital to get incoming headers at the beginning to allow reasonable smart sampling decisions/filtering
via monkey patched respondXXX() APIs we capture response headers
if applicable capture request parameters from x-www-form-urlencoded body
close/error events are used to end transaction and capture error.stack if applicable

Advance features (not yet in use for HTTP/2 but in HTTP):

modify response headers
modify response body

I think we could get the same functionality with following hooks:

incoming stream is created. Arguments: the stream object, headers, async id/resource
response is started. Arguments: the stream object, response headers
stream is closed. Arguments: the stream object, reason/error

Client:

monkey patch http2.connect to get access to ClientHttp2Session
monkey patch ClientHttp2Session.request to get ClientHttp2Stream
wrap/install "response", "close" and "error" events on ClientHttp2Stream

Captured data:

transaction is started with ClientHttp2Session.request
outgoing headers (tricky part are protocol and authority as they are stored private in session)
outgoing headers are also modified to add a tag for transaction linking
response headers in "response" listener
close/error events are used to end transaction and capture error.stack if applicable

I think we could get the same functionality with following hooks:

client stream is created. Arguments: the stream object, headers, async id/resource
response is received. Arguments: the stream object, response headers
stream is closed. Arguments: the stream object, reason/error

Context propagation
There are also some details relevant regarding async hooks.
Currently we wrap listeners like "stream" which allows us to propagate our current context to any async operation created. With hooks as indicated above it's not that easy. Think about following sample:

function onStream(stream) {   // incoming stream is a new transaction (1)
  http3.request(firstUrl, cb1);   // outgoing request is a new child transaction (A) linked to (1)
  process.nextTick(someWork); // no new transaction, propagate context (1)
  http3.request(secondUrl, cb2); // outgoing request is a new child transaction (B) linked to (1)
}

I think this can work with above hooks if following conditions are met:

a HTTP/3 request/stream is associated with one async_id which is used in all callbacks/listeners
the hooks which indicates that a new transaction starts provides this async_id to allow to propagate the new transaction context instead that one from executing async_id

Flarna mentioned this issue Mar 1, 2020

async_hooks: add sync enterWith to async storage nodejs/node#31945

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hooks for distributed tracing #201

Add hooks for distributed tracing #201

Flarna commented Nov 25, 2019

addaleax commented Nov 26, 2019

jasnell commented Nov 26, 2019

Flarna commented Nov 26, 2019

Flarna commented Dec 6, 2019

Add hooks for distributed tracing #201

Add hooks for distributed tracing #201

Comments

Flarna commented Nov 25, 2019

addaleax commented Nov 26, 2019

jasnell commented Nov 26, 2019

Flarna commented Nov 26, 2019

Flarna commented Dec 6, 2019