-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry support #104
Conversation
I would suggest also asking @david-luna and @trentm for a review. |
Add test for OpenTelemetry error tracking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PR looks good. One minor issue I think we should discuss is there is a package which holds attribute names and values (if they are enums). The package is @opentelemetry/semantic-conventions
.
I'm hesitant to request for changes since that package is in the process of a major update and may probably break this instrumentation if we use the package an try to update. Another option is tu use hardcoded strings for now and use the package when ready.
In both cases we I think wee need a tracking issue to action when the new version of the semantic conventions package is ready.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a first pass. I will look again a bit more this afternoon.
package.json
Outdated
@@ -65,10 +66,15 @@ | |||
"tslib": "^2.4.0", | |||
"undici": "^6.12.0" | |||
}, | |||
"peerDependencies": { | |||
"@opentelemetry/api": "1.x", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be an entry in "dependencies" for @opentelemetry/api
, as it is imported/required by the using code?
I suppose using peerDeps typically works for modern npm, as peerDeps are installed by default (https://docs.npmjs.com/cli/v10/configuring-npm/package-json#peerdependencies). I say "typically" because some projects unfortunately use the legacy-peer-deps
npm config option (https://docs.npmjs.com/cli/v10/using-npm/config#legacy-peer-deps) which results in peerDeps being ignored. The result, in this case, is an @elastic/transport
that breaks. Here is a smallish repro of that breakage:
% cat package.json
{
"name": "asdf.20240624t094529",
"version": "1.0.0",
"dependencies": {
"@elastic/transport": "git+https://github.com/elastic/elastic-transport-js.git#otel"
}
}
% npm install --legacy-peer-deps
...
% npm ls -a
npm ERR! code ELSPROBLEMS
npm ERR! invalid: @elastic/[email protected] /Users/trentm/tmp/asdf.20240624T094529/node_modules/@elastic/transport
npm ERR! missing: @opentelemetry/[email protected], required by @elastic/[email protected]
npm ERR! missing: @opentelemetry/[email protected], required by @elastic/[email protected]
[email protected] /Users/trentm/tmp/asdf.20240624T094529
└─┬ @elastic/[email protected] invalid: "git+https://github.com/elastic/elastic-transport-js.git#otel" from the root project
├── UNMET DEPENDENCY @opentelemetry/[email protected]
├── UNMET DEPENDENCY @opentelemetry/[email protected]
├─┬ [email protected]
│ └── [email protected]
├── [email protected]
├── [email protected]
├── [email protected]
├── [email protected]
└── [email protected]
...
Here I manually copy in the built JS code from a git working copy in a separate dir:
% rsync -av ~/el/elastic-transport-js2/lib/ node_modules/@elastic/transport/lib/
Then importing the package fails:
% node
Welcome to Node.js v18.18.2.
Type ".help" for more information.
> require('@elastic/transport')
Uncaught Error: Cannot find module '@opentelemetry/api'
Require stack:
- /Users/trentm/tmp/asdf.20240624T094529/node_modules/@elastic/transport/lib/Transport.js
- /Users/trentm/tmp/asdf.20240624T094529/node_modules/@elastic/transport/index.js
- <repl>
at Module._resolveFilename (node:internal/modules/cjs/loader:1077:15)
at Module._load (node:internal/modules/cjs/loader:922:27)
at Module.require (node:internal/modules/cjs/loader:1143:19)
at require (node:internal/modules/cjs/helpers:119:18) {
code: 'MODULE_NOT_FOUND',
requireStack: [
'/Users/trentm/tmp/asdf.20240624T094529/node_modules/@elastic/transport/lib/Transport.js',
'/Users/trentm/tmp/asdf.20240624T094529/node_modules/@elastic/transport/index.js',
'<repl>'
]
}
I don't think there is a significant downside in taking the direct dependency on @opentelemetry/api
, is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I'd initially intended to put the SDK in as a peer dependency, but forgot to clean that up. Fixed in ec782d6.
Co-authored-by: Trent Mick <[email protected]>
I've opened #108 and assigned it to myself to remind me to circle back to this later. Feel free to add any useful context to that issue if I missed anything. |
@trentm thanks so much for all the useful feedback! I'd never used the OTel API before so your expertise is much appreciated. Please take another look and let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Something to consider for later: configurability of the OTel instrumentation.
Similar to how the older diagnostics events can be configured via TransportOptions.diagnostic
, there are a couple things that might be nice to be configurable:
- A way to disable this OTel instrumentation. Users of an OTel SDK will be somewhat used to the ability to disable particular instrumentations. Usually this config var is called
enabled
, in the context of an instrumentation. - A way to suppress child HTTP spans that will be created under the Elasticsearch spans. There are a few existing OTel instrumentations that have this same situation, e.g.: instrumentation-aws-sdk, instrumentation-mongoose. Typically this config var is a boolean called
suppressInternalInstrumentation
. E.g. https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/plugins/node/opentelemetry-instrumentation-aws-sdk/README.md#aws-sdk-instrumentation-options
AFAIK there aren't any other Node.js libraries that have native OTel instrumentation like is being added here, so there isn't prior art on exactly what to name OTel-related config. Perhaps having:
export interface TransportOptions {
opentelemetry?: {
enabled?: boolean;
suppressInternalInstrumentation?: boolean
}
Should I create a separate issue for this?
That would be fantastic, thank you! As much context and prior art you can provide would be super helpful for me. This PR was definitely meant as a "bare minimum" OTel implementation to have something ready in time for 8.15, so there is plenty of opportunity to make enhancements down the road. |
} | ||
} | ||
|
||
return await this[kOtelTracer].startActiveSpan(params.meta.name, { attributes, kind: SpanKind.CLIENT }, async (otelSpan: Span) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoshMock Hi there!
Why is elastic creating an otel span by default without Otel being activated/used? 🤔
I have not used --require '@opentelemetry/auto-instrumentations-node/register'
but I can see while debugging that the otel spans are getting created.
Could you please explain to me why elastic is approaching this? Thanks so much!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the OpenTelemetry docs, the code will create no-op traces if the OpenTelemetry SDK has not been initialized. They'll still show up in the stack trace, but they won't go anywhere.
If you need to disable OTel collection for this process, you can set the environment variable OTEL_SDK_DISABLED=true
.
I hadn't followed up on creating issues for the things I mentioned above in #104 (review) @kirrg001 Having a disable option in the Elasticsearch client config would allow a way to disable Elasticsearch OTel instrumentation without having to fully disable the SDK, or add a SpanProcess or something to handle dropping those spans if they are undesired. |
I take that back. I had followed up: the feature request issue is here: elastic/elasticsearch-js#2299 |
@kirrg001 The argument for having OTel instrumentation directly in a given library (so-called "native" instrumentation) is in the top-section here: https://opentelemetry.io/docs/concepts/instrumentation/libraries/ |
Adds automatic instrumentation for OpenTelemetry, tracking the lifecycle of each Elasticsearch request. Follows all documented semantic conventions for Elasticsearch, excluding
db.query.text
, which we do not have a simple way to sanitize.If someone wants to start shipping request spans to an OpenTelemetry endpoint without making any code changes, they must:
Add
@opentelemetry/api
and@opentelemetry/auto-instrumentations-node
as Node.js dependenciesAdd appropriate environment variable values for
OTEL_EXPORTER_OTLP_ENDPOINT
,OTEL_EXPORTER_OTLP_HEADERS
,OTEL_RESOURCE_ATTRIBUTES
andOTEL_SERVICE_NAME
require
the auto instrumentation registration package at run time:Full documentation will be included in a separate PR to elastic/elasticsearch-js soon.
This change will depend on an improvement to elastic/elasticsearch-js to include an optional
meta
object when callingtransport.request(...)
, which will include both the endpoint name (db.operation.name
) and any dynamic values in the path (db.elasticsearch.path_parts.<key>
).See elastic/elasticsearch-js#2267