-
Notifications
You must be signed in to change notification settings - Fork 230
fix: mongodb async context fix #3657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/instrumentation/index.js
Outdated
@@ -115,6 +116,8 @@ const IITM_MODULES = { | |||
https: { instrumentImportMod: true }, | |||
ioredis: { instrumentImportMod: false }, | |||
knex: { instrumentImportMod: false }, | |||
// XXX: check this | |||
// mongodb: { instrumentImportMod: false }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewer: this is a work in progress though I do not mind to have it in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding ESM in a separate PR would be helpful. Keeping this PR to a small change for using AsyncResource.bind might be helpful to compare with an equivalent OTel change.
lib/instrumentation/index.js
Outdated
this._patches.keys, | ||
// XXX: found this was not enabled so our redis instrumentations | ||
// may not be working as for now | ||
{ internals: true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewer: check the comment and give feedback please. Was okay to pass the new option? what about other instrumentations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We weren't using internals: true
before the changes adding ESM support:
- this._hook = new Hook(this._patches.keys, function (exports, name, basedir) {
+ this._ritmHook = new RitmHook(this._patches.keys, function (exports, name, basedir) {
so I don't think this is what internals is about. I'm not positive, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internals: true
isn't needed if the sub-module path, e.g. @redis/client/dist/lib/client
, is an entry in the first argument to RitmHook
-- i.e. if it is on the explicit list of modules to hook into. If I add these debug-prints to RITM:
--- /Users/trentm/el/require-in-the-middle/index.js 2023-07-13 15:32:26
+++ ./node_modules/require-in-the-middle/index.js 2023-10-03 09:26:14
@@ -126,6 +126,7 @@
const self = this
const patching = new Set()
const internals = options ? options.internals === true : false
+ console.log('XXX ritm: internals=%s', internals);
const hasWhitelist = Array.isArray(modules)
debug('registering require hook')
@@ -204,6 +205,7 @@
}
moduleName = filename
} else if (hasWhitelist === true && modules.includes(filename)) {
+ console.log('XXX ritm abspath in allowlist: hasWhitelist=%s modules.includes(filename)=%s filename=%s', hasWhitelist, modules.includes(filename), filename);
// whitelist includes the absolute path to the file including extension
const parsedPath = path.parse(filename)
moduleName = parsedPath.name
@@ -227,6 +229,7 @@
if (hasWhitelist === true && modules.includes(moduleName) === false) {
if (modules.includes(fullModuleName) === false) return exports // abort if module name isn't on whitelist
+ console.log('XXX ritm allowlisted sub-module: moduleName=%s fullModuleName=%s filename=%s', moduleName, fullModuleName, filename)
// if we get to this point, it means that we're requiring a whitelisted sub-module
moduleName = fullModuleName
} else {
diff --git a/lib/instrumentation/index.js b/lib/instrumentation/index.js
index b90152f7..29e4bbb7 100644
--- a/lib/instrumentation/index.js
+++ b/lib/instrumentation/index.js
@@ -363,6 +363,7 @@ Instrumentation.prototype._startHook = function () {
name,
basedir,
) {
+ console.log('XXX RitmHook called:', name, basedir);
const enabled = self._isModuleEnabled(name);
var pkg, version;
and run examples/trace-redis.js:
% node examples/trace-redis.js
{"log.level":"info","@timestamp":"2023-10-03T16:30:11.754Z","log":{"logger":"elastic-apm-node"},"agentVersion":"4.0.0","env":{"pid":19696,"proctitle":"node","os":"darwin 22.6.0","arch":"x64","host":"pink.local","timezone":"UTC-0700","runtime":"Node.js v16.20.2"},"config":{"serviceName":{"source":"start","value":"example-trace-redis4","commonName":"service_name"},"serviceVersion":{"source":"default","value":"4.0.0","commonName":"service_version"},"serverUrl":{"source":"default","value":"http://127.0.0.1:8200/","commonName":"server_url"},"logLevel":{"source":"default","value":"info","commonName":"log_level"},"spanCompressionEnabled":{"source":"start","value":false}},"activationMethod":"require","ecs":{"version":"1.6.0"},"message":"Elastic APM Node.js Agent v4.0.0"}
XXX ritm: internals=false
XXX ritm allowlisted sub-module: moduleName=@redis/client fullModuleName=@redis/client/dist/lib/client/commands-queue filename=/Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client/dist/lib/client/commands-queue.js
XXX RitmHook called: @redis/client/dist/lib/client/commands-queue /Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client
XXX RitmHook called: generic-pool /Users/trentm/el/apm-agent-nodejs/node_modules/generic-pool
XXX ritm allowlisted sub-module: moduleName=@redis/client fullModuleName=@redis/client/dist/lib/client filename=/Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client/dist/lib/client/index.js
XXX RitmHook called: @redis/client/dist/lib/client /Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client
XXX RitmHook called: redis /Users/trentm/el/apm-agent-nodejs/node_modules/redis
PING res: PONG
SET res: OK
GET res: bar
MULTI res: [ 'OK', 'eggs' ]
The RITM opts.internals
value is only used later for internal module paths that are not on the explicitly set of modules to hook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not get those logs for mongodb
and I think I know why. The last log inside ./node_modules/require-in-the-middle/index.js
is inside a conditional
if (hasWhitelist === true && modules.includes(moduleName) === false) {
if (modules.includes(fullModuleName) === false) return exports // abort if module name isn't on whitelist
console.log('XXX ritm allowlisted sub-module: moduleName=%s fullModuleName=%s filename=%s', moduleName, fullModuleName, filename)
// more code here ...
}
In my case
- moduleName === 'mongodb'
- fullModuleName === 'mongodb/lib/cmap/connection_pool'
The expression allows to pass sub-modules only if the root module is not in the whitelist. I this case the agent is instrumenting the main module and one sub-module. I think it works on @redis
because its instrumention only sub-modules.
I see two options here. Let me know what you think
- change the instrumentation and go for sub-modules (need to check if feasible)
- make use of the
internals
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- make use of the
internals
option.
I'd prefer to avoid this. It will mean that we will be calling our onRequire
hook function for every require'd file under each of the modules we instrument (every lib file in ioredis, in express, in fastify, etc.). There may be a perf impact of this.
Option 3: Remove that limitation in RITM.
I think this is a bug in RITM that you cannot do this. Having 'mongodb' on the list of modules to patch shouldn't mean that you cannot patch mongodb/lib/cmap/connection_pool
. I'll have to stare at this longer... the if/else's in RITM make my eyes swim a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I still think this is a RITM bug. We should open an issue about it. However, I don't want to fix that right now.
- I've changed my mind. I think we should use the
internals: true
and name our patches of sub-module files with the trailing.js
-- i.e.mongodb/lib/cmap/connection_pool.js
instead of without the.js
. I have started implementing this (along with implementing Allow Logical Grouping of Instrumentation by Name #2992 so thatdisableInstrumentations: 'mongodb'
would suffice to disable both instrumentation ofmongodb
andmongodb/lib/cmap/connection_pool.js
.
I haven't finished this yet. There is also a potential hiccup in maintaining full support for https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html#apm-add-patch I think it'll be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- That's okay. There is no need to rush in a fix there and we still have to check what's the impact since no one hasn't noticed.
- I've done the changes to have sub-modules with the
.js
extension. I've noticed our Hook is reading thepackage.json
file each time is called so enabling internals will do many reads. To avoid this I made use of aMap
to cache versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR solves our use case for instrumenting module and sub-module at the same time. which is what we need for mongodb
.
We still have to look at RITM but since the issue is only reported by us I think we have no rush.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async context fix is a little lost in the 2000 lines of change due to the ESM work, the mongodb.js move and the test file refactoring. It would be helpful to have these in 3 separate PRs. (However, I hate that that is a pain because of our review policies.)
lib/instrumentation/index.js
Outdated
@@ -115,6 +116,8 @@ const IITM_MODULES = { | |||
https: { instrumentImportMod: true }, | |||
ioredis: { instrumentImportMod: false }, | |||
knex: { instrumentImportMod: false }, | |||
// XXX: check this | |||
// mongodb: { instrumentImportMod: false }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding ESM in a separate PR would be helpful. Keeping this PR to a small change for using AsyncResource.bind might be helpful to compare with an equivalent OTel change.
lib/instrumentation/index.js
Outdated
this._patches.keys, | ||
// XXX: found this was not enabled so our redis instrumentations | ||
// may not be working as for now | ||
{ internals: true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internals: true
isn't needed if the sub-module path, e.g. @redis/client/dist/lib/client
, is an entry in the first argument to RitmHook
-- i.e. if it is on the explicit list of modules to hook into. If I add these debug-prints to RITM:
--- /Users/trentm/el/require-in-the-middle/index.js 2023-07-13 15:32:26
+++ ./node_modules/require-in-the-middle/index.js 2023-10-03 09:26:14
@@ -126,6 +126,7 @@
const self = this
const patching = new Set()
const internals = options ? options.internals === true : false
+ console.log('XXX ritm: internals=%s', internals);
const hasWhitelist = Array.isArray(modules)
debug('registering require hook')
@@ -204,6 +205,7 @@
}
moduleName = filename
} else if (hasWhitelist === true && modules.includes(filename)) {
+ console.log('XXX ritm abspath in allowlist: hasWhitelist=%s modules.includes(filename)=%s filename=%s', hasWhitelist, modules.includes(filename), filename);
// whitelist includes the absolute path to the file including extension
const parsedPath = path.parse(filename)
moduleName = parsedPath.name
@@ -227,6 +229,7 @@
if (hasWhitelist === true && modules.includes(moduleName) === false) {
if (modules.includes(fullModuleName) === false) return exports // abort if module name isn't on whitelist
+ console.log('XXX ritm allowlisted sub-module: moduleName=%s fullModuleName=%s filename=%s', moduleName, fullModuleName, filename)
// if we get to this point, it means that we're requiring a whitelisted sub-module
moduleName = fullModuleName
} else {
diff --git a/lib/instrumentation/index.js b/lib/instrumentation/index.js
index b90152f7..29e4bbb7 100644
--- a/lib/instrumentation/index.js
+++ b/lib/instrumentation/index.js
@@ -363,6 +363,7 @@ Instrumentation.prototype._startHook = function () {
name,
basedir,
) {
+ console.log('XXX RitmHook called:', name, basedir);
const enabled = self._isModuleEnabled(name);
var pkg, version;
and run examples/trace-redis.js:
% node examples/trace-redis.js
{"log.level":"info","@timestamp":"2023-10-03T16:30:11.754Z","log":{"logger":"elastic-apm-node"},"agentVersion":"4.0.0","env":{"pid":19696,"proctitle":"node","os":"darwin 22.6.0","arch":"x64","host":"pink.local","timezone":"UTC-0700","runtime":"Node.js v16.20.2"},"config":{"serviceName":{"source":"start","value":"example-trace-redis4","commonName":"service_name"},"serviceVersion":{"source":"default","value":"4.0.0","commonName":"service_version"},"serverUrl":{"source":"default","value":"http://127.0.0.1:8200/","commonName":"server_url"},"logLevel":{"source":"default","value":"info","commonName":"log_level"},"spanCompressionEnabled":{"source":"start","value":false}},"activationMethod":"require","ecs":{"version":"1.6.0"},"message":"Elastic APM Node.js Agent v4.0.0"}
XXX ritm: internals=false
XXX ritm allowlisted sub-module: moduleName=@redis/client fullModuleName=@redis/client/dist/lib/client/commands-queue filename=/Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client/dist/lib/client/commands-queue.js
XXX RitmHook called: @redis/client/dist/lib/client/commands-queue /Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client
XXX RitmHook called: generic-pool /Users/trentm/el/apm-agent-nodejs/node_modules/generic-pool
XXX ritm allowlisted sub-module: moduleName=@redis/client fullModuleName=@redis/client/dist/lib/client filename=/Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client/dist/lib/client/index.js
XXX RitmHook called: @redis/client/dist/lib/client /Users/trentm/el/apm-agent-nodejs/node_modules/@redis/client
XXX RitmHook called: redis /Users/trentm/el/apm-agent-nodejs/node_modules/redis
PING res: PONG
SET res: OK
GET res: bar
MULTI res: [ 'OK', 'eggs' ]
The RITM opts.internals
value is only used later for internal module paths that are not on the explicitly set of modules to hook.
I'm okay with leaving the implementation where it was. I'll put it back in a new commit |
lib/instrumentation/index.js
Outdated
@@ -412,22 +419,26 @@ Instrumentation.prototype._startHook = function () { | |||
}; | |||
|
|||
Instrumentation.prototype._getPackageVersion = function (modName, modBaseDir) { | |||
const modPath = path.join(modBaseDir, 'package.json'); | |||
if (this._versionsCache.has(modPath)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewer: both RITM and IITM hooks use this function and will get cached result if they request many times the version for the same pacakge.json
file
IMO this PR should wait until #2992 is done, so I'll leave it as a draft |
This refactors how module patch handlers are loaded and the Instrumentation class file. There is now a PatcherRegistry that is a little smarter than the `_patches` object before it. The main change is that we now use RITM's internals:true option so that our `onRequire` hook is called with module sub-paths. This will allow #3657 to hook both "mongodb" and "mongodb/lib/cmap/connection_pool.js", and avoid a RITM bug where "mongodb" & "mongodb/lib/cmap/connection_pool" (no extension, indicating a "sub-module") don't work together. Benefits and other changes: - With the smarter "PatcherRegistry" the module name can be used in `disableInstrumentations` to disable any patchers for that module. E.g. `disableInstrumentations=next` will disable all four `next`-related patchers. - A custom "lambda" name is now usable to disable the Lambda instr, FWIW. - (Note: we *could* also add custom disableInstrumentations keys for other modules, e.g. perhaps "aws-sdk" for all those.) - Some edge cases with the (rare, and undocumented) `config.addPatch` config var have been cleaned up. - Loading of the lambda-handler has been refactored a bit to no longer need to know internal details of the instrumentation module. Refs: #3657 Closes: #2992
This refactors how module patch handlers are loaded and the Instrumentation class file. There is now a PatcherRegistry that is a little smarter than the `_patches` object before it. The main change is that we now use RITM's internals:true option so that our `onRequire` hook is called with module sub-paths. This will allow #3657 to hook both "mongodb" and "mongodb/lib/cmap/connection_pool.js", and avoid a RITM bug where "mongodb" & "mongodb/lib/cmap/connection_pool" (no extension, indicating a "sub-module") don't work together. Benefits and other changes: - With the smarter "PatcherRegistry" the module name can be used in `disableInstrumentations` to disable any patchers for that module. E.g. `disableInstrumentations=next` will disable all four `next`-related patchers. - A custom "lambda" name is now usable to disable the Lambda instr, FWIW. - (Note: we *could* also add custom disableInstrumentations keys for other modules, e.g. perhaps "aws-sdk" for all those.) - Some edge cases with the (rare, and undocumented) `config.addPatch` config var have been cleaned up. - Loading of the lambda-handler has been refactored a bit to no longer need to know internal details of the instrumentation module. Refs: #3657 Closes: #2992
This PR is closed in favor of #3665 |
…ic#3658) This refactors how module patch handlers are loaded and the Instrumentation class file. There is now a PatcherRegistry that is a little smarter than the `_patches` object before it. The main change is that we now use RITM's internals:true option so that our `onRequire` hook is called with module sub-paths. This will allow elastic#3657 to hook both "mongodb" and "mongodb/lib/cmap/connection_pool.js", and avoid a RITM bug where "mongodb" & "mongodb/lib/cmap/connection_pool" (no extension, indicating a "sub-module") don't work together. Benefits and other changes: - With the smarter "PatcherRegistry" the module name can be used in `disableInstrumentations` to disable any patchers for that module. E.g. `disableInstrumentations=next` will disable all four `next`-related patchers. - A custom "lambda" name is now usable to disable the Lambda instr, FWIW. - (Note: we *could* also add custom disableInstrumentations keys for other modules, e.g. perhaps "aws-sdk" for all those.) - Some edge cases with the (rare, and undocumented) `config.addPatch` config var have been cleaned up. - Loading of the lambda-handler has been refactored a bit to no longer need to know internal details of the instrumentation module. Refs: elastic#3657 Closes: elastic#2992
This PR add extra instrumentation to Mongodb's
ConectionPool
class to ensure the context is not lost when the pool class is queueing the commands. The agent makes use of theAsyncLocalStiorage.bind
static function to inject the context into the queued callbacks.The PR also migrates to the new test implementations making use of
runTestFixtures
util.Fixes #3161
Checklist