-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear the internal require statCache
on an unsuccessful module load?
#31803
Comments
That approach sounds fine to me. It would hurt the loading percormance of all following calls though. I'll check if it is possible to isolate the failed modules stats to only clear those. |
How would this be similar to ESM behavior for a similar pattern? |
Perhaps there should be an explicit API to do this instead of always doing it, so that you opt-in to any potential performance issues. |
Rather than clearing the cache on failure (which means no longer caching failed lookups which makes repeated failed lookups very expensive), why not clear the cache on invocation of a
So for an esm equivalent: async function main() {
let Bluebird;
try {
Bluebird = await import('bluebird');
} catch (e) {
execSync('npm install bluebird');
Bluebird = await import('bluebird');
}
}
main(); since each dynamic |
Just to be explicit, import() is required by spec to cache if it succeeds (on a per-module basis, and node actually has a stricter behavior because it caches across all modules). If it fails, it is not required to cache the failure, and node doesn't cache the failure. |
-1 That's way too magical |
I’m -1 of clearing the cache. A lot of modules use a “conditional loading pattern” and this change is going to slow their startup time significantly. |
Apart from technical considerations, the example introduces a kind of non-determinism (the install can fail - or not) that shouldn't be encouraged. The time for installing packages is at install time, not run time. |
Does require.resolve have the same restriction? If not, it can be used to check if the module is there before trying to require it |
It does, it goes through Module._findPath and uses the stat cache.
Package installation is just one use case - the only reason Node works this way at the moment is because we have caching for performance reasons and that is observable. Moreover we have already worked around it in the REPL. Hmm, if it only invalidated the entries it created for the problematically
I tend to agree though I believe the current behavior is only because our performance optimization leaks a behavioral change. |
Your OP only has a use case for runtime install, or a repl; are there others you have in mind? |
If you know that the specifier and package name match, as does in OP, you could use an alternative method (not resolve) of checking if the package exists. // index.js
const {accessSync} = require('fs');
const {execSync} = require('child_process');
const node_modules = './node_modules/';
const specifier = 'bluebird';
let Bluebird;
try {
// does package exist?
accessSync(node_modules + specifier);
// yes, so require it
Bluebird = require(specifier);
} catch (e) {
// no, so install it...
execSync('npm install ' + specifier);
// ...and require it
Bluebird = require(specifier);
}
console.log(Bluebird); |
It would not cause significant performance issues in that case. |
This problem also affects the case where the entire node_modules directory is created within node. In this case, there doesn't seem to be any workaround. If you do something like what OP is doing:
but you do it when there is no |
What’s the use case for “npm install hasn’t been ran first”? Can you elaborate? |
I think in a really general sense, these caches shouldn't be updated for failure cases (obviously easier to say than implement, but I think that's what we should aim for) |
That pessimizes a pattern that is fairly common in larger applications: try {
var debug = require('debug'); // just an example, of course
} catch (e) {
var debug = () => {};
} The bigger the program, the more occurrences of that pattern pop up, and the more it is punished if the results aren't in the stat cache. |
Is that pattern often done for the same package a great many times in the graph, and also not at module level/require time? |
Yes and yes (I think - I'm not completely sure what you mean by "also not at module level/require time".) |
Like, is that done inside function bodies, as opposed to at the top level of modules (where it'd happen once and be cached after that, and thus the perf hit wouldn't be significant) |
Ah, like that. I think it only matters at the top level because the stat cache is cleared when control returns to the event loop. I'm currently looking at a midsize application that has two dozen instances of the above pattern (that I can find) on a total of 1,200 require() calls, so about 2% of the total. An inexact science benchmark suggests not caching the result makes such require() calls 2.8x slower. That's not as bad as I feared. There's a caveat though: require() searches upwards, therefore deeper directory structures are penalized more heavily. |
In that case, someone needing to do a conditional require in this case can already do it asynchronously, just not synchronously? |
What if we only retry failed attempts without the statCache? That way we would only pay the penalty of not going through the cache failed attempts. |
Fixes: nodejs#31803 PR-URL: nodejs#31921
I am going to go ahead and close this since it staled, if anyone feels strongly about supporting this - please feel free to reopen! |
Currently, it is impossible to write code like:
Because the cjs loader caches the results of
stat
calls internally so it would not look for bluebird the second time werequire
it.It would be pretty easy to deal with this use case from a technical point of view by setting statCache to a
new Map()
on a MODULE_NOT_FOUND.I can see several use cases like:
Would people be in favor of this change and enabling this?
cc @nodejs/modules
cc @BridgeAR because of #26928 that enables this specifically for the REPL.
The text was updated successfully, but these errors were encountered: