-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC is not able to instantly free up memory of request after Response finish ==> possible OOM on sequential requests #2836
Comments
This has to do with how V8 allocates closures. Specifically, the way you monkey-patch If you rewrite your example, it'll keep running: // ...
const patch = f => function(s) { return f.call(this, s) };
const server = createServer((req, res) => {
// simulate compression middleware
res.write = patch(res.write); // ok, no longer shares a closure
// ... This is longstanding behavior (as in: I think V8 has always worked this way) and not something I expect to change because 99% of the time it's the desired behavior: it drastically reduces the number of allocated closures. |
FWIW this is an open bug: v8:7610 |
@viyaha I'm going to close this as a wontfix on our side but I can move it to nodejs/help if you have more questions. |
@bnoordhuis This is interesting info, thank you. However, I still fail to understand why the GC cannot reclaim the memory. Client requests are serial and after one client request is finished surely all work scheduled by Also please note that this came out of a OOM crash we encountered in production and we spent a lot of time breaking down into a small self-contained example in order to report here. But it is a real problem for us and we would like to prevent it in the future (although we mitigated through other means in the meantime). In reality the problem is introduced by the express compression middleware: https://github.com/expressjs/compression/blob/master/index.js Do you see a glaring problem with the way it monkey-patches the response? We'd be willing to work on and test a fix, but still don't really understand what exactly to change, i.e. what exactly is keeping the request alive... Or are you saying that nothing can be done here? |
@mika-fischer Just so we're on the same page, the basic issue is this: function t() {
var x = small()
var y = big()
function f() { return x }
function g() { return y }
return [f, g] // sharing a single [x,y] closure
}
var [f] = t() // discards g but still retains y Circling back to the example @viyaha posted: setImmediate(() => {
// simulate request-body middleware reading large body
req.body = [];
for (let i = 0; i < 18000000; i++) {
req.body.push({
a: "" + Math.random(),
});
}
// write response
for (let i = 0; i < 2000; i++) {
res.write(``)
}
res.end();
});
So, long story short: yes, the problem is in how expressjs/compression creates a closure that retains more than you'd at first glance expect. |
@bnoordhuis That's how I understood your first response, yes. And I understand that part. [Incidentally, I don't think it's a big problem here, since in our use-case with express, the response obviously needs to be kept alive until it is finished and it holds a reference to the request. So this behavior is not surprising. Still very interesting to know though, so thanks for that.] But from our perspective the issue is not that req is not released early within the handling of the HTTP request. The issue is that after the HTTP request is completely finished (i.e. the client has read the response), the GC still cannot free the memory, and if another large request comes in then node terminates with an OOM. And it's totally unclear to us, what is keeping the memory alive after the |
@mika-fischer That needs more explaining than I can probably fit into a single-page comment but the executive summary is that those 2,000 (There are reasons it works that way, it lets Node coalesce writes into fewer - in this case: one - TCP packet. Without that optimization, the example would send 2,000 + 1 packets.) The good news is that Node master is a bit smarter about it. The example runs without OOM errors there. |
@bnoordhuis Thanks for your explanations, but I think we're still talking somewhat past each other. The OOM does not happen within the handling of a request! I would understand the issue with that, but that's not the case here. Instead, the first request is completely finished (i.e. went over the network, and was received by the client), then a second request comes in and that triggers the OOM because the GC for some reason cannot free the memory associated with the first request (it is not deterministic, and might happen with the third or fourth, etc.). At the point in time, when the client has finished reading the first response, obviously all write and end requests have already been done (otherwise the client would not have finished reading the response) and so the GC should be able to free them if needed. But this is not happening. Note that in the extreme this even works with only a single So let me pose my confusion as a question: At the point in time when the client has finished receiving the response to the first request (i.e. all writes and the end request were handled, went over the network, got recieved by the client), is there any valid reason that the GC could fail to free the memory associated with this completed first request in the server? If so, what exactly is keeping these things alive? From my perspective, it can't be the We tried with Node 14, will also try with master. But do note that with Node 14, we already had to use |
The client can receive the response and fire off a new request before the next event loop tick happens on the server, when the write requests become eligible for garbage collection. Put another way: there's a short time frame where the memory footprint of two sequential client requests may co-exist. |
But this is not happening here, as the behavior can be triggered even with a pause of 1 second between requests (see commented out code in the initial code). |
@bnoordhuis All right, I've prepared a new version, this time with only 100 writes, 5 seconds(!) pause between requests, some more logging and tested with current master. It still OOMs reliably. I still fail to understand why the data of the first request cannot be GCed, 20 seconds(!) after the response was received by the client. const {createServer, get} = require("http");
const log = msg => console.log(`[${process.pid}:${''.padStart(16)}]${(process.uptime() * 1000).toFixed(0).padStart(9)} ms: ${msg}`);
let n = 0;
createServer((req, res) => {
setImmediate(() => { // setImmediate or Promises are necessary
log(`S: Got new request ${n}`);
// simulate compression middleware
const _write = res.write;
res.write = function (data) {
return _write.call(this, data);
}
log(`S: Reading request body ${n}`);
req.body = [];
for (let i = 0; i < 19 * 1000 * 1000; i++) {
req.body.push({a: `${i}`});
}
log(`S: Writing response ${n}`);
for (let i = 0; i < 100; i++) {
res.write("");
}
res.end(() => log(`S: Finished writing response ${n++} (res.end() cb)`));
});
}).listen(3000, () => { client().catch(console.error) });
async function client() {
for (let i = 0; ; i++) {
log(`C: Starting request ${i}`);
await new Promise(resolve => {
get("http://localhost:3000", res => {
res.on("data", () => {});
res.on("end", resolve);
});
});
log(`C: Finished reading response to request ${i}`);
log(`C: Pausing for 5 seconds`);
await new Promise(res => setTimeout(res, 5000));
}
} gives me:
|
There has been no activity on this issue for 3 years and it may no longer be relevant. It will be closed 1 month after the last non-automated comment. |
There has been no activity on this issue and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment. |
What steps will reproduce the bug?
Log:
How often does it reproduce? Is there a required condition?
Seems like always, most of the time not in an OOM (less Memory Consumption) but the Request-Object can't be disposed for some time.
It seems like following conditions are required:
What is the expected behavior?
After finish of the request, the GC should be able to dispose all request related objects (freeing the memory).
The text was updated successfully, but these errors were encountered: