-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
worker_threads consuming so much memory and crash #32265
Comments
able to recreate. the report data shows this: "javascriptHeap": {
"totalMemory": 4059136,
"totalCommittedMemory": 3299464,
"usedMemory": 2861680,
"availableMemory": 104855004168,
"memoryLimit": 104857600000,
"heapSpaces": {
"read_only_space": {
"memorySize": 262144,
"committedMemory": 33328,
"capacity": 33040,
"used": 33040,
"available": 0
},
"new_space": {
"memorySize": 1048576,
"committedMemory": 1047944,
"capacity": 1047424,
"used": 633768,
"available": 413656
},
"old_space": {
"memorySize": 1654784,
"committedMemory": 1602320,
"capacity": 1602528,
"used": 1600304,
"available": 2224
},
"code_space": {
"memorySize": 430080,
"committedMemory": 170720,
"capacity": 154336,
"used": 154336,
"available": 0
},
"map_space": {
"memorySize": 528384,
"committedMemory": 309984,
"capacity": 309120,
"used": 309120,
"available": 0
},
"large_object_space": {
"memorySize": 135168,
"committedMemory": 135168,
"capacity": 131112,
"used": 131112,
"available": 0
},
"code_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 0,
"used": 0,
"available": 0
},
"new_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 1047424,
"used": 0,
"available": 1047424
}
}
} and top (just before the crash):
|
there are many spaces seen as exhausted - such as @nodejs/v8 |
I think the big hint there might actually be |
but I have unlimited virtual memory :
plus the failing stack in the referenced issue has @oh-frontend1 - what is your |
and my report.json "javascriptHeap": {
"totalMemory": 4452352,
"totalCommittedMemory": 3517904,
"usedMemory": 1448464,
"availableMemory": 85947560576,
"memoryLimit": 85949677568,
"heapSpaces": {
"read_only_space": {
"memorySize": 262144,
"committedMemory": 33088,
"capacity": 32808,
"used": 32808,
"available": 0
},
"new_space": {
"memorySize": 2097152,
"committedMemory": 1683416,
"capacity": 1047456,
"used": 188368,
"available": 859088
},
"old_space": {
"memorySize": 1396736,
"committedMemory": 1368440,
"capacity": 1064504,
"used": 897832,
"available": 166672
},
"code_space": {
"memorySize": 430080,
"committedMemory": 170400,
"capacity": 154016,
"used": 154016,
"available": 0
},
"map_space": {
"memorySize": 266240,
"committedMemory": 262560,
"capacity": 175440,
"used": 175440,
"available": 0
},
"large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 0,
"used": 0,
"available": 0
},
"code_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 0,
"used": 0,
"available": 0
},
"new_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 1047456,
"used": 0,
"available": 1047456
}
}
}, |
thanks @oh-frontend1 - so our failing contexts seem to match. Let me see if I can figure out what caused the gc to fail |
there are several
so by increasing the mapping count, I am able to create 4K threads, and consume upto 05.t of virtual and 15G of |
@gireeshpunathil thank you, this solution also work on my real code. |
@oh-frontend1 ... I was wondering if you wouldn't mind expanding on the reason why you need a worker thread pool of several thousand workers. What is the scenario / app case you're exploring here. The reason I'm asking is that we (NearForm) are doing some investigation into worker thread perf diagnostics and the dynamic of profiling small worker pools (4-50) range is much different than profiling pools in the 2k-4k range and we'd like to understand the use case a bit more. |
@jasnell the application is confidential. So, nothing much, in future, I have to monitor a large number of IoT device, having 500 network IO on same thread causing a large bottle neck on CPU, but split to And a simple case is one IO per thread, if I cannot resolve this problem, I would decide to increase number of IO per thread, but it will increase code complexity. In this real application, as I benchmark, I can only create about ~200 threads and this error happened, so I would create a minimal source code to reproduce (and in this case, number of threads reached 1k5, before the error occurred) |
Ok thank you! That is super helpful information @oh-frontend1! |
@oh-frontend1 what do you mean by "500 network IO"? Is it 500 client connections? If that's true and your application is I/O bound, Node should be able to handle much more than that. In most cases, you just need to follow the golden rule of Node (don't block the event loop). And if it's CPU-bound, then it's better to keep the number of worker threads close to number of CPU cores and queue tasks when all members are busy (just like Sorry in the advance, if I misunderstood your needs. |
I have hit the same problem even with 1 worker thread which processes a huge amount of data (around a dozen GB). |
What steps will reproduce the bug?
How often does it reproduce? Is there a required condition?
This problem always occur.
What is the expected behavior?
I have to run at least 2000 worker thread at the same time.
What do you see instead?
The script crash with random GC error.
Additional information
I need to run at least 2000 thread at the same time, but there are 2 problem that I encounter:
worker_thread
are consuming so much memory, about 5MB in RSS for an empty thread, so I end up with 1500 threads and about 8GB RAM, and cost some more if the thread do something, but it wasn't the real problem, because my server have a large amount of RAM (>100GB)--max-old-space-size=81920 --max-semi-space-size=81920
, but the error still there when RSS reach 8GBOutput of script
The text was updated successfully, but these errors were encountered: