worker_threads consuming so much memory and crash #32265

khoanguyen-3fc · 2020-03-14T17:59:17Z

Version: v12.16.1
Platform: Linux host-name 4.15.0-88-generic Proposal: return Promises as well as taking callbacks. #88-Ubuntu SMP Tue Feb 11 20:11:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Subsystem:

What steps will reproduce the bug?

const { Worker, isMainThread } = require('worker_threads');

if (isMainThread) {
    for (let i = 0; i < 2000; i++) {
        new Worker(__filename);
    }
} else {
    console.log(JSON.stringify(process.memoryUsage()));

    setInterval(() => {
        // Keep thread alive
    }, 1000);
}

How often does it reproduce? Is there a required condition?

This problem always occur.

What is the expected behavior?

I have to run at least 2000 worker thread at the same time.

What do you see instead?

The script crash with random GC error.

Additional information

I need to run at least 2000 thread at the same time, but there are 2 problem that I encounter:

The worker_thread are consuming so much memory, about 5MB in RSS for an empty thread, so I end up with 1500 threads and about 8GB RAM, and cost some more if the thread do something, but it wasn't the real problem, because my server have a large amount of RAM (>100GB)
The main problem is the script would crash at about 8GB RSS, I'd also try with --max-old-space-size=81920 --max-semi-space-size=81920, but the error still there when RSS reach 8GB

Output of script

// 1486 lines, 1487th line bellow
{"rss":8157556736,"heapTotal":4190208,"heapUsed":2382936,"external":802056}

<--- Last few GCs --->
[19127:0x7f5f4442fa80]    26606 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 1.8 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
{"rss":8158093312,"heapTotal":4190208,"heapUsed":2388408,"external":802056}
 1: 0x9ef190 node::Abort() [node]

<--- Last few GCs --->
[19127:0x7f5fa442fb40]    24675 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 1.9 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
{"rss":8158359552,"heapTotal":4190208,"heapUsed":2383584,"external":802056}
{"rss":8158359552,"heapTotal":4190208,"heapUsed":2375648,"external":802056}
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]

<--- Last few GCs --->
[19127:0x7f5fb842fb20]    27482 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 2.0 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
 1: 0x9ef190 node::Abort() [node]
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 1: 0x9ef190 node::Abort() [node]
{"rss":8158846976,"heapTotal":4190208,"heapUsed":2375568,"external":802056}
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]

<--- Last few GCs --->
[19127:0x7f5fac42fb20]    27489 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 2.0 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
{"rss":8160190464,"heapTotal":4190208,"heapUsed":2390760,"external":802056}
{"rss":8160190464,"heapTotal":4190208,"heapUsed":2385720,"external":802056}
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xd0a765  [node]
 1: 0x9ef190 node::Abort() [node]
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 6: 0xd545ee  [node]

<--- Last few GCs --->
[19127:0x7f60b842fe20]    29692 ms: Scavenge 2.0 (2.7) -> 1.5 (3.7) MB, 1.8 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
{"rss":8161587200,"heapTotal":3928064,"heapUsed":2385144,"external":802056}
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xd0a765  [node]
 7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]

<--- Last few GCs --->
[19127:0x7f5ef4165610]    26880 ms: Scavenge 2.0 (2.7) -> 1.5 (3.7) MB, 1.8 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
 1: 0x9ef190 node::Abort() [node]
 5: 0xd0a765  [node]
 6: 0xd545ee  [node]
 8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
{"rss":8161710080,"heapTotal":3928064,"heapUsed":2373360,"external":802056}
 1: 0x9ef190 node::Abort() [node]
 6: 0xd545ee  [node]
 7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
 9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
 5: 0xd0a765  [node]
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
 7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
 8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 6: 0xd545ee  [node]
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
 9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
 5: 0xd0a765  [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2381024,"external":802056}
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2388704,"external":802056}
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2360896,"external":802056}
 8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
 6: 0xd545ee  [node]
 5: 0xd0a765  [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
 9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
 7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
 6: 0xd545ee  [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2365384,"external":802056}
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
{"rss":8160485376,"heapTotal":3403776,"heapUsed":2389984,"external":802056}
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
{"rss":8160489472,"heapTotal":3403776,"heapUsed":2389456,"external":802056}
{"rss":8160489472,"heapTotal":3403776,"heapUsed":2397112,"external":802056}
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]

<--- Last few GCs --->

[19127:0x7f60a8001010]    47768 ms: Scavenge 2.4 (4.2) -> 2.1 (4.0) MB, 1.6 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 


<--- JS stacktrace --->

FATAL ERROR: Committing semi space failed. Allocation failed - JavaScript heap out of memory
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
 7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
{"rss":8150188032,"heapTotal":3403776,"heapUsed":2413400,"external":802056}
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
 1: 0x9ef190 node::Abort() [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
 8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
 2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
 9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
 3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
18: 0xee15f2 v8::internal::Map::TransitionToDataProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::StoreOrigin) [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 5: 0xd0a765  [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
19: 0xed1cdf v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::StoreOrigin) [node]
11: 0xd1a959 v8::internal::Heap::ReserveSpace(std::vector<v8::internal::Heap::Chunk, std::allocator<v8::internal::Heap::Chunk> >*, std::vector<unsigned long, std::allocator<unsigned long> >*) [node]
 6: 0xd182ee v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
18: 0xee15f2 v8::internal::Map::TransitionToDataProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::StoreOrigin) [node]
18: 0xee15f2 v8::internal::Map::TransitionToDataProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::StoreOrigin) [node]
19: 0xed1cdf v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::StoreOrigin) [node]
20: 0xf05566 v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin) [node]
 7: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
19: 0xed1cdf v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::StoreOrigin) [node]
20: 0xf05566 v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin) [node]
21: 0xeb08e0 v8::internal::JSObject::DefineOwnPropertyIgnoreAttributes(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::JSObject::AccessorInfoHandling) [node]
22: 0xeb0bec v8::internal::JSObject::SetOwnPropertyIgnoreAttributes(v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes) [node]
23: 0x10283ef  [node]
24: 0x102c399  [node]
25: 0x102d303 v8::internal::Runtime_CreateObjectLiteral(int, unsigned long*, v8::internal::Isolate*) [node]
26: 0x13a71b9  [node]

The text was updated successfully, but these errors were encountered:

gireeshpunathil · 2020-03-16T10:25:21Z

able to recreate. the report data shows this:

  "javascriptHeap": {
    "totalMemory": 4059136,
    "totalCommittedMemory": 3299464,
    "usedMemory": 2861680,
    "availableMemory": 104855004168,
    "memoryLimit": 104857600000,
    "heapSpaces": {
      "read_only_space": {
        "memorySize": 262144,
        "committedMemory": 33328,
        "capacity": 33040,
        "used": 33040,
        "available": 0
      },
      "new_space": {
        "memorySize": 1048576,
        "committedMemory": 1047944,
        "capacity": 1047424,
        "used": 633768,
        "available": 413656
      },
      "old_space": {
        "memorySize": 1654784,
        "committedMemory": 1602320,
        "capacity": 1602528,
        "used": 1600304,
        "available": 2224
      },
      "code_space": {
        "memorySize": 430080,
        "committedMemory": 170720,
        "capacity": 154336,
        "used": 154336,
        "available": 0
      },
      "map_space": {
        "memorySize": 528384,
        "committedMemory": 309984,
        "capacity": 309120,
        "used": 309120,
        "available": 0
      },
      "large_object_space": {
        "memorySize": 135168,
        "committedMemory": 135168,
        "capacity": 131112,
        "used": 131112,
        "available": 0
      },
      "code_large_object_space": {
        "memorySize": 0,
        "committedMemory": 0,
        "capacity": 0,
        "used": 0,
        "available": 0
      },
      "new_large_object_space": {
        "memorySize": 0,
        "committedMemory": 0,
        "capacity": 1047424,
        "used": 0,
        "available": 1047424
      }
    }
  }

and top (just before the crash):

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                            
 10138 root      20   0  0.319t 7.563g  13644 S 350.8  0.8   1:46.31 node

gireeshpunathil · 2020-03-16T10:27:09Z

there are many spaces seen as exhausted - such as code_space and map_space. how do I increase those? I am not sure which flags in node --v8-options to use

@nodejs/v8

addaleax · 2020-03-16T10:48:31Z

I think the big hint there might actually be VIRT reporting as 0.319t – maybe the process is running out of virtual memory? (That would be somewhat related to #25933)

gireeshpunathil · 2020-03-16T11:18:40Z

but I have unlimited virtual memory :

virtual memory (kbytes, -v) unlimited

plus the failing stack in the referenced issue has node::NewIsolate in it, which is not the case here, looks like we are doing gc?

@oh-frontend1 - what is your ulimit -v showing up?

khoanguyen-3fc · 2020-03-16T17:27:06Z

@gireeshpunathil

> ulimit -v
unlimited

and my report.json

  "javascriptHeap": {
    "totalMemory": 4452352,
    "totalCommittedMemory": 3517904,
    "usedMemory": 1448464,
    "availableMemory": 85947560576,
    "memoryLimit": 85949677568,
    "heapSpaces": {
      "read_only_space": {
        "memorySize": 262144,
        "committedMemory": 33088,
        "capacity": 32808,
        "used": 32808,
        "available": 0
      },
      "new_space": {
        "memorySize": 2097152,
        "committedMemory": 1683416,
        "capacity": 1047456,
        "used": 188368,
        "available": 859088
      },
      "old_space": {
        "memorySize": 1396736,
        "committedMemory": 1368440,
        "capacity": 1064504,
        "used": 897832,
        "available": 166672
      },
      "code_space": {
        "memorySize": 430080,
        "committedMemory": 170400,
        "capacity": 154016,
        "used": 154016,
        "available": 0
      },
      "map_space": {
        "memorySize": 266240,
        "committedMemory": 262560,
        "capacity": 175440,
        "used": 175440,
        "available": 0
      },
      "large_object_space": {
        "memorySize": 0,
        "committedMemory": 0,
        "capacity": 0,
        "used": 0,
        "available": 0
      },
      "code_large_object_space": {
        "memorySize": 0,
        "committedMemory": 0,
        "capacity": 0,
        "used": 0,
        "available": 0
      },
      "new_large_object_space": {
        "memorySize": 0,
        "committedMemory": 0,
        "capacity": 1047456,
        "used": 0,
        "available": 1047456
      }
    }
  },

gireeshpunathil · 2020-03-17T10:32:50Z

thanks @oh-frontend1 - so our failing contexts seem to match. Let me see if I can figure out what caused the gc to fail

gireeshpunathil · 2020-03-17T11:44:25Z

$ grep "ENOMEM" strace.txt | grep "mmap" | wc -l
1372184

there are several mmap calls that fail. Looking at the manual, the second probable reason stated is exhaustion of process' mappings.

$ sysctl vm.max_map_count
vm.max_map_count = 65530
$ sysctl -w vm.max_map_count=655300
vm.max_map_count = 655300

$ node --max-heap-size=100000 foo

{"rss":11898519552,"heapTotal":62894080,"heapUsed":32157136,"external":940898,"arrayBuffers":9386}
{"rss":14497640448,"heapTotal":62894080,"heapUsed":32192184,"external":940938,"arrayBuffers":9386}
{"rss":15572897792,"heapTotal":62894080,"heapUsed":32200208,"external":940938,"arrayBuffers":9386}
{"rss":16316686336,"heapTotal":62894080,"heapUsed":32203928,"external":940938,"arrayBuffers":9386}
...

$ top

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                            
 25537 root      20   0  0.575t 0.014t  13636 S 209.0  1.4   2:25.30 node

so by increasing the mapping count, I am able to create 4K threads, and consume upto 05.t of virtual and 15G of rss. So looks like adjusting maximum mapping count in the kernel is the solution for this. @oh-frontend1 - can you pls verify?

khoanguyen-3fc · 2020-03-17T14:51:59Z

@gireeshpunathil thank you, this solution also work on my real code.

jasnell · 2020-03-25T18:23:23Z

@oh-frontend1 ... I was wondering if you wouldn't mind expanding on the reason why you need a worker thread pool of several thousand workers. What is the scenario / app case you're exploring here. The reason I'm asking is that we (NearForm) are doing some investigation into worker thread perf diagnostics and the dynamic of profiling small worker pools (4-50) range is much different than profiling pools in the 2k-4k range and we'd like to understand the use case a bit more.

khoanguyen-3fc · 2020-03-27T14:32:25Z

@jasnell the application is confidential.

So, nothing much, in future, I have to monitor a large number of IoT device, having 500 network IO on same thread causing a large bottle neck on CPU, but split to child_process is hard to manage and communicate to master, so I decide using worker_thread.

And a simple case is one IO per thread, if I cannot resolve this problem, I would decide to increase number of IO per thread, but it will increase code complexity.

In this real application, as I benchmark, I can only create about ~200 threads and this error happened, so I would create a minimal source code to reproduce (and in this case, number of threads reached 1k5, before the error occurred)

jasnell · 2020-03-27T14:35:51Z

Ok thank you! That is super helpful information @oh-frontend1!

puzpuzpuz · 2020-03-27T17:08:33Z

@oh-frontend1 what do you mean by "500 network IO"? Is it 500 client connections? If that's true and your application is I/O bound, Node should be able to handle much more than that. In most cases, you just need to follow the golden rule of Node (don't block the event loop).

And if it's CPU-bound, then it's better to keep the number of worker threads close to number of CPU cores and queue tasks when all members are busy (just like ThreadPoolExecutor in Java does it). Otherwise, if you run CPU-bound tasks on a large number of threads, you will be wasting memory (due to the footprint per each worker thread) and CPU time spent on context switching on OS level.

Sorry in the advance, if I misunderstood your needs.

miklcct · 2025-01-13T11:13:51Z

I have hit the same problem even with 1 worker thread which processes a huge amount of data (around a dozen GB).

gireeshpunathil mentioned this issue Mar 17, 2020

assertion failure in ../src/node_worker.cc:647 with 4K worker threads #32319

Closed

khoanguyen-3fc closed this as completed Mar 17, 2020

addaleax mentioned this issue Jun 29, 2020

[Worker] heap out of memory nodejs/help#2809

Closed

drouarb mentioned this issue May 16, 2021

FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory nodejs/help#2698

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

worker_threads consuming so much memory and crash #32265

worker_threads consuming so much memory and crash #32265

khoanguyen-3fc commented Mar 14, 2020

gireeshpunathil commented Mar 16, 2020

gireeshpunathil commented Mar 16, 2020

addaleax commented Mar 16, 2020

gireeshpunathil commented Mar 16, 2020

khoanguyen-3fc commented Mar 16, 2020 •

edited

Loading

gireeshpunathil commented Mar 17, 2020

gireeshpunathil commented Mar 17, 2020

khoanguyen-3fc commented Mar 17, 2020

jasnell commented Mar 25, 2020

khoanguyen-3fc commented Mar 27, 2020

jasnell commented Mar 27, 2020

puzpuzpuz commented Mar 27, 2020

miklcct commented Jan 13, 2025 •

edited

Loading

worker_threads consuming so much memory and crash #32265

worker_threads consuming so much memory and crash #32265

Comments

khoanguyen-3fc commented Mar 14, 2020

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Additional information

gireeshpunathil commented Mar 16, 2020

gireeshpunathil commented Mar 16, 2020

addaleax commented Mar 16, 2020

gireeshpunathil commented Mar 16, 2020

khoanguyen-3fc commented Mar 16, 2020 • edited Loading

gireeshpunathil commented Mar 17, 2020

gireeshpunathil commented Mar 17, 2020

khoanguyen-3fc commented Mar 17, 2020

jasnell commented Mar 25, 2020

khoanguyen-3fc commented Mar 27, 2020

jasnell commented Mar 27, 2020

puzpuzpuz commented Mar 27, 2020

miklcct commented Jan 13, 2025 • edited Loading

khoanguyen-3fc commented Mar 16, 2020 •

edited

Loading

miklcct commented Jan 13, 2025 •

edited

Loading