Skip to content

Commit cab95c7

Browse files
joyeecheungtargos
authored andcommitted
src: fix near heap limit callback
- Use the allocated space size to calculate the raised heap limit, as that is what V8 uses to determine whether it should crash - previously we use the used size for the calculation and that was too conservative and did not prevent the crashes effectively enough. - Use RequestInterrupt() to take the snapshot since we need to make sure that the heap limit is raised first before the snapshot can be taken.
1 parent 2631a2f commit cab95c7

File tree

3 files changed

+87
-37
lines changed

3 files changed

+87
-37
lines changed

doc/api/cli.md

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -436,23 +436,36 @@ Writes a V8 heap snapshot to disk when the V8 heap usage is approaching the
436436
heap limit. `count` should be a non-negative integer (in which case
437437
Node.js will write no more than `max_count` snapshots to disk).
438438

439-
When generating snapshots, garbage collection may be triggered and bring
440-
the heap usage down. Therefore multiple snapshots may be written to disk
441-
before the Node.js instance finally runs out of memory. These heap snapshots
442-
can be compared to determine what objects are being allocated during the
443-
time consecutive snapshots are taken. It's not guaranteed that Node.js will
444-
write exactly `max_count` snapshots to disk, but it will try
445-
its best to generate at least one and up to `max_count` snapshots before the
446-
Node.js instance runs out of memory when `max_count` is greater than `0`.
447-
448-
Generating V8 snapshots takes time and memory (both memory managed by the
439+
Generating V8 heap snapshots takes time and memory (both memory managed by the
449440
V8 heap and native memory outside the V8 heap). The bigger the heap is,
450-
the more resources it needs. Node.js will adjust the V8 heap to accommodate
441+
the more resources it needs. When generating heap snapshots for this
442+
feature, Node.js will temporarily raise the V8 heap limit to accommodate
451443
the additional V8 heap memory overhead, and try its best to avoid using up
452444
all the memory available to the process. When the process uses
453445
more memory than the system deems appropriate, the process may be terminated
454446
abruptly by the system, depending on the system configuration.
455447

448+
Heap snapshot generation could trigger garbage collections. If enough memory
449+
can be reclaimed after the garbage collection, the heap usage may go down
450+
and so multiple snapshots may be written to disk before the Node.js instance
451+
finally runs out of memory. On the other hand, since Node.js temporarily
452+
raises the heap limit before the heap snapshot is generated, and the limit
453+
only gets restored when the heap usage falls below it, if the application
454+
allocates reachable memory faster than what the garbage collector can keep up
455+
with, the heap usage could also go up and exceed the initial limit quite a bit
456+
until Node.js stops raising the heap limit.
457+
458+
To control the number of heap snapshots to be written to disk, it is
459+
recommended to specify a value of `--heapsnapshot-near-heap-limit`.
460+
It's not guaranteed that Node.js will write exactly `max_count` snapshots
461+
to disk, but it will try its best to generate at least one and up to
462+
`max_count` snapshots before the Node.js instance runs out of memory when
463+
`max_count` is greater than `0`.
464+
465+
When multiple heap snapshots are generated, they can be compared to determine
466+
what objects are being allocated during the time consecutive snapshots
467+
are taken.
468+
456469
```console
457470
$ node --max-old-space-size=100 --heapsnapshot-near-heap-limit=3 index.js
458471
Wrote snapshot to Heap.20200430.100036.49580.0.001.heapsnapshot

src/env.cc

Lines changed: 60 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1562,14 +1562,31 @@ size_t Environment::NearHeapLimitCallback(void* data,
15621562
size_t num_heap_spaces = env->isolate()->NumberOfHeapSpaces();
15631563
for (size_t i = 0; i < num_heap_spaces; ++i) {
15641564
env->isolate()->GetHeapSpaceStatistics(&stats, i);
1565+
1566+
Debug(env,
1567+
DebugCategory::DIAGNOSTICS,
1568+
"%s space_size = %" PRIu64 ", "
1569+
"space_used_size = %" PRIu64 ", "
1570+
"space_available_size = %" PRIu64 ", "
1571+
"physical_space_size = %" PRIu64 "\n",
1572+
stats.space_name(),
1573+
static_cast<uint64_t>(stats.space_size()),
1574+
static_cast<uint64_t>(stats.space_used_size()),
1575+
static_cast<uint64_t>(stats.space_available_size()),
1576+
static_cast<uint64_t>(stats.physical_space_size()));
1577+
1578+
// space_size() returns the allocated size of a given space,
1579+
// we use this to calculate the new limit because V8 also
1580+
// uses the allocated size to determine whether it should crash.
15651581
if (strcmp(stats.space_name(), "new_space") == 0 ||
15661582
strcmp(stats.space_name(), "new_large_object_space") == 0) {
1567-
young_gen_size += stats.space_used_size();
1583+
young_gen_size += stats.space_size();
15681584
} else {
1569-
old_gen_size += stats.space_used_size();
1585+
old_gen_size += stats.space_size();
15701586
}
15711587
}
15721588

1589+
size_t total_size = young_gen_size + old_gen_size;
15731590
Debug(env,
15741591
DebugCategory::DIAGNOSTICS,
15751592
"max_young_gen_size=%" PRIu64 ", "
@@ -1579,21 +1596,15 @@ size_t Environment::NearHeapLimitCallback(void* data,
15791596
static_cast<uint64_t>(max_young_gen_size),
15801597
static_cast<uint64_t>(young_gen_size),
15811598
static_cast<uint64_t>(old_gen_size),
1582-
static_cast<uint64_t>(young_gen_size + old_gen_size));
1599+
static_cast<uint64_t>(total_size));
15831600

15841601
uint64_t available = GuessMemoryAvailableToTheProcess();
15851602
// TODO(joyeecheung): get a better estimate about the native memory
15861603
// usage into the overhead, e.g. based on the count of objects.
1587-
uint64_t estimated_overhead = max_young_gen_size;
1588-
Debug(env,
1589-
DebugCategory::DIAGNOSTICS,
1590-
"Estimated available memory=%" PRIu64 ", "
1591-
"estimated overhead=%" PRIu64 "\n",
1592-
static_cast<uint64_t>(available),
1593-
static_cast<uint64_t>(estimated_overhead));
1594-
1595-
// This might be hit when the snapshot is being taken in another
1596-
// NearHeapLimitCallback invocation.
1604+
uint64_t estimated_overhead = young_gen_size;
1605+
// The new limit must be higher than current_heap_limit or V8 might
1606+
// crash.
1607+
uint64_t minimun_new_limit = static_cast<uint64_t>(current_heap_limit + 1);
15971608
// When taking the snapshot, objects in the young generation may be
15981609
// promoted to the old generation, result in increased heap usage,
15991610
// but it should be no more than the young generation size.
@@ -1602,33 +1613,56 @@ size_t Environment::NearHeapLimitCallback(void* data,
16021613
// new limit, so in a heap with unbounded growth the isolate
16031614
// may eventually crash with this new limit - effectively raising
16041615
// the heap limit to the new one.
1616+
uint64_t estimated_space_needed =
1617+
std::max(estimated_overhead + total_size, minimun_new_limit);
1618+
1619+
Debug(env,
1620+
DebugCategory::DIAGNOSTICS,
1621+
"Estimated available memory=%" PRIu64 ", "
1622+
"estimated overhead=%" PRIu64 "\n"
1623+
"estimated space needed=%" PRIu64 "\n",
1624+
static_cast<uint64_t>(available),
1625+
static_cast<uint64_t>(estimated_overhead),
1626+
static_cast<uint64_t>(estimated_space_needed));
1627+
1628+
// This might be hit when the snapshot is being taken in another
1629+
// NearHeapLimitCallback invocation.
1630+
// TODO(joyeecheung): turn this into
1631+
// DCHECK(!env->is_processing_heap_limit_callback_)
1632+
// when V8 ensures that the callback can't be nested.
16051633
if (env->is_processing_heap_limit_callback_) {
1606-
size_t new_limit = current_heap_limit + max_young_gen_size;
16071634
Debug(env,
16081635
DebugCategory::DIAGNOSTICS,
16091636
"Not generating snapshots in nested callback. "
16101637
"new_limit=%" PRIu64 "\n",
1611-
static_cast<uint64_t>(new_limit));
1612-
return new_limit;
1638+
static_cast<uint64_t>(estimated_space_needed));
1639+
return estimated_space_needed;
16131640
}
16141641

16151642
// Estimate whether the snapshot is going to use up all the memory
16161643
// available to the process. If so, just give up to prevent the system
16171644
// from killing the process for a system OOM.
1618-
if (estimated_overhead > available) {
1645+
if (estimated_space_needed > available) {
16191646
Debug(env,
16201647
DebugCategory::DIAGNOSTICS,
16211648
"Not generating snapshots because it's too risky.\n");
16221649
env->isolate()->RemoveNearHeapLimitCallback(NearHeapLimitCallback,
16231650
initial_heap_limit);
1624-
// The new limit must be higher than current_heap_limit or V8 might
1625-
// crash.
1626-
return current_heap_limit + 1;
1651+
1652+
return minimun_new_limit;
16271653
}
16281654

1629-
// Take the snapshot synchronously.
1655+
env->initial_heap_limit_ = initial_heap_limit;
16301656
env->is_processing_heap_limit_callback_ = true;
1657+
env->isolate()->RequestInterrupt(TakeSnapshotInNearHeapLimitCallback, env);
1658+
// The new limit must be higher than current_heap_limit or V8 might
1659+
// crash.
1660+
return estimated_space_needed;
1661+
}
16311662

1663+
void Environment::TakeSnapshotInNearHeapLimitCallback(v8::Isolate* isolate,
1664+
void* data) {
1665+
Environment* env = static_cast<Environment*>(data);
16321666
std::string dir = env->options()->diagnostic_dir;
16331667
if (dir.empty()) {
16341668
dir = env->GetCwd();
@@ -1640,8 +1674,12 @@ size_t Environment::NearHeapLimitCallback(void* data,
16401674

16411675
// Remove the callback first in case it's triggered when generating
16421676
// the snapshot.
1677+
// TODO(joyeecheung): when V8 ensures that the callback can't be nested,
1678+
// we can simply remove the callback when env->heap_limit_snapshot_taken_
1679+
// reaches env->options_->heap_snapshot_near_heap_limit at the
1680+
// end of this interrupt.
16431681
env->isolate()->RemoveNearHeapLimitCallback(NearHeapLimitCallback,
1644-
initial_heap_limit);
1682+
env->initial_heap_limit_);
16451683

16461684
heap::WriteSnapshot(env, filename.c_str());
16471685
env->heap_limit_snapshot_taken_ += 1;
@@ -1659,10 +1697,6 @@ size_t Environment::NearHeapLimitCallback(void* data,
16591697
env->isolate()->AutomaticallyRestoreInitialHeapLimit(0.95);
16601698

16611699
env->is_processing_heap_limit_callback_ = false;
1662-
1663-
// The new limit must be higher than current_heap_limit or V8 might
1664-
// crash.
1665-
return current_heap_limit + 1;
16661700
}
16671701

16681702
inline size_t Environment::SelfSize() const {

src/env.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1392,6 +1392,8 @@ class Environment : public MemoryRetainer {
13921392
inline void RemoveCleanupHook(CleanupCallback cb, void* arg);
13931393
void RunCleanup();
13941394

1395+
static void TakeSnapshotInNearHeapLimitCallback(v8::Isolate* isolate,
1396+
void* data);
13951397
static size_t NearHeapLimitCallback(void* data,
13961398
size_t current_heap_limit,
13971399
size_t initial_heap_limit);
@@ -1524,6 +1526,7 @@ class Environment : public MemoryRetainer {
15241526

15251527
bool is_processing_heap_limit_callback_ = false;
15261528
int64_t heap_limit_snapshot_taken_ = 0;
1529+
size_t initial_heap_limit_ = 0;
15271530

15281531
uint32_t module_id_counter_ = 0;
15291532
uint32_t script_id_counter_ = 0;

0 commit comments

Comments
 (0)