-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move 'internalBinding('heap_utils').createHeapDump()' to user land #23328
Comments
BTW this is also available as I think it makes sense to expose it through something like cc @addaleax @nodejs/v8 WDYT? |
I followed the discussion over the diagnostics group and @joyeecheung told that it's possible through the |
@vmarchaud Sort of, to get the heap snapshot through the inspector API the user needs to create an inspector session, which brings more overhead than just calling |
I'm not convinced that leveraging the inspector API should be needed to create a heapdump. Yes, it works, but it's not easy and quite error-prone. One could easily forget to close resources for example. Also, the inspector API has no backpressure mechanism, which could lead to memory issues. |
I see, but my point would be that if we expose a public API to make a HeapProfile, why not for the CPU Profile too ? I think it's more a debugging design question : should we expose all methods of the Inspector through high level method because it's "not easy" or "error-prone" ? I personally believe we should better document those methods with the |
I see your point. CPU profile would be welcome as well, now we come to speak about it ;-). Apart from the backpressure issue mentioned in my previous post, which makes using the inspector API not ideal, i guess it's a matter of taste. |
For the backpressure issue, i believe it's better for us to fix the issue upstream (in this case V8) than writing another implementation in Node ? I agree on the fact that it's a matter of taste, but i would personally argue that the core goal is to expose the API, and if it's too complex, a userland package could simplify it (which is easier to maintain/release). I believe the core made a similar decision with |
That is a good point, though I personally find the inspector API a bit strange in the case of taking profiles - it makes much more sense in the debugger/runtime domains. For one thing, if you take the profiles through the inspector API, there will be objects/functions in the inspector module showing up in the profiles, not a significant noise but there is still noise. Probably V8 can provide a way for us to mark those as hidden, but exposing the raw methods allows users to avoid the many hoops that the inspector jumps in order to call them in the backend - and they are already part of the V8 API anyway (and quite stable, actually). |
cc @nodejs/diagnostics |
@joyeecheung Sure, I’d be okay with making this public if that’s what people think is a good idea. One question might be whether we want to present them with parsed JSON or the “raw” stream we get from V8? |
@addaleax Maybe providing an option like |
A stream might be better, as it doesn't require the whole json blob to be retained in memory. |
Note that any decently sized heap dump is going to be too large to put into memory as a JS Object unless you want millions of objects to hit the GC and the size of them on the heap. DevTools libraries use typed arrays and views/short lived objects exactly because it is unusable for anything non-trivial. |
I think making parsing built-in might be a potential future improvement but it might make more sense as a userland-first feature (given that streaming JSON parsing isn't a thing core does yet). A stream of |
The V8 API only supports JSON as the only serialization format for heap snapshots right now, if we need to implement a serializer ourselves I'd prefer we defer that to the user land, or request V8 to implement other serialization formats instead of doing it ourselves.
Are you talking about how the inspector API emits chunks of heap snapshot? EDIT: oh, this is already available, we just need to implement another |
I think the stream interface is already part of the public API (lifted from v8::HeapProfiler* profiler = m_isolate->GetHeapProfiler();
v8::HeapSnapshot* snapshot = profiler->TakeHeapSnapshot(0, &resolver);
MyOutputStream stream();
snapshot->Serialize(&stream); Which should give us chunks of raw bytes I think. Actually, the current implementation in |
It can cause a lot of problems to return a heap snapshot in JS, since that uses heap. That doesn't mean it is never useful, but I hope it won't be the only way to get a heap in pure js. IIRC, the underlying V8 APIs allow setting a stream output destination, so the data can be written directly to disk and not fill up memory. If we add a wrapper around inspector, it would be useful to have similar functionality. |
@jkrems @sam-github The “issue” with that is that it’s still all happening synchronously… I think we’d want to change that in upstream V8 before exposing a real streaming interface? |
@addaleax Ah, sorry. Missed that part of your point. :) My vote would be to still expose it as a stream from day one. Even if it gets emitted in one synchronous burst of chunks, we should save on copying the data into strings (and/or parsing) and the interface wouldn't need to change if it becomes an asynchronous stream in the future. Not sure though if an asynchronous stream would be likely - maybe in the presence of the sampling heap profiler where the VM can continue running..? Anyhow, I think the point I care about is that I believe the initial interface should be Buffer chunks [details may vary] and not parsed or strings. Because my guess is that most people would process the data offline, not in a process they want to investigate. |
Can you do an async heap snapshot? If js is still being executed, the heap is mutating.... how can you snapshot a heap that is being changed? Seems a hard but maybe not impossible problem. With what we have now though, I think an API use case that is important is one in which no new js objects (even buffers) need to be allocated in order to retrieve the heap snapshot and write it to a file, which is possible (IIRC) with the sync streaming API exposed now. As another use case, If there was some way, as @jkrems says, to expose a streaming API for use in-node, that wouldn't have to change if the underlying implementation became a true stream, that would be great, but it might involve difficult predictions of the future. |
@sam-github I think what currently happens is that heap snapshot creation and serialization are two different steps. The creation step does not support streaming and is fully synchronous (and should be, to avoid JS heap mutations while it’s happening), the serialization step is the streaming part. That’s why I think async streaming could be doable if we want it to be – we don’t have to worry about heap allocations during streaming, because that’s always happening after the snapshot was taken. |
Just so we are clear: when talking about streaming, are we talking about the JS ReadableStreams? Or just some general mechanism like events emitted in the inspector API? |
@joyeecheung I think either would work fine? I was thinking about stream.Readable, but we could also do something simpler |
Is there anything i can do to move this forward? Or do we need more discussion on this topic? Thanks for the insights up until now. |
@paulrutter I guess the main issue here is still the question of what API format we want… If we are okay with a If we want something that requires no V8 changes, we need some kind of synchronous API instead. |
Is anyone working on moving this forward? Should it be put on the Diagnostics WG's agenda or something? Labeled |
Closing, as I believe this is done. Please reopen if I'm wrong. |
Thanks! In which major versions will it land? As this API is now available, the next step could be to have a way to hook into V8 isolate->SetOOMErrorHandler(OnOOMError); from JS code. |
I think exposing an OOM handler would be a separate request. Wouldn't its usefulness from JS be fairly limited though in an OOM situation? |
Ok, good to know.
I agree, but following the response of @joyeecheung, he would rather have a signal than directly create a heapdump, so the developer can decide what to do with it. |
You can create a heap snapshot via signal using the |
That's not my point; i would want a heapdump when an out of memory occurs. |
@paulrutter No, there currently is no signal or similar, and we can’t really execute JS from a real OOM handler. I do rememeber heapdump-on-OOM being discussed at the last @nodejs/diagnostics summit, but I can’t remember whether that was feasible or not. (Either way, we should either re-open this issue or open a new one, discussions on closed ones tend to get lost easily. I’d prefer opening a new one.) |
I'm unable to create heap dumps when the heap is large (eg.g 2GB and more). When trying to create the dump, the process memory just goes up and up to 3 and 4 times the size of the original heap until it eventually crashes. To reproduce I'm using the following code: let counter = 0; app.get("/memUsage", (req, res) => { app.get("/heapDump", (req, res) => { app.get("/", (req, res) => { app.listen(port, () => { I simply call /memUsage in a loop until the memory consumption reaches a few GBs: |
Not sure if it's the same issue, but i came across this a few times as well. |
@paulrutter thanks. This might indeed be the issue. Doesn't look like there's a solution though. |
This feature request comes from discussion in nodejs/diagnostics#239; ideally there would be a public API in nodejs that offers creating heapdumps.
There are several modules around (node-heapdump, v8-profiler, node-oom-heapdump etc) that offer this functionality, but it would make nodejs more mature if this is available in the public API.
@joyeecheung mentioned a way to generate a heapdump without help of an external module, but this seems a bit convoluted.
Phase 2 would be to have a way to hook into V8
isolate->SetOOMErrorHandler(OnOOMError);
from JS code, so a user can define custom actions in case of an out of memory. I will create a separate feature request for that one.The text was updated successfully, but these errors were encountered: