-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unhandledRejection does not have the correct stack trace. #5084
Comments
|
Correct. The stack leading upto See below an example of errors and
|
Just to make sure we're on the same page, and that I'm not missing anything, here's the tiny change that I made to node's master: diff --git a/src/node.cc b/src/node.cc
index ab58627..a91ece3 100644
--- a/src/node.cc
+++ b/src/node.cc
@@ -1011,6 +1011,9 @@ static bool ShouldAbortOnUncaughtException(Isolate* isolate) {
return isEmittingTopLevelDomainError || !DomainsStackHasErrorHandler(env);
}
+static void AbortOnPromiseRejection(PromiseRejectMessage message) {
+ abort();
+}
void SetupDomainUse(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
@@ -1117,7 +1120,6 @@ void SetupPromises(const FunctionCallbackInfo<Value>& args) {
CHECK(args[0]->IsFunction());
- isolate->SetPromiseRejectCallback(PromiseRejectCallback);
env->set_promise_reject_function(args[0].As<Function>());
env->process_object()->Delete(
@@ -4136,6 +4138,8 @@ static void StartNodeInstance(void* arg) {
isolate->SetAbortOnUncaughtExceptionCallback(
ShouldAbortOnUncaughtException);
+ isolate->SetPromiseRejectCallback(
+ AbortOnPromiseRejection);
// Start debug agent when argv has --debug
if (instance_data->use_debug_agent()) and here's what I ran:
and finally here's the callstack that I get from the core file that was just generated:
The problem as I see it is that |
Correct, The reason they need to be in the call stack is far more evident if we use Seeing the arguments to functions on the stack is absolutely critical to debugging what is going on in Take for example the classic recursion problem, your trying to debug an infinite recursion exception about "Maximum call stack" you really want to see the functions that are being called recursively. Seeing the nearest promise that caught the recursion gives you less information then preferable. In fact seeing the stack to the nearest promise is probably as good as looking at the |
Thanks for doing this! This touches on something interesting. You're instrumenting the executor, which runs before For reference, that would look something like this: 'use strict'
const fs = require('fs')
function bar () {
throw new Error('unhandled rejection');
}
function foo () {
return bar();
}
const p = new Promise(function (resolve, reject) {
fs.readFile(__filename, (err, data) => { err ? reject(err) : resolve(data) })
}).then(() => {
foo()
}); I am curious: do you think the overlap between the post mortem debugging and promise-using audiences is such that there are a lot of folks that desire this debugging to be perfect? Or can the promise-using audience rely on advances in the Chrome dev tools, while non-promise-users rely on |
@chrisdickinson in terms of overlap and usages of nodejs in companies:
I think the overlap probably exists in enterprises. Take for example This definitely shows itself when you have 1 or 2 people writing for example, an "rpc" library or a high performance "json api" library that want to be able to do post-mortem debugging on their library which is embedded in >50 services and touched by >200 engineers. This becomes especially difficult if an application developer uses a database ORM library that has promises and then calls Our current response would be to try and debug their issue and then tell them that there is not enough information and we have to add a change their application to use libraries and techniques that enable post mortem debugging. This particular dynamic of application developers and library authors wrt Promises has not happened at Uber yet, but it will be far more likely to happen if promises are in core. |
I thought this was what the
I would think it's another data point and a great test to add to document and try to improve the post-mortem debugging support for promises. EDIT: since it suffers from similar issues as the previous test case.
I don't think it's about "perfect" debugging, it's about being able to debug anything at all using post-mortem debugging. As a lot of applications are built on top of third-party npm modules, it seems very likely (if not certain) that users of post-mortem debuggers will need to debug applications that use the promises-based APIs of node core using post-mortem debugging tools, or they'll need to make code changes to these modules. Making that more likely to happen before thinking about solutions to the issues mentioned in this thread would seem to be going against the work done by the @nodejs/post-mortem WG.
What do Chrome dev tools provide in terms of debugging promises? That sounds interesting!
I don't think post-mortem debugging users will have the luxury of not using any module making use of the new promises-based node core APIs. |
Even if post mortem debugging users have that luxury and control over their dependency tree (some of us do). The moment a caller wraps the callsite entry point into their code with a try/catch it hinders the ability to debug any bugs that occur in production using post mortem techniques. |
Sincere thanks to you all for taking the time to dig into this. Promises in core with good dumps would of course be incredible. |
What do you mean by this? Throwing/catching doesn't affect an error's |
@petkaantonov "Unwinding the stack" is a different case than the captured stack trace — it means that when the process aborts, the abort call is at the top of stack so that the resulting core file can be inspected to find the original stack leading to the exception. |
Oh I see. Btw the |
This flag doesn't make any sense with code you don't control. One needs to use a workaround for certain cases to avoid false positives, that cannot be expected from "low quality" npm modules. |
I think it's not possible to implement this. Thrown errors are caught no matter what. Explicitly rejecting should work, but it would result in false positives. For example Promise.reject().catch() would abort instantly |
The
The first message in this issue illustrates that one cannot use a
Could you please be more explicit? What "certain cases" do you have in mind? Is it the |
If you don't immediately call |
Right, this is impossible, it's not an artifact and it'll be the same way with any asynchronous error - those just get silenced with callbacks where promises provide a Of course, you can access the stack inside the |
Thank you, that's helpful to know! I'll definitely add that to the elements to consider when writing the results of my research. |
As @misterdjules notes, yes, let's keep this conversation polite & productive. It looks like there are a couple misunderstandings on both sides, so I'd like to define some terms here:
"Unwinding the Stack": This is separate from a stack trace — which gives an indication of "where" the error occurred, but less information about "how" an control passed to that point. When |
Promise rejection in general case is not an indication of "mortem". |
@vkurchatkin Continuing the discussion from #5020 (comment):
Right, the point is not necessarily that it's in a library or not, it's about differentiating programmer errors and operational errors. Users of post-mortem debugging handle programmer errors by generating a core file that can be inspected later to root-cause the problem. The issue with promises is that so far and If I understand them correctly, programmer errors and operational errors are not handled differently, and so it's impossible to differentiate them. Thus it seems promises in their current implementation are not compatible with post-mortem debugging. Adding a promises-based API in core means that core accepts and supports not being able to differentiate between the two types of errors, which I feel contradicts the work of the @nodejs/post-mortem WG and the current support for post-mortem debugging. Similar (not identical) issues have been caused by the addition of domains in core. Luckily, some work has been done to mitigate these issues, maybe the same work can be done with promises at least to inform the project on whether or not we want to accept promises-based API in core? |
Right, and that's not what I suggested. See my response above to @vkurchatkin: [Post-mortem debugging is] about differentiating programmer errors and operational errors. Users of post-mortem debugging handle programmer errors by generating a core file that can be inspected later to root-cause the problem. The issue with promises is that so far and If I understand them correctly, programmer errors and operational errors are not handled differently, and so it's impossible to differentiate them. |
You can do different kind of error handling depending on the context, the errors type and properties of the error. Suppose the program is leaking database handles. You will get handle allocation error, in this case you would abort the process and restart, to free all the leaked handles.. Suppose you get TypeError or ReferenceError or such. There might not be any need to abort, you can drop whatever context was related to this (e.g. one request). Suppose you get typical operational error like TimeoutError, in this case you would retry. Or if you have already done e.g. 5 retries, you would report it to the user. |
I sort of disagree with this definition: this describes an aspect of post-mortem debugging, but not post-mortem debugging as whole. I would broaden the definition — post-mortem debugging is about having the necessary tools and information available to piece back the series of events leading to an unexpected state in the program after the program has terminated. Determining the provenance of an error is a task one should be able to perform, but it's not the whole of the thing. |
But even currently you have no such thing since you only have the last event? In that case I'd argue the current promise setups in userland are already doing this much better with long stack traces that you can log in the unhandledRejection handler. This is currently unimplemented by Node.js for native promises, but it's in DevTools so I assume there is API for it. |
Yes, that's what's done in |
This doesn't work if someone catches and rethrows an error |
You can handle the Exception DebugEvent and manually PrintStackTrace and abort() there |
I don't think it's something that can be turned on in production. It looks like it debugger analyses every |
Absolutely, which is why users of post-mortem debugging carefully use try/catch blocks and throw statements. The problem that we're describing here is that promises implicitly add that try/catch block, and thus don't give you a choice of what errors are caught or uncaught. |
@misterdjules: How feasible do you think it would be for the V8 stack trace capture mechanism to additionally capture parameter values? If feasible, do you think that would make the case where the core contains an error object (but an undesirable stack state) more tenable? |
If you need to carefully use try catch and throw anyway, then what is the point of this issue? You dont need to be particularly careful about not using promises. |
@ChrisDickson @mrdjules @vkurchatin As an aside, Uber has a hacked up private fork of mdb that kind of works on Linux. Might be worth starting a conversation with post-mortem about Linux support. |
Because so far a lot of node's core (domains is a counter example of that) has been written in a way that core APIs throw when an error cannot be dealt with (on a programmer error, on an 'error' event being emitted but no handler being set, etc.). Users of post-mortem debugging rely on that convention to make their process abort and generate a core dump when such an error is thrown, and they can't handle it themselves with a try/catch block. Because promises add an implicit try/catch block around users' code, it seems that convention isn't respected anymore. |
That would be great! Would you mind opening an issue in nodejs/post-mortem? |
Yes but you need to use promises in order for this to happen. If you dont use promises then you are not affected. So non promise users who are not affected shouldnt block promise progress for thos who actually use them. |
What if I want to use the new promises-based core APIs and post-mortem debugging? How do we communicate that the two are basically incompatible? Like I said before, domains have similar issues with post-mortem debugging. I've fixed some of these issues and had to explain inconsistencies to many users. I think having two incompatible tools in the core runtime brings a lot of confusion and in the long term doesn't serve users and the project. Does it hurt to try to find a way to make them work together? That's all I'm trying to do here. And what if I can't avoid using the new promises-based core APIs? If core provides a promises-based API for most of its modules, then I would expect at least a significant part of the modules ecosystem to use them, otherwise, why bother doing it? A lot of node programs are built on top of small third-party modules, and post-mortem debugging users like myself unfortunately won't have the time to rewrite code that migrates to using core's promises based APIs. Of course any of those third-party modules could today wrap all their code in a giant try/catch, thus having a negative impact on post-mortem debugging too, but that's not what most of them are doing, because it's not the convention that has been built so far. By moving to using core's promises based APIs though, it seems that these modules, and thus the applications that use them, would get closer to that. Is it worth it to discuss and investigate how we can understand that better and maybe find solutions if we determine there are problems to consider? I think so.
I am not blocking anything, in fact I cannot block anything because that's not how the project works. I have raised questions around the impact of #5020 on post-mortem debugging, and I've provided some facts and code to illustrate these questions. We've already made progress in this discussion, and I'm sure we can make a lot more progress in the weeks to come to answers some of these questions. I believe answering these questions only helps the project and its users. Like @chrisdickinson mentioned before, the worse thing that can happen by doing that is that we understand better the potential problems, and by doing so better serve our users and the project. |
@misterdjules Indeed, I'm happy you broached this concern! I'm looking into what information V8 is privy to when generating stack traces, and it looks like it may contain the parameter values for each frame. If this true it opens an interesting venue for investigation: if we can preserve this information on the error object somehow instead of throwing it away when capturing the trace, this might greatly improve post-mortem capabilities in the general case, while resolving many of the concerns post-mortem has with the Promises PR. Specifically, this would relax the need for a process to abort at a given location on the stack — if the process aborts anywhere with an |
It's not really that post-mortem debuggers need a specific set of data and we could try to provide that data in different ways, but more that the core files they inspect need to be generated as soon as possible after the error happened, and without loosing any state. For instance, what if I want to inspect the content of a buffer instance? That content is in the heap, how do I know that that content has not been overwritten by some code that runs after the error has happened and before the process aborts if between these two events node and V8 can basically do a lot of extra processing, including garbage collection, memory allocation/deallocation, etc? What happens if I want to inspect the content of the stack frames but this content has been overwritten by calls to functions that handle errors after the uncaught error was thrown and before the process aborts? What if I want to inspect the state of globals objects that may have changed in the meantime too? These are just examples of the same idea that core files should reflect the exact state of the process at the time of the error. Does that answer your question? |
Just for fun, I experimented with a small hack that provides us with the ability to dump a core file at the time an uncaught error is thrown from within a promise's resolver. This hack currently covers only one specific use case that was mentioned in this issue, but I believe if we determine it's an approach worth considering, it wouldn't be too difficult to try to cover all other use cases. As seen in node's output when running this script and in mdb's I don't know yet if this hack can be considered a valid proof of concept, most importantly because contrary to the current behavior of At the very least, all other cases where an error can be thrown from a promise (not only from the resolver) would need to be handled, but it can be a starting point for discussing a potential fix for this problem. |
@misterdjules Maybe I missed it in the discussion but is there a reason something like this wouldn't work? diff --git a/src/node.cc b/src/node.cc
index 20be309..dde2be1 100644
--- a/src/node.cc
+++ b/src/node.cc
@@ -145,6 +145,7 @@ static bool use_debug_agent = false;
static bool debug_wait_connect = false;
static int debug_port = 5858;
static bool prof_process = false;
+static bool v8_abort_on_uncaught_exception = false;
static bool v8_is_profiling = false;
static bool node_is_initialized = false;
static node_module* modpending;
@@ -1094,6 +1095,9 @@ void SetupNextTick(const FunctionCallbackInfo<Value>& args) {
}
void PromiseRejectCallback(PromiseRejectMessage message) {
+ if (v8_abort_on_uncaught_exception)
+ abort();
+
Local<Promise> promise = message.GetPromise();
Isolate* isolate = promise->GetIsolate();
Local<Value> value = message.GetValue();
@@ -3864,10 +3868,12 @@ void Init(int* argc,
// behavior but it could also interfere with the user's intentions in ways
// we fail to anticipate. Dillema.
for (int i = 1; i < v8_argc; ++i) {
- if (strncmp(v8_argv[i], "--prof", sizeof("--prof") - 1) == 0) {
- v8_is_profiling = true;
- break;
- }
+ const char* const arg = v8_argv[i];
+#define V(flag, var) if (strncmp(arg, flag, sizeof(flag) - 1) == 0) var = true
+ V("--abort-on-uncaught-exception", v8_abort_on_uncaught_exception);
+ V("--abort_on_uncaught_exception", v8_abort_on_uncaught_exception);
+ V("--prof", v8_is_profiling);
+#undef V
}
#ifdef __POSIX__ The issue I see with your changes to promise.js is that they look to be breaking promise debugging when |
@bnoordhuis That would |
In addition to what @petkaantonov said, I had experimented with such a solution earlier and described why it isn't suitable.
OK, I'll look into that when I have more time tomorrow. Thank you for taking a look and giving your feedback! |
@misterdjules: Yes — though it seems to me that in many cases, aborting at the top of stack may not be possible. I suspect that this will be true of most promise-based programs. However, perhaps we can broaden the applicability of post-mortem debugging techniques to include programs that use The ephemeral stack data would be preserved and persisted by the error object, and the heap should be kept static for the most part. Only a small gap is introduced between entering the exceptional state and the abort process, while relaxing the need to exit at top of stack somewhat. In a promise chain like so:
If (I posted to this effect in the post-mortem WG thread as well.) |
V8 fires "unhandled rejection" and "rejection handled" events even when the debugger is not active. When the debugger is active however, V8 collects additional information when Promises are executed. I believe the problems you are talking about are all solved problems if the debugger is active. See https://github.com/v8/v8/tree/master/test/mjsunit/es6/debug-promises Short explanation: promise rejections are treated the same way as exception events in the debug listener. Rejected promises that have a reject handler are considered caught exceptions, those without are considered uncaught. The exception event is fired at the point of the rejection, so the stack has not been unwound yet. The promise object itself can be retrieved from the event data object. You can also use the promise mirror to get some reflection: https://github.com/v8/v8/blob/master/test/mjsunit/es6/mirror-promises.js Another gem you might be interested in: Chrome DevTools offers async stack traces. With that, you can actually see the asynchronous call chain leading up to Promise rejections. |
This issue is about being able to investigate production issues using core dumps. Enabling the debugger in production is, as far as I know, not standard practice. I would imagine that the impact on performance would not be acceptable, but I'd welcome any data that suggests otherwise.
Does that imply live debugging? If so that is outside of the scope of this issue, which is about post-mortem debugging in production. |
I see. Then the debugger indeed won't be a help. Note however that I don't expect the debugger to impact performance aside from calling to the callback on exception events though. JIT code only run slow if a break point has been set to it. |
Closing in favour of nodejs/post-mortem#16. For posterity, issue title is misleading: unhandled rejections in fact DO have the correct stack traces (that is, |
Given the recent PR ( #5020 ) I wanted to see if promises could enable core dump based debugging.
It looks like currently when an
unhandledRejection
occurs the stack is gone. This is very similar to how the stack is gone when anuncaughtException
occurs.We've added
--abort-on-uncaught-exception
flag to allow us to abort the process with the correct stack in C++. We need to implement--abort-on-unhandled-rejection
flag as well.See foo.js example
Note the error has the correct stacktrace including
foo
andreject
Note that the actual stack of the program when in the
unhandledRejection
listener is wrong. This means we cannot justprocess.abort()
in there since that's absolutely useless, I want to stack that mattered.I considered this a MUSTFIX before we land promises into core, if we cannot debug promises then there is not point in using it. The only good feature nodejs has is extreme debuggability.
cc: @chrisdickinson @misterdjules
The text was updated successfully, but these errors were encountered: