Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

EventEmitter fails to dispatch all handlers when preceeding handler throws && domains won't prevent #5114

Closed
CrabDude opened this issue Mar 22, 2013 · 31 comments

Comments

@CrabDude
Copy link

var EE = require('events').EventEmitter
var ee = new EE
ee.on('foo', function() {
  throw new Error
})
// This handler never fires b/c events are emitted synchronously and serially
ee.on('foo', console.log.bind(null, 'success!'))
ee.emit('foo')

Additionally, domains fail to prevent this (although they do catch the error):

var domain = require('domain')
var d = domain.create()
var EE = require('events').EventEmitter
var ee = new EE
ee.on('foo', function() {
  throw new Error
})
// This handler never fires b/c events are emitted synchronously and serially
ee.on('foo', console.log.bind(null, 'success!'))
d.on('error', console.log.bind(null, 'caught: '))
d.run(function() {
  ee.emit('foo')
})

Lastly, this error prevents sockets from being released back into the socket pool, because node is listening on the end event internally whose handler never gets fired because of an application code error (see #5107).

FWIW, this (both the above, and the socket release issue) works in trycatch because I wrap all handlers in a try/catch before dispatching.

If the error is occurring in core, we may not want to continue dispatching, but if it's occurring in application code, not only are we not in an undefined state, but the way EE short circuits is in fact creating an undefined state where sockets are not being released.

@trevnorris
Copy link

Thought there were some test cases covering this. I'll look into it.

@isaacs
Copy link

isaacs commented Mar 22, 2013

Just to be 100% clear: There has never been a guarantee that other handlers will fire if an error is thrown in an event handler.

We can try/finally to clean up sockets if there's an error in the end handler, but this is unequivocally NOT a new bug, and not something we've ever intended to support. If your domain catches an error, you are now in an undefined state. This was the same for process.on('uncaughtException'). It's stated all over the docs, and in every tutorial for domains.

Domains are not a "On Error Resume Next" type of thing. They're a "On Error Goto End" kind of thing. Your mission in the domain error handler is to die as gracefully as possible, as soon as possible.

@CrabDude
Copy link
Author

It seems this issue is abutting against other gray areas within node core. Rather than respond directly to your point,

Your mission in the domain error handler is to die as gracefully as possible, as soon as possible.

I'll try to make a more comprehensive point which should be clearer than responding directly to your point.

Error handling in node is remarkably inconsistent.

This inconsistency creates a lot of problems for a lot of people, and it doesn't help that the core team hasn't entirely solved the issues themselves yet. Amongst the error-related issues are:

  • To throw or not to throw (at all, ever)
  • When to try/catch
  • Which errors come back on the callback, which are thrown
  • When will the callback never be called
  • Application vs "runtime" errors
  • When and how to use uncaughtException
  • How to use domains

At the heart of all of these is consistency in how different types of errors are handled and responded to.

Typically, in these sorts of discussions it's broken down into "application" and "runtime", where application is the code the developer writes and runtime is everything else. Unfortunately, this categorization lacks important distinctions, most importantly that where the error occurs is independent of the type, an error or an exception. The difference between errors and exceptions conceptually is exacerbated a bit in JavaScript, because of the Error class, but more or less, exceptions are intended to be caught, while errors are often fatal in nature and recovery from errors is not possible. (If I'm technically getting the terminology wrong, it's not important to my point in that I'll use these definitions and terms consistently)

Instead, the following break down (from input to output) is a little more appropriate, and will help us better discuss appropriate response strategies:

Error Types

  1. Application Exceptions
  2. Input Exceptions
  3. Pre-IO Core Errors
  4. IO Errors
  5. IO Exceptions
  6. Post-IO Core Errors
  7. Post-IO Core Exceptions
  8. Repeat. (Application code)

The primary goal of this breakdown is to two-fold.

  • When should / do we throw and when should we pass?
  • Which require a process restart because of an undefined state, and which don't?

Application Exceptions

These are exceptions resulting from non-core code. These exceptions by definition have nothing to do with the current process state. Examples: null.foo, throw new Error('something\'s wrong.').
Characteristics:

  • Type: Exception
  • Thrown: Implicitly and Explicitly
  • Undefined State: false

Input Exceptions

These are exceptions that are currently thrown from node core when invalid input is provided.
Characteristics:

  • Type: Exception
  • Thrown: Explicitly
  • Undefined State: false

Pre-IO Core Errors

These are unexpected errors for any number of reason. These occur in core, and may be a result of a bug or a failure to check an input boundary that resulted in a pre-IO C++ error (See #4583).
Characteristics:

  • Type: Error
  • Thrown: Explicitly
  • Undefined State: true

IO Errors

These are exceptions external to node including: seg faults, out of memory, hardware issues, or network issues.
Characteristics:

  • Type: Exception
  • Thrown: No. Passed on callback
  • Undefined State: false

IO Exceptions

These are errors external to node including: file not found, etc..
Characteristics:

  • Type: Exception
  • Thrown: No. Passed on callback
  • Undefined State: false

Post-IO Core Errors

See "Pre-IO Core Errors".

Post-IO Core Exceptions

These in a way are a superset of "IO Errors" in that they are expected cases where we will be returning an error to the callback even when the bindings themselves don't return an error.
Characteristics:

  • Type: Exception
  • Thrown: No. Passed on callback
  • Undefined State: false

Post-IO Application Errors

See "Application Errors".

How Things Are

aka "How to deal with each type, and how to tell them apart"

There are essentially 3 major issues with the current Error handling that prevent us from ever catching any exceptions, and therefore always requiring a restart on any exception except "Post-IO Core Exceptions":

  1. There's no way to distinguish between "Input Exceptions" and "Pre-IO Core Errors". This is pretty major since effectively any failure to completely pre-validate input and avoid an input exception, will require a process restart. It's of course exacerbated by the fact that input requirements are not extensively documented and requires reading both the JavaScript and C++ source. (Again, see crypto.randomBytes throws in async mode #4583)
  2. try/catch is insufficient to catch application exceptions without parsing the stack trace (err.stack.split('\n')[1].indexOf(path.sep) !== -1) because you will also unintentionally catch Input Exceptions and Pre-IO Core Errors.
  3. In order to ensure callback execution, and therefore guarantee request response, any call to node core requires handling of both the "Input Exception" on the front-end and the "Post-IO Core Exception" on the back-end. In fact, you can't even rely upon domains to do this for you since they don't even catch synchronous Errors (domain: Synchronous error not caught #4770), requiring us to wrap all node core calls in a try/catch to avoid superfluous restarts caused by Invalid Input.

The result is that production code requires the following hoop jumping to ensure callbacks are called for all exceptions originating from a given function call:

function foo(callback) {
var d = domain.create()
try{
  d.run(function() {
    fs.readFile('bar', function(err, data) {
      d.exit()
      callback(err, data)
    })
  })
} catch(e) {
  process.nextTick(function() {
    callback(e)
  })
}

Why not just never use try/catch and use domains only for a quick 500 before shutting down?

  1. Exceptions are a common fact of life for a developer and their team.
  2. Node makes it nearly impossible to avoid invalid input exceptions due to point Writing to files in loops seems to be capped at 250 file.write calls. #1 above.
  3. Node's performance is severely hurt by constantly stopping and restarting processes.
  4. 99.99% of developers think domains were built to solve points 1 & 2. Based on your words here, they're obviously wrong.

There's no point binding every Buffer, EE, Stream, etc... to a domain if it's supposed to be collected very soon after on process exit. That's wasted effort and a hit to performance. That can be accomplished with a much lighter more performant implementation like I did in [email protected], which works great.
** I'm not downplaying domains. I think they should be utilized more fully.

Additionally, trycatch, addresses the exception issue as well. Yes, it catches all Errors, but since node has been increasingly stable, nearly all of them are "Application Exceptions", allowing you to return 500s before domains ever existed without requiring a restart. It runs in long running production code flawlessly without restarts or memory leaks with and without Errors. The server is much more reliable compared to the alternative of properly closing your server and gracefully handling any open requests before exiting, while constantly maintaining an appropriately sized process pool. I've done both. (Though I will add that I'll probably add support to trycatch to not catch core errors...)

Error Response Strategies

Now we can address the following:

  • Which require a process restart because of an undefined state, and which don't?
  • When should we throw and when should we pass?
  • How should we deal with each type?

Which Errors result in an undefined state and require a restart?

This is easy. Errors by definition require a restart, Exceptions do not. Both can be caught via process.on('uncaughtException', ...).

When should we throw and when should we pass?

This is easy too. Never throw, ever. Thrown Errors should be the exclusive domain of Errors in node.js, not Exceptions. We have an error passing paradigm, it's the callback. If it is a synchronous function, Exceptions should be passed as a return value similar to Google's Go. Please don't kill / force me to restart my server unnecessarily because your library throws instead of returning null or an Exception.

Even when there is no callback, it's likely that the program simply does not care about the result. Regarding node core throwing on invalid input, I would say node isn't "failing fast" so much as it's "failing unnecessarily" and hurting performance unnecessarily as well.

How should we deal with each type?

This is a bit lengthier, but here's a quick summary:

  • Errors should crash the process, and developers should restart gracefully.
  • Input Exceptions & Post-IO Exceptions should be passed by node to the callback.
  • Application Exceptions should be the only Errors caught by domains.

I'll update trycatch to fix all of the above.

Conclusion

Errors in node are broken, and none of the current solutions are sufficient. In direct response to your point,

Your mission in the domain error handler is to die as gracefully as possible, as soon as possible.

the core API, the domain documentation (See "Explicit Binding"), and my experience with error handling in node and JavaScript all seem to disagree with your assumption that it's necessary.

However, if it is true that this is how domains are meant to be used, it definitely needs to be added in bold to the documentation, and clarified why domains don't address the above in the ways they're capable because I've not met a person who uses domains as you suggest and it would be news to me and all of them.

Lastly,

  • Why do Application Exceptions require a process restart?
  • Why would wrapping EE handlers with try/catch create an undefined state? (especially when there's an active domain to pass the application Error to)

@mscdex
Copy link

mscdex commented Mar 23, 2013

RE: errors in general, what I usually do in my modules is throw if it's an error that can be detected immediately (bad/missing required input, current state does not permit the particular action, etc) and callback/'error' event for anything else.

@othiym23
Copy link

This is a fantastic writeup, Adam. It echoes a lot of my own frustration and confusion regarding how we're supposed to deal with exceptional cases in Node, and I agree in large part with what you have to say about domains. I think I need to spend spend some time thinking about this before I have any kind of substantial response, except to say that I think a lot of what you've done with trycatch belongs in Node proper, if only so we can handle errors consistently without having to pay a deoptimization penalty, if possible.

@trevnorris
Copy link

Nice breakdown. This has been a heavily debated topic (as I'm sure the
follow up posts will demonstrate). I would like a couple clarifications on
your part.

Are you suggesting node should start wrapped in a domain? In other words, a
domain is preemptively set.

While I mostly understand the theoretical behind your discussion, there are
a couple scenarios I can't account for:

  • JSON.parse will cause your application to crash if the JSON string is
    invalid. This would be an Input Exception, and is not part of node core. So
    it is dependent on the user to properly validate the string or place it in
    a try/catch.
  • There are differences in the API that require different handling of IO
    Exceptions. For example how to handle a File Not Found using read/readSync.
  • Errors like Out Of Memory force V8 to crash. We can pass a cc callback to
    be executed immediately, but we can't pass that asynchronously back to
    javascript.

@bnoordhuis
Copy link
Member

@CrabDude The undefined state that @isaacs is referring to is the fact that any mutable action that is interrupted by an exception can leave your application (and by extension, node core) in an inconsistent state. Consider this example:

function dispatch(arrayOfCallbacks) {
  if (!Array.isArray(arrayOfCallbacks)) throw new TypeError('not an array');  // (1)
  for (var i = 0; i < arrayOfCallbacks.length; i++)
    if (typeof arrayOfCallbacks[i] !== 'function') throw new TypeError('not a function');  // (2)
  for (var i = 0; i < arrayOfCallbacks.length; i++)
    arrayOfCallbacks[i]();  // (3)
}

Exceptions from (1) and (2) are safe because they don't mutate program state but only if the caller explicitly (and correctly!) handles the exceptions.

An exception from (3) is intrinsically unsafe because there is no telling a) how many callbacks ran before one raised an exception, and b) what side effects those callbacks had.

This is of course a contrived example but unless you code very, very carefully, you're going to run into a similar situation sooner or later. I don't trust myself to never make a mistake like that, let alone others. :-)

Apropos the 'uncaughtException' event: it's evil, don't use it.

@vicary
Copy link

vicary commented Mar 24, 2013

+1

Leaving participation for easy follow up, very nice discussion here.

@isaacs
Copy link

isaacs commented Mar 24, 2013

Ben covered the "undefined state" question quite nicely in his example. I'll only add: if throwing might sometimes cause an undefined state, then the question of whether or not it will cause an undefined state is, itself, undefined. You need to know exactly why that exception was thrown, and from where, and for what reason, in order to be sure that you're handling it correctly.

You list null.foo as an example of an "Application Exception". By your definition here, null.foo must be an error. Either:

  1. You expected an object here, but instead, you have null. This is a violation of your expectations, and who knows what other things might go wrong as a result. You're violating a contract by passing null. Or,
  2. You wrote a program that takes an expects null, but tries to read a property off of it. That program is clearly incorrect.

In JavaScript, there is no fundamental difference between "Error" and "Exception", really. Whenever you throw, you are jumping all the way up the call stack to the closest frame that had a try/catch wrapped around it, or all the way to the TryCatch object that node uses in C++ land. In the best case (where undefined state is not created, like you wrap a JSON.parse() in a try/catch), it's a goto. In the worst, it's a goto that jumps somewhere you can't predict.

@CrabDude
Copy link
Author

@trevnorris

  • JSON.parse throwing is generally (at least from what I've observed) acknowledge to be a poor API decision. The goal is to minimize the number of internal APIs that require being wrapped in a try/catch in order to determine success/failure.
    JSON.parse is a fantastic example of why it's bad to throw on Invalid Input. It is effectively cumbersome and redundant to pre-validate input, so it has now become a best practice to always wrap JSON.parse in a try/catch. Now, since every node API call is equivalent to JSON.parse, and assuming as Isaac suggests that on uncaughtException there's no way to determine if we're in an undefined state or not, we now have to wrap every sync and asyn, IO or non-IO, call in a try/catch! Or, conversely, requiring perfect input pre-validation in node is equivalent to expecting pre-validation of JSON before parsing it.
  • I'm less concerned about sync IO API error handling, though for consistency sake (in my proposal), I would prefer if they returned an error rather than throwing for the same reasons JSON.parse shouldn't throw. Again, there is precedent for this. (See Google Go (Really, read this, there's some good stuff here) and Google's C++ style guide)
  • Major Errors like that are exactly that, major. Node should do what it can to allow a graceful shutdown (logging, send 500s, etc...), but with errors like that, data loss is almost always unavoidable.

@bnoordhuis @isaacs
Neither of you answered my two questions as stated within the context of a proposed solution, (in other words, these are not general questions, they are contextual: "Taking into account my points about error types, node's error handling shortcomings, and my proposed solutions...")

  • Why do Application Exceptions require a process restart?
  • Why would wrapping EE handlers with try/catch create an undefined state?

nor did you address most of my points regarding both failures in the current error handling and potential solutions.

Isaac to be clear, I understand how throwing creates an undefined state. My response was entirely a summary of all possible avenues of errors and sources of undefined state, as well as a comprehensive list of potential strategies for addressing each avoiding undefined states.

The goal is to separate node (core) errors (and undefined state) from application errors (and undefined state). My proposal includes strategies for internally eliminating existing sources of undefined state, as well as separating application from core undefined states, thus requiring a restart only in issues like "Out of Memory" or core errors. Isaac, your example is correct that it could place the application in an undefined state, but from issues like #2582, most developers want that to be their concern, not node's. Equally, most developers want node to not ever enter into an undefined state to begin with. A major source of undefined state currently is application exceptions, which my proposal would eliminate entirely (their affect on core's state, not the exceptions themselves).

Per the original issue, EE handlers should be wrapped in try/catch to avoid any unnecessary internal undefined state. Errors in handlers and callbacks should be passed to the active domain, and then either emitted as uncaughtException with a property like err.internal === true or (my preference) emitted as an alternate event like process.on('error') or process.on('uncaughtApplicationException') to differentiate core errors (requiring restart) and application errors (node core safe, application state undefined).

In other words, developers want to be responsible for their own state, and want node core to be responsible for itself.

This is especially true in request/response-based daemons (server, SOA service, etc...) where most errors occur within the context of a request, and process can continue with a 500 on error.

I laid out when errors necessarily cause an undefined state, and when they do not. Please point out which of those are wrong, so we can discuss directly which assertions are correct and which are not rather than repeating what we are in agreement about, and what the original issue already addressed, that we are in fact in an undefined state in such situations. Please provide examples for your reasons regarding which assertions are wrong, or alternatively, provide reasons and examples for why the proposed strategies would be insufficient.

@gagle
Copy link

gagle commented Mar 26, 2013

@CrabDude, @isaacs resumed it very well:

Domains are not a "On Error Resume Next" type of thing. They're a "On Error Goto End" kind of thing. Your mission in the domain error handler is to die as gracefully as possible, as soon as possible.

When you get a domain error your mission is to clean up resources and die. There's no reason to continue alive, it's like having a NullPointerException in java programs. If your app is wrong then fix it and run it again.

You should have 1 domain error that wraps all your app and a domain per request. If the global domain receives an error you should kill the server. If a request throws an error it's ok to continue serving other requests but you should log the error with the highest priority in order to fix it asap.

If I'm correct, domains were introduced to wrap user requests/responses but have never been a global "try-catch-continue".

Concluding, if you get an error in the global domain, shutdown as graceful as you can. (spam)

@CrabDude
Copy link
Author

@gagle
Your comment contradicts itself and proceeds to repeat the original counter argument without providing any additional reasoning or examples to add to your point.

@isaacs stated that domains require a restart unequivocally. You agree and reiterated, yet go on to contradict yourself by suggesting:

 If a request throws an error it's ok to continue serving other requests

Either errors caught by domains require a restart in all cases or they don't. I agree completely that certain Errors in a request's async stack do not require a restart, and that's the argument I'm addressing, which cases do require a restart. My post lists them out. Please provide a direct response to one of my points. Repeating that uncaughtExceptions must always require a restart without providing either a reason why or an example unfortunately does not address any of my points.

Additionally, there's nothing special about Errors. We must learn to separate Errors from throw as it only exacerbates the issue.

/// I can throw a non-Error:
throw 'foo'
// And I can create and handle an error without throwing
callback(new Error('path required.'))

throw is special in its goto nature, but can be mitigated by wrapping the call site in a try/catch. Additionally, the second point is to separate Errors from Exceptions. The JSON.parse SyntaxError Exception is both a great example of being an Exception not an Error ( as no undefined state is created), and a horrible one because it throws its Exception when it should be returned (@isaacs agrees entirely on this point, see below). Say for example JSON.parse didn't throw a SyntaxError (an Exception) but threw an actual Error (one that creates an undefined state), well now the 99.99% of code that wraps it in a dumb try/catch (to catch the Exception) will catch the offending Error without ever realizing it was not the one they expected and continuing blindly in an undefined state. Of course the difference between node and JavaScript engines like V8 (at least in the browser) is that they do not really ever enter into an undefined state requiring a page refresh.

Servers do not require restarts on Invalid Input Exceptions, Application Exceptions, or Exceptions passed to callbacks. To suggest the first 2 require a restart when the latter does not merely because one throws where it should return and the other returns, is a contradiction. The undefined state is created only if the call site does not catch the Exception and allows it to propagate beyond the current stack slice, which is precisely why node core should not ever throw something @isaacs himself regularly points out as an anti-pattern.

@isaacs Honestly, I find your disagreement to be entirely inconsistent with your own previous stance on the subject. I agree completely with your previous stance on the issue.

Your words from try/catch/throw (was: My Humble Co-routine Proposal):

Throw is great.  I love throw.  Throw is how you crash a program.  It
should be like assert(0) in C.  If it's not worth crashing your
program, on par with a SyntaxError or something else that is a clear
indication of a programmer mistake (and not, say, invalid data
encountered at run time), it should not throw.

and:

The fact that JSON.parse throws is a bit hideous.  There is no
JSON.validate to check for syntax errors, so having JSON.parse throw
means that you *can't* interpret user-supplied JSON without a
try/catch.  I much prefer php's json_decode function, since it just
returns `null` on invalid input.  You can't reasonably argue that
this:

try {
  foo = JSON.parse(input)
} catch (er) {
  return "invalid data"
}

is more lightweight than:

foo = JSON.parse(input)
if (!foo) return "invalid data"

and finally:

In the end, in my opinion (which I am very ok with you disagreeing
with), the offensive thing about fibers and streamline is that they
encourage the use of try/catch, which blurs the line between errors
that are *mistakes* (accessing a property of null, calling .write() on
a stream after .end(), etc.), and those which are expected
application-level problems (invalid data, file missing, and so on).

Please reconcile your previous stance with my arguments here, which I suspect we are more in agreement on than not.

@gagle
Copy link

gagle commented Mar 26, 2013

@CrabDude It's not the same to handle a request error and handle an app error. Remember that node.js is a web server, it can do other things but its mainly purpose is to serve requests. My previous comment is not contradicted.

@CrabDude
Copy link
Author

@gagle You need to define "request error" as this is not a technical thing. (Obviously, I understand what you mean because in my original rebuttal I acknowledge the need to return 500s) What you refer to as a "request error" is what @isaacs refers to as an error. In other words, he does not make this distinction because there is not a distinction from a node perspective. From an application perspective, the distinction is one created by the developer and relative to the application.

Your contradiction is not that app errors and request erros are the same, it's that you fully agree with Isaac that domain errors unequivocally require a restart, yet suggest "requests errors" are an exclusion to that principle.

@isaacs
Copy link

isaacs commented Mar 26, 2013

Please reconcile your previous stance with my arguments here, which I suspect we are more in agreement on than not.

Ok, I'm lost. I still agree with what I said in that post you linked to, but I don't see how it actually has any relevance here. Can you make the link clearer?

JSON.parse() throws, which is annoying, because there's no way to reliably prevent it from receiving invalid input, so you end up having to wrap it in a try/catch. I don't think this is a good thing. I think it's a bad thing.

You keep talking about "all cases" and such, and really, I'm very uncomfortable throwing away context like that. (The point of domains is to add context to your throws, after all.)

Can you write like a short tl;dr summary of with some bullet points of what you're actually trying to accomplish here? This has gotten WAY off topic, and I have no idea what we're even talking about.

Regarding the OP:

  1. We're not going to wrap all handlers in a try{} in the emit() method. That would be prohibitively expensive, and usually completely the wrong thing. Furthermore, picking up right where we left off would be extremely complicated, since the language's tools for error handling are rather clumsy and simple.
  2. I will clarify the domain markdown to explain that it is almost always a good idea to shut down the process asap when a domain catches an error, unless you are 100% certain that you can safely continue.
  3. The failure to release the socket when a response 'end' handler throws an error is a great example of why throws mean you should get out of the process ASAP! If you look just a little harder, you'll find countless examples of memory leaks, invalid states, and blown assertions if you continue in the face of errors. This is not the Node Way.

So, I'm closing this issue, because the requested feature is not something we can or should do.

@isaacs isaacs closed this as completed Mar 26, 2013
@CrabDude
Copy link
Author

@isaacs
Response to your final points: (TLDR; to follow)

JSON.parse() throws, which is annoying, because there's no way to reliably prevent it from receiving invalid input, so you end up having to wrap it in a try/catch. I don't think this is a good thing. I think it's a bad thing.

This is exactly my point. The node.js API is unintentionally identical to JSON.parse by throwing on invalid input, yet in one case you believe catching is appropriate and the other a restart is required.

  1. try/catch is emphatically not slow when implemented properly. Wrapping handlers would be a net performance gain since it would avoid requiring a restart.
  2. We can significantly limit the number of situations where a domain error requires a restart, by handling "exceptions" better (mostly by passing them on the callback).
  3. Agreed, but only because node core is not properly catching and handling them properly.

@trevnorris
Copy link

@CrabDude Please refrain from posting any jsperf benchmarks as a reliable source. Instead write node specific benchmarks using the built in high precision timer process.hrtime, use IR Hydra to build execution graphs and analyse code paths and use a profiler to examine the amount of code that is actually running. Also you'll want to read additional comprehensive articles on how this type of context wrapping can kill performance.

Also I'm curious how else JSON.parse would let you know that it had trouble parsing the input? You can't check for null because JSON.parse('null') === null. You can't check for an Error object because JSON.parse(JSON.stringify(new Error)) is a valid object.

And can we please clarify Error and Exception? My understanding has always been Exceptions throw with the intention of being caught and Errors are unrecoverable, and will cause the program to crash. Though this can be confusing since throwing an Error (or any instance of it) is technically an Exception. And it's been accepted and documented that throw is used when an Exception is encountered, and that javascript's Exception types are Error constructors.

So really I think it boils down to this: Node must always throw for synchronous tasks, but we have the choice whether or not to throw or pass the Error when a callback is passed (of course if the callback is not a function then must still throw). Is it more complicated than this?

As a side, I do believe the documentation could be improved to be more specific about what input types are expected, and what the return type will be.

@CrabDude
Copy link
Author

@trevnorris
Thank you for your response.

I used jsperfs to avoid writing some myself (they're generally sufficient). I agree, we need proper performance test to verify any claims for node.

JSON.parse should return an Error, the same way I'm suggesting node sync calls should return and not throw Errors (similar to Go). Others in the community suggest the same (See @balupton's post on SO). JSON.parse(JSON.stringify(new Error)) should not throw since it is successfully parsing and returning {}.

You have Error vs Exception exactly right. Though, I am trying to separate the Exception vs Error semantic from the throw mechanism due the lack of async typed catches in node. I tried to distinguish Errors and Exceptions from Errors in my comments.

Node must always throw for synchronous tasks, but we have the choice
whether or not to throw or pass the Error when a callback is passed
(of course if the callback is not a function then must still throw).
Is it more complicated than this?

The OP / wrapping handlers is a separate issue, but apart from that you have it right[1]. Most of my responses are a comprehensive defense of the reasoning for passing these Exceptions to the callbacks. Every thrown Exception removed from core is one less case requiring a restart.

The closing of this issue was premature since I'm pretty sure we are in agreement, assuming @isaacs' 2 key concerns can be addressed:

  • Performance.
  • Would the following necessarily create an undefined state? (I've not seen any any evidence to support that it would)
  1. Catching handler errors
  2. Then emitting them on domain.active or uncaughtException (to allow for a process.exit)
  3. After the Exceptions is handled (process.exit not called) then ensuring all other handlers are executed.

We seem to disagree on the efficacy of the available solution(s).

If an undefined state can be avoided, you should avoid it, yet the current EE implementation not only fails to avoid it, but exacerbates it by not calling its own internal handler.

[1] I'm also arguing sync calls should return errors instead of throw, but I don't want to muddy the water now that we're on the same page.

@trevnorris
Copy link

Cool. To simplify this topic let's agree to continue based on the following:

  1. This discussion will only continue to address asynchronous functions (those which require a callback)
  2. Whether node should throw when validating argument types (e.g. expecting number, function, etc.)
  3. If all further Exceptions should be passed to the callback

For me (3) seems like a simple yes.

As for (2), there are a couple scenarios we must consider. For example, js developers are used to async functions returning undefined. So many times they won't check for a return value. Given the following:

var opts = { /* things here */ };
// oops. i'm assuming cb is a function
var cb = opts.cb;
var arg = opts.arg;
asyncFn(arg, cb);

If nothing is thrown here then it'll fail silently and you'll be in a world of hurt.

You also need to remember that Error constructors are very simplistic. If we don't generate an Error until the callback fails then you won't have a stacktrace back to the point at which the arguments were originally passed. This makes debugging a massive pain. The only alternative is to create an Error early then send it down later (see what had to be done in fs.js), but this is a massive perf hit for the 99.9% of the cases where you don't need it.

So imho (2) is a no.

@CrabDude
Copy link
Author

I don't entirely follow.

For Exceptions on Invalid Input:

throw new TypeError('Bad arguments');
// becomes
var err = new TypeError('Bad arguments');
return process.nextTick(function() {
  callback(err)
})

In your example, it currently fails silently anyways?

fs.readFile('foo') // no error

To me this is how it should be. If it is async, it should continue returning undefined and fail silently as this implies the user is unconcerned with any resulting Exceptions (fire and forget).

@CrabDude
Copy link
Author

@isaacs TLDR; #5149

@carlos8f
Copy link

To me this is how it should be. If it is async, it should continue returning undefined and fail silently as this implies the user is unconcerned with any resulting Exceptions (fire and forget).

The goal shouldn't be making it easier to ignore errors. After all, the reason for the err-first callback convention is to force you to deal with errors. Sometimes a crashed program (from an unexpected thrown exception) is just the right thing to break a lazy developer's cycle of ignoring errors, and for that, throw is actually a godsend :)

That said, I completely disagree that 2) async functions should throw (from input mistakes), since then there are 2 code paths that need to be created to deal with errors from the same function, making a bad interface. If you pass a callback, it should be guaranteed to run at some time or another, and should be the only way of handling errors.

@fresheneesz
Copy link

@isaacs

We're not going to wrap all handlers in a try{} in the emit() method. That would be prohibitively expensive

Can you justify that statement? I was under the impression that throwing and catching exceptions is the expensive part of a try catch - ie not the try itself. If an error doesn't happen, its very inexpensive, no?

@othiym23
Copy link

The JIT can't do anything with the statements inside a try block because throws are computed gotos and break any assumptions that might be made about inlining. Because EEs are used everywhere, in most of the hottest code paths inside Node's JS, this would have huge knock-on effects.

@bnoordhuis
Copy link
Member

That's true but it's easy to work around by moving the try/catch into a separate function. It's an extra function call but meh, lib/events.js uses the util.is*() functions all over the place and no one complains about that.

@othiym23
Copy link

Sure, just answering the question. Ultimately, this kind of stuff is up to @trevnorris, and his thoughts on this stuff are pretty widely known.

@fresheneesz
Copy link

Isn't the possibility of a thrown error true for almost every single line of code? Even if you don't have a try-catch block, you don't save yourself from the possibility of throws.

@othiym23
Copy link

It's not the throw, it's the catch, which requires the compiler to put guards around the possibility of a throw happening in the try block. If there's no catch, all the runtime has to do is capture the call stack on throw and hand the error off to the global exception handler (which is where Node's domain-handling code gets involved, incidentally). It's the possibility of handling a throw part way up the call stack that makes try / catch problematic for the compiler.

@fresheneesz
Copy link

Is this a problem in most languages that implement exceptions? I still don't quite understand, I would think a try-catch block would only require a simple context save, which could then be handled however unperformantly once an exception is actually thrown. What is the work that needs to be done when a catch is setup?

On a related note, I've heard that the performance hit comes from javascript (or perhaps v8 specifically) storing variables in the heap rather than on the stack when they're inside a try block. Do you know about the specifics of that?

@othiym23
Copy link

An optimizing compiler that does any kind of inlining, loop unrolling, or code removal is going to have problems with any control flow structures that can lead to unexpected branching. Sometimes static analysis can make strong guarantees about what happens on throw that can lead to efficient code being generated. When you have a dynamic language, little type information, and a JIT all in the mix, it's nearly impossible to do this on the fly, and V8 doesn't even try. Because the output of the JIT is machine code, the inline cache feeding the code generator needs to be able to reduce "context" to simple memory lookups, so it's not "just" saving a context or an activation record. (The same applies in the case of escape analysis where code in closures references the environment records outside the closure, which is what makes closures similarly tricky for V8's JIT.)

As to where memory gets allocated, I don't know much beyond the fact that the inline cache and the codegen (crankshaft) in V8 do what they can to keep hot variables in registers. I don't think that has an impact on how V8 deals with exceptions.

@fresheneesz
Copy link

Gotcha alright. Thanks for the explanations!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants