timers: fix subsequent enroll calls not working #19936

apapirovski · 2018-04-11T09:10:11Z

This PR resolves one recently introduced bug and one long-standing incompatibility with browser behaviour:

A bug was introduced in a recent semver-major PR which makes it impossible to enroll an already enrolled object. This resolves the issue by keeping the validation function at the top but moving the assignment of the potentially modified msecs below.
Node.js currently doesn't match browser behaviour when it comes to intervals that throw during their execution. In all browsers, the interval continues running but in Node.js it will never be rescheduled after a throw nor will the destroy async hook for that interval fire.

Edit: The second change has been backed out now.

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
commit message follows commit guidelines

A bug was introduced in nodejs#17704 which meant that subsequent calls to enroll would unset the new _idleTimeout and the enrolled object could never again function as a timer.

apapirovski · 2018-04-11T09:13:33Z

CI: https://ci.nodejs.org/job/node-test-pull-request/14198/
CitGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1358/

benjamingr

Very nice

BridgeAR

LGTM

jasnell · 2018-04-11T14:22:58Z

Point 2 regarding throwing might push this into semver-major territory.

apapirovski · 2018-04-11T14:27:59Z

@jasnell It's marked as dont-land on all the other versions for now. The 1st change is fixing a semver-major commit that hasn't landed in any releases yet.

For the second change, I'm fine if we want to have it be semver-major. Or we could also see how it does in 10.x and backport if there's a reason to do so. (It does bring us in line with the browsers and the spec.)

Fishrock123

Ok uh I really think these should be in separate PRs:

PR 1 (or separate? whichever)

Commit 1, fixes an obvious bug of mine which I'm really surprised there wasn't a test for.
The part that makes the destroy hook fire should probably be fine and seem like a good idea... Maybe I'm for getting something though, since I'm pretty sure me and Trevor had it set like that for some reason.

PR 2

The browser compat part seems like a breaking change to me. Seems safer to separate it out.
- As in, technically it's a breaking change - but I think you might need to be doing pretty bad already to actually get caught by it.

Fishrock123 · 2018-04-11T22:30:52Z

test/parallel/test-timers-enroll-second-time.js

+};
+
+timers.enroll(enrollObj, 1);
+timers.enroll(enrollObj, 10);


More reliable: put the setTimeout before this and then just re-enroll with 1. The list order should guarantee the call order is as we expect without depending on libuv behavior.

Yeah but I want to indirectly test that it's getting the new value rather than just keeping the old one without depending on checking _idleTimeout. That could be another possible bug that someone could introduce.

apapirovski · 2018-04-12T06:24:10Z

Ok uh I really think these should be in separate PRs:

One of them fixes a breaking change and the other one, I suppose, is a breaking change (although I would call it a bug fix). Feels kind of unnecessary to open another PR now...

apapirovski · 2018-04-12T12:51:51Z

ping @nodejs/tsc since this would ideally make it into 10.x

benjamingr · 2018-04-12T12:55:47Z

I definitely think this should make it to v10 as semver-major, given a clean CITGM run

apapirovski · 2018-04-12T12:57:29Z

given a clean CITGM run

Think the one above was ok but I'm not 100% certain what's expected and what's not these days, given my absence for the past month and a bit.

The CI was fine too.

benjamingr · 2018-04-12T14:02:26Z

@apapirovski from what I can tell both were fine, it's just common phrasing to indicate that I'm +1 on landing and that that depends on the CI and citgm being green (which they currently are).

As in - if we make changes here (which I don't believe we need to at the moment tbh but others might disagree) we should run CITGM again.

apapirovski · 2018-04-12T14:03:52Z

As in - if we make changes here (which I don't believe we need to at the moment tbh but others might disagree) we should run CITGM again.

Haha we got our wires crossed. I think I got what you were saying but was trying to express I'm not 100% certain if the CITGM is actually fine or not since I'm not sure what the baseline is these days. 😆 (Indirectly prodding others to verify it for me...)

Fishrock123 · 2018-04-12T14:21:58Z

One of them fixes a breaking change and the other one, I suppose, is a breaking change (although I would call it a bug fix). Feels kind of unnecessary to open another PR now...

Not separating out breaking changes makes backporting significantly more difficult.

apapirovski · 2018-04-12T14:23:54Z

Not separating out breaking changes makes backporting significantly more difficult.

Yeah but there's nothing to back-port here... one is a fix for a semver-major PR (which is landing in 10.x) and the other is semver-major itself. I would understand if they were going into different release lines but right now there's no chance of that happening.

Fishrock123 · 2018-04-12T14:46:43Z

Oh heh. True.

Trott · 2018-04-12T20:12:11Z

Re-running the two failed CI subtasks:

https://ci.nodejs.org/job/node-test-commit-linux-containered/3658/
https://ci.nodejs.org/job/node-test-commit-arm/15352/

Trott

LGTM. Needs one more TSC approval to land, I think.

benjamingr · 2018-04-12T21:10:58Z

@Trott

LGTM. Needs one more TSC approval to land, I think.

Just wondering - Why?

Trott · 2018-04-12T21:41:55Z

Just wondering - Why?

Breaking changes require sign-off from two TSC members.

From https://github.com/nodejs/node/blob/565fd50b4ac81c8c3fca6a08ffcbc59a4b7b2504/COLLABORATOR_GUIDE.md#consensus-seeking:

Note that breaking changes (that is, pull requests that require an increase in
the major version number, known as semver-major changes) must be elevated for
review by the TSC. This does not necessarily mean that the
PR must be put onto the TSC meeting agenda. If multiple TSC members approve
(LGTM) the PR and no Collaborators oppose the PR, it should be landed. Where
there is disagreement among TSC members or objections from one or more
Collaborators, semver-major pull requests may be put on the TSC meeting
agenda.

(Also: If that text is not sufficiently clear, then that's my fault and let's fix that. I'm 100% certain that the "multiple TSC members" is meant as "approval by at least 2 TSC members is required before landing" because I'm the one who came up with the original language. #7955)

Fishrock123 · 2018-04-12T23:03:57Z

I'd like to think about the breaking change here a bit more still - we already don't follow browser behavior in several places and we aren't obligated to. Our platform is quite different from the browser for long running tasks and I'd like to consider the implications in people's applications a bit more.

Trott · 2018-04-13T05:45:50Z

I'd like to think about the breaking change here a bit more still - we already don't follow browser behavior in several places and we aren't obligated to. Our platform is quite different from the browser for long running tasks and I'd like to consider the implications in people's applications a bit more.

@Fishrock123 If you want to delay landing this to give more time to think about it, please put a red-X "Request Changes" on it. Otherwise, someone using git node land on this isn't going to get notified that it is in fact not yet ready to land.

Trott · 2018-04-13T05:46:59Z

(Also: It might be helpful if you could give a ballpark for how much longer you'd like to think. Days? Weeks? Something else?)

apapirovski · 2018-04-13T06:27:57Z

Our platform is quite different from the browser for long running tasks and I'd like to consider the implications in people's applications a bit more.

I don't think the current behaviour is intuitive. There's no indication that the timer won't get rescheduled just because the callback threw an error. In fact, it's mostly an implementation detail.

If someone is swallowing uncaughtException (which they probably shouldn't) or is using domains, then this interval just silently stops executing. In a situation where such an interval is meant to do some async work that is network dependant an error could be as simple as getting the wrong response from another service or something. Doesn't mean the interval should stop executing.

mcollina

I'm convinced our current behavior is better for Node.js.
However, I've not seen any good reasoning why we should adopt the browser way of handling errors in this regard, apart from "to increase compatibility". Can you please make a case for this change apart from "to increase compatibility"?

I'd prefer for the enroll() bug to be on a different PR.

mcollina · 2018-04-13T07:27:43Z

If someone is swallowing uncaughtException (which they probably shouldn't) or is using domains, then this interval just silently stops executing. In a situation where such an interval is meant to do some async work that is network dependant an error could be as simple as getting the wrong response from another service or something. Doesn't mean the interval should stop executing.

Keep executing it can also cause memory leaks, and sudden behavior changes and bugs exposed in not-well-tested path in code. This is a very edge case: basically we are saying that there is a programmer error (goes to uncaughtException) and the user did not crash their application. This story looks very much like a memory leak to me.
Rescheduling can definitely cause a memory or file descriptor leak: as an example, the setInterval could potentially create a new socket each time.
I do not see how doing this improve things for our users in this condition.

apapirovski · 2018-04-13T07:48:42Z

I fundamentally disagree that compatibility with the spec and browsers is a bad thing.

Anyway, I'll back out the setInterval rescheduling change from this PR and open a separate one for it.

mcollina · 2018-04-13T07:56:45Z

You got me wrong. I think that increasing compatibility is a good thing but I’m not convinced that potentially adding memory leaks to our users code is worth it.

apapirovski · 2018-04-13T08:04:14Z

@mcollina I backed out the change and left the bug fix. Do you mind rescinding the requested changes?

Opening another two PRs with two possible solutions to the interval issue.

mcollina

LGTM

apapirovski · 2018-04-13T08:48:19Z

Ok, opened two separate PRs for the setInterval issue:

non-semver-major that fixes the destroy hook not being called and can be backported across all release lines: #20001
semver-major from this PR: #20002

BridgeAR

Still LGTM

apapirovski · 2018-04-14T07:32:46Z

CI: https://ci.nodejs.org/job/node-test-pull-request/14277/

(one commit was taken out, let's make sure it wasn't crucial to tests passing)

jasnell · 2018-04-14T18:02:38Z

CI failure is unrelated.

BridgeAR · 2018-04-16T02:29:45Z

Seems like the test is flaky:

https://ci.nodejs.org/job/node-test-binary-arm/434/RUN_SUBSET=1,label=pi1-docker/console

05:14:26 not ok 227 parallel/test-timers-enroll-second-time
05:14:26   ---
05:14:26   duration_ms: 3.79
05:14:26   severity: fail
05:14:26   exitcode: 1
05:14:26   stack: |-
05:14:26     (node:14533) [DEP0095] DeprecationWarning: timers.enroll() is deprecated. Please use setTimeout instead.
05:14:26     assert.js:250
05:14:26         throw err;
05:14:26         ^
05:14:26     
05:14:26     AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
05:14:26     
05:14:26       assert(didCall)
05:14:26     
05:14:26         at Object.enrollObj._onTimeout.common.mustCall (/home/iojs/build/workspace/node-test-binary-arm/test/parallel/test-timers-enroll-second-time.js:10:37)
05:14:26         at Object._onTimeout (/home/iojs/build/workspace/node-test-binary-arm/test/common/index.js:467:15)
05:14:26         at ontimeout (timers.js:427:11)
05:14:26         at tryOnTimeout (timers.js:289:5)
05:14:26         at listOnTimeout (timers.js:252:5)
05:14:26         at Timer.processTimers (timers.js:212:10)

apapirovski · 2018-04-16T06:51:52Z

Didn't want to check _idleTimeout but I guess there's no other option to avoid flakiness (other than huge timeout values and/or putting it in sequential). Should be fixed.

New CI: https://ci.nodejs.org/job/node-test-pull-request/14323/

apapirovski · 2018-04-17T14:03:15Z

Landed in 734eb17

@jasnell this should get into v10.x if at all possible to prevent a regression.

A bug was introduced in #17704 which meant that subsequent calls to enroll would unset the new _idleTimeout and the enrolled object could never again function as a timer. PR-URL: #19936 Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Rich Trott <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]>

Johnnyboy87

``

timers: fix subsequent enroll calls not working

e838531

A bug was introduced in nodejs#17704 which meant that subsequent calls to enroll would unset the new _idleTimeout and the enrolled object could never again function as a timer.

apapirovski added dont-land-on-v4.x labels Apr 11, 2018

apapirovski added this to the 10.0.0 milestone Apr 11, 2018

nodejs-github-bot added the timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. label Apr 11, 2018

apapirovski force-pushed the patch-timers-bugs branch from 4423534 to 7ee052e Compare April 11, 2018 09:11

benjamingr approved these changes Apr 11, 2018

View reviewed changes

BridgeAR approved these changes Apr 11, 2018

View reviewed changes

BridgeAR added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 11, 2018

jasnell requested a review from Fishrock123 April 11, 2018 14:21

Fishrock123 added the semver-major PRs that contain breaking changes and should be released in the next major version. label Apr 11, 2018

Fishrock123 reviewed Apr 11, 2018

View reviewed changes

Trott approved these changes Apr 12, 2018

View reviewed changes

mcollina requested changes Apr 13, 2018

View reviewed changes

apapirovski removed the semver-major PRs that contain breaking changes and should be released in the next major version. label Apr 13, 2018

apapirovski force-pushed the patch-timers-bugs branch from 7ee052e to e838531 Compare April 13, 2018 08:03

apapirovski changed the title ~~timers: resolve two edge-case bugs~~ timers: fix subsequent enroll calls not working Apr 13, 2018

mcollina approved these changes Apr 13, 2018

View reviewed changes

BridgeAR approved these changes Apr 13, 2018

View reviewed changes

jasnell approved these changes Apr 14, 2018

View reviewed changes

BridgeAR removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 16, 2018

fixup: less flaky test

d487717

apapirovski added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 16, 2018

Fishrock123 approved these changes Apr 16, 2018

View reviewed changes

apapirovski closed this Apr 17, 2018

apapirovski deleted the patch-timers-bugs branch April 17, 2018 14:03

Johnnyboy87 reviewed Feb 25, 2024

View reviewed changes

timers: fix subsequent enroll calls not working #19936

timers: fix subsequent enroll calls not working #19936

Conversation

apapirovski commented Apr 11, 2018 • edited Loading

Checklist

apapirovski commented Apr 11, 2018

benjamingr left a comment

Choose a reason for hiding this comment

BridgeAR left a comment

Choose a reason for hiding this comment

jasnell commented Apr 11, 2018

apapirovski commented Apr 11, 2018

Fishrock123 left a comment • edited Loading

Choose a reason for hiding this comment

PR 1 (or separate? whichever)

PR 2

Fishrock123 Apr 11, 2018

Choose a reason for hiding this comment

apapirovski Apr 12, 2018 • edited Loading

Choose a reason for hiding this comment

apapirovski commented Apr 12, 2018 • edited Loading

apapirovski commented Apr 12, 2018

benjamingr commented Apr 12, 2018

apapirovski commented Apr 12, 2018

benjamingr commented Apr 12, 2018

apapirovski commented Apr 12, 2018 • edited Loading

Fishrock123 commented Apr 12, 2018

apapirovski commented Apr 12, 2018

Fishrock123 commented Apr 12, 2018

Trott commented Apr 12, 2018

Trott left a comment

Choose a reason for hiding this comment

benjamingr commented Apr 12, 2018

Trott commented Apr 12, 2018

Fishrock123 commented Apr 12, 2018 • edited Loading

Trott commented Apr 13, 2018

Trott commented Apr 13, 2018

apapirovski commented Apr 13, 2018 • edited Loading

mcollina left a comment

Choose a reason for hiding this comment

mcollina commented Apr 13, 2018

apapirovski commented Apr 13, 2018

mcollina commented Apr 13, 2018

apapirovski commented Apr 13, 2018

mcollina left a comment

Choose a reason for hiding this comment

apapirovski commented Apr 13, 2018

BridgeAR left a comment

Choose a reason for hiding this comment

apapirovski commented Apr 14, 2018

jasnell commented Apr 14, 2018

BridgeAR commented Apr 16, 2018

apapirovski commented Apr 16, 2018

apapirovski commented Apr 17, 2018

Johnnyboy87 left a comment

Choose a reason for hiding this comment

apapirovski commented Apr 11, 2018 •

edited

Loading

Fishrock123 left a comment •

edited

Loading

apapirovski Apr 12, 2018 •

edited

Loading

apapirovski commented Apr 12, 2018 •

edited

Loading

apapirovski commented Apr 12, 2018 •

edited

Loading

Fishrock123 commented Apr 12, 2018 •

edited

Loading

apapirovski commented Apr 13, 2018 •

edited

Loading