Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix timer .unref() handle management once and for all #1330

Closed
wants to merge 5 commits into from

Conversation

indutny
Copy link
Member

@indutny indutny commented Apr 2, 2015

See: #1152
See: #1151
See: #1075 (may be)

@indutny
Copy link
Member Author

indutny commented Apr 2, 2015

function destroyOnTimeout() {
if (!this.owner._repeat)
this.owner.close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should probably be:

function destroyOnTimeout() {
  if (!this.owner._repeat &&
      this.owner._idleNext._idleNext === this.owner)
    this.owner.close();
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but may be it should not be there at all. I'll figure out the proper way of doing it in a bit.

@Fishrock123 Fishrock123 added the timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. label Apr 2, 2015
@brycebaril
Copy link
Contributor

Built locally and tried out the test cases, did not appear fix the repro cases in #1151 or #1264

#1264 is a different timer issue, but the test case there is also a reliable test case for leaked timers -- if they leak it never exits.

bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1151.js 
TIMER 2343: unenroll
TIMER 2343: unenroll: list empty

assert.js:88
  throw new assert.AssertionError({
        ^
AssertionError: 2 === 0
    at process.<anonymous> (/tmp/1151.js:16:10)
    at emitOne (events.js:77:13)
    at process.emit (events.js:166:7)
    at Immediate.process.exit (node.js:659:17)
    at processImmediate [as _immediateCallback] (timers.js:371:17)
bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1264.js 
TIMER 2364: timeout callback 1
TIMER 2364: now: 2767035398
TIMER 2364: unenroll
TIMER 2364: unenroll: list empty
TIMER 2364: 1 list empty
TIMER 2364: timeout callback 1
TIMER 2364: now: 2767035551
TIMER 2364: 1 list empty
TIMER 2364: timeout callback 1
TIMER 2364: now: 2767035703
TIMER 2364: 1 list empty
TIMER 2364: timeout callback 1
TIMER 2364: now: 2767035856
TIMER 2364: 1 list empty
TIMER 2364: timeout callback 1
TIMER 2364: now: 2767036010
^C

@thlorenz has also been doing a lot of research around these various timers issues and may want to take a peek here

@Fishrock123
Copy link
Contributor

@brycebaril yes, #1264 is different.*

This (in general) fixes timer's .unref() in terms of handle use.

#1264 has to do with Thornsten's research into uv keeping the thread loop open.

Edit: * probably.. timers is the real worm can though
Edit2: ahhh it's getting late and I just saw that comment was just about the test specifically T_T

@Fishrock123 Fishrock123 changed the title Fix timers once and for all Fix timer .unref() handle management once and for all Apr 3, 2015
@brycebaril
Copy link
Contributor

@Fishrock123 what I'm trying to say is this patch doesn't fix the problem

The two fixes pulled from joyent/node improve behavior, then commit indutny@c93bb3a makes things worse.

#1264 is a different issue, but the test itself is good at illustrating the leaked handles in addition to to the uv issue.

I added a few more debug() calls and a timestamp, here is the output:

This pull request without commit indutny@c93bb3a:

bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1264.js
TIMER 10155: 1428029019164: timeout callback 1
TIMER 10155: 1428029019174: now: 2777259515
TIMER 10155: 1428029019175: wrapper start
TIMER 10155: 1428029019176: unenroll
TIMER 10155: 1428029019176: unenroll: list empty
TIMER 10155: 1428029019176: creating hidden _handle
TIMER 10155: 1428029019253: have a _handle (already unrefd)
TIMER 10155: 1428029019253: wrapper end
TIMER 10155: 1428029019253: 1 list empty
TIMER 10155: 1428029019254: wrapper start
TIMER 10155: 1428029019254: forwarding unref to hidden _handle
TIMER 10155: 1428029019329: have a _handle (already unrefd)
TIMER 10155: 1428029019329: wrapper end
TIMER 10155: 1428029019329: wrapper start
TIMER 10155: 1428029019329: forwarding unref to hidden _handle
TIMER 10155: 1428029019402: have a _handle (already unrefd)
TIMER 10155: 1428029019402: wrapper end

assert.js:88
  throw new assert.AssertionError({
        ^
AssertionError: 3 == 1
    at process.<anonymous> (/tmp/1264.js:17:10)
    at emitOne (events.js:77:13)
    at process.emit (events.js:166:7)

The assertion error is because there is no fix for #1264, but we can't even get that far if the handles are leaked.

with indutny@c93bb3a:

bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1264.js
TIMER 11032: 1428029307775: timeout callback 1
TIMER 11032: 1428029307784: now: 2777548124
TIMER 11032: 1428029307785: wrapper start
TIMER 11032: 1428029307785: unenroll
TIMER 11032: 1428029307785: unenroll: list empty
TIMER 11032: 1428029307785: creating hidden _handle
TIMER 11032: 1428029307860: have a _handle (already unrefd)
TIMER 11032: 1428029307861: wrapper end
TIMER 11032: 1428029307861: 1 list empty
TIMER 11032: 1428029307861: wrapper start
TIMER 11032: 1428029307861: forwarding unref to hidden _handle
TIMER 11032: 1428029307934: no _handle, setting active
TIMER 11032: 1428029307935: wrapper end
TIMER 11032: 1428029307936: timeout callback 1
TIMER 11032: 1428029307936: now: 2777548276
TIMER 11032: 1428029307936: wrapper start
TIMER 11032: 1428029307936: forwarding unref to hidden _handle
TIMER 11032: 1428029308011: have a _handle (already unrefd)
TIMER 11032: 1428029308011: wrapper end
TIMER 11032: 1428029308011: 1 list empty
TIMER 11032: 1428029308011: wrapper start
TIMER 11032: 1428029308011: forwarding unref to hidden _handle
TIMER 11032: 1428029308084: no _handle, setting active
TIMER 11032: 1428029308084: wrapper end
TIMER 11032: 1428029308085: timeout callback 1
TIMER 11032: 1428029307785: unenroll
TIMER 11032: 1428029307785: unenroll: list empty
TIMER 11032: 1428029307785: creating hidden _handle
TIMER 11032: 1428029307860: have a _handle (already unrefd)
TIMER 11032: 1428029307861: wrapper end
TIMER 11032: 1428029307861: 1 list empty
TIMER 11032: 1428029307861: wrapper start
TIMER 11032: 1428029307861: forwarding unref to hidden _handle
TIMER 11032: 1428029307934: no _handle, setting active
TIMER 11032: 1428029307935: wrapper end
TIMER 11032: 1428029307936: timeout callback 1
TIMER 11032: 1428029307936: now: 2777548276
TIMER 11032: 1428029307936: wrapper start
TIMER 11032: 1428029307936: forwarding unref to hidden _handle
TIMER 11032: 1428029308011: have a _handle (already unrefd)
TIMER 11032: 1428029308011: wrapper end
TIMER 11032: 1428029308011: 1 list empty
TIMER 11032: 1428029308011: wrapper start
TIMER 11032: 1428029308011: forwarding unref to hidden _handle
TIMER 11032: 1428029308084: no _handle, setting active
TIMER 11032: 1428029308084: wrapper end
TIMER 11032: 1428029308085: timeout callback 1
TIMER 11032: 1428029308086: now: 2777548426
TIMER 11032: 1428029308086: wrapper start
TIMER 11032: 1428029308086: forwarding unref to hidden _handle
TIMER 11032: 1428029308161: have a _handle (already unrefd)
TIMER 11032: 1428029308161: wrapper end
TIMER 11032: 1428029308161: 1 list empty
TIMER 11032: 1428029308161: wrapper start
TIMER 11032: 1428029308161: forwarding unref to hidden _handle
TIMER 11032: 1428029308234: no _handle, setting active
TIMER 11032: 1428029308234: wrapper end
TIMER 11032: 1428029308236: timeout callback 1
TIMER 11032: 1428029308236: now: 2777548576
TIMER 11032: 1428029308236: wrapper start
TIMER 11032: 1428029308236: forwarding unref to hidden _handle
^C

You can see the no _handle, setting active and creating hidden _handle multiple times because it is losing the reference. I also have to kill the application because the leaked handles cause #1264 to never allow exit.

As for the test for #1151:

With this pull request change but without indutny@c93bb3a:

bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1151.js 
TIMER 11939: 1428029667416: unenroll
TIMER 11939: 1428029667425: unenroll: list empty
TIMER 11939: 1428029667426: creating hidden _handle
TIMER 11939: 1428029667427: wrapper start
TIMER 11939: 1428029667427: have a _handle (already unrefd)
TIMER 11939: 1428029667427: wrapper end

assert.js:88
  throw new assert.AssertionError({
        ^
AssertionError: 1 === 0
    at process.<anonymous> (/tmp/1151.js:16:10)
    at emitOne (events.js:77:13)
    at process.emit (events.js:166:7)
    at Immediate.process.exit (node.js:659:17)
    at processImmediate [as _immediateCallback] (timers.js:384:17)

With indutny@c93bb3a:

bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1151.js 
TIMER 11068: 1428029583908: unenroll
TIMER 11068: 1428029583917: unenroll: list empty
TIMER 11068: 1428029583918: creating hidden _handle
TIMER 11068: 1428029583918: wrapper start
TIMER 11068: 1428029583918: no _handle, setting active
TIMER 11068: 1428029583919: wrapper end

assert.js:88
  throw new assert.AssertionError({
        ^
AssertionError: 2 === 0
    at process.<anonymous> (/tmp/1151.js:16:10)
    at emitOne (events.js:77:13)
    at process.emit (events.js:166:7)
    at Immediate.process.exit (node.js:659:17)
    at processImmediate [as _immediateCallback] (timers.js:385:17)

Here's the diff with the extra debug messages:

diff --git a/lib/timers.js b/lib/timers.js
index 225e2af..fbe6ce8 100644
--- a/lib/timers.js
+++ b/lib/timers.js
@@ -4,7 +4,15 @@ const Timer = process.binding('timer_wrap').Timer;
 const L = require('_linklist');
 const assert = require('assert').ok;
 const util = require('util');
-const debug = util.debuglog('timer');
+const _debug = util.debuglog('timer');
+function debug() {
+  var args = [];
+  for (var i = 0; i < arguments.length; i++) {
+    args[i] = arguments[i];
+  }
+  args[0] = Date.now() + ": " + args[0];
+  _debug.apply(null, args);
+}
 const kOnTimeout = Timer.kOnTimeout | 0;
 const kOnUnrefTimeout = Timer.kOnUnrefTimeout | 0;

@@ -272,14 +280,18 @@ exports.setInterval = function(callback, repeat) {
   return timer;

   function wrapper() {
+    debug('wrapper start');
     timer._repeat.call(this);
     // If timer is unref'd (or was - it's permanently removed from the list.)
     if (this._handle) {
+      debug('have a _handle (already unrefd)');
       this._handle.start(repeat, 0);
     } else {
+      debug('no _handle, setting active');
       timer._idleTimeout = repeat;
       exports.active(timer);
     }
+    debug('wrapper end');
   }
 };

@@ -311,6 +323,7 @@ function destroyOnTimeout() {

 Timeout.prototype.unref = function() {
   if (this._handle) {
+    debug('forwarding unref to hidden _handle');
     this._handle.unref();
   } else if (typeof(this._onTimeout) === 'function') {
     var now = Timer.now();
@@ -322,6 +335,7 @@ Timeout.prototype.unref = function() {
     // Prevent running cb again when unref() is called during the same cb
     if (this._called && !this._repeat) return;

+    debug('creating hidden _handle');
     this._handle = new Timer();
     this._handle.owner = this;
     this._handle[kOnUnrefTimeout] = destroyOnTimeout;

@brycebaril
Copy link
Contributor

@indutny the destroyOnTimeout function never gets called. You can add a process.abort(); to the first line of that function and make test still passes 100%.

@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

@brycebaril I'm afraid that tests you are talking about are unrelated to what I'm trying to fix here.

@indutny indutny force-pushed the fix/timers-once-and-for-all branch from c93bb3a to 13cd8e2 Compare April 3, 2015 08:51
@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

@Fishrock123 @trevnorris PTAL, I've fixed it in a proper way.

@brycebaril
Copy link
Contributor

@indutny they DO very much have to do with the timer leak, and your latest commits to the branch get us back into a workable state. 😄

bryce@x220:/tmp$ NODE_DEBUG=timer ~/forks/io.js/iojs 1264.js 
TIMER 12598: timeout callback 1
TIMER 12598: now: 2822822025
TIMER 12598: unenroll
TIMER 12598: unenroll: list empty
TIMER 12598: 1 list empty

assert.js:88
  throw new assert.AssertionError({
        ^
AssertionError: 2 == 1
    at process.<anonymous> (/tmp/1264.js:17:10)
    at emitOne (events.js:77:13)
    at process.emit (events.js:166:7)

The failed assert is a separate issue, so disregard that -- if the timers are leaked you can never get to the assert. Before your latest commits the destroyOnTimeout function was unreachable so the required cleanup never happened.

@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

@brycebaril sorry, I'm terrible english speaker. Does it work now?

@brycebaril
Copy link
Contributor

@indutny yes, it works now

@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

Yaaay!

@brycebaril
Copy link
Contributor

Here's a swag at a test that demonstrates the timers being leaked which is fixed by this PR:

var timers = require('timers');
var assert = require('assert');

var original = timers.active;
var calls = 0;
timers.active = function wrapActive(item) {
  calls++;
  original.call(null, item);
} 

var i = setInterval(function noop() {}, 10);
i.unref();

setTimeout(function () {
  // timers.active should be called exactly twice, once for the first setInterval,
  // and once for the setTimeout
  assert.equal(calls, 2);
}, 50);

@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

@Fishrock123 added test. cc @brycebaril

@brycebaril
Copy link
Contributor

Test looks good, confirmed it is fixed by this patch.

@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

Yay, thank you!

trevnorris and others added 3 commits April 4, 2015 01:30
The destructor isn't being called for timers that have been unref'd.

Fixes: nodejs/node-v0.x-archive#8364
This change fixes a regression introduced by commit
0d05123, which contained a typo that
would cause every unrefd interval to fire only once.

Fixes: nodejs/node-v0.x-archive#8900
Reviewed-By: Timothy J Fontaine <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-by: Trevor Norris <[email protected]>
Partially revert 776b73b.

Following code crashes after backported timer leak fixes:

```javascript
var timer = setInterval(function() {
  clearInterval(timer);
}, 10);
timer.unref();
```

Note that this is actually tested in a `test-timers-unref.js`, and is
crashing only with 776b73b.

Calling `clearInterval` leads to the crashes in case of `.unref()`ed
timers, and might lead to a extra timer spin in case of regular
intervals that was closed during the interval callback. All of these
happens because `.unref()`ed timer has it's own `_handle` and was used
after the `.close()`.
@indutny indutny force-pushed the fix/timers-once-and-for-all branch from c5c33aa to 452571b Compare April 3, 2015 22:35
@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

function unrefdHandle() {
this.owner._onTimeout();
if (!this.owner._repeat)
this.owner.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will run this._handle.close() which runs uv_close(wrap->handle__, OnClose). Are we sure that the list of timers attached to this TimerWrap instance is empty before this runs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No list attached here for sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great. then LGTM.

@trevnorris
Copy link
Contributor

One question, but other than that LGTM.

trevnorris added a commit that referenced this pull request Apr 3, 2015
The destructor isn't being called for timers that have been unref'd.

Fixes: nodejs/node-v0.x-archive#8364
PR-URL: #1330
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
indutny added a commit that referenced this pull request Apr 3, 2015
Partially revert 776b73b.

Following code crashes after backported timer leak fixes:

```javascript
var timer = setInterval(function() {
  clearInterval(timer);
}, 10);
timer.unref();
```

Note that this is actually tested in a `test-timers-unref.js`, and is
crashing only with 776b73b.

Calling `clearInterval` leads to the crashes in case of `.unref()`ed
timers, and might lead to a extra timer spin in case of regular
intervals that was closed during the interval callback. All of these
happens because `.unref()`ed timer has it's own `_handle` and was used
after the `.close()`.

PR-URL: #1330
Reviewed-by: Trevor Norris <[email protected]>
indutny added a commit that referenced this pull request Apr 3, 2015
indutny added a commit that referenced this pull request Apr 3, 2015
PR-URL: #1330
Reviewed-by: Trevor Norris <[email protected]>
@indutny
Copy link
Member Author

indutny commented Apr 3, 2015

Landed in 0e06197, cca5efb, d22b2a9, 6f72d87, and 416499c. Thank you, everyone!

@indutny indutny closed this Apr 3, 2015
@indutny indutny deleted the fix/timers-once-and-for-all branch April 3, 2015 23:35
@Fishrock123
Copy link
Contributor

\o/

Fishrock123 added a commit that referenced this pull request Apr 6, 2015
Notable changes:

* npm: upgrade npm to 2.7.5. See the npm CHANGELOG.md for details.
Includes two important security fixes.
https://github.com/npm/npm/blob/master/CHANGELOG.md#v275-2015-03-26
* openssl: preliminary work has been done for an upcoming upgrade to
OpenSSL 1.0.2a #1325 (Shigeki Ohtsu). See #589 for additional details.
* timers: a minor memory leak when timers are unreferenced was fixed,
alongside some related timers issues #1330 (Fedor Indutny). This
appears to have fixed the remaining leak reported in #1075.
* android: it is now possible to compile io.js for Android and related
devices #1307 (Giovanny Andres Gongora Granada).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants