Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: implement iterable interface #204

Closed
wants to merge 232 commits into from

Conversation

vkurchatkin
Copy link
Contributor

This makes possible to use for..of loop with buffers. Also related keys, values and entries methods are added for feature parity with Uint8Array.

This is basically the code from v8 array iterator implementation minus spec stuff.

Julien Gilli and others added 11 commits December 9, 2014 12:06
Dtrace probes were removed from libuv recently, but their usage by node
was not completely removed, causing build breaks on SmartOS.

Even though the build is working on other platforms, these probes are
not fired by libuv anymore, so there's no point in using them on these
platforms too.

Reviewed-by: Trevor Norris <[email protected]>
Marking these two tests as flaky, since they have been failing
intermittenly in recent builds:
test-debug-signal-cluster
test-cluster-basic
Clarify the fd option: it is preferred to the path parameter, omits
the "open" event if given, and is available on WriteStreams as well.

PR-URL: nodejs/node-v0.x-archive#7707
Fixes: nodejs/node-v0.x-archive#7707
Fixes: nodejs/node-v0.x-archive#7708
Fixes: nodejs/node-v0.x-archive#4367
Reviewed-By: Chris Dickinson <[email protected]>
Fix Interface.setBreakpoint() to correctly handle an attempt to set a
breakpoint in the current script when there is no current script.
This usually happens when the debugged process is not paused.

Fixes: nodejs/node-v0.x-archive#6453
PR-URL: nodejs/node-v0.x-archive#6460
Reviewed-By: Chris Dickinson <[email protected]>
Fix a Windows-only build error that was introduced in
commit 1183ba4 ("zlib: support concatenated gzip files").

Rename the NO_ERROR and FAILED enumerations, they conflict
with macros of the same name in <winerror.h>.

PR-URL: nodejs/node-v0.x-archive#8893
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Rod Vagg <[email protected]>
Reviewed-by: Timothy J Fontaine <[email protected]>
In cases where many small writes are made to a stream
lacking _writev, the array data structure backing the
WriteReq buffer would greatly increase GC pressure.

Specifically, in the fs.WriteStream case, the
clearBuffer routine would only clear a single WriteReq
from the buffer before exiting, but would cause the
entire backing array to be GC'd. Switching to [].shift
lessened pressure, but still the bulk of the time was
spent in memcpy.

This replaces that structure with a linked list-backed
queue so that adding and removing from the queue is O(1).
In the _writev case, collecting the buffer requires an
O(N) loop over the buffer, but that was already being
performed to collect callbacks, so slowdown should be
neglible.

PR-URL: nodejs/node-v0.x-archive#8826
Reviewed-by: Timothy J Fontaine <[email protected]>
Reviewed-by: Trevor Norris <[email protected]>
Add documentation for the callback parameter of http.ClientRequest's and
http.ServerResponse's end methods.

Signed-off-by: Julien Gilli <[email protected]>
// 2
// 3

Additionaly, `buffer.values()`, `buffer.keys()` and `buffer.entries()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Additionaly/Additionally/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

if (kind === ITERATOR_KIND_ENTRIES)
return new BufferIteratorResult([index, buffer[index]], false);

return new BufferIteratorResult(index, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious -- can iterators reuse results? If so that might be ideal, otherwise for large buffers we might see a huge number of throwaway objects generated when iterated with for... of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't using flyweights here just be reinventing smis, badly? If you really wanted to do that, though, you could probably do it in the BufferIteratorResult constructor by having it return from a cache of previously-seen values. There's only 256 potential values when iterating via for..of, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this would be somehow optimized eventually. I mean, the whole for..of thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my unscientific benchmarking, using a cache makes it about 15% faster.

Benchmark:

var buf = Buffer(1 << 26), idx = 0;
buf.fill(0);  // Touch memory, force delayed allocation.
for (var b of buf) buf[idx++] = b;

Cache:

diff --git a/lib/buffer.js b/lib/buffer.js
index 9149b65..eb75761 100644
--- a/lib/buffer.js
+++ b/lib/buffer.js
@@ -969,6 +969,11 @@ function BufferIteratorResult(value, done) {
   this.done = done;
 }

+var cache = new Array(256);
+
+for (var i = 0; i < cache.length; i += 1)
+  cache[i] = new BufferIteratorResult(i, false);
+
 BufferIterator.prototype.next = function() {
   var buffer = this._buffer;
   var kind = this._kind;
@@ -980,7 +985,7 @@ BufferIterator.prototype.next = function() {
   this._index++;

   if (kind === ITERATOR_KIND_VALUES)
-    return new BufferIteratorResult(buffer[index], false);
+    return cache[buffer[index]];

   if (kind === ITERATOR_KIND_ENTRIES)
     return new BufferIteratorResult([index, buffer[index]], false);

Peak RSS and the number of page faults with that benchmark are consistently and repeatedly 2-3% lower so it makes sense that it's faster.

On a related topic, it would be good to add one or two benchmarks in benchmark/buffers/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@othiym23 ah, I was thinking that each iterator could instantiate a single BufferIteratorResult and swap its value on .next() -- to avoid the problem of having to instantiate a whole new object just to add a .done parameter. Could you expand on the "reinventing smis, badly" bit? I'm afraid I don't follow.

@vkurchatkin In that these objects will probably only exist in the new-generation pool and be quickly GC'd, it's optimized, though it would still increase GC pressure (I think.) If we can get away it, I imagine only creating two objects per iteration (vs. N + 1) would be less expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisdickinson I was actually thinking that it's possible to avoid object creation at all in for..of. Unfortunately, iterators can be consumed directly and it's possible to mutate result objects:

var b = new Buffer(255);

for (var i = 0; i < 255; i++)
  b[i] = i;

var it = b[Symbol.iterator](), r;

while ((r = it.next()) && !r.done)
  r.value = Math.floor(Math.random() * 255);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything in the spec that forbids making the the cached objects immutable with Object.freeze()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think result object are supposed to be simple mutable data objects (https://people.mozilla.org/~jorendorff/es6-draft.html#sec-createiterresultobject). It is not specified anywhere that custom iterators should work the same way (https://people.mozilla.org/~jorendorff/es6-draft.html#sec-iteratorresult-interface), but I think they should.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object.freeze is definitely allowed.

@domenic
Copy link
Contributor

domenic commented Dec 28, 2014

I think a simpler version of this patch (untested) would be

Buffer.prototype[Symbol.iterator] = Array.prototype[Symbol.iterator];
Buffer.prototype.values = Array.prototype.values;
Buffer.prototype.keys = Array.prototype.keys;
Buffer.prototype.entries = Array.prototype.entries;

@bnoordhuis
Copy link
Member

@domenic That makes the array methods polymorphic and probably slows them down.

@vkurchatkin
Copy link
Contributor Author

@domenic thought that too, typed arrays do that, but we obviously want some optimizations.

@bnoordhuis added a benchmark and result caching. Results on my machine:

Before:

buffers/buffer-iterate.js size=16 type=fast itertation=for: 8632
buffers/buffer-iterate.js size=16 type=fast itertation=forOf: 3806
buffers/buffer-iterate.js size=16 type=slow itertation=for: 9686
buffers/buffer-iterate.js size=16 type=slow itertation=forOf: 3690
buffers/buffer-iterate.js size=512 type=fast itertation=for: 4619
buffers/buffer-iterate.js size=512 type=fast itertation=forOf: 1558
buffers/buffer-iterate.js size=512 type=slow itertation=for: 4232
buffers/buffer-iterate.js size=512 type=slow itertation=forOf: 1646
buffers/buffer-iterate.js size=1024 type=fast itertation=for: 4059
buffers/buffer-iterate.js size=1024 type=fast itertation=forOf: 1414
buffers/buffer-iterate.js size=1024 type=slow itertation=for: 4585
buffers/buffer-iterate.js size=1024 type=slow itertation=forOf: 1575
buffers/buffer-iterate.js size=4096 type=fast itertation=for: 2727
buffers/buffer-iterate.js size=4096 type=fast itertation=forOf: 1015
buffers/buffer-iterate.js size=4096 type=slow itertation=for: 2973
buffers/buffer-iterate.js size=4096 type=slow itertation=forOf: 878
buffers/buffer-iterate.js size=16386 type=fast itertation=for: 1263
buffers/buffer-iterate.js size=16386 type=fast itertation=forOf: 527
buffers/buffer-iterate.js size=16386 type=slow itertation=for: 1205
buffers/buffer-iterate.js size=16386 type=slow itertation=forOf: 590

After:

buffers/buffer-iterate.js size=16 type=fast itertation=for: 9810
buffers/buffer-iterate.js size=16 type=fast itertation=forOf: 4436
buffers/buffer-iterate.js size=16 type=slow itertation=for: 10319
buffers/buffer-iterate.js size=16 type=slow itertation=forOf: 4712
buffers/buffer-iterate.js size=512 type=fast itertation=for: 4615
buffers/buffer-iterate.js size=512 type=fast itertation=forOf: 1776
buffers/buffer-iterate.js size=512 type=slow itertation=for: 4816
buffers/buffer-iterate.js size=512 type=slow itertation=forOf: 2355
buffers/buffer-iterate.js size=1024 type=fast itertation=for: 4570
buffers/buffer-iterate.js size=1024 type=fast itertation=forOf: 2068
buffers/buffer-iterate.js size=1024 type=slow itertation=for: 4474
buffers/buffer-iterate.js size=1024 type=slow itertation=forOf: 1885
buffers/buffer-iterate.js size=4096 type=fast itertation=for: 2810
buffers/buffer-iterate.js size=4096 type=fast itertation=forOf: 1491
buffers/buffer-iterate.js size=4096 type=slow itertation=for: 2388
buffers/buffer-iterate.js size=4096 type=slow itertation=forOf: 1634
buffers/buffer-iterate.js size=16386 type=fast itertation=for: 1301
buffers/buffer-iterate.js size=16386 type=fast itertation=forOf: 952
buffers/buffer-iterate.js size=16386 type=slow itertation=for: 1082
buffers/buffer-iterate.js size=16386 type=slow itertation=forOf: 983

@timoxley
Copy link
Contributor

@bnoordhuis none of keys/values or entries methods take any arguments. Is it still polymorphic if only the type of the this value is changing?

@vkurchatkin
Copy link
Contributor Author

@timoxley I think thisArg also counts

var bench = common.createBenchmark(main, {
size: [16, 512, 1024, 4096, 16386],
type: ['fast', 'slow'],
itertation: ['for', 'forOf']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: s/itertation/iteration/ (or maybe just 'method')

@timoxley
Copy link
Contributor

@vkurchatkin "buffer: iteratior optimization" might want to amend that commit message before the typo is made permanent.

Also running your tests I get:

for (b of buffer[Symbol.iterator]())
                                 ^
TypeError: undefined is not a function
    at Object.<anonymous> (/Users/timoxley/Projects/libs/io.js/test/parallel/test-buffer-iterator.js:43:34)
    at Module._compile (module.js:467:26)
    at Object.Module._extensions..js (module.js:485:10)
    at Module.load (module.js:362:32)
    at Function.Module._load (module.js:317:12)
    at Function.Module.runMain (module.js:508:10)
    at startup (node.js:132:16)
    at node.js:830:3

Have I forgotten to do something?

}

function benchForOf(buffer) {
bench.start();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: indent error.

@vkurchatkin vkurchatkin force-pushed the iterable-buffer branch 2 times, most recently from 7ef9fe7 to 60cdc32 Compare December 29, 2014 14:06
arr = [];

for (b of buffer)
arr.push(b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: wrong indentation.

@vkurchatkin
Copy link
Contributor Author

@timoxley my bad! too many fuck ups for a simple commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.