-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: allow reading data into a static buffer #25436
Conversation
659c22d
to
f84b416
Compare
I like the feature idea. Regarding the implementation:
|
I'm open to suggestions (especially specific code changes), this is very much just a proof of concept to show the potential, significant performance gains. |
@mscdex Since you’re asking for specific code changes: addaleax@d8b783f is about what I’d have in mind (applied on top of this PR). It also brings the combined diff nicely down from +225/−66 to +139/−45. |
@addaleax Merged, thanks. |
thoughts @nodejs/collaborators ? |
@mscdex The code LGTM here, but I’d also like other people to weigh in more. I assume you don’t want to do unnecessary work in case this doesn’t get merged, but I also guess for a lot of people docs and tests help; so, tl;dr on the API here: This PR adds an option to Usage looks something like this: const socket = net.connect({
host: 'example.com',
port: 80,
onread: {
buffer: Buffer.allocUnsafe(65536),
callback(bytesRead, buffer) {
// Do something with the first `bytesRead` bytes of `buffer`.
// The `buffer` will always be the same object, so all data
// needs to be handled synchronously here.
}
} |
There has been some talk of creating a node API layer that is closer to uv, so more efficient code can be written on top of it. Is that where this is going? cf http://docs.libuv.org/en/v1.x/stream.html#c.uv_read_start The onread use of a buffer seems to make unnecessary the uv alloc/read callback pair, that seems reasonable to me, it reduces roundtrip time in that only one C++ to js callback is needed. I guess the prices is the need for sync data handling. Maybe that's OK, as long as the data is immediately passed to a protocol parser, or unzipper, or something of the sort, though it wouldn't work for a simple pipe of data to re: lack of backpressure support, at the uv layer the caller has to explicitly start/stop data flow to exert backpressure. Isn't there a need for a readStart/Stop API as uv has? Or does the handle have that already? There is the Readable.pause() method, but I think the idea here is to bypass the streams API. |
18f7aca
to
86e015e
Compare
The |
86e015e
to
8cde1aa
Compare
I like the idea, we had quite a long discussion about it in denoland/deno#387 |
8cde1aa
to
475614d
Compare
ping @nodejs/collaborators ? |
@mscdex Unanswered questions in #25436 (comment) |
@sam-github I answered how starting and stopping is handled in this PR in #25436 (comment). Were you looking for some other information? |
My apologies @mscdex, that comment exactly answers my question, I don't know how I missed it. |
Ping @nodejs/streams to review ... in particular, @mcollina, I'd love your thoughts on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning to add this to other areas of core? Specifically, are you planning to add it to TLS?
What will happen if pause
is called within callback
?
I think this API could be a bit limiting to what it could be achieved. What I would like to see is more a situation where the same buffer moves between the producer and the consumer, like this:
let oldBuffer = null
const socket = net.connect({
host: 'example.com',
port: 80,
onread: {
buffer() {
return oldBuffer || Buffer.allocUnsafe(65536)
},
callback(bytesRead, buffer) {
destination.write(buffer, (err) => {
oldBuffer = buffer
})
}
}
Probably.
With the PR as it currently is, node will not read from the socket.
I don't understand what exactly is expected with the suggested implementation. Is Also, due to the asynchronous nature of |
The key idea is to provide a mechanism to reuse multiple buffers, not just one. Whenever there is a need for a read, C++ can call |
That would probably cause a noticeable performance hit, unless you were willing to compromise a bit and instead have the function that the C++ side calls when the socket is read from return the next Buffer to use (by calling |
+1, that's what I meant, sorry for making it complicated. |
Refs: #25436 PR-URL: #35753 Refs: #25436 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Minwoo Jung <[email protected]>
Refs: #25436 PR-URL: #35753 Refs: #25436 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Minwoo Jung <[email protected]>
Refs: #25436 PR-URL: #35753 Refs: #25436 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Minwoo Jung <[email protected]>
Refs: #25436 PR-URL: #35753 Refs: #25436 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Minwoo Jung <[email protected]>
Refs: #25436 PR-URL: #35753 Refs: #25436 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Minwoo Jung <[email protected]>
Refs: nodejs#25436 PR-URL: nodejs#35753 Refs: nodejs#25436 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Minwoo Jung <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: #55112 Refs: #25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: #55112 Refs: #25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: #55112 Refs: #25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: nodejs#55112 Refs: nodejs#25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: #55112 Refs: #25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: #55112 Refs: #25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
moved YAML changes element regarding to `onread` option from `socket.connect(options[, connectListener])` to `new net.Socket([options])` PR-URL: nodejs#55112 Refs: nodejs#25436 Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
This is more or less the changes I described in nodejs/node-eps#27 with some updates since the C++ side had changed a fair amount. I thought I would put this out there to see if there was still interest in merging a feature like this (even if it's not this PR as-is).
This implementation is opt-in, only for
net
(although it could obviously be adapted for other places where dynamic buffers are used), and it goes the simple and easy route and completely bypasses the data streaming feature of sockets (although'end'
and other such non-data-related events are still emitted as usual).Having this kind of power is obviously not for all use cases, but for network protocol implementations it can make things a lot faster because the connection data is typically parsed synchronously and any raw data that needs to be passed on to end users could simply be copied.
Here are some example benchmark results (
recvbuflen=0
indicates old/current behavior of allocating a new buffer for each chunk read):I did not add documentation or tests yet because API is not set in stone.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes