-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream performance improvement ideas #117
Comments
Not sure whether this is worth the effort. Is memory usage a problem? Number of streams in typical application * state size should be quite low, relatively speaking? I tried this once but the performance impact was negligible. I was thinking that being able to keep more of the state in a single cache line should make a impact but I was wrong. Might be different now...
I think adding
Not sure what this is? We already optimized for single pipe? What am I missing
Difficult.
Skeptical if this is worth it. Other stuff:
|
Do we have any actual profile data to back our efforts? |
TBH. I think the biggest impact we can have is to try and increase the highwatermark across the board, e.g. increasing from 16k to 128k has a theoretical perf improvement of 8x in the context of streams overhead. |
i'll open a PR and we can check but since streams are so omnipresent I would suspect so especially when people construct smaller streams (e.g. to use the iterator helpers). I feel like doing the work and seeing would be easier and I feel like I got sucked into a lot more talking than coding roles recently so that would be fun.
That would work
We always store a
No, I went through the systems analyzer with |
I don't see how this affects any hot path. |
Basically all streams become slightly smaller and lookup for the pipe becomes slightly faster which would have very small impact in "many big streams" but measurable in "many small streams" |
Said PR nodejs/node#49745 |
The main performance bottleneck in streams comes from the cost of buffering at each step of a stream pipeline. From both an algorithmic and memory perspective, this is heavy. Worsely, we do this double buffering in http, because both the socket and IncomingRequests are Readable. We should internalize a library such as syncthrough or minipass (or one built with the same principles) that does not perform any buffering. Look at the benchmarks. |
@mcollina we can explore/add a BYOB fast mode where different steps of the transform mutate the data in place instead of copying it which should ideally mean 0 copies |
That is impossible with the current API. We did a full design for this a long time ago, but there was no budget to make it happen. Using synchthrough/minipass where you'd use a Transform is a good intermediate step (you'd need this anyway for a BYOB solution). |
@mcollina why? We keep the same design and just emit the buffers that were passed to us, we keep a pointer to what we're currently using etc? |
@ronag trying FixedQueue suggestion in nodejs/node#49754 |
For a BYOB strategy to work, you need to "lock" the two streams sides so that others could not interject and inspect that. Our current stream implementation is designed as a one-to-many abstraction. |
Correct, we could for example have a base class that is not an event emitter and only |
The design for that was already done in https://github.com/Fishrock123/bob. I think this was 2017-2018 work. |
In Readable and ReadableState we do a stream instanceof DuplexStream. Same with Writable. Is there a way to avoid the instanceof? maybe using an symbol attribute kIsDuplexStream set to true DuplexStream in constructor and checked by Writable and Readable |
I checked it locally, and it would break for Writable. |
@Uzlopak why? |
Idk. But if you check... Writable has also not the following check if (!(this instanceof Writable))
return new Writable(options); |
Some object mode Readable implementations could benefit from a new |
Another thing we could look into is to merge the https://github.com/nodejs/undici/blob/main/lib/api/readable.js |
Sitting together with Ronag he pointed out some more important/useful ideas for the bitmap: errored -> move to flags (and mutate shape on error - make non-error path faster) We should also optimize for non-legacy streams and keep compat in a slow path. |
I know it sounds ignorant... Why do we call destroy in next tick? All the other operations are done in the current loop. What makes destroy so special? |
To support this flow: const server = net.createServer((socket) => {
socket.on('error', ...)
}) If we did not emit on the next tick, we could have streams that are errored before the user has a chance to attach an handler. Unfortunately this pattern is pervasive in our codebase and the ecosystem. A possible optimization in this regard is to not nextTick if there is a listener. |
Then we should consider this. Actually nextTick is the sole reason, that destroy is slower than the other ops. |
nextTick isn't that expensive (though its queue should use a deque if it doesn't assuming we can't move to V8 queueMicrotask for timing reasons) but the nextTick is also there to provide consistent ordering so code isn't sometimes synchronous it always fires listeners asynchronously |
Current state of recent improvements vs 20.7.0:
|
That's great |
Good work! |
We can:
|
cc @anonrig from our chat
More wild ideas:
cc @ronag @nodejs/streams
The text was updated successfully, but these errors were encountered: