-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static buffer for onData #605
Comments
This is not applicable for us, we already know how to do things correctly. And it still seems they don't really have any real idea what they are doing. |
I agree that there is almost no sense for us. |
@uasan But you are suggesting for your example to allocate at the start 10Mb buffer size for every request to your endpoint, request could cancel before completes, onData only uses the ram required at the time. Maybe for large sizes should write to disk instead buffer in ram |
@e3dio No, I suggest for each connection (accept socket) to define a function buffer or static object (this is less flexible) If we know the value HTTP header |
Resizable and Growable ArrayBuffer, will appear soon in V8, and then it will become even more convenient to use static ArrayBuffers for each socket |
The buffer function gives flexibility, we can slice subarray from a static buffer, its sizes will depend on the request, for the upload of a file it is enough to slice by one chunk and save it to a file. My main point is that we can reuse memory, for example, 1k RPS JSON (one request body 1Kb) creates 1MB of garbage in 1 second, this is not critical, but it makes no sense ) |
1KB Json string only needs 1 chunk and is zero copy, |
Yes this is an ideal case when the whole body is transferred in one call, but for a static buffer it will work just as well. |
Http is a stream, there is no guarantee you'll know the total size of the body up front. That's why the interface is giving you chunks and whether or not the chunk is last. What you do with these chunks is up to you, nobody forces you to use Buffer.concat. You can use whatever you think is best. |
If you have a content-length header you can pre-allocate a buffer and fill it. That's already possible today. |
Yes, you are right I closed the ticket, but I am writing thoughts here, maybe someone will be useful. |
This is not true. You are given an ArrayBuffer that is zero copy, a reference. There is no garbage collection in this library and your links to discussions in Node.js camp are not applicable. Inspiration and advice on performance is worth less than the dog shit under your shoe, if it comes from the Node.js people. These people have no vision, no idea about anything. These aspects, memory management, are architectural and will not change. How it works now, is how it will remain. |
Copies of buffers are made at the Node.js level of the application, I'm sure everything is fine on your part ) |
No Buffer.concat is only used when needed or wanted. Single chunk posts like your 1KB Json example are zero copy |
Buffer.allocUnsafe - this works in half, it will allocate a buffer slice from the shared pool, but there is no interface to free this buffer slice, they just accumulate and wait for the GC to delete them, this is not effective |
Well..... yes, that's what it means to be a GC'd language... GC languages use.... garbage collection. And yes, that sucks if you know what you're doing. But most programmers don't, and that's why GC works for most programmers. If you have any kind of vision or idea of what you're actually doing, don't program in a GC language. |
I admire solutions that take better from two worlds (manual memory and GC), the buffer function tries to do this, but the use case is small, these are endpoints that receive data in more than one call onData and should not dump data to a file |
You're not really listening to the explanation. uWS does not allocate a new Buffer for every chunk. So whatever they are doing, is not applicable here. We've been zero copy through the entire stack, all the way up to JavaScript, since day 1. If you want to fill one single ArrayBuffer with all the data of all chunks, you can pre-allocate it, then pass it along and append to it until done. That's definitely going to outperform anything Node.js is doing. |
I understand that very well. There are no complaints about C++ runtime. |
So you want pretty much onCompleteData where the user is given the full body in memory? That's something you can build atop, but it kind of stimulates poor solutions as you'll fill your RAM with bodies |
Yes, not everyone needs this |
If you make a survey who has such lines in the code onData (chunk => {... copies = Buffer.concat([copies, chunk]) ...}) I think most will say that they have it.
|
.... Or.... People can learn how to write better code and do it themselves. Like I've said three times now; none of this requires any changes to uWS. You can achieve this by writing a simple NPM module, doing this. Call it, hmmm, uws-getfullbody or something and make it a Promise or something. |
You will have exactly the same amount of copies doing this inside uWS or outside as JavaScript. Doing it inside uWS gives, like said 4 times now, zero gain.
No. That's definitely not how it works. The buffer would still be under regular GC. Just like anything written in JavaScript atop.
Yes. But if you really have an issue writing 10 lines of JavaScript, doing a simple buffer append on your own (which in itself is horrible practise) then you're not really displaying enough interest in uWS. The customers I work for / at use uWS to create solutions made to last and to deliver. They don't really care about 10 lines of extra code. |
If anything we can improve the readJSON example: uWebSockets.js/examples/JsonPost.js Line 35 in 81ab2a1
Currently it uses Node.js's insane Buffer.concat (which is like you say incredibly wasteful). Making changes to this function to better make use of content-length. |
But why, if we always append to the end of one buffer, we should not make copies of the previous chunks, the output is zero copies of the chunks, one whole buffer?
It's a pity, I thought you were destroying it, by some internal method in V8
No, the third point is as a bonus, to sell the idea, for JS community this profit )
Yes, you can see it in the source code Buffer.concat |
It has already been stated you can do |
Yes, where size = req.getHeader('content-length'); This makes a linear copy while Buffer.concat makes copies in a pyramid (which is just pure dog shit). |
There is a similar solution in my code, I solved this problem. |
|
Yes, I said that I have no copies of bytes, because I have a similar solution that you wrote. The subject of our discussion is whether this needs to be done in the new event handler under the hood in the uWS so that users have no chance of making extra copies |
A couple of years ago, the developers of the node added a reasonable way to read a socket from a static buffer. This reduces memory allocations for temporary buffer copies and reduces work GC.
nodejs/node#25436
Does this make sense, for read HTTP body?
If yes, I suggest adding an onRead handler, making the interface the same as in Node.js
https://nodejs.org/api/net.html#net_socket_connect_options_connectlistener
The text was updated successfully, but these errors were encountered: