Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the memory consumption of each Worker Thread #34823

Closed
BorisKozo opened this issue Aug 18, 2020 · 8 comments
Closed

Reduce the memory consumption of each Worker Thread #34823

BorisKozo opened this issue Aug 18, 2020 · 8 comments
Labels
feature request Issues that request new features to be added to Node.js. performance Issues and PRs related to the performance of Node.js. worker Issues and PRs related to Worker support.

Comments

@BorisKozo
Copy link

Hi,
-- Please describe the problem you are trying to solve.
We are building a Node.js based on-premise application that needs to run in a memory restricted environment. The application uses its own implementation of worker threads because it was developed before the addition of worker threads to Node.js. We are looking to migrate our implementation which is becoming harder and harder to maintain (syncing our code with the Node.js versions) to the built in implementation. Our application starts numerous worker threads therefore the memory utilization per one worker thread is critical. In our implementation of worker threads we emphasized the reduction of memory consumption but payed in the fact that our worker thread could not access the Node.js standard APIs. Furthermore, our implementation is somewhat tailored to our use-case and lacks the flexibility of the standard implementation (e.g. we will not be able to use "import" and we have our own implementation of "require").

I have migrated all of the relevant code to use the native worker threads but now each worker thread utilizes almost 3 times more memory than in our implementation. I must add that the general structure of our implementation is not dissimilar to the native one and it uses a libuv event loop and v8 Isolates separation in a similar fashion.

I ran some tests with an empty worker thread starting up every 10 seconds (i.e. new worker thread that does nothing infinitely starts every 10 seconds) and here are the results of the memory utilization of that process:

New worker thread every 10 seconds
Test was run with Node.js 12.8.3 on a Windows 10 machine.

As you can see a Node.js process without any worker threads running takes ~11mb and starting a worker thread utilizes almost as much memory as the full process.

-- Please describe the desired behavior.
We would like to reduce the utilization of the memory per worker thread. Any reduction would be greatly appreciated. I did some investigation as to what exactly causing each worker thread to use as much memory as the entire process but did not reach any conclusive results (I am not an expert in this area and unfortunately the person who implemented our worker threads variant has left the company). I suspect that each worker thread loads all of Node.js native library separately into its memory which could explain why each worker thread has the same memory usage as the entire process (i.e. the main thread). Maybe there is some way to use the same native (C code) across all worker threads to save on memory consumption (again, I am not an expert in C programming so I might be writing nonsense here).

-- Please describe alternative solutions or features you have considered.
Currently our only alternative is to stay with our implementation.

Thank you.

@jasnell
Copy link
Member

jasnell commented Aug 18, 2020

The reason each Worker consumes roughly the same amount of memory as the initial process is that each Worker maintains its own complete Node.js instance in memory (where Node.js instance is the combination of Environment (all the internal Node.js state), Isolate, and Core APIs. Thus far, optimization efforts for Workers have focused on improving start up times, with the majority of activity going into enabling V8 heap snapshots to reduce overall startup time of both the main process thread and all Workers. I would also very much like to see us introduce a new "lightweight" Worker that does not load the Node.js core APIs by default. It is something that I've talked about with @addaleax off and on and, unfortunately, it would not be trivial -- to the point that we'd really have to make sure there's value before taking the time to invest in the significant amount of work entailed. That said, definitely not out of the question.

@jasnell jasnell added feature request Issues that request new features to be added to Node.js. performance Issues and PRs related to the performance of Node.js. worker Issues and PRs related to Worker support. labels Aug 18, 2020
@BorisKozo
Copy link
Author

@jasnell thank you for looking into this.
I see how lightweight Worker may benefit many users although probably not for our case.
The direction I thought about and may be simpler (or just as hard, I don't know) is to load all the stateless functionality of the Node.js instance only once for all the Workers.

Maybe @YafimK (who implemented our workers) can chime in and share his thoughts.

@benjamingr
Copy link
Member

Hey @BorisKozo !

Should this issue remain open?

@BorisKozo
Copy link
Author

@benjamingr That depends, can someone look into it sometime in the future? No point keeping it open if no one is ever going to check it. I don't have enough knowledge to research the internals myself.

@manast
Copy link

manast commented May 5, 2023

Wondering, what would be the advantage of using worker threads instead of forking child processes if the memory consumption is more or less the same?

@bnoordhuis
Copy link
Member

Things like IPC are faster in-process.

Also, cheaper context switches on most (probably all) architectures.

@benjamingr
Copy link
Member

Also, cheaper context switches on most (probably all) architectures.

(Nit: technically context-switching a process is usually the same order of magnitude as fast as context-switching a thread (on linuxes), the thing that's different IIRC is that the translation lookaside buffer is invalidated since different processes have different virtual address spaces.)

@bnoordhuis
Copy link
Member

Yes, that's precisely my point. TLB flushes are expensive. It's why operating systems keep themselves mapped in the address space of user processes, to avoid invalidating page tables on system calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js. performance Issues and PRs related to the performance of Node.js. worker Issues and PRs related to Worker support.
Projects
None yet
Development

No branches or pull requests

6 participants