Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker: improve integration with native addons (take II) #26175

Closed
wants to merge 11 commits into from

Conversation

addaleax
Copy link
Member

@addaleax addaleax commented Feb 17, 2019

Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239

/cc @nodejs/workers @nodejs/n-api

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@addaleax addaleax added semver-minor PRs that contain new features and should be released in the next minor version. addons Issues and PRs related to native addons. worker Issues and PRs related to Worker support. labels Feb 17, 2019
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. node-api Issues and PRs related to the Node-API. labels Feb 17, 2019
@addaleax
Copy link
Member Author

@addaleax
Copy link
Member Author

Should have fixed the linter and Windows issues. New CI: https://ci.nodejs.org/job/node-test-pull-request/20895/

This might require some debugging on AIX.

@addaleax
Copy link
Member Author

I'll try to look into AIX, but I might have to request ssh access for that.

@gireeshpunathil Do you know anything off the top of your head that would lead to AIX-specific here? Maybe dlopen() returns different pointers when loading the same shared object twice, unlike other systems?

@gireeshpunathil
Copy link
Member

@addaleax: yes, AIX returns a unique value each time dlopen is called on the same object.

excerpt from the manual:

       The dlopen subroutine loads the module specified by FilePath into the executing process's
       address space. Dependents of the module are automatically loaded as well. If the module is
       already loaded, it is not loaded again, but a new, unique value will be returned by the dlopen
       subroutine.

cat dl.cc

#include <dlfcn.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  for(int i = 0; i < 10; i++) {
    void* addr = dlopen(argv[1], RTLD_NOW | RTLD_GLOBAL);
    if (addr == NULL) {
      fprintf(stderr, "error loading the lib\n");
      return -1;
    }
    fprintf(stderr, "address %d: %p\n", i, addr);
 }
}

Linux:

./a.out foo.so

address 0: 0x1ede040
address 1: 0x1ede040
address 2: 0x1ede040
address 3: 0x1ede040
address 4: 0x1ede040
address 5: 0x1ede040
address 6: 0x1ede040
address 7: 0x1ede040
address 8: 0x1ede040
address 9: 0x1ede040

AIX:
./a.out foo.so

address 0: 3
address 1: 4
address 2: 5
address 3: 6
address 4: 7
address 5: 8
address 6: 9
address 7: a
address 8: b
address 9: c

Let me know if you want any other specific info, or assistance.

@addaleax
Copy link
Member Author

@gireeshpunathil Thanks, that is indeed very helpful. :)

I think what I’ll attempt to do for AIX is to implement wrappers for dlopen() that provides the same return value when re-loading an add-on, based on running stat() and checking the dev + ino values. It’s not perfect, but I think it should make this work.

If you have other ideas, I’d be happy to hear them.

@gireeshpunathil
Copy link
Member

@addaleax - I think that is a good approach. Other way is to use loadquery API to get the (long) list of loaded modules, search through that... I guess your approach is more simple and elegant.

@addaleax
Copy link
Member Author

@gireeshpunathil I’ve implemented that … it still does look a bit hacky (although probably less than a loadquery approach). I’m wondering if it makes sense to put the extra code into a separate file?

Either way, if you have the bandwidth I would definitely appreciate a review from you on this PR :)

CI: https://ci.nodejs.org/job/node-test-pull-request/20902/

@addaleax
Copy link
Member Author

CI on musl should be fixed now, too. 🤞

CI: https://ci.nodejs.org/job/node-test-pull-request/20908/

@gireeshpunathil
Copy link
Member

@addaleax - sure, will do. Need some time to go through the linked issues and previous PRs.

@addaleax
Copy link
Member Author

addaleax commented Feb 20, 2019

@gireeshpunathil
Copy link
Member

@addaleax - I am unable to connect some links, so thought I will ask:

  • Error: Module did not self-register: is the base problem right? how (where) are we addressing it (line numbers in the patch will help)

We are now managing the referencing count of loaded libraries.

Are these maintained across:

  • main thread (worker-less) loading the same library multiple times
    • repeated invokion of process.dlopen ends up only reference count increments
    • repeated unloads are ignored but reference adjusted. When reach 0, really unload
    • further requests are not honored
  • same scenarios, but in worker's case
  • worker loading libraries that were loaded by main
    • maintain the count within worker's scope, reset to the baseline upon exit
  • same scenarios, but between 2 or more workers without main's involvement
    ?

@addaleax
Copy link
Member Author

@gireeshpunathil Thanks for taking a look!

@addaleax - I am unable to connect some links, so thought I will ask:

* `Error: Module did not self-register`: is the base problem right? how (where) are we addressing it (line numbers in the patch will help)

Yes, that’s the error message that people currently get when they try to load a Node.js addon that uses the __attribute__((constructor)) approach for loading twice inside a process.

We are now managing the referencing count of loaded libraries.

For AIX, yes. Other operating systems/standard libraries already implement dlopen() using reference counting, so we don’t have to take care of that. (musl is the other outlier here, with not actually providing a functional dlclose() at all.) So, the whole #ifdef _AIX block is only there to align behaviour with other platforms, it’s not introducing new functionality.

Are these maintained across:

  • main thread (worker-less) loading the same library multiple times

    • repeated invokion of process.dlopen ends up only reference count increments

That’s correct. (As mentioned above, that’s already the case on almost all platforms). The core of this patch is the fact that, based on the handle returned from dlopen(), we store the node_module* pointer, and put a reference count on that as well.

  • repeated unloads are ignored but reference adjusted. When reach 0, really unload

That’s true, although we can’t realistically unload addons that were loaded on the main thread for backwards compatibility (that’s what #25577 fixed).

  • further requests are not honored

We should not be attempting to unload addons more times than we’ve loaded them, so I’m not sure if I’m understanding this correctly?

  • same scenarios, but in worker's case

  • worker loading libraries that were loaded by main

    • maintain the count within worker's scope, reset to the baseline upon exit

The reference counts, both for the shared object handle (whether we maintain that ourselves on AIX, or libc does that for us) and for the node_module* pointer, are both global/per-process, and not scoped to Workers or the main thread.

  • same scenarios, but between 2 or more workers without main's involvement

Hm – I’m not sure I understand what you’re asking here?

@gireeshpunathil
Copy link
Member

same scenarios, but between 2 or more workers without main's involvement

I mean to say, a dll was loaded by a worker, but a second worker that comes up starts loading the same. I guess I got the answer as, it is nothing different than a main - worker scenario. Am I right?

re: AIX, does this fact make anything better for you? i.e, AIX knows (internally) how many times the library load was attempted, and the unload unrolls that count. I closed more than I opened just to show that it actually remembers what it opened.

$ cat main.cc

#include <dlfcn.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  int* cache[3];
  for(int i = 0; i < 3; i++) {
    cache[i] = (int*) dlopen(argv[1], RTLD_NOW | RTLD_GLOBAL);
    if (cache[i]== NULL) {
      fprintf(stderr, "error loading the lib\n");
      return -1;
    }
    fprintf(stderr, "index: %d, handle: %p\n", i, (void*) cache[i]);
 }
  for(int i = 0; i < 5; i++) {
    fprintf(stderr, "%d\n", dlclose((void*) cache[i % 10]));
  }
}

$ ./a.out ./libfoo.so

index: 0, handle: 3
index: 1, handle: 4
index: 2, handle: 5
0
0
0
22
22

@addaleax
Copy link
Member Author

same scenarios, but between 2 or more workers without main's involvement

I mean to say, a dll was loaded by a worker, but a second worker that comes up starts loading the same. I guess I got the answer as, it is nothing different than a main - worker scenario. Am I right?

Yeah, the only difference that Worker threads vs the main thread makes is that the main thread doesn’t unload addons.

So that’s why in the test here, we’re seeing 2 full load + unload cycles when using 2 Workers, but only 1 call to static constructors/destructors when using the main thread and a Worker thread.

re: AIX, does this fact make anything better for you? i.e, AIX knows (internally) how many times the library load was attempted, and the unload unrolls that count. I closed more than I opened just to show that it actually remembers what it opened.

I’m not sure… what we need is a fixed key for caching the node_module* for a given shared object. Would *(int*)handle be that? I don’t want to run into problems relying on AIX’s internals, given how much trouble we’ve been having with people hooking into our internals 😄

@gireeshpunathil
Copy link
Member

@addaleax - agreed. we can't rely on undocumented opaque variables. And those handles are not even native addresses, even to attempt one.

@addaleax
Copy link
Member Author

@gireeshpunathil Thank you!

@jasnell Does your approval still stand? The PR has changed a bit since you approved, but otherwise it should be ready. :)

@richardlau
Copy link
Member

In case anyone wants to reference the AIX documentation for dlopen() quoted by @gireeshpunathil, it can be found: https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.basetrf1/dlopen.htm

@addaleax
Copy link
Member Author

Landed in 8ebd339 🎉

@addaleax addaleax closed this Feb 22, 2019
@addaleax addaleax deleted the worker-addons-ii branch February 22, 2019 20:54
addaleax added a commit that referenced this pull request Feb 22, 2019
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

PR-URL: #26175
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
addaleax added a commit that referenced this pull request Feb 25, 2019
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

PR-URL: #26175
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
@BridgeAR BridgeAR mentioned this pull request Feb 26, 2019
BridgeAR added a commit that referenced this pull request Feb 26, 2019
Notable changes:

* worker:
  * Improve integration with native addons
    (#26175)
rvagg pushed a commit that referenced this pull request Feb 28, 2019
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

PR-URL: #26175
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
BridgeAR added a commit that referenced this pull request Mar 5, 2019
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    (#25917)
* util:
  * Group array elements together (Ruben Bridgewater)
    (#26269)
  * Add compact depth mode (Ruben Bridgewater)
    (#26269)
* worker:
  * Improve integration with native addons (Anna Henningsen)
    (#26175)
BridgeAR added a commit that referenced this pull request Mar 5, 2019
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    #25917
* util:
  * Add compact depth mode for `util.inspect()` (Ruben Bridgewater)
    #26269
* worker:
  * Improve integration with native addons (Anna Henningsen)
    #26175
  * MessagePort.prototype.onmessage takes arguments closer to the Web
    specification now (Anna Henningsen)
    #26082
Trott pushed a commit to Trott/io.js that referenced this pull request Mar 6, 2019
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    nodejs#25917
* util:
  * Add compact depth mode for `util.inspect()` (Ruben Bridgewater)
    nodejs#26269
* worker:
  * Improve integration with native addons (Anna Henningsen)
    nodejs#26175
  * MessagePort.prototype.onmessage takes arguments closer to the Web
    specification now (Anna Henningsen)
    nodejs#26082
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addons Issues and PRs related to native addons. c++ Issues and PRs that require attention from people who are familiar with C++. node-api Issues and PRs related to the Node-API. semver-minor PRs that contain new features and should be released in the next minor version. worker Issues and PRs related to Worker support.
Projects
None yet
8 participants