Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically loading N-API functions #584

Closed
goto-bus-stop opened this issue Aug 8, 2020 · 6 comments
Closed

Dynamically loading N-API functions #584

goto-bus-stop opened this issue Aug 8, 2020 · 6 comments

Comments

@goto-bus-stop
Copy link
Member

goto-bus-stop commented Aug 8, 2020

This issue explores the N-API linking story on Windows and cross-platform.
I tried to make it understandable for people who haven't done any Windows
programming but please ask if anything's unclear 🙇‍♀️

Summary

We could manually load pointers to N-API functions using GetModuleHandle
and dlopen(). On Windows, this lets us avoid a compile-time dependency on
node.lib and makes the win_delay_load_hook unnecessary.

Background

When your Neon addon is loaded, it looks up the address of all the napi_*
functions that it needs. Otherwise, it could not call those functions. These
napi_* functions are provided by the Node.js executable itself.

On Linux, my understanding is that the process has one big namespace for all the
functions. So when the addon is loaded, the OS's runtime linker can get to work
and finds all the napi_* addresses for you, no matter where in the current
process they are defined.

On Windows, things are a bit stricter. You not only need to tell it about the
name of the function, but also which module (exe or dll file) contains it. This
is done at compile time using a .lib file (node.lib for N-API). This .lib
file contains entries like "find napi_create_object in node.exe". When the addon
is loaded, the OS's runtime linker looks up all the napi_* addresses inside
node.exe.

The Problems

This setup causes two complications for Neon.

  1. We need to have this node.lib file available at build time, because we need
    to link to it. This file does not ship with Node.js installations, at least
    not on Windows.

  2. Not all Node.js executables are called "node.exe". Node.js can be embedded
    inside other applications. The major one is "electron.exe". If we build a
    Neon addon with Node.js's node.lib, Windows will look for N-API functions
    in "node.exe". But, if we are running the node addon inside Electron, there
    is no "node.exe": this causes Windows to look for a node.exe at some
    predefined paths on the system and load it if it exists. Whether "node.exe"
    exists or not, the end result is not good.

    So, what we really want to tell Windows is to look up napi_* functions in
    the host process
    , whether it is named "node.exe" or "electron.exe" or
    something else.

The obvious way to address problem 1 is by downloading it in a build.rs
script. The more interesting one is problem 2.

A Solution

node-gyp solves this using a delayed loading hook. MSVC and Windows have a neat
feature where you can specify that a particular module that you link to should
not be loaded at startup, but only once you use its functions. For Neon, that
would mean that Windows won't look for all the napi_* functions as soon as the
addon is loaded, but only look them up once they are called. This is called
delayed loading.

Delayed loading also allows you to declare hooks.
These hooks let you intercept the loading of a module or a function. The
interesting bit for us: we can intercept loads of the module "node.exe", and
return the correct value ourselves. GetModuleHandle(NULL)
returns the calling process.

This is what it looks like:
https://github.com/nodejs/node-gyp/blob/aaf33c30296ddb71c12e2b587a5ec5add3f8ace0/src/win_delay_load_hook.cc#L23-L35
(The final line, __pfnDliNotifyHook2 = load_exe_hook, declares a symbol that
the delayed loading code will look for when it loads something. That's how the
hooking is done.)

It is also possible to do this in Rust. The function looks quite similar:
https://github.com/goto-bus-stop/neon/blob/00d60dd0f6b70b70e32bededeba31ba729d44d6a/src/win_delay_load_hook.rs#L51-L72

This lets us load Neon addons on Windows, even if the "node.exe" file has been
renamed 🎉

But… it's not perfect

We can now build addons for Windows, but there are still some pain points:

  1. The build script downloads a file from nodejs.org, outside of Cargo. What if
    nodejs.org is down, or blocked?
  2. We still need to handle Electron specially, because its node.lib file
    tells the runtime linker to look for napi_* functions in the module
    "electron.exe". This can be as simple as just checking for both "node.exe"
    and "electron.exe" in the code above. But what if there are other Node.js-
    based applications in the future, or a new Node.js fork?

Alternative Solution

Whenever we call a N-API function with delayed loading, Windows uses the
information from the node.lib file to essentially do this for us:

HMODULE node = GetModuleHandle("node.exe");
FARPROC fn = GetProcAddress(node, "napi_create_object");

So, what if we do this ourselves instead? Then, the node.lib file is
unnecessary. We then also control all the details. In particular, it allows us
make tweaks like this:

HMODULE node = GetModuleHandle(NULL);
FARPROC fn = GetProcAddress(node, "napi_create_object");

Now, we immediately tell the Windows API that we're looking for the host
process
module, and we don't need to use the load hook to redirect "node.exe"
or "electron.exe" at all.

Bill Ticehurst wrote about this approach in a blog post. That was
for a different use case, but it seems like people are already doing things like
this.

In Rust, we can do this in a cross-platform way using libloading. We can
then store pointers to the napi_* functions that we need in a static location
in memory, so that it's very fast: one pointer dereference and one call
instruction. This is basically the same as what the runtime linker would do.

One possible implementation for this may be the recently proposed dynamic
linking PR for bindgen: #1846. It could generate a NodeApi struct with
all the N-API function pointers in it. When a Neon addon loads, we can create
an instance of this struct. Then we update all our napi_* callsites to
something like:

(napi().napi_create_object)(.. args ..)

Where napi() is a hypothetical function that returns the NodeApi struct. The
extra set of parens is necessary because .napi_create_object is a field
containing a pointer; it's not a method. This napi() function could likely
just return a static address, so it should always be inlined by the compiler.

The initialization can happen in the register_module! macro so that end users
don't have to worry about it.

Advantages

With the dynamic loading approach, we don't need a node.lib file, and we
don't need to hook the delayed loading mechanism. This removes the hairiest
platform-specific part of the build system. The win_delay_load_hook can
also be hard to understand for outsiders, because it relies on an obscure
feature.

Drawbacks

The main ones I can see:

  1. Doing the dynamic loading ourselves is unnecessary on Linux. But AFAICT there
    is no simple way to switch between the "standard" approach and this dynamic
    approach at compile time, because the source code has to change at each N-API
    call site.
  2. Our code becomes a bit clunky with the parens and napi() call.
@kjvalencik
Copy link
Member

kjvalencik commented Aug 8, 2020

I really like the alternative approach. I think it's acceptable to introduce the dynamic loading for Linux/macOS to reduce the cross platform differences.

However, if we want to avoid it, we could introduce a napi!(napi_method_name)(args) macro with platform specific implementations. On Windows it would expand to (napi().napi_method_name)(args) and on other OS nodejs_sys::napi_method_name(args).

Or something similar. I'm not sure if rust-analyzer would handle it better with the arguments inside the macro or outside.

@goto-bus-stop
Copy link
Member Author

goto-bus-stop commented Aug 9, 2020

Yes! A macro is a good idea, and would be great for other reasons too. We could then more easily introduce something similar to the NAPI_CALL() business shown here (scroll down a bit) to provide better error information instead of the assert_eq!(status, napi_ok) that we do right now, if we want to, or swap out the linking details again later down the line.

@tjallingt
Copy link

I'm not sure how much of N-API is used by the average neon app. Would it be possible/difficult to lazily populate the NodeApi struct with the procedure addresses?

@goto-bus-stop
Copy link
Member Author

goto-bus-stop commented Aug 9, 2020

With the current design of the node-bindgen PR, that is not an option. If we write our own Rust signatures for the N-API functions it would be possible. I don't think that's likely to be a performance bottleneck though.

That said one more drawback of the dynamic loading approach with the current node-bindgen design is that we would be looking up every N-API function, instead of only the ones we use. If that turns out to be a problem we can definitely try lazy loading the functions or keeping our own list of them.

@goto-bus-stop
Copy link
Member Author

goto-bus-stop commented Aug 11, 2020

To address that last drawback, we could use .whitelist_function() to only generate bindings for the functions that we use.

@goto-bus-stop
Copy link
Member Author

I'd say #646 has done this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants