Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing FileHandle objects via postMessage #33772

Closed
wants to merge 11 commits into from
Closed
17 changes: 15 additions & 2 deletions doc/api/errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -1566,6 +1566,17 @@ behavior. See the documentation for [policy][] manifests for more information.
An attempt was made to allocate memory (usually in the C++ layer) but it
failed.

<a id="ERR_MESSAGE_TARGET_CONTEXT_UNAVAILABLE"></a>
### `ERR_MESSAGE_TARGET_CONTEXT_UNAVAILABLE`
<!-- YAML
added: REPLACEME
-->

A message posted to a [`MessagePort`][] could not be deserialized in the target
[vm][] `Context`. Not all Node.js objects can be successfully instantiated in
any context at this time, and attempting to transfer them using `postMessage()`
can fail on the receiving side in that case.

<a id="ERR_METHOD_NOT_IMPLEMENTED"></a>
### `ERR_METHOD_NOT_IMPLEMENTED`

Expand All @@ -1589,8 +1600,9 @@ is thrown if a required option is missing.
<a id="ERR_MISSING_MESSAGE_PORT_IN_TRANSFER_LIST"></a>
### `ERR_MISSING_MESSAGE_PORT_IN_TRANSFER_LIST`

A `MessagePort` was found in the object passed to a `postMessage()` call,
but not provided in the `transferList` for that call.
An object that needs to be explicitly listed in the `transferList` argument
was found in the object passed to a `postMessage()` call, but not provided in
the `transferList` for that call. Usually, this is a `MessagePort`.

<a id="ERR_MISSING_PASSPHRASE"></a>
### `ERR_MISSING_PASSPHRASE`
Expand Down Expand Up @@ -2563,6 +2575,7 @@ such as `process.stdout.on('data')`.
[`Class: assert.AssertionError`]: assert.html#assert_class_assert_assertionerror
[`ERR_INVALID_ARG_TYPE`]: #ERR_INVALID_ARG_TYPE
[`EventEmitter`]: events.html#events_class_eventemitter
[`MessagePort`]: worker_threads.html#worker_threads_class_messageport
[`Object.getPrototypeOf`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/getPrototypeOf
[`Object.setPrototypeOf`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/setPrototypeOf
[`REPL`]: repl.html
Expand Down
30 changes: 28 additions & 2 deletions doc/api/worker_threads.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,15 @@ input of [`port.postMessage()`][].
Listeners on this event will receive a clone of the `value` parameter as passed
to `postMessage()` and no further arguments.

### Event: `'messageerror'`
<!-- YAML
added: REPLACEME
-->

* `error` {Error} An Error object

The `'messageerror'` event is emitted when deserializing a message failed.

### `port.close()`
<!-- YAML
added: v10.5.0
Expand All @@ -318,6 +327,10 @@ are part of the channel.
### `port.postMessage(value[, transferList])`
<!-- YAML
added: v10.5.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/33772
description: Added `FileHandle` to the list of transferable types.
-->

* `value` {any}
Expand All @@ -335,7 +348,8 @@ In particular, the significant differences to `JSON` are:
* `value` may contain typed arrays, both using `ArrayBuffer`s
and `SharedArrayBuffer`s.
* `value` may contain [`WebAssembly.Module`][] instances.
* `value` may not contain native (C++-backed) objects other than `MessagePort`s.
* `value` may not contain native (C++-backed) objects other than `MessagePort`s
and [`FileHandle`][]s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, we might want to consider marking transferable objects with a special static read-only symbol property so we don't actually have to maintain this list. For instance, something like:

const { transferableSymbol } = require('worker_threads');

MessagePort[transferableSymbol]         // True!
FileHandle[transferableSymbol]          // True!
MyOtherHandleThing[transferableSymbol   // False!

Then we wouldn't necessarily have to maintain a list of such objects in the docs, we would just say that value may be any object where [transferableSymbol] === true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasnell I think that’s a good idea, but I do think it’s important to list the built-in types that are transferable here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I think we could put those in one place rather than having them listed in a couple different places (they're currently listed in the value and the transferList, etc)


```js
const { MessageChannel } = require('worker_threads');
Expand All @@ -349,7 +363,8 @@ circularData.foo = circularData;
port2.postMessage(circularData);
```

`transferList` may be a list of `ArrayBuffer` and `MessagePort` objects.
`transferList` may be a list of [`ArrayBuffer`][], [`MessagePort`][] and
[`FileHandle`][] objects.
After transferring, they will not be usable on the sending side of the channel
anymore (even if they are not contained in `value`). Unlike with
[child processes][], transferring handles such as network sockets is currently
Expand Down Expand Up @@ -675,6 +690,15 @@ See the [`port.on('message')`][] event for more details.
All messages sent from the worker thread will be emitted before the
[`'exit'` event][] is emitted on the `Worker` object.

### Event: `'messageerror'`
<!-- YAML
added: REPLACEME
-->

* `error` {Error} An Error object

The `'messageerror'` event is emitted when deserializing a message failed.

### Event: `'online'`
<!-- YAML
added: v10.5.0
Expand Down Expand Up @@ -816,13 +840,15 @@ active handle in the event system. If the worker is already `unref()`ed calling

[`'close'` event]: #worker_threads_event_close
[`'exit'` event]: #worker_threads_event_exit
[`ArrayBuffer`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer
[`AsyncResource`]: async_hooks.html#async_hooks_class_asyncresource
[`Buffer`]: buffer.html
[`Buffer.allocUnsafe()`]: buffer.html#buffer_class_method_buffer_allocunsafe_size
[`ERR_MISSING_MESSAGE_PORT_IN_TRANSFER_LIST`]: errors.html#errors_err_missing_message_port_in_transfer_list
[`ERR_WORKER_NOT_RUNNING`]: errors.html#ERR_WORKER_NOT_RUNNING
[`EventEmitter`]: events.html
[`EventTarget`]: https://developer.mozilla.org/en-US/docs/Web/API/EventTarget
[`FileHandle`]: fs.html#fs_class_filehandle
[`MessagePort`]: #worker_threads_class_messageport
[`SharedArrayBuffer`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer
[`Uint8Array`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array
Expand Down
1 change: 1 addition & 0 deletions lib/internal/bootstrap/node.js
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ process._exiting = false;

// process.config is serialized config.gypi
process.config = JSONParse(internalBinding('native_module').config);
require('internal/worker/js_transferable').setup();

// Bootstrappers for all threads, including worker threads and main thread
const perThreadSetup = require('internal/process/per_thread');
Expand Down
28 changes: 26 additions & 2 deletions lib/internal/fs/promises.js
Original file line number Diff line number Diff line change
Expand Up @@ -65,13 +65,17 @@ const { promisify } = require('internal/util');
const kHandle = Symbol('kHandle');
const kFd = Symbol('kFd');
const { kUsePromises } = binding;
const {
JSTransferable, kDeserialize, kTransfer, kTransferList
} = require('internal/worker/js_transferable');

const getDirectoryEntriesPromise = promisify(getDirents);

class FileHandle {
class FileHandle extends JSTransferable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have an impact on the memory consumption or size of FileHandle objects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the inspector, this adds 32 bytes in total, which seems to make sense to me (1 × vpointer + 3 × pointer-size fields in BaseObject).

For comparison, the close function on each FileHandle is 56 bytes in size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparison, the close function on each FileHandle is 56 bytes in size.

That is actually pretty unfortunate to allocate a function per handle, but I guess maybe file handles are expected to be relatively expensive to hold?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjamingr I would love to see our objects become smaller in general, but I think it’s fine here. There’s also the native FileHandle class, which also comes with 56 bytes per instance.

And as far as FileHandles as a resource are concerned, we maybe create thousands of them in a single process, but not millions.

constructor(filehandle) {
super();
this[kHandle] = filehandle;
this[kFd] = filehandle.fd;
this[kFd] = filehandle ? filehandle.fd : -1;
}

getAsyncId() {
Expand Down Expand Up @@ -142,6 +146,26 @@ class FileHandle {
this[kFd] = -1;
return this[kHandle].close();
}

[kTransfer]() {
const handle = this[kHandle];
this[kFd] = -1;
this[kHandle] = null;

return {
data: { handle },
deserializeInfo: 'internal/fs/promises:FileHandle'
};
}

[kTransferList]() {
return [ this[kHandle] ];
}

[kDeserialize]({ handle }) {
this[kHandle] = handle;
this[kFd] = handle.fd;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasnell Happy to take suggestions on the API here. It’s a bit unfortunate that we can’t use the constructor and instead use this deserialization method to fill the object’s contents with the transferred data, but I don’t really know a way around this (See the comment in JSTransferable::Data::Deserialize() for details on why it seems necessary to me)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is actually pretty common in JS for "hydrating" objects in my experience - though it is admittedly less neat than doing everything in the constructor and always getting a fully initialized object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this API. The one thing that I would be concerned about is whether we'll always be certain that kDeserialize is always going to be called with a non-null/non-undefined value, otherwise the ({ handle }) deconstruction will fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will always be called with the value in [kTransfer]().data, so yes, that should work out.

}

function validateFileHandle(handle) {
Expand Down
4 changes: 3 additions & 1 deletion lib/internal/worker.js
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,9 @@ class Worker extends EventEmitter {
transferList.push(...options.transferList);

this[kPublicPort] = port1;
this[kPublicPort].on('message', (message) => this.emit('message', message));
for (const event of ['message', 'messageerror']) {
this[kPublicPort].on(event, (message) => this.emit(event, message));
}
setupPortReferencing(this[kPublicPort], this, 'message');
this[kPort].postMessage({
argv,
Expand Down
39 changes: 39 additions & 0 deletions lib/internal/worker/js_transferable.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
'use strict';
const { Error } = primordials;
const {
messaging_deserialize_symbol,
messaging_transfer_symbol,
messaging_clone_symbol,
messaging_transfer_list_symbol
} = internalBinding('symbols');
const {
JSTransferable,
setDeserializerCreateObjectFunction
} = internalBinding('messaging');

function setup() {
// Register the handler that will be used when deserializing JS-based objects
// from .postMessage() calls. The format of `deserializeInfo` is generally
// 'module:Constructor', e.g. 'internal/fs/promises:FileHandle'.
setDeserializerCreateObjectFunction((deserializeInfo) => {
const [ module, ctor ] = deserializeInfo.split(':');
const Ctor = require(module)[ctor];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the module name be guarded with a whitelist here?

I don't think it should since worker threads don't provide a sandbox but I just want to make sure that this isn't exposing a new attack surface somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjamingr Hm yeah … this isn’t extending capabilities of the calling code in any way, but I agree that it might be good to be careful here. I’ll think about this a bit more.

Copy link
Member Author

@addaleax addaleax Jun 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjamingr I’ve updated this PR so that this function now verifies that the target constructor is a subclass of JSTransferable, and added tests that verify that fact + the fact that this will only load internal modules (which was already the case before).

I think that’s as “locked down” as this can get. If user code makes other internal classes inherit from JSTransferable on the receiving side, and override the kTransfer method on the sending side – which is totally doable in most situations – then that means that you can already make arbitrary changes to Node.js internals and won’t even need to do this.

The one thing I could see happen would be side effects from require()ing the internal module itself, which the sending code can still trigger on the receiving side (and which would still typically result in an uncaught exception).

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one thing I could see happen would be side effects from require()ing the internal module itself, which the sending code can still trigger on the receiving side (and which would still typically result in an uncaught exception).

Fwiw that's the part that initially bugged me, but like we both said it doesn't increase the attack surface at all from what I can tell.

if (typeof Ctor !== 'function' ||
!(Ctor.prototype instanceof JSTransferable)) {
// Not one of the official errors because one should not be able to get
// here without messing with Node.js internals.
// eslint-disable-next-line no-restricted-syntax
throw new Error(`Unknown deserialize spec ${deserializeInfo}`);
}
return new Ctor();
});
}

module.exports = {
setup,
JSTransferable,
kClone: messaging_clone_symbol,
kDeserialize: messaging_deserialize_symbol,
kTransfer: messaging_transfer_symbol,
kTransferList: messaging_transfer_list_symbol
};
1 change: 1 addition & 0 deletions node.gyp
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,7 @@
'lib/internal/vm/module.js',
'lib/internal/worker.js',
'lib/internal/worker/io.js',
'lib/internal/worker/js_transferable.js',
'lib/internal/watchdog.js',
'lib/internal/streams/lazy_transform.js',
'lib/internal/streams/async_iterator.js',
Expand Down
1 change: 1 addition & 0 deletions src/async_wrap.cc
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,7 @@ Local<FunctionTemplate> AsyncWrap::GetConstructorTemplate(Environment* env) {
if (tmpl.IsEmpty()) {
tmpl = env->NewFunctionTemplate(nullptr);
tmpl->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "AsyncWrap"));
tmpl->Inherit(BaseObject::GetConstructorTemplate(env));
env->SetProtoMethod(tmpl, "getAsyncId", AsyncWrap::GetAsyncId);
env->SetProtoMethod(tmpl, "asyncReset", AsyncWrap::AsyncReset);
env->SetProtoMethod(tmpl, "getProviderType", AsyncWrap::GetProviderType);
Expand Down
15 changes: 15 additions & 0 deletions src/base_object-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ BaseObject::MakeLazilyInitializedJSTemplate(Environment* env) {
};

v8::Local<v8::FunctionTemplate> t = env->NewFunctionTemplate(constructor);
t->Inherit(BaseObject::GetConstructorTemplate(env));
t->InstanceTemplate()->SetInternalFieldCount(
BaseObject::kInternalFieldCount);
return t;
Expand Down Expand Up @@ -337,6 +338,20 @@ BaseObjectPtrImpl<T, kIsWeak>::operator bool() const {
return get() != nullptr;
}

template <typename T, bool kIsWeak>
template <typename U, bool kW>
bool BaseObjectPtrImpl<T, kIsWeak>::operator ==(
const BaseObjectPtrImpl<U, kW>& other) const {
return get() == other.get();
}

template <typename T, bool kIsWeak>
template <typename U, bool kW>
bool BaseObjectPtrImpl<T, kIsWeak>::operator !=(
const BaseObjectPtrImpl<U, kW>& other) const {
return get() != other.get();
}

template <typename T, typename... Args>
BaseObjectPtr<T> MakeBaseObject(Args&&... args) {
return BaseObjectPtr<T>(new T(std::forward<Args>(args)...));
Expand Down
55 changes: 54 additions & 1 deletion src/base_object.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ class Environment;
template <typename T, bool kIsWeak>
class BaseObjectPtrImpl;

namespace worker {
class TransferData;
}

class BaseObject : public MemoryRetainer {
public:
enum InternalFields { kSlot, kInternalFieldCount };
Expand Down Expand Up @@ -98,7 +102,51 @@ class BaseObject : public MemoryRetainer {
// a BaseObjectPtr to this object.
inline void Detach();

protected:
static v8::Local<v8::FunctionTemplate> GetConstructorTemplate(
Environment* env);

// Interface for transferring BaseObject instances using the .postMessage()
// method of MessagePorts (and, by extension, Workers).
// GetTransferMode() returns a transfer mode that indicates how to deal with
// the current object:
// - kUntransferable:
// No transfer is possible, either because this type of BaseObject does
// not know how to be transfered, or because it is not in a state in
// which it is possible to do so (e.g. because it has already been
// transfered).
// - kTransferable:
// This object can be transfered in a destructive fashion, i.e. will be
// rendered unusable on the sending side of the channel in the process
// of being transfered. (In C++ this would be referred to as movable but
// not copyable.) Objects of this type need to be listed in the
// `transferList` argument of the relevant postMessage() call in order to
// make sure that they are not accidentally destroyed on the sending side.
// TransferForMessaging() will be called to get a representation of the
// object that is used for subsequent deserialization.
// The NestedTransferables() method can be used to transfer other objects
// along with this one, if a situation requires it.
// - kCloneable:
// This object can be cloned without being modified.
// CloneForMessaging() will be called to get a representation of the
// object that is used for subsequent deserialization, unless the
// object is listed in transferList, in which case TransferForMessaging()
// is attempted first.
// After a successful clone, FinalizeTransferRead() is called on the receiving
// end, and can read deserialize JS data possibly serialized by a previous
// FinalizeTransferWrite() call.
enum class TransferMode {
kUntransferable,
kTransferable,
kCloneable
};
virtual TransferMode GetTransferMode() const;
virtual std::unique_ptr<worker::TransferData> TransferForMessaging();
virtual std::unique_ptr<worker::TransferData> CloneForMessaging() const;
virtual v8::Maybe<std::vector<BaseObjectPtrImpl<BaseObject, false>>>
NestedTransferables() const;
virtual v8::Maybe<bool> FinalizeTransferRead(
v8::Local<v8::Context> context, v8::ValueDeserializer* deserializer);

virtual inline void OnGCCollect();

private:
Expand Down Expand Up @@ -197,6 +245,11 @@ class BaseObjectPtrImpl final {
inline T* operator->() const;
inline operator bool() const;

template <typename U, bool kW>
inline bool operator ==(const BaseObjectPtrImpl<U, kW>& other) const;
template <typename U, bool kW>
inline bool operator !=(const BaseObjectPtrImpl<U, kW>& other) const;

private:
union {
BaseObject* target; // Used for strong pointers.
Expand Down
11 changes: 11 additions & 0 deletions src/env.cc
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,7 @@ void Environment::CreateProperties() {
Local<FunctionTemplate> templ = FunctionTemplate::New(isolate());
templ->InstanceTemplate()->SetInternalFieldCount(
BaseObject::kInternalFieldCount);
templ->Inherit(BaseObject::GetConstructorTemplate(this));

set_binding_data_ctor_template(templ);
}
Expand Down Expand Up @@ -1112,4 +1113,14 @@ bool BaseObject::IsRootNode() const {
return !persistent_handle_.IsWeak();
}

Local<FunctionTemplate> BaseObject::GetConstructorTemplate(Environment* env) {
Local<FunctionTemplate> tmpl = env->base_object_ctor_template();
if (tmpl.IsEmpty()) {
tmpl = env->NewFunctionTemplate(nullptr);
tmpl->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "BaseObject"));
env->set_base_object_ctor_template(tmpl);
}
return tmpl;
}

} // namespace node
Loading