-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add onclose event to MessagePort #1766
Comments
It doesn't really work with the MessageChannel model as discussed in that email. Do you have a specific proposal that doesn't expose GC? |
Reading over the thread and the spec again, that particular email's claim that there's ambiguity about what a port's 'owner' is is actually wrong. The spec does define what settings object owns the port, and sharing a reference doesn't change it. The pass-port-around case is already basically broken, it looks like. What about an event that fires:
|
I'll dupe https://www.w3.org/Bugs/Public/show_bug.cgi?id=28813 against this. |
@isonmad the problem with that is that rather than exposing GC, you end up creating a memory leak by keeping the objects alive for the duration of the global. I think service workers ended up doing this without much thought about it. |
That was in the thread too, and it turned out there was already a leak of MessagePorts, at least in Chrome in 2013, and the spec language was written around that fact that devs should be careful regardless and not depend on GC. But is there a technical reason the MessagePort object itself actually has to be kept alive in its entirety in implementations of |
I'd like to see this revived now that (My interest specifically comes from using MessagePorts in Electron as a general-purpose IPC tool) |
I use LockManager to shim web-worker /// worker
const lockReqId = 'process-live-'+Date.now()+Math.random();
navigator.locks.request(lockReqId,()=>new Promise(()=>{}));
postMessage(lockReqId);
/// master
worker.addEventListener('message', me=>{
if(typeof me.data==='string' && me.data.startsWith("process-live-")){
navigator.locks.request(me.data,()=>{
worker.dispatchEvent(new CloseEvent('close'))
})
}
}); I use LockManager to shim web-worker onclose. |
I'd like to revisit this; in Firefox we're implementing Web Locks and it seems like absent a MessagePort I too re-read the original mailing list thread from the start that was linked to in the top comment in this issue and it seemed like the primary discussion points were:
With Web Locks (which is moving to WebApps WG) exposing same-storage-key agent termination and ServiceWorkers exposing same-storage-key client enumeration via matchAll, there already exist same-origin mechanisms to surface lifecycle. I think the main risk with the "close" event proposal here is that it could potentially expose cross-origin life-cycles in an adversarial manner if not constrained. Specifically, if |
@asutherland I don't think that would expose recipient agent lifetime without explicit action on part of the recipient. The sender would not be able to distinguish between "the other end of the message channel was dropped due to GC" and "the recipient agent was terminated". If the recipient agent does not register any event listeners on the port it receives, and does not save a reference to it, then the port will be garbage collected and the close event would be emitted on the sender side, revealing no information about the lifecycle of the recipient. |
@nornagon I agree that if we generate the close event on GC then we side-step the problem. I was unclear about which specific proposal I was discussing, which was the proposal comment in this thread at #1766 (comment). I read that proposal as explicitly not letting it be known when GC happens but instead when the global is terminated, which means that the message port post message steps which explicitly deserializes regardless of whether there are any listeners will then run the transfer-receiving steps would inherently need to tie the MessagePort to the global in a way that goes out of its way to ignore GC. @annevk pointed out issues related to this in his comment at #1766 (comment). @domenic @annevk Do either of you have a feeling about what the current state of things is in terms of concerns about exposing GC? I know in Firefox that the existence of Web Locks and the (ServiceWorkers) Clients API means that it is conceptually possible to detect when Firefox GC's a Worker, for example. |
I think there are 2 kind of collection we need to differentiate: an object within an agent, and an agent itself. Like has been expressed in the email threads, I would be worried about exposing the collection of objects inside an agent (MessagePort) to another agent. Even with the existence of WeakRef and FinalizationRegistry, the capability to observe an object lifecycle requires a direct access to these APIs and to the object to observe. As such, tying the observable lifecycle of a remote MessagePort to program actions or to the remote agent's overall lifecycle seems more appropriate. From what I understand, automatically tying the observable remote MessagePort lifecycle to the receiving agent creates its own problems. As @asutherland mentions, the solution is to tie the observation capability to explicit action of the program when receiving a MessagePort. More specifically, a sending agent should not be informed of a remote MessagePort disconnection unless the program running in the receiving agent starts the port (registers a message listeners). At that point the sending agent would receive a close notification when either the receiving agent terminates, or when the program explicitly closes the received port. That is different from what @nornagon suggests: a sending agent cannot be made aware of the disconnection of the MessagePort if the receiving program ignores the received port. |
I think since the introducing of WeakRefs the boundary has been broken a good deal, and anything which is equivalent to WeakRefs in power is allowable. It's still not something you want to do lightly, as minimizing the number of ways in which an application can be GC-dependent is important. But if it's useful (and I think this instance qualifies; we've been hearing this request for over a decade now) then it's probably reasonable to do. It sounds like we have a similar precedent with Web Locks and ServiceWorker for whole-agent GC, so I would extend the same reasoning to that case. |
For the "close" event itself, CloseEvent is already a thing and is defined in the web-sockets chapter of HTML. Would we just reuse that? It has 3 attributes, defined like so quoting from here:
The defaults per CloseEventInit are |
I don't think we should use CloseEvent for this, as it seems rather WebSocket-specific. We can just use Event. |
@domenic I'd also like to add my two cents here. For a client, I am building a micro-frontend application where users can dynamically add and remove additional apps hosted in iframes loaded from various origins. We're connecting to those third-party browsing contexts via channel messaging, which works perfectly fine. However, we must do some cleanup work after an iframe is removed, when the site was unloaded, has crashed, etc. Unfortunately, due to the lack of an For our use case, it would also be okay to opt-in to exposing lifecycle events to third-party browsing contexts (as @asutherland mentioned) as the client apps are under our control. For us, it would also be fine if the event would be dispatched sometime after the message port has actually been closed. All in all, it would be really cool to have this event. :) |
When we discussed this at TPAC in 2022 others did share the concern around running JS on shutdown so it might still be better to only do this at deterministic times for now. I.e., when the specification dictates shutdown. |
@annevk Can you clarify what you mean here? I understand the central decision here is about when to notify a close has occurred for an opted-in MesagePort (by starting it after it's been deserialized in the given global) for situations other than the obvious invocation of close. Those times are:
Specifically, which of the two above is compatible with the TPAC 2022 discussion? If the answer is none, how do the concerns deal with what Web Locks has already enabled? Thanks! |
I think it indeed tied back to @isonmad's original proposal of embracing the memory leak in order to not expose GC. I also realize now that the script wouldn't run in the agent being shutdown as presumably we'd only dispatch the I'm not sure what the model with Web Locks is and whether that can still change. It does seem like that would indeed end up exposing similar things. (The discussion at TPAC was quite brief and I somewhat regret bringing it up now, but on the other hand I learned a few things due to your reply...) |
I'd say in that particular case, we could probably delegate the handling to the service tailored exactly for such cases - |
I've written an explainer that covers the problem, the background, a proposal and a discussion of alternatives. TL;DR the proposal is to send a |
The explainer proposal is great, thank you. I'm on board with implementing this in Firefox. Re: exposing navigation, the below is just me restating why this proposal works in relation to things I've previously said; I'm not raising any new issues: My concern was a more abstract one of creating non-trivial new side-channels, but that was only a concern if we were intentionally tying the MessagePort to the global in an intentional memory leak to avoid exposing GC in a way that would let navigation be detected in the future without an opt-in from the recipient global. In particular, with that capability, being able to "ship" the MessagePort outside the agent cluster would be a new capability beyond what having access to the WindowProxy would mean (or other similar browsing context hierarchy shenanigans). This proposal does not create a non-trivial new side-channel because an ignored MessagePort postMessaged to a WindowProxy that doesn't care will be subject to GC. As the proposal explains in great detail (and consistent with @domenic's comment, this potential exposure of GC timing is fine because WeakRefs already expose GC. As the explainer also points out, there is a trivial new side-channel where, because the MessagePort will currently become entagled with the global[1], a close event could be delivered as a result of a very rapid navigation or something like that, but that's a very short timeline and we would expect the WindowProxy to still be available on that timeline, etc. 1: Per spec, at least, the deserialization in step 7.3 of the message port post message steps is unconditional. And I believe in Firefox we unconditionally do this, but I think I remember seeing a bug filed at one point that suggested that other engines may lazily perform the deserialization so that it only happens on demand. |
Edit: Skip this comment, I was confused about when "source" was set, but @rakina's comment below cleared things up for me and you can probably skip my comment explaining my confusion as well. A related change we can make if introducing the close event that came up while talking with @smaug---- was that 9.4.5 Ports and garbage collection normatively says:
Without having done the historical investigation, my impression of this text is that it was an effort to prevent GC/CC being observable. But given the changes here, it would seem to make sense to either remove this text or to alter it to make it clear that a strong reference should be conditioned on both 1) being entangled and 2) having "message" or "messageerror" listeners which could obtain a new strong reference to the MessagePort via the Note that while the streams spec uses MessagePorts for transferred streams, its hookup explicitly adds the above event listeners and it explicit disentangles the ports without removing the event listeners, so the above proposed change shouldn't affect streams, just manually created MessageChannels/MessagePorts. |
I would love to see this implemented. My case is that I have shared web worker which may have more than one window connecting to it. Because there is no close event in the case one of the windows goes away, I have to use combination of WeakRef (on the worker side) and polling alongside with explicit window close events in the cases where I can get those on the window side, so that I can try to guess when window is no longer available. To combat worker going under, I have added missing polling timeout on the window side (in the case timeout is fired I try to poll worker one last time before giving up). I have also added explicit worker close event to worker in the case an event happens to be processed during worker close, ie closing flag is true (althought I'm not sure in what situation that can happen outside of last connection dropping). With these combined I should be able to know if window needs to reconnect to shared web worker. All in all this is quite much code, just to see if the connection is still alive. Also, because of polling, there is quite much extra processing involved (I have to poll quite often to ensure that I can determine missing window in a reasonable timeframe). In the case I'm implementing I'm using shared web worker as a communication platform between the windows. It is used as a secure side channel alongside with window to window communication, as the window to window communication is not secure by design (at least in my case as I'm routing most of the messages through thirdparties). Because the shared web worker knows all the connecting windows it is imperative that it can know what windows are still available to prevent unnecessary wait time (for the user) and/or unwanted behaviour (in the case where we think the window had gone under but it was just bogged) in the cases where there is multiples windows affected. |
FYI, Chromium has started implementing the proposal from the explainer. Please let us know if there are still unresolved questions that haven't been answered in the explainer etc. We'll be sending out a spec PR soon too, but I have a question about @asutherland's comment above about GCs:
Can you elaborate more on why we would need this part to change? From the note in the spec, this seems to be there to allow the message event listener for the port to continue to run even when there's no references to the port itself. Is the intent behind changing that to allow the port to be GCed more often, or something else? How does that relate to the Also, even if we need to change it, I don't think point #2 is relevant, since the source property isn't actually set on (Thanks!) |
Aha, I definitely had gotten confused and thought in the general MessagePort case that we would set the source. Thank you for correcting my misunderstanding! I believe this was a combination of 1) the similar but unrelated ServiceWorker.postMessage behavior, and 2) some extra flexibility in Gecko plumbing code that supports passing the MessagePort as a source, but we definitely do not pass it in this case. And then I think this clears up my confusion about 9.4.5 Ports and Garbage Collection that you link. The bit about how if we have a global gA and a global gB having entangled ports pA and pB in gA and gB respectively, it's saying either...
...makes more sense to me now as I don't need to worry about the potential to create new strong references, so GCing a port and thereby automatically closing it is much more straightforward. I'll edit my comment above to indicate it should be skipped to save at least future me headaches :) Thanks again! |
Unfortunately Chrome's plan to implement this has been blocked by our security review. Their response is available here. Quoted below
Unfortunately the argument that this information is already leaked was not persuasive. I would rather not make sending the close event conditional on some kind of opt-in because it leaves a bunch of cases un-covered but it might be the best we can do. |
A challenge with an opt-in is that a frame may navigate or be unloaded before it has had a chance to call the API on the Assuming the act of posting a message opts in, the frame that listens for the |
How about opt-in but same-origin is automatically opted in? Because ports can be infinitely re-shipped this might be most sanely modeled as a new flag "has been shipped cross-origin". The flag gets set when entangling based on the environment, so once a port goes across origins it's now opt-in even if it's subsequently re-shipped to its original origin. |
@fergald @robertknight @asutherland can one of you please ensure this is tracked in a new issue? Also, if it's not resolved in some timely manner (say a couple of weeks) we should revert the specification change lest it accidentally gets implemented as-is. |
I've filed #10201 Let's move the conversation over there. |
There's currently no reliable way to detect when a MessagePort becomes disentangled, whether due to purposefully calling .close(), or the tab being unexpectedly killed.
An onclose event was proposed and discussed several years ago, but it looks like the discussion petered out after that.
Any hope of reviving it?
The text was updated successfully, but these errors were encountered: