-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review of structured clone algorithms #2555
Comments
Can we specify what non-browsers must throw, and use annex b-like wording so that browsers must use a domexception per html? |
What does Node throw today? And are you planning to include Node as a browser? |
What about abort the AC with an |
What do you mean? Do you mean non-browser must throw with ES |
What does AbortController have to do with this? |
Oh it's typo, I mean abort the Abstract Operations. |
Ah. I was thinking it'd just return some spec-internal value like |
What is node's use of structured cloning? @Jack-Works yes, i mean exactly that. |
There are some substeps in the deep sections that also need a bit of care, e.g. StructuredSerializeInternal step 26.3.
My suggestion is that instead you use return values from the AO. So you can return either a JS value or Another thing that might need slight tweaks is that object creation is not as precise as it should be in the current algorithms. E.g. HTML says
which should probably be expanded to something like
|
There could be platform objects deep into the object graph, so it seems like there would need to be host hooks for serializing and deserializing host objects, that get invoked in StructuredSerializeInternal and StructuredDeserialize. Perhaps if the serialization hook returns a record (as opposed to |
From https://html.spec.whatwg.org/#structuredserializeinternal step 13.2.1:
The settings object in question seems to be the
which is a restriction meaningful, indeed necessary, in EcmaScript. This restriction needs rephrasing in purely EcmaScript terms and in a host independent manner. |
Structured clone has some pretty unique behaviors which (afaik) aren't found elsewhere currently. Some of these amount to (very hard to leverage) unique information sources. The best known one is that it's the sole place where one can observe [[ErrorData]], but a more surprising one I think is that it bypasses IsArray and therefore uniquely pokes a hole in the boundary that (otherwise would) prevent ES code from observing whether an object is a true array exotic object. I'd be curious to know if the quirky one-offs could be safely closed off (or whether folks would consider that desirable). I don't have much of an opinion on this; I'm just deeply fascinated by places where the unobservable becomes observable under a blood red moon and my god, structured clone delivers. |
I'm still absorbing. But IIUC, I find this both desirable and necessary. I left off the text after "observing" above because the concern is not specific to arrays. ANY case where this makes something observable that was previously unobservable is likely a bug that must be fixed. I look forward to going through cases. Thanks! |
https://html.spec.whatwg.org/#structuredserializeinternal step 17 refers to "platform object" which is not term used in ecma262. What does in mean in ecma262 terms? How would this be rephrased? |
The problem @bathos identifies at #2555 (comment) seems to be pervasive. For example, this algorithm dispatches on the presence of
Note that a proposal in progress (Error stacks by @ljharb and I) would likely makes |
@erights one of the things you'll likely find interesting is that it's "asymmetrical" because of how, given an arbitrary "untrusted" object, MOP ops will be invoked in steps prior to those which reveal special information. This is one factor that limits its real world utility: though novel knowledge can be obtained, it does not have the property of being reliably obtainable. For example we cannot use structured clone to know an object isn't an error, but we can use it to know that some objects definitely are. |
I don't understand that. Could you explain with errors specifically, since I just looked at that? Thanks. |
Just the obvious problem that this list needs to be updated to include "AggregateError". Which is indeed a great example of why this spec needs to move into tc39. |
yep: |
In web specs, a platform object is an object that implements a Web IDL interface – roughly an instance of a web API "class". I suspect ecma262 might want to use something like "host-defined object" instead (though that already has a meaning that might not exactly match). Since the (de)serialization behavior for such objects must be host-specific, and there might be host-defined objects deep into the object graph (so the caller to the structured clone algorithms can't do all the work), there would need to be host hooks to serialize and deserialize host-defined objects as I argued in my comment above. The default implementation of the serialize hook would be to return some empty marker (or Note that the global objects in the web are non-serializable platform objects, so serializing them fails. I'd argue this should probably still happen even with hosts that don't override the serialization hooks. |
Step 23:
is clearly incompatible with ecma262. How should this be repaired to restore practical membrane transparency? |
It is a bug/oversight that [[ErrorData]] is not currently observable, so anything that adds this observability is a good thing. |
Aside from "anything", I agree. It is a good thing only when coordinated with the rest of the language. This observability should enter via our error stacks proposal. In any case, it should not happen by accident. There seem to be lots of such accidents scattered in the structured clone serialization algorithm. |
I don’t agree that it should only enter that way - the stacks proposal is merely one way. I agree it should be added on purpose, but i think it’s perfectly fine if structured clone also adds it along with stacks. |
Any such observability change would at least need to go through the stage process, just as the error stack proposal itself needs to. |
Step 26.4:
How could a key obtained in step 4 fail the test in 4.1? Should 4.1 be an assert rather than a test? |
I am surprised that it serializes only enumerable own properties. I have no pro or con opinion on this yet; merely surprise. After all, elsewhere it reaches past public interfaces to access internal slots like |
At https://html.spec.whatwg.org/#structureddeserialize step 12.1:
is stated using only EcmaScript concepts. Can we fix #2555 (comment) to do likewise? |
I don’t know the rationale or history, but my sense is — given all the guards that exist just to prevent serializing any platform or intrinsic objects that aren’t specifically permitted and the choice to forbid functions and (problematically) PEOs, that this was “meant” for POJOs. I suspect that if there were actually a meaningful/reliable way to forbid instances of userland classes too, it might have done so. If you view it that way, it’s consistent with e.g. |
13:
Is this the only circumstance that can throw at this step? I suspect so. Can we state this explicitly with an assert? |
Unless I'm missing something, the good news is that deserialization is vastly less problematic than serialization. If true, this is unsurprising. But I felt it is worth stating explicitly. Are there any problems or concerns with deserialization that I may have missed? |
I'd like to say it's not feasible to change the behavior of structured clone. The most realistic way to repair this is to add host-specific carve outs here, but we're not going to be changing how structured clone already works on the web wrt Proxies (nor in Node, I expect). Would that be satisfactory? |
In general though thank you all for the detailed review comments. Before I try to synthesize the comments and come up with a concrete PR to review, I'd very much appreciate resolving the question of "are there concerns with the normative behavior" with @erights. To set expectations, normative changes to how structured clone works on the web or node is aren't on the table. I am fine, however, with having host-specific carve outs, if there are normative behavior you think are blockers for taking this into ecma262. |
To clarify, are you suggesting that due to the new observability points added here, taking over structured clone ought to go through the stage process with a call out to these new observability points? |
The later steps can mutate value to exclude key, but the list returned by EnumerableOwnPropertyNames would still contain it. So you could assert the first key in the list, but not the later ones. |
Yes it is. This should become explicit when you refactor to use the necessary create abstract operations for these objects. |
No, see #1435 (comment). You could make this more agnostic by having a bit on agent clusters that determines whether or not SAB are serializable in them. HTML could set that bit appropriately. (We could also derive exposure of SharedArrayBuffer on the global from that, but I still somewhat like the idea of eventually just exposing SharedArrayBuffer there all the time, with just serialization being restricted.) |
To be clear this doesn't actually add the |
@Jamesernator Correct. |
following up on node, it uses v8's default behavior which is an ordinary Error instance with the message "xyz could not be cloned". v8 allows hooking this error which is how chromium throws a dom exception. this matches nicely with throwing a host defined exception. we could also have something like ThrowDataCloneError which html normatively implements.
normally just for threads, same sorta thing as html, but we also expose a serialization api (https://nodejs.org/api/v8.html#serialization-api) |
Seeing https://es.discourse.group/t/symbol-clone-to-ease-out-structuredclone-implicit-conversion/2035/7 reminded me of this effort. What's the status of this transition? |
if worth anything, and big imho here:
I wrote a polyfill to help paving the path forward and because the current API is one way, I think we should consider that one way forward: you clone and deliver, what happens next, as long as your data was delivered and didn't throw instead because behind a Proxy, or data that can't be sent, is up to you (developer). |
the alternative that comes up to mind is a This is a whole new topic to eventually discuss though, and less relevant around current |
last for me, and sorry for the noise, the structuredClone API already has an That way people just serializing values wouldn't care, people interested into unserializing would, and provide their own returns when any data passes through it. This would unlock even more worlds, and I'd be more thank happy to provide a better polyfill for that in the meantime. |
We reached consensus on the general idea of taking over maintainership of the structured clone algorithm in the October 2021 TC39, pending a requested review by @erights of the concrete algorithm proposed to be refactored into ecma262.
The two algorithms proposed to be copied are StructuredSerializeInternal and StructuredDeserialize.
For StructuredSerializeInternal:
DataCloneError
"DOMException
" will be changed to "throw a host-defined exception".For StructuredDeserialize:
DataCloneError
"DOMException
" will be changed to "throw a host-defined exception".The text was updated successfully, but these errors were encountered: