Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start adding processing model #19

Open
wants to merge 2 commits into
base: gh-pages
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 127 additions & 3 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ Include MDN Panels: no
url: https://tc39.es/proposal-error-stacks/#sec-get-error.prototype-stack; type: dfn; spec: ERROR STACKS
text: Error.prototype.stack
</pre>
<pre class=link-defaults>
spec:reporting-1; type:dfn; text:endpoints
spec:infra; type:dfn; text:list
</pre>

Introduction {#intro}
============
Expand Down Expand Up @@ -47,21 +51,39 @@ Concepts {#concept}
Crash {#concept-crash}
-----

Any time a process controlling [=Documents=] is abruptly terminated, whether by
a supervising process within the user agent, or by the underlying operating
system, the process is said to have <dfn lt="crash">crashed</dfn>. This may be
due to a logic error in the browser, an unexpected hardware fault, the computer
Copy link

@aluhrs13 aluhrs13 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
due to a logic error in the browser, an unexpected hardware fault, the computer
due to a logic error in the browser, an unexpected hardware fault,

Removing "the computer" probably genericizes slightly better to handle OS limits, browser-imposed limits, etc.

running out of needed resources, among other reasons. Specific kinds of
[=crashes=] are detailed below.

Out-of-Memory {#concept-oom}
-------------

Memory in computing systems is a finite resource, and is generally granted to
running processes on request. When a request to allocate additional memory is
denied, this may be treated as an unrecoverable error, and the requesting
process may be terminated. A process terminated for this reason has [=crashed=]
due to an <dfn>out-of-memory condition</dfn>.

Unresponsive {#concept-unresponsive}
------------

A process may be terminated in order to destroy a page which is no longer able
to respond to user input. This may happen, for instance, because of an infinite
loop in a script in the page, such that the script never completes and will
never return control back to the event loop. A process terminated for this
reason has [=crashed=] due to an <dfn>unresponsive condition</dfn>.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an "Other" concept here?

Crash Reports {#crash-report}
=============

<dfn>Crash reports</dfn> indicate that the user was unable to continue using the
page because the browser (or one of its processes necessary for the page)
crashed. For security reasons, no details of the crash are communicated except
for a unique identifier (which can be interpreted by the browser vendor), and
optionally the reason for the crash (such as "oom").
[=crashed=]. For security reasons, no details of the crash are communicated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[=crashed=]. For security reasons, no details of the crash are communicated
[=crashed=]. For security reasons, only high-level details of the crash are communicated

Minor tweak since "reason" and "stack" are technically details.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[=crashed=]. For security reasons, no details of the crash are communicated
[=crashed=]. Crash reports aren't sent for every crash, for instance, if the [=coordinator process=] crashes. For security reasons, no details of the crash are communicated

Adding a bit about crashes being best effort, and there are cases where they won't be sent. I use coordinator process as the example because it's well defined and obvious, but GPU might be better in terms of user impact?

except for a unique identifier (which can be interpreted by the browser vendor),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unique ID isn't anywhere else in the spec yet, right? Maybe remove and re-add when adding ID to the IDL and examples?

and optionally the reason for the [=crash=] (such as "oom").

[=Crash reports=] are a type of [=report=].

Expand Down Expand Up @@ -138,6 +160,108 @@ call stack, at the time of the crash, to be included in a crash report.
<xmp class="http">Document-Policy: include-js-call-stacks-in-crash-reports</xmp>
</div>

Processing Model {#processing}
================

Note: Browser architectures vary, and is it possible that a browser exists which
cannot support this specification as written due to its architecture.

This processing model makes certain assumptions about the underlying process
architecture. Namely:

* Each [=Document=] is controlled by a single process, such that if that process
crashes, the controlled [=Document=] will immediately and abruptly terminate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
crashes, the controlled [=Document=] will immediately and abruptly terminate.
crashes, the controlled [=Documents=] will immediately and abruptly terminate.

Making 1:N relationship clear here.


Note: It is not required that all such [=Documents=] are related to each
other in any way other than being controlled by the same process.

* A separate <dfn>coordinator process</dfn> exists which is responsible for
monitoring the processes which control documents, and which is expected to
survive even when those processes crash. This process is able to observe the
document process crash.

* The coordinator process maintains enough information about the state of each
[=Document=] that after a process crash, it can reconstruct the frame tree
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I have concrete feedback, but "frame tree" feels out of place here. Cursory search shows that Gecko and WebKit both use the same concept and phrasing (Ladybird seems to have the concept without the exact wording), so probably not a problem.

that existed in each of that process' documents before the crash occurred.

If these assumptions do not hold for a particular user agent, then this
processing model may need to be modfied.

These assumptions allow significant flexibility regarding the association of
documents and processes. The following are all possible arrangments which this
processing model is intended to support:

* A process may control multiple [=Documents=]. These [=Documents=] may be part
of the same page, or may reside in different pages. They may be [=/top-level
traversables=] or [=child navigables=].
* A process may control [=Documents=] which are cross-site or cross-origin to
each other.
* A [=Document=] in a process which has crashed may have contained child
documents in processes which have not crashed
* Similarly, a [=Document=] in a process which has crashed may be contained
within a parent document in a process which has not crashed.

Registering Crash Reports {#registration}
-------------------------

The [=coordinator process=] has an <dfn>endpoint map</dfn>, which is a map of
[=Documents=] -> [=endpoints=], initially empty.

The [=coordinator process=] has a <dfn>process map</dfn>, which is a map of
process -> [=lists=] of [=Documents=].

Note: The maintenance of the [=process map=] is out of scope for this
specification. It is intended that the user agent track, for every process it
creates for controlling [=Documents=], a [=list=] of which [=Documents=] are
controlled by that process.

When a |document| [=initialize a global's endpoint list|initializes its endpoint
list=], it should run the following step:

1. If |document|'s [=endpoints=] contains `"default"`, then set
[=endpoint map=][|document|] to |document|'s [=endpoints=]["default"].

When a document is destroyed, remove it from the map.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When a document is destroyed, remove it from the map.
When a document is destroyed, remove it from the map (after sending [=crash reports=] if necessary).

Not sure how pedantic this needs to be.


Processing Crashes {#processing-crash}
------------------

<div class="algorithm" data-algorithm="process-crash">
Given a crashed process |process|, with a string |reason|, this algorithm sends
crash reports to the appropriate endpoints:

1. Let |affectedDocuments| be [=process map=][|process|]
1. For each |document| in |affectedDocuments|:
1. If [=endpoint map=][|document|] does not exist, continue.
1. Let |endpoint| be [=endpoint map=][|document|].
1. Let |parent| be |document|'s [=node navigable=]'s [=navigable/parent=]'s
[=active document=].
1. If |parent| is in |affectedDocuments|:
1. If [=endpoint map=][|parent|] = |endpoint|, continue.
1. Let |body| be a new {{CrashReportBody}}, initialized
as follows:

: [=CrashReportBody/reason=]
:: |reason|

1. Execute [=generate and queue a report=] with "crash", |endpoint|, and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Document Policy check and stack inclusion excluded on purpose? I can take a stab at adding in a follow-up PR if we want.

|body|.

</div>

If a process controlling [=Documents=] crashes due to an
[=out-of-memory condition=], the [=coordinator process=] should process a crash
for |process| with the reason string "oom".

If the user agent destroyes a process due to an [=unresponsive condition=], the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the user agent destroyes a process due to an [=unresponsive condition=], the
If the user agent destroys a process due to an [=unresponsive condition=], the

[=coordinator process=] should process a crash for |process| with the reason
string "unresponsive".

If the [=coordinator process=] observes a [=crash=] in a process which was not
due to an [=out-of-memory condition=] or an [=unresponsive condition=], then the
[=coordinator process=] should process a crash for |process| with the reason
string "".

Implementation Considerations {#implementation}
=============================

Expand Down
Loading