-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start adding processing model #19
base: gh-pages
Are you sure you want to change the base?
Conversation
@noamr, can you take a look at this? It's still a bit hand-wavy around definitions (what is a process, anyway? This spec doesn't seem like the place to define it) but it tries to explain the loose expectations about the relationships between document-controlling (renderer, in Chromium) and supervisor (browser) processes, that I expect we can find in any multi-process browser architecture. It also better defines what to actually do when a crash is noticed, now that individual documents in the same origin can set different reporting endpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM with minor nits and questions. One thing I didn't put anywhere specifically is that it feels like there should be some mention of other shared processes (GPU, utility, etc. in Chromium) and how those are handled, but it also feels like it might be out of place?
Any time a process controlling [=Documents=] is abruptly terminated, whether by | ||
a supervising process within the user agent, or by the underlying operating | ||
system, the process is said to have <dfn lt="crash">crashed</dfn>. This may be | ||
due to a logic error in the browser, an unexpected hardware fault, the computer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
due to a logic error in the browser, an unexpected hardware fault, the computer | |
due to a logic error in the browser, an unexpected hardware fault, |
Removing "the computer" probably genericizes slightly better to handle OS limits, browser-imposed limits, etc.
to respond to user input. This may happen, for instance, because of an infinite | ||
loop in a script in the page, such that the script never completes and will | ||
never return control back to the event loop. A process terminated for this | ||
reason has [=crashed=] due to an <dfn>unresponsive condition</dfn>. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an "Other" concept here?
for a unique identifier (which can be interpreted by the browser vendor), and | ||
optionally the reason for the crash (such as "oom"). | ||
[=crashed=]. For security reasons, no details of the crash are communicated | ||
except for a unique identifier (which can be interpreted by the browser vendor), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unique ID isn't anywhere else in the spec yet, right? Maybe remove and re-add when adding ID to the IDL and examples?
crashed. For security reasons, no details of the crash are communicated except | ||
for a unique identifier (which can be interpreted by the browser vendor), and | ||
optionally the reason for the crash (such as "oom"). | ||
[=crashed=]. For security reasons, no details of the crash are communicated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[=crashed=]. For security reasons, no details of the crash are communicated | |
[=crashed=]. For security reasons, only high-level details of the crash are communicated |
Minor tweak since "reason" and "stack" are technically details.
architecture. Namely: | ||
|
||
* Each [=Document=] is controlled by a single process, such that if that process | ||
crashes, the controlled [=Document=] will immediately and abruptly terminate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crashes, the controlled [=Document=] will immediately and abruptly terminate. | |
crashes, the controlled [=Documents=] will immediately and abruptly terminate. |
Making 1:N relationship clear here.
document process crash. | ||
|
||
* The coordinator process maintains enough information about the state of each | ||
[=Document=] that after a process crash, it can reconstruct the frame tree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I have concrete feedback, but "frame tree" feels out of place here. Cursory search shows that Gecko and WebKit both use the same concept and phrasing (Ladybird seems to have the concept without the exact wording), so probably not a problem.
crashed. For security reasons, no details of the crash are communicated except | ||
for a unique identifier (which can be interpreted by the browser vendor), and | ||
optionally the reason for the crash (such as "oom"). | ||
[=crashed=]. For security reasons, no details of the crash are communicated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[=crashed=]. For security reasons, no details of the crash are communicated | |
[=crashed=]. Crash reports aren't sent for every crash, for instance, if the [=coordinator process=] crashes. For security reasons, no details of the crash are communicated |
Adding a bit about crashes being best effort, and there are cases where they won't be sent. I use coordinator process as the example because it's well defined and obvious, but GPU might be better in terms of user impact?
1. If |document|'s [=endpoints=] contains `"default"`, then set | ||
[=endpoint map=][|document|] to |document|'s [=endpoints=]["default"]. | ||
|
||
When a document is destroyed, remove it from the map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a document is destroyed, remove it from the map. | |
When a document is destroyed, remove it from the map (after sending [=crash reports=] if necessary). |
Not sure how pedantic this needs to be.
: [=CrashReportBody/reason=] | ||
:: |reason| | ||
|
||
1. Execute [=generate and queue a report=] with "crash", |endpoint|, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the Document Policy check and stack inclusion excluded on purpose? I can take a stab at adding in a follow-up PR if we want.
[=out-of-memory condition=], the [=coordinator process=] should process a crash | ||
for |process| with the reason string "oom". | ||
|
||
If the user agent destroyes a process due to an [=unresponsive condition=], the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user agent destroyes a process due to an [=unresponsive condition=], the | |
If the user agent destroys a process due to an [=unresponsive condition=], the |
Add a proper processing model for crash reporting.
This model defines the relationships between processes and documents, and defines a few conditions, such as out-of-memory and unresponsive. It covers what the user agent should do when it notices that a process has crashed for any reason, and which endpoints should receive reports.
Fixes: #11
💥 Error: 500 Internal Server Error 💥
PR Preview failed to build. (Last tried on Oct 25, 2024, 5:50 PM UTC).
More
PR Preview relies on a number of web services to run. There seems to be an issue with the following one:
🚨 Spec Generator - Spec Generator is the web service used to build specs that rely on ReSpec.
🔗 [Related URL]([object Object])
If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.