You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Invariant: Let the common elements be those elements that occur in both the stack of open elements and the list of active formatting elements. The common elements then always occur in the stack of open elements in the same order as they occur in the list of active formatting elements.
See below for a proof of the claim that this is in fact an invariant.
I believe that this invariant will be generally useful to readers of the standard that wish to understand the parser mechanics in depth, and in particular, the possible behaviors of the "adoption agency" algorithm.
More specifically, this invariant is very useful for implementers, like myself, who seek to represent the list of active formatting elements as an array (rather than a doubly linked list), and represent the active formatting elements bookmark of the "adoption agency" algorithm as an index into this array (rather than a pointer to an entry in a doubly linked list).
If there is an interest, or at least not too much resistance, I intend to follow up with a pull request with specific text changes.
Proof of invariant claim:
First of all, removals from either place (stack or list) cannot by themselves cause a violation of the proposed invariant. Further more, since all insertions are of newly created elements (no reinsertions after removal), we only need to consider the cases where elements are inserted in both places at once (stack and list). There are only a few places where this happens:
One of the steps (6) of the inner loop of the "adoption agency" algorithm where a recreated auxiliary formatting element (see definition below) replaces the original in both places (stack and list).
The two last steps (18 and 19) of the outer loop of the "adoption agency" algorithm where the recreated primary formatting element (see definition below) replaces the original in both places (stack and list).
In the case of the processing of start tags of formatting elements (item 1 in the list above), the new element is pushed at the bottom of the stack and appended to the list, so this will uphold the invariant.
In the case of the recreation of auxiliary formatting elements (item 2 in the list above), the new element is inserted at the position of the original, so this will uphold the invariant.
In the case of the recreation of the primary formatting element (item 3 in the list above), the new element is inserted into the stack of open elements below the furthest block, so below all auxiliary formatting elements and above any trailing formatting elements (see definition below). From the point of view of the list of active formatting elements, the new element is inserted after any auxiliary formatting elements (or in place of itself if there are no auxiliary formatting elements), and before any trailing formatting elements. Since the change in the two places is the same from the point of the order of common elements, the invariant is upheld.
The primary formatting element is the formatting element that is passed as argument to the "adoption agency" algorithm.
The auxiliary formatting elements are the active formatting elements that occur between the common ancestor and the furthest block in the stack of open elements (there may be none).
The trailing formatting elements are the active formatting elements that occur below the furthest block in the stack of open elements prior to the invocation of the "adoption agency" algorithm (there may be none).
The text was updated successfully, but these errors were encountered:
Thank you for the detailed feedback. I hope @hsivonen and @zcorpan can make time to evaluate this. Clarifying the HTML parser further seems like a good thing.
kspangsege
changed the title
Proposal to introduce invariant for order of active formatting elements that are also in stack of open elements
Parser: Proposal to introduce invariant for order of active formatting elements that are also in stack of open elements
Dec 18, 2024
What is the issue with the HTML Standard?
I propose to introduce the following non-normative invariant for the combined state of the stack of open elements and the list of active formatting elements:
Invariant: Let the common elements be those elements that occur in both the stack of open elements and the list of active formatting elements. The common elements then always occur in the stack of open elements in the same order as they occur in the list of active formatting elements.
See below for a proof of the claim that this is in fact an invariant.
I believe that this invariant will be generally useful to readers of the standard that wish to understand the parser mechanics in depth, and in particular, the possible behaviors of the "adoption agency" algorithm.
More specifically, this invariant is very useful for implementers, like myself, who seek to represent the list of active formatting elements as an array (rather than a doubly linked list), and represent the active formatting elements bookmark of the "adoption agency" algorithm as an index into this array (rather than a pointer to an entry in a doubly linked list).
If there is an interest, or at least not too much resistance, I intend to follow up with a pull request with specific text changes.
Proof of invariant claim:
First of all, removals from either place (stack or list) cannot by themselves cause a violation of the proposed invariant. Further more, since all insertions are of newly created elements (no reinsertions after removal), we only need to consider the cases where elements are inserted in both places at once (stack and list). There are only a few places where this happens:
In the case of the processing of start tags of formatting elements (item 1 in the list above), the new element is pushed at the bottom of the stack and appended to the list, so this will uphold the invariant.
In the case of the recreation of auxiliary formatting elements (item 2 in the list above), the new element is inserted at the position of the original, so this will uphold the invariant.
In the case of the recreation of the primary formatting element (item 3 in the list above), the new element is inserted into the stack of open elements below the furthest block, so below all auxiliary formatting elements and above any trailing formatting elements (see definition below). From the point of view of the list of active formatting elements, the new element is inserted after any auxiliary formatting elements (or in place of itself if there are no auxiliary formatting elements), and before any trailing formatting elements. Since the change in the two places is the same from the point of the order of common elements, the invariant is upheld.
The primary formatting element is the formatting element that is passed as argument to the "adoption agency" algorithm.
The auxiliary formatting elements are the active formatting elements that occur between the common ancestor and the furthest block in the stack of open elements (there may be none).
The trailing formatting elements are the active formatting elements that occur below the furthest block in the stack of open elements prior to the invocation of the "adoption agency" algorithm (there may be none).
The text was updated successfully, but these errors were encountered: