-
-
Notifications
You must be signed in to change notification settings - Fork 36
(design) Exterior whitespace handling #487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Provide a document with the options being considered for "pattern exterior whitespace"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few additions proposed inline, but looks good.
I fixed the numbering and some of the formatting directly on the branch.
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
_What is this proposal trying to achieve?_ | ||
|
||
The WG is discussing how to handle "pattern exterior" whitespace, | ||
which is ASCII whitespace (tab, CR, LF, or U+0020) that is **_part_** of the pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#487 (comment) (updated) :
This definition of "ASCII whitespace" differs from that of the web platform (which additionally includes U+000C FORM FEED) and of Unicode (which additionally includes both U+000C FORM FEED and U+000B VERTICAL TABULATION).
I'd be fine with any of these whitespace definitions, as long as it includes
\n
,\r
,\t
and space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We omitted form feed and vtab, which are ASCII whitespace characters. We are consistent with JSON, HTML, and CSS's definition of whitespace. Many other languages also observe this definition (Java and I believe C++, although not sure about the latter) Is there a technical reason for us to permit form feed and vtab into our whitespace definition? If we don't permit them in our whitespace production, they would not be "pattern exterior" (for many of the cases in this document). We don't currently allow them in expressions, declarations and the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We omitted form feed and vtab, which are ASCII whitespace characters. We are consistent with JSON, HTML, and CSS's definition of whitespace.
This is consistent with JSON (ref), but not with HTML or CSS by my reading of those specifications:
- HTML uses the Infra Standard definition of ASCII whitespace (U+0009 TAB, U+000A LF, U+000C FF, U+000D CR, or U+0020 SPACE) that I linked to above (example).
- CSS is more complicated because preprocessing replaces U+000D CARRIAGE RETURN and U+000C FORM FEED with U+000A LINE FEED, but basically boils down to the union of those with U+0009 CHARACTER TABULATION and U+0020 SPACE, aligning with HTML (ref).
Many other languages also observe this definition (Java and I believe C++, although not sure about the latter) Is there a technical reason for us to permit form feed and vtab into our whitespace definition? If we don't permit them in our whitespace production, they would not be "pattern exterior" (for many of the cases in this document). We don't currently allow them in expressions, declarations and the like.
I'm not arguing for inclusion or exclusion of the extra control characters, just for clarity. "ASCII whitespace" is a term of art in the web platform that includes U+000C FORM FEED, and in Unicode implies an interpretation like "code points that are both ASCII and White_Space" that includes U+000C FORM FEED and U+000B VERTICAL TABULATION. This text was therefore misleading, but cd188aa resolved that. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I get a "ship-it" (approval)?
Add requirements, clarify technical choices, add examples.
Provide a document with the options being considered for "pattern exterior whitespace"