You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hexadecimal numeric character references do not mention this limitation, but I guess imply it (with “They too are parsed as the corresponding Unicode character”).
Thanks! That’s a good read but a) doesn’t answer the “invalid code points” part, and b) the CM spec already defines “A character is a Unicode code point [...] all code points count as characters for purposes of this spec”, so I’m not sure why not to use the word “character” in references.
Decimal numeric character references references “Invalid Unicode code points”, but nowhere is it defined what those are.
Hexadecimal numeric character references do not mention this limitation, but I guess imply it (with “They too are parsed as the corresponding Unicode character”).
The HTML spec defines several limitations on numerical character references: https://html.spec.whatwg.org/multipage/parsing.html#numeric-character-reference-end-state, so I’m guessing some or all of that applies to CM as well.
However, HTML defines that some “invalid” references map to other characters (the table at the bottom of the linked section).
Why mention code points instead of characters? Is it just surrogates?
The text was updated successfully, but these errors were encountered: