Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified Hyphen in [Hexadecimal numeric character references] #728

Open
KiyanYang opened this issue Nov 17, 2022 · 2 comments
Open

Unified Hyphen in [Hexadecimal numeric character references] #728

KiyanYang opened this issue Nov 17, 2022 · 2 comments

Comments

@KiyanYang
Copy link

commonmark-spec/spec.txt

Lines 676 to 680 in 796666d

[Hexadecimal numeric character
references](@) consist of `&#` +
either `X` or `x` + a string of 1-6 hexadecimal digits + `;`.
They too are parsed as the corresponding Unicode character (this
time specified with a hexadecimal numeral instead of decimal).

The hyphen in paragraph about Hexadecimal numeric character references, which different from other places. So I suggest to change a string of 1-6 hexadecimal digits into a string of 1--6 hexadecimal digits. Here are examples of other locations:

commonmark-spec/spec.txt

Lines 661 to 667 in 796666d

[Decimal numeric character
references](@)
consist of `&#` + a string of 1--7 arabic digits + `;`. A
numeric character reference is parsed as the corresponding
Unicode character. Invalid Unicode code points will be replaced by
the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons,
the code point `U+0000` will also be replaced by `U+FFFD`.

commonmark-spec/spec.txt

Lines 1099 to 1109 in 796666d

An [ATX heading](@)
consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of unescaped `#` characters.
The opening sequence of `#` characters must be followed by spaces or tabs, or
by the end of line. The optional closing sequence of `#`s must be preceded by
spaces or tabs and may be followed by spaces or tabs only. The opening
`#` character may be preceded by up to three spaces of indentation. The raw
contents of the heading are stripped of leading and trailing space or tabs
before being parsed as inline content. The heading level is equal to the number
of `#` characters in the opening sequence.

commonmark-spec/spec.txt

Lines 2993 to 2994 in 796666d

An HTML block of types 1--6 can interrupt a paragraph, and need not be
preceded by a blank line.

commonmark-spec/spec.txt

Lines 4106 to 4110 in 796666d

An [ordered list marker](@)
is a sequence of 1--9 arabic digits (`0-9`), followed by either a
`.` character or a `)` character. (The reason for the length
limit is that with 10 digits we start seeing integer overflows
in some browsers.)

@wooorm
Copy link
Contributor

wooorm commented Nov 19, 2022

I’d personally prefer to use unicode characters in the spec, rather than ancient ASCII-era typewriter pseudocharacters 😅

@jgm
Copy link
Member

jgm commented Nov 19, 2022

Yes, it should indeed be an -- which will turn into an en dash in the rendered version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants