Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example 31 is misplaced and unexplained #687

Open
rsc opened this issue Sep 4, 2021 · 4 comments · May be fixed by #690
Open

example 31 is misplaced and unexplained #687

rsc opened this issue Sep 4, 2021 · 4 comments · May be fixed by #690

Comments

@rsc
Copy link

rsc commented Sep 4, 2021

In 0.30, examples 31-34 are introduced by:

Entity and numeric character references are recognized in any context besides code spans or code blocks, including URLs, link titles, and fenced code block info strings:

and then examples 35-36 are introduced by:

Entity and numeric character references are treated as literal text in code spans and code blocks:

But example 31 is an example of a context where entity and numeric character references are not recognized, namely raw HTML:

<a href="&ouml;&ouml;.html">

The two intros should probably be rewritten to list raw HTML as one of the exceptions:

Entity and numeric character references are recognized in any context besides code spans, code blocks or raw HTML, including URLs, link titles, and fenced code block info strings:

Entity and numeric character references are treated as literal text in code spans, code blocks, and raw HTML:

and then example 31 should be moved after current example 36.

(The argument can be made that they are "recognized" by the eventual HTML parser reading the output, but they are not recognized by CommonMark, or else the output of example 31 would say <a href="öö.html">. Unless CommonMark is saying that ö should be reescaped to &ouml; in output, but that isn't done in examples 32-34.)

@jgm
Copy link
Member

jgm commented Sep 4, 2021

It's tricky to know what to say here so as not to be confusing.
If we say that they aren't recognized in raw HTML, people might think that means that &ouml; in raw HTML will be expanded as &amp;ouml; in HTML rendering -- as happens with &ouml; in code spans.
If we say they are recognized, that is also a bit misleading, since really they're just passed through.

@rsc
Copy link
Author

rsc commented Sep 4, 2021

Indeed. One option would be to reverse the order the two statements and insert a third between them:

Entity and numeric character references are treated as literal text in code spans and code blocks:

(NEW) Entity and numeric character references are passed through unaltered in raw HTML:

Entity and numeric character references are recognized in any other context, including URLs, link titles, and fenced code block info strings:

@rsc
Copy link
Author

rsc commented Sep 4, 2021

And, assuming example 31 were in the new middle section, another useful example would be something using an HTML entity that commonmark does not allow, such as &copy, which is passed through rather than turned into &amp;copy.

@rsc
Copy link
Author

rsc commented Sep 4, 2021

I created PR #690 in case it is helpful. No worries if you'd rather do something different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants
@jgm @rsc and others