You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A <script> tag's contents is escaped if it's sandwiched between two HTML tags, and the first HTML element has another HTML tag on the line after it. To illustrate:
Input:
<p>This is immediately followed by a script tag, which is then broken</p>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<p>This ending tag is matched as the ending tag of the first paragraph</p>
Output:
<p>This is immediately followed by a script tag, which is then broken</p>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<p>This ending tag is matched as the ending tag of the first paragraph</p>
In this output, the contents of the script tag are escaped as if they were inside a p tag. Adding debug statements inside Lexer.prototype.token reveals that the if (cap = this.rules.html.exec(src)) { block is entered only once, and consumes the entire string. It looks like the HTML regex is matching the entire input document above as a single tag (or HTML block), which then sets the pre option to false and proceeds to escape special characters inside all tags in the document.
Additional examples: Input with blockquotes (so it's not just p tags):
<blockquote>This is immediately followed by a script tag, which is then broken</blockquote>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<blockquote>This ending tag is matched as the ending tag of the first paragraph</blockquote>
Output:
<blockquote>This is immediately followed by a script tag, which is then broken</blockquote>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<blockquote>This ending tag is matched as the ending tag of the first paragraph</blockquote>
Input with whitespace between the first tag and the script tag (works):
<p>This is followed by whitespace, which works</p>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<p>This ending tag is matched as the ending tag of the first paragraph</p>
Output:
<p>This is followed by whitespace, which works</p>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<p>This ending tag is matched as the ending tag of the first paragraph</p>
Input where the first tag is followed by a different HTML tag:
<p>This is immediately followed by a different tag, which breaks the script tag</p>
<div>This breaks the script tag</div>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<p>This ending tag is matched as the ending tag of the first paragraph</p>
Output:
<p>This is immediately followed by a different tag, which breaks the script tag</p>
<div>This breaks the script tag</div>
<script type="text/javascript">
if (i < 3 && 'one' != "two") alert("ok");
</script>
<p>This ending tag is matched as the ending tag of the first paragraph</p>
All the above were performed with version 0.3.5 using the default configuration, via marked -i repro.md where repro.md contains only the given contents.
The text was updated successfully, but these errors were encountered:
I think, because the regular of html cannot match all HTML line, like this:
I made a html function:
renderer.html = function(html){
return '<h5>' + encode(html) + ' has been matched</h5>';
};
INPUT:
# title 1
<h1>aaaa</h1>
<h2>bbbb</h2>
<h1>aaaa</h1>
<h1>bbbb</h1>
<h2>cccc</h2>
OUTPUT:
<h1>aaaa</h1>
<h5><h2>bbbb</h2> has been matched<h5>
<h1>aaaa</h1>
<h5><h1>bbbb</h2> <br><h1>bbbb</h2> has been matched<h5>
<hr>
So, if only one \n between in two or more HTMLtags, the frist HTMLtag cannot been matched.
A
<script>
tag's contents is escaped if it's sandwiched between two HTML tags, and the first HTML element has another HTML tag on the line after it. To illustrate:Input:
Output:
In this output, the contents of the
script
tag are escaped as if they were inside ap
tag. Adding debug statements insideLexer.prototype.token
reveals that theif (cap = this.rules.html.exec(src)) {
block is entered only once, and consumes the entire string. It looks like the HTML regex is matching the entire input document above as a single tag (or HTML block), which then sets thepre
option tofalse
and proceeds to escape special characters inside all tags in the document.Additional examples:
Input with blockquotes (so it's not just
p
tags):Output:
Input with whitespace between the first tag and the script tag (works):
Output:
Input where the first tag is followed by a different HTML tag:
Output:
All the above were performed with version 0.3.5 using the default configuration, via
marked -i repro.md
whererepro.md
contains only the given contents.The text was updated successfully, but these errors were encountered: