Skip to content

Commit 07f9b21

Browse files
josepharharlozy219
authored andcommitted
Add setHTMLUnsafe() and parseHTMLUnsafe()
These are modern HTML-parsing methods, replacing the innerHTML setter and (new DOMParser()).parseFromString(). See https://github.com/otherdaniel/purification/blob/explainer-examples/explainer.md#proposed-api for more background. The "unsafe" part of their names comes from the fact that safe versions, which sanitize by default, will be introduced in the future. Notable differences from the older versions include support for declarative shadow roots (whatwg#5465) by default, no mode-switching between XML and HTML, and (for parseHTMLUnsafe()) no inheritance from the outer document.
1 parent 8a39823 commit 07f9b21

File tree

1 file changed

+142
-22
lines changed

1 file changed

+142
-22
lines changed

source

+142-22
Original file line numberDiff line numberDiff line change
@@ -10609,6 +10609,8 @@ typedef (<span>HTMLScriptElement</span> or <span>SVGScriptElement</span>) <dfn t
1060910609

1061010610
[<span>LegacyOverrideBuiltIns</span>]
1061110611
partial interface <dfn id="document" data-lt="">Document</dfn> {
10612+
static <code>Document</code> <span data-x="dom-parseHTMLUnsafe">parseHTMLUnsafe</span>(DOMString html);
10613+
1061210614
// <span>resource metadata management</span>
1061310615
[PutForwards=<span data-x="dom-location-href">href</span>, <span>LegacyUnforgeable</span>] readonly attribute <span>Location</span>? <span data-x="dom-document-location">location</span>;
1061410616
attribute USVString <span data-x="dom-document-domain">domain</span>;
@@ -109471,6 +109473,8 @@ document.body.appendChild(frame)</code></pre>
109471109473
also live here? -->
109472109474
<h3 id="dom-parsing-and-serialization">DOM parsing</h3>
109473109475

109476+
<h4>The <code>DOMParser</code> interface</h4>
109477+
109474109478
<p>The <code>DOMParser</code> interface allows authors to create new <code>Document</code> objects
109475109479
by parsing strings, as either HTML or XML.</p>
109476109480

@@ -109491,17 +109495,19 @@ document.body.appendChild(frame)</code></pre>
109491109495

109492109496
<p>Note that <code>script</code> elements are not evaluated during parsing, and the resulting
109493109497
document's <span data-x="document's character encoding">encoding</span> will always be
109494-
<span>UTF-8</span>.</p>
109498+
<span>UTF-8</span>. The document's <span data-x="concept-document-url">URL</span> will be
109499+
inherited from <var>parser</var>'s <span>relevant global object</span>.</p>
109495109500

109496109501
<p>Values other than the above for <var>type</var> will cause a <code>TypeError</code> exception
109497109502
to be thrown.</p>
109498109503
</dd>
109499109504
</dl>
109500109505

109501109506
<p class="note">The design of <code>DOMParser</code>, as a class that needs to be constructed and
109502-
then have its <code data-x="dom-DOMParser-parseFromString">parseFromString()</code> method called,
109503-
is an unfortunate historical artifact. If we were designing this functionality today it would be a
109504-
standalone function.</p>
109507+
then have its <code data-x="dom-DOMParser-parseFromString">parseFromString()</code> method
109508+
called, is an unfortunate historical artifact. If we were designing this functionality today it
109509+
would be a standalone function. For parsing HTML, the modern alternative is <code
109510+
data-x="dom-parseHTMLUnsafe">Document.parseHTMLUnsafe()</code>.</p>
109505109511

109506109512
<pre><code class="idl">[Exposed=Window]
109507109513
interface <dfn interface>DOMParser</dfn> {
@@ -109531,7 +109537,7 @@ enum <dfn enum>DOMParserSupportedType</dfn> {
109531109537
<li>
109532109538
<p>Let <var>document</var> be a new <code>Document</code>, whose <span
109533109539
data-x="concept-document-content-type">content type</span> is <var>type</var> and <span
109534-
data-x="concept-document-URL">url</span> is this's <span>relevant global object</span>'s <span
109540+
data-x="concept-document-URL">URL</span> is this's <span>relevant global object</span>'s <span
109535109541
data-x="concept-document-window">associated <code>Document</code></span>'s <span
109536109542
data-x="concept-document-URL">URL</span>.</p>
109537109543
<!-- When https://github.com/whatwg/html/issues/4792 gets fixed we need to investigate which of
@@ -109552,23 +109558,8 @@ enum <dfn enum>DOMParserSupportedType</dfn> {
109552109558
data-x="dom-DOMParserSupportedType-texthtml"><code>text/html</code>"</dfn></dt>
109553109559
<dd>
109554109560
<ol>
109555-
<li><p>Set <var>document</var>'s <span data-x="concept-document-type">type</span> to "<code
109556-
data-x="">html</code>".</p></li>
109557-
109558-
<li><p>Create an <span>HTML parser</span> <var>parser</var>, associated with
109559-
<var>document</var>.</p></li>
109560-
109561-
<li><p>Place <var>string</var> into the <span>input stream</span> for <var>parser</var>. The
109562-
encoding <span data-x="concept-encoding-confidence">confidence</span> is
109563-
<i>irrelevant</i>.</p></li>
109564-
109565-
<li>
109566-
<p>Start <var>parser</var> and let it run until it has consumed all the characters just
109567-
inserted into the input stream.</p>
109568-
109569-
<p class="note">This might mutate the document's <span
109570-
data-x="concept-document-mode">mode</span>.</p>
109571-
</li>
109561+
<li><p><span>Parse HTML from a string</span> given <var>document</var> and
109562+
<var>string</var>.</p></li>
109572109563
</ol>
109573109564

109574109565
<p class="note">Since <var>document</var> does not have a <span
@@ -109610,6 +109601,135 @@ enum <dfn enum>DOMParserSupportedType</dfn> {
109610109601
<li><p>Return <var>document</var>.</p>
109611109602
</ol>
109612109603

109604+
<p>To <dfn>parse HTML from a string</dfn>, given a <var>document</var> <code>Document</code> and a
109605+
<span>string</span> <var>html</var>:</p>
109606+
109607+
<ol>
109608+
<li><p>Set <var>document</var>'s <span data-x="concept-document-type">type</span> to "<code
109609+
data-x="">html</code>".</p></li>
109610+
109611+
<li><p>Create an <span>HTML parser</span> <var>parser</var>, associated with
109612+
<var>document</var>.</p></li>
109613+
109614+
<li><p>Place <var>html</var> into the <span>input stream</span> for <var>parser</var>. The
109615+
encoding <span data-x="concept-encoding-confidence">confidence</span> is
109616+
<i>irrelevant</i>.</p></li>
109617+
109618+
<li>
109619+
<p>Start <var>parser</var> and let it run until it has consumed all the characters just
109620+
inserted into the input stream.</p>
109621+
109622+
<p class="note">This might mutate the document's <span
109623+
data-x="concept-document-mode">mode</span>.</p>
109624+
</li>
109625+
</ol>
109626+
109627+
</div>
109628+
109629+
<h4>Unsafe HTML parsing methods</h4>
109630+
109631+
<dl class="domintro">
109632+
<dt><code data-x=""><var>element</var>.<span subdfn
109633+
data-x="dom-Element-setHTMLUnsafe">setHTMLUnsafe</span>(<var>html</var>)</code></dt>
109634+
109635+
<dd>
109636+
<p>Parses <var>html</var> using the HTML parser, and replaces the children of <var>element</var>
109637+
with the result. <var>element</var> provides context for the HTML parser.</p>
109638+
</dd>
109639+
109640+
<dt><code data-x=""><var>shadowRoot</var>.<span subdfn
109641+
data-x="dom-ShadowRoot-setHTMLUnsafe">setHTMLUnsafe</span>(<var>html</var>)</code></dt>
109642+
109643+
<dd>
109644+
<p>Parses <var>html</var> using the HTML parser, and replaces the children of
109645+
<var>shadowRoot</var> with the result. <var>shadowRoot</var>'s <span
109646+
data-x="concept-DocumentFragment-host">host</span> provides context for the HTML parser.</p>
109647+
</dd>
109648+
109649+
<dt><code data-x=""><var>doc</var> = Document.<span
109650+
data-x="dom-parseHTMLUnsafe">parseHTMLUnsafe</span>(<var>html</var>)</code></dt>
109651+
109652+
<dd>
109653+
<p>Parses <var>html</var> using the HTML parser, and returns the resulting
109654+
<code>Document</code>.</p>
109655+
109656+
<p>Note that <code>script</code> elements are not evaluated during parsing, and the resulting
109657+
document's <span data-x="document's character encoding">encoding</span> will always be
109658+
<span>UTF-8</span>. The document's <span data-x="concept-document-url">URL</span> will be
109659+
<code>about:blank</code>.</p>
109660+
</dd>
109661+
</dl>
109662+
109663+
<p class="warning">These methods perform no sanitization to remove potentially-dangerous elements
109664+
and attributes like <code>script</code> or <span>event handler content attributes</span>.</p>
109665+
109666+
<pre><code class="idl">partial interface <span id="Element-partial">Element</span> {
109667+
undefined <span data-x="dom-Element-setHTMLUnsafe">setHTMLUnsafe</span>(DOMString html);
109668+
};
109669+
109670+
partial interface <span id="ShadowRoot-partial">ShadowRoot</span> {
109671+
undefined <span data-x="dom-ShadowRoot-setHTMLUnsafe">setHTMLUnsafe</span>(DOMString html);
109672+
};</code></pre>
109673+
109674+
<div w-nodev>
109675+
109676+
<p><code>Element</code>'s <dfn method for="Element"><code
109677+
data-x="dom-Element-setHTMLUnsafe">setHTMLUnsafe(<var>html</var>)</code></dfn> method steps
109678+
are:</p>
109679+
109680+
<ol>
109681+
<li><p>Let <var>target</var> be <span>this</span>'s <span>template contents</span> if
109682+
<span>this</span> is a <code>template</code> element; otherwise <span>this</span>.</p></li>
109683+
109684+
<li><p><span>Unsafely set HTML</span> given <var>target</var>, <span>this</span>, and
109685+
<var>html</var>.</p></li>
109686+
</ol>
109687+
109688+
<p><code>ShadowRoot</code>'s <dfn method for="ShadowRoot"><code
109689+
data-x="dom-ShadowRoot-setHTMLUnsafe">setHTMLUnsafe(<var>html</var>)</code></dfn> method steps
109690+
are to <span>unsafely set HTML</span> given <span>this</span>, <span>this</span>'s <span
109691+
data-x="concept-DocumentFragment-host">shadow host</span>, and <var>html</var>.</p>
109692+
109693+
<p>To <dfn>unsafely set HTML</dfn>, given an <code>Element</code> or <code>DocumentFragment</code>
109694+
<var>target</var>, an <code>Element</code> <var>contextElement</var>, and a <span>string</span>
109695+
<var>html</var>:</p>
109696+
109697+
<ol>
109698+
<li><p>Let <var>newChildren</var> be the result of the <span>HTML fragment parsing algorithm</span>
109699+
given <var>contextElement</var> and <var>html</var>.</p></li>
109700+
109701+
<li><p>Let <var>fragment</var> be a new <code>DocumentFragment</code> whose <span>node
109702+
document</span> is <var>contextElement</var>'s <span>node document</span>.</p></li>
109703+
109704+
<li><p>For each <var>node</var> in <var>newChildren</var>, <span
109705+
data-x="concept-node-append">append</span> <var>node</var> to <var>fragment</var>.</p></li>
109706+
109707+
<li><p><span data-x="concept-node-replace-all">Replace all</span> with <var>fragment</var> within
109708+
<var>target</var>.</p></li>
109709+
</ol>
109710+
109711+
<hr>
109712+
109713+
<p>The static <dfn method for="Document"><code
109714+
data-x="dom-parseHTMLUnsafe">parseHTMLUnsafe(<var>html</var>)</code></dfn> method steps are:</p>
109715+
109716+
<ol>
109717+
<li>
109718+
<p>Let <var>document</var> be a new <code>Document</code>, whose <span
109719+
data-x="concept-document-content-type">content type</span> is "<code
109720+
data-x="">text/html</code>".</p>
109721+
109722+
<p class="note">Since <var>document</var> does not have a <span
109723+
data-x="concept-document-bc">browsing context</span>, <span data-x="concept-n-script">scripting
109724+
is disabled</span>.</p>
109725+
</li>
109726+
109727+
<li><p><span>Parse HTML from a string</span> given <var>document</var> and
109728+
<var>html</var>.</p></li>
109729+
109730+
<li><p>Return <var>document</var>.</p></li>
109731+
</ol>
109732+
109613109733
</div>
109614109734

109615109735

0 commit comments

Comments
 (0)