Escape attributes before printing #1

oliverpool · 2023-11-07T08:24:26Z

 // &lt; & co. gets wrongly removed
htmlformat.Fragment(&buf, strings.NewReader(`<button class="&lt;script&gt; &#34; href=&#34;#hacked">Escaped classes</button>`))

a.Val must be escaped using html.EscapeString in at least two places:

htmlformat/format.go

Line 63 in c3d4a33

if _, err = fmt.Fprintf(w, ` %s="%s"`, a.Key, val); err != nil {

htmlformat/format.go

Line 133 in c3d4a33

if _, err = fmt.Fprintf(w, ` %s="%s"`, a.Key, a.Val); err != nil {

The text was updated successfully, but these errors were encountered:

a-h · 2023-11-07T12:07:42Z

Ah, you're right. Thanks!

Unfortunately, I don't think it's as simple as just re-escaping the a.Val, because if the attribute value is already invalid, e.g. class="&", running it through html.EscapeString will "fix" it so that the output is class="&".
This was discussed golang/go#52911 but I don't think the use case for having access to the raw value was really explained.

One use case is to unminify HTML created by a templating engine, so that we can compare it to expected output, and have an easy to visualise result.

Part of that use case is to find issues like invalid escaping, so fixing the invalid input isn't what we want.

One option is to fork the HTML parser and make the change to add the raw attribute value to the Node type, and then use it. There's only about 1 commit a month to the parser, so it probably wouldn't be too much of a maintenance burden. https://github.com/golang/net/commits/master/html

oliverpool · 2023-11-07T13:37:46Z

having access to the raw value was really explained

Actually getting the "value as interpreted by the browser" should be enough (which is what a.Val gives I think). Even if the string representation differ, if the browser interpretation is the same, then the code should be safe.

Alternatively, you could reconstruct an html.Token from an html.Node (with a StartTagToken type) and call the t.String() method on it (but it will still escape previously unescaped values).

a-h · 2023-11-08T12:47:32Z

Thanks, agreed. I've fixed this in 5bd994f

a-h closed this as completed Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Escape attributes before printing #1

Escape attributes before printing #1

oliverpool commented Nov 7, 2023

a-h commented Nov 7, 2023 •

edited

Loading

oliverpool commented Nov 7, 2023

a-h commented Nov 8, 2023

Escape attributes before printing #1

Escape attributes before printing #1

Comments

oliverpool commented Nov 7, 2023

a-h commented Nov 7, 2023 • edited Loading

oliverpool commented Nov 7, 2023

a-h commented Nov 8, 2023

a-h commented Nov 7, 2023 •

edited

Loading