Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape attributes before printing #1

Closed
oliverpool opened this issue Nov 7, 2023 · 3 comments
Closed

Escape attributes before printing #1

oliverpool opened this issue Nov 7, 2023 · 3 comments

Comments

@oliverpool
Copy link

As noticed in a-h/templ#284

 // < & co. gets wrongly removed
htmlformat.Fragment(&buf, strings.NewReader(`<button class="&lt;script&gt; &#34; href=&#34;#hacked">Escaped classes</button>`))

a.Val must be escaped using html.EscapeString in at least two places:

if _, err = fmt.Fprintf(w, ` %s="%s"`, a.Key, val); err != nil {

if _, err = fmt.Fprintf(w, ` %s="%s"`, a.Key, a.Val); err != nil {

@a-h
Copy link
Owner

a-h commented Nov 7, 2023

Ah, you're right. Thanks!

Unfortunately, I don't think it's as simple as just re-escaping the a.Val, because if the attribute value is already invalid, e.g. class="&", running it through html.EscapeString will "fix" it so that the output is class="&amp;".
This was discussed golang/go#52911 but I don't think the use case for having access to the raw value was really explained.

One use case is to unminify HTML created by a templating engine, so that we can compare it to expected output, and have an easy to visualise result.

Part of that use case is to find issues like invalid escaping, so fixing the invalid input isn't what we want.

One option is to fork the HTML parser and make the change to add the raw attribute value to the Node type, and then use it. There's only about 1 commit a month to the parser, so it probably wouldn't be too much of a maintenance burden. https://github.com/golang/net/commits/master/html

@oliverpool
Copy link
Author

having access to the raw value was really explained

Actually getting the "value as interpreted by the browser" should be enough (which is what a.Val gives I think). Even if the string representation differ, if the browser interpretation is the same, then the code should be safe.

Alternatively, you could reconstruct an html.Token from an html.Node (with a StartTagToken type) and call the t.String() method on it (but it will still escape previously unescaped values).

@a-h
Copy link
Owner

a-h commented Nov 8, 2023

Thanks, agreed. I've fixed this in 5bd994f

@a-h a-h closed this as completed Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants