Skip to content

Conversation

@mpkorstanje
Copy link
Contributor

@mpkorstanje mpkorstanje commented Aug 10, 2025

🤔 What's changed?

In brief, explained in more detail by Jon Surrel[1], both </script> and <!-- are interpreted by the html render. We caught the first one, but not the second.

The W3C recommendation is to replace the < with \x3C[2] instead of escaping the /.

  1. https://sirre.al/2025/08/06/safe-json-in-script-tags-how-not-to-break-a-site/
  2. https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements

🏷️ What kind of change is this?

  • 🐛 Bug fix (non-breaking change which fixes a defect)

📋 Checklist:

  • I agree to respect and uphold the Cucumber Community Code of Conduct
  • I've changed the behaviour of the code
    • I have added/updated tests to cover my changes.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
  • Users should know about my change
    • I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

@mpkorstanje mpkorstanje force-pushed the fix-escape-html-in-json branch 4 times, most recently from bfd7716 to 2f8f853 Compare August 10, 2025 11:34
In brief, explained in more detail by Jon Surrel[1], both `</script>`
and `<!--` are interpreted by the html render. We caught the first one,
but not the second.

The W3C recommendation is to escape the `<` instead with `\x3C`[2].

1. https://sirre.al/2025/08/06/safe-json-in-script-tags-how-not-to-break-a-site/
2. https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements
Copy link
Member

@gasparnagy gasparnagy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.NET is good

@mpkorstanje mpkorstanje merged commit 1177850 into main Aug 11, 2025
15 checks passed
@mpkorstanje mpkorstanje deleted the fix-escape-html-in-json branch August 11, 2025 09:34
@sirreal
Copy link

sirreal commented Dec 1, 2025

👋 I'm glad you found the post helpful and acted on it!

I should mention an error in the post that could affect this change. In short, I'd recommend using the Unicode escape sequence \u003C instead of the hex escape \x3C because it's the correct escape for JSON and generally more portable.

I have updated the post and added a note about the different escape sequences.


\x3C is the escape sequence that the HTML standard recommends for script tags, however it isn't actually a JSON escape sequence. JSON, to the best of my understanding, doesn't include these hex escape sequences like JavaScript does. JSON only supports the Unicode escape sequences like \u003C. It's tricky to even detect this inside of JavaScript, but you can see it like this:

eval( String.raw`"\x3C"` )
// '<'
JSON.parse( String.raw`"\x3C"` )
// Uncaught SyntaxError: Bad escaped character in JSON at position 2 (line 1 column 3)
eval( String.raw`"\u003C"` )
// '<'
JSON.parse( String.raw`"\u003C"` )
// '<'

Depending on how the JSON is used, that may be fine. For example, printing the JSON as a JavaScript object means that the JSON strings are just JavaScript strings:

<!DOCTYPE html>
<script>
console.log(
  //  raw printed JSON      vvvvvv
  `The "\x3C" character: ${ "\x3C" }`
)
</script>

Or if the JSON is a string literal in JavaScript, that's also likely fine:

<!DOCTYPE html>
<script>
console.log(
  //  printed JSON in JavaScript string  vvvvvv
  `The "\x3C" character: ${ JSON.parse( '"\x3C"' ) }`
)
</script>

However if the JSON is actually expected to be JSON, then hex escapes like \x3C is a bad choice. For example, the following will throw Uncaught SyntaxError: Bad escaped character in JSON at position 3:

<script type="application/json">
"\x3C"
</script>
<script>
console.log(
  JSON.parse( document.querySelector( 'script' ).textContent )
)
</script>

While this is just fine with the Unicode escape sequence:

<script type="application/json">
"\u003C"
</script>
<script>
console.log(
  JSON.parse( document.querySelector( 'script' ).textContent )
)
</script>

@mpkorstanje
Copy link
Contributor Author

mpkorstanje commented Dec 1, 2025

Cheers! The pro-active follow up is much appreciated. How did you even find us? 😄

Or if the JSON is a string literal in JavaScript, that's also likely fine:

This is the case that applies to us. We're filling in this part of the template with a comma separated list of json objects. So on evaluation that's all JavaScript.

<script>
window.CUCUMBER_MESSAGES = [{{messages}}];
</script>

Though we should probably rename the JsonInHtmlWriter to JsonInJavascriptInHtmlWriter to reflect that accurately.

@sirreal
Copy link

sirreal commented Dec 1, 2025

How did you even find us?

Referrer traffic in my stats. A benefit of your nicely linking to my post (thank you) is that I get insight and can follow up like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants