Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

easier way of using HTML entities #42

Closed
pekeler opened this issue Aug 30, 2016 · 10 comments
Closed

easier way of using HTML entities #42

pekeler opened this issue Aug 30, 2016 · 10 comments

Comments

@pekeler
Copy link

pekeler commented Aug 30, 2016

Using innerHTML to insert the odd HTML entity such as   as described in the wiki is OK. But when HTML entities are used all over the place, this quickly becomes unwieldy. I wish there was some way to keep using the entities inline.

Is there another reason to escape all text other than prevention of XSS from user entered text? If there isn't, I think the default of escaping everything is wrong and should be changed (if possible).

When writing plain old HTML, people are used their <tags> being interpreted accordingly, just like their &entities;, and the rest used as is (i.e. unescaped). I wish bel could mirror that experience.

@shama
Copy link
Member

shama commented Aug 30, 2016

Could you describe some specific examples you're running into? There might be some things possible to make it easier.

Is there another reason to escape all text other than prevention of XSS from user entered text? If there isn't, I think the default of escaping everything is wrong and should be changed (if possible).

I don't agree. You should rarely ever need to use innerHTML. If you find your app is using innerHTML quite a bit it means your HTML is being largely generated somewhere else. In which case, you should just use that other thing to generate your HTML instead of additionally feeding through bel.

Plus I'm not going to sacrifice security for developer convenience. Inserting unsafe HTML into your app is an awkward task on purpose.

@pekeler
Copy link
Author

pekeler commented Aug 30, 2016

I'm not actually using bel directly. I'm using choo. And I'm honestly not sure if the escaping happens at this level of the stack or at hyperx'.

I have 5 lines of content that can be blank. In those cases, to keep the layout steady, every line needs a nonbreaking space.

  while (numberOfTimes-- > 0) {
    const blankLine = html`<div></div>`
    blankLine.innerHTML = '&nbsp;'
    times.push(blankLine)
  }
  return html`
    <div class="example-lines">
      ${times}
    </div>
  `

Another case I ran into is for formatting two dates:

2016-10-01&nbsp;13:00 &gt; 2016-10-01&nbsp;10:00

Is there a more elegant way to render these?

Plus I'm not going to sacrifice security for developer convenience. Inserting unsafe HTML into your app is an awkward task on purpose.

I understand your position. But unless you're working on something like a wiki, isn't rendering user input the exception? Using the textContent attribute to render user input is common. Using innerHTML to render HTML entities is unusual.

@shama
Copy link
Member

shama commented Aug 30, 2016

But unless you're working on something like a wiki, isn't rendering user input the exception?

Think about any application that retrieves data from a database (for example via a JSON API). A user entered that data.

Enter Your Name: [<script>alert('My name is haxor')</script>]

One could argue what if scenarios all day but in the end, if you're not escaping your HTML your app is vulnerable to a XSS attack.


That being said, I'm all for making embedding entities easier within the template strings but only if it can be done in a safe way. Here is the line of code responsible.

@pekeler
Copy link
Author

pekeler commented Aug 30, 2016

Hm, there doesn't seem to be a document.createEntity. And Entity isn't implemented in Firefox. I can't figure out a way to programmatically create entities unless I wrap them each in some other element which isn't desirable.

I think this workaround is nicer than using innerHTML:

  const NBSP = '\u00A0'
  bel`<div>12${NBSP}pm</div>`

@pekeler
Copy link
Author

pekeler commented Aug 30, 2016

Something like this may work:

function createBelCreateElement (tag, props, children, safe) {
  return function belCreateElement (tag, props, children) {
    // ...
    function appendChild (childs) {
      // ...
      if (safe) {
        node = document.createTextNode(node)
      }

      if (node && node.nodeType) {
        if (safe) {
          el.appendChild(node)
        } else {
          el.innerHTML = el.innerHTML + node
        }
      }
    }
    appendChild(children)

    return el
  }
}

module.exports = hyperx(createBelCreateElement(true))
module.exports.unsafe = hyperx(createBelCreateElement(false))
module.exports.createElement = belCreateElement

@shama
Copy link
Member

shama commented Aug 31, 2016

Sorry, I don't want to provide an unsafe element creator in this library. Happy to consider a safe way to support entities embedded into the template string though.

@pekeler
Copy link
Author

pekeler commented Aug 31, 2016

OK, something like this would be a more generic approach that detects HTML entities in the children and renders them with innerHTML while the rest keeps being rendered with createTextNode:

function appendChild (childs) {
  if (!Array.isArray(childs)) return
  for (var i = 0; i < childs.length; i++) {
    var node = childs[i]
    // ...
    if (typeof node === 'string') {
      // parts is array of entities and regular text
      const parts = node.replace(/(&[a-z]+;)/gi, '\u0000$1\u0000').split('\u0000').filter(p => p)
      if (parts.length > 1) {
        appendChild(parts)
        continue
      }
      if (/^&[a-z]+;$/gi.test(node)) {
        el.innerHTML = el.innerHTML + node
        continue
      }
      if (el.lastChild && el.lastChild.nodeName === '#text') {
        el.lastChild.nodeValue += node
        continue
      }
      node = document.createTextNode(node)
    }

    if (node && node.nodeType) {
      el.appendChild(node)
    }
  }
}

The regex would have to be a bit more complex to also deal with numerical entities like &#160; and &#x000A0;.

Anyways, I'm not sure downstream libraries that use bel would appreciate the impact on performance this has. There's a reason why Handlebars, Mustache, ERB, Underscore, Haml, etc. all have an explicit way for rendering without escaping.

Update: added .filter to remove empty parts

@jondashkyle
Copy link
Contributor

jondashkyle commented Jul 12, 2017

Just wanted to chime in on this now that the module has evolved some. I hear the safety concerns, but with the added SSR capabilities of Pelo, this would be really handy. For example, I’m formatting text, which I’ve properly sanitized, as markdown formatter. Currently to do this requires:

var input = 'Some **sanitized** text!'
var text = markdown(input)
var element = bel`<div></div>`
element.setInnerHTML = text

Up till the addition of Pelo this was cool, but when using .toString() for some SSR jazz, the DOM isn’t around, and the element ends up being blank. I’ve hit this a few times recently, both when working on some SSR stuff, and when creating a little static site generator with Choo. Obv loosing a big benefit of SSR when used on a page with a lot of copy.

You can see this in action when visiting my site:
https://jon-kyle.com/

Source here:
https://github.com/jondashkyle/jon-kyle.com

Could be a great add?! 🎉
(cc: @yoshuawuyts)

@yoshuawuyts
Copy link
Member

yoshuawuyts commented Jul 12, 2017 via email

@goto-bus-stop
Copy link
Member

added in #86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants