Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalizer seems overzealous when stripping whitespace between elements #149

Closed
paulchaplin opened this issue Jun 18, 2014 · 8 comments
Closed

Comments

@paulchaplin
Copy link

My browser (Chrome 35) doesn't insert &nbsp; between e.g. <b>foo</b> <i>bar</i> when I've selected and formatted each word, so the normaliser eats the space when running setHTML(), giving the run-together "foobar" rather than the expected "foo bar".

I locally changed what equates to line 117 in src/normalizer.coffee from:

    html = html.replace(/\>\s+\</g, '><')

to:

    html = html.replace(/\>\s+\</g, '> <')

to normalise to a single space and solve my problem, although depending on what else that might be required for, it could make the line redundant due to normal HTML whitespace-collapsing.

Is there another (better) way to solve this, and prevent genuine spaces from being removed, without manually inserting &nbsp;?

@Kilian
Copy link

Kilian commented Jun 19, 2014

+1

@jhchen
Copy link
Member

jhchen commented Jun 26, 2014

Just to clarify this is limited to the setHTML call only correct? If you have two words "one two" and bold one and italicize two through the UI the space remains (at least this is the behavior I'm seeing on my Chrome).

@paulchaplin
Copy link
Author

That's correct -- there's no InsertOp entry for the space (since it's pre-stripped), so the HTML is set without it, and getHTML naturally then returns the editor's contents without it.

@bjnsn
Copy link

bjnsn commented Oct 16, 2014

+1

@bjnsn
Copy link

bjnsn commented Oct 16, 2014

If I understand the purpose of the whitespace stripping, a more complex RegExp that avoids stripping space between inline elements might work.

Instead of this (in Normalize.stripWhitespace):
html = html.replace(/>\s+</g, '><');

Use this:
html = html.replace(/>\s+<(?!\s*(?:b|big|i|small|tt|abbr|acronym|cite|code|dfn|em|kbd|strong|samp|var|a|bd|br|img|map|object|q|script|span|sub|sub|sup|button|input|label|select|textarea))/g, '><');

In that example, there is a negative lookahead which skips any cases that match these tags noted by MDN: https://developer.mozilla.org/en-US/docs/Web/HTML/Inline_elemente#Elements

@thomsbg
Copy link
Contributor

thomsbg commented Nov 24, 2015

I think this issue needs to be re-opened. Unfortunately, simply patching getHTML() to insert &nbsp; as appropriate is not sufficient to support copy / paste. See the following repro steps:

  1. visit http://quilljs.com/
  2. select the space between the words Quill and Rich in the 1st line
  3. use the toolbar or the keyboard to apply the bold format
  4. select all text, cut and then paste over the selected contents
  5. notice that the space between Quill and Rich has disappeared

While these repro steps may seem a bit convoluted (who would apply formatting to just whitespace?), in practice this happens often while editing rich text, especially when the authorship module is enabled (which frequently wraps runs of pure whitespace in <span class="author-1234"> </span> tags).

One could probably patch the paste-manager module to handle this case correctly, but I must admit I am confused why stripWhitespace is needed at all... it seems better to have Document#setHTML not require &nbsp; between inline tags to work correctly.

Perhaps Normalizer.stripWhitespace could replace runs of whitespace with a single space, rather than the empty string? Or simply not used at all.

I need to fix this issue on my fork before 1.0 is released. Which path forward would you anticipate would be most compatible with the 1.0 branch?

@jhchen
Copy link
Member

jhchen commented Nov 24, 2015

Normally HTML dictates consecutive whitespace be collapsed into one. To explicitly display more than one whitespace character you can use HTML entities such as &nbsp; for spaces. But all browsers insert a space, not an HTML entity, when you hit the spacebar on a contenteditable element. This could be replaced immediately by Quill but it's always tricky to replace content around where the cursor is so ideally the editor could just work with the native browser behavior. The CSS white-space property allows this and also has the benefit of showing tab characters properly and this is currently the path Quill choses to take.

The only issue introduced by this solution is when HTML is written for human readability (including tabs and newlines for nested HTML tags) it's surprising to discover those characters would actually show up. Also copying and pasting the example docs code to try out Quill (which does include nicely indented HTML) is a use case I want to support without the aforementioned surprise. So the quick solution was to get rid of whitespace between tags which is proving to not be such a robust solution.

I'm willing to believe a more complex regex could be the solution (which I'll explore for 1.0) but if it's the case that you're saving/loading Quill's content through the API and not with nicely indented HTML in the markup then the best and easiest solution is probably to get rid of stripWhitespace in the editor.

@fracz
Copy link

fracz commented May 3, 2016

You can override this behaviour in your application. No need to modify the sources.

normalizer = Quill.require('normalizer')
stripWhitespace = normalizer.stripWhitespace
normalizer.stripWhitespace = (html) ->
  stripped = stripWhitespace(html)
  stripped.replace(/></g, '> <')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants