Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New lines in shortcode content are ignored #4456

Closed
2 tasks
bobbingwide opened this issue Jan 14, 2018 · 6 comments · Fixed by #7892
Closed
2 tasks

New lines in shortcode content are ignored #4456

bobbingwide opened this issue Jan 14, 2018 · 6 comments · Fixed by #7892
Assignees
Labels
Backwards Compatibility Issues or PRs that impact backwards compatability [Feature] Shortcodes Related to shortcode functionality [Type] Bug An existing feature does not function as intended

Comments

@bobbingwide
Copy link
Contributor

bobbingwide commented Jan 14, 2018

Issue Overview

The new editor does not respect new lines in shortcode content. It's impossible to use the new editor to edit posts which depend on this. For me this is a blocker; it breaks my sites.

I have a number of shortcodes where each line in the content is expected to be processed separately. With Gutenberg v2.0.0 it's impossible to safely edit a post containing these shortcodes without the content being broken. Take for example the [bw_csv] shortcode which converts CSV content into a table.

[bw_csv]One,Two,Three
A,B,C
D,E,F
[/bw_csv]

During shortcode expansion the first line becomes table the table heading.
Subsequent lines are table rows. So the expected output is like this.

One Two Three
A B C
D E F

When the new lines are removed by the editor the shortcode does not expand correctly. Everything is treated as the table heading.

The basic problem can be demonstrated even without the shortcode being active.
In the current editor you can quite happily switch between Visual and Text and the site is not broken.
Once the content is converted the new lines in the original source are lost.

Steps to Reproduce (for bugs)

  1. Create a new post using Posts > All Posts > Add New > Classic Editor
  2. Enter content as in the shortcode above
  3. View the post.
  4. The content should appear as typed.

Repeat the above steps for another new post
5. Now Edit the post using Gutenberg
6. The new line characters do not survive the conversion.
7. Any attempt to recreate the original content using Gutenberg will fail.
8. Switching to the Code editor to remove unwanted paragraphs ( #3900 ) doesn't work as that ignores new lines as well. They disappear as soon as you click on Update.

I've also tried using different block types: Shortcode, Custom HTML.

Expected Behavior

  • The editor should not be removing new lines.
  • It should not add anything either.
  • It works in WordPress 4.9.1
  • In my opinion the Shortcode block should support content created AS IS. It should respect any new line characters in the content.

Current Behavior

I've been unable to find any existing block that respects my new lines.

  • Custom HTML strips the new lines on save.
  • The shortcode block also strips new lines.

Possible Solution

Shortcode block should respect the content as typed.

Screenshots / Video

image
image
image

Related Issues and/or PRs

#3900
#1709 - see #1709 (comment)

Todos

  • Tests
  • Documentation
@bobbingwide
Copy link
Contributor Author

During a direct discussion with @aduth regarding finding a solution to respect new lines in the content for my CSS block, Andrew commented

Hmm, it seems like the beautifier can be a bit aggressive. Maybe need some metrics to tell it to not perform (maybe if there’s no tags? if a block opts out?)

I'm going to look into the options.

But first, I'd like to know why the function is needed at all?
For the shortcode block, replacing

	return beautifyHtml( content, {
		indent_inner_html: true,
		wrap_line_length: 0,
	} );

with

 return content;

appeared, at first glance, to resolve my issues nicely.
Why do other blocks need it?

@aduth
Copy link
Member

aduth commented Apr 8, 2018

Related: #633

With work toward a custom serializer in #5897, I'm wondering if we ought to just build some basic beautification into the serializer, rather than a step which occurs after block output has been generated. This way we can have more awareness of what should and shouldn't have changes applied.

@danielbachhuber danielbachhuber added the [Feature] Shortcodes Related to shortcode functionality label Apr 10, 2018
@danielbachhuber danielbachhuber added this to the Merge Proposal milestone Apr 10, 2018
@karmatosed karmatosed modified the milestones: Merge Proposal, Merge Proposal: Back Compat Apr 12, 2018
@bobbingwide
Copy link
Contributor Author

bobbingwide commented Apr 15, 2018

During WordCamp London contributor day I discussed this issue with @dmsnell. He suggested revisiting the reason that the HTML beautifier was implemented in #633.
So the original question stands... why do we need to invoke beautifyHtml at all?

@dmsnell
Copy link
Member

dmsnell commented Apr 16, 2018

I'll drop by here as well and share a brief thought on beautification, which actually I like. We have a fundamental problem with post_content as soon as we have shortcodes because they immediately break anything that operates on HTML. Something like jsbeautify is probably guaranteed to break in random unexpected ways because it thinks it has HTML when actually it doesn't - the same kinds of problems we have had with WordPress posts for years.

That is, if we want to beautify we probably need to build a custom beautifier and attempt to preserve shortcodes. Frankly, this is probably also a losing venture since we cannot preserve shortcodes in the general sense because they are unspecified. However, we might be able to preserve shortcodes within a block and make a hard cut-off saying "no shortcodes may span block boundaries."

Actually this problem makes me want to cry because basically I can't see any practical way to beautify and not destroy shortcodes. We could build another layer in the parser to detect some shortcodes which wrap content (but we could not get them all) and then wrap those in something before beautifying to try to preserve them, but that also would add considerable computational complexity.

  • Is there a strong enough justification for beautification to take us head-to-head with this problem?
  • Should we consider adding a warning whenever we detect a shortcode in the post to say that "oh yeah this might blow up the world and all that's good?"
  • Do we need to start thinking about a kind of inline-block whereby shortcodes could be contained and protected automatically?

@aduth
Copy link
Member

aduth commented Apr 16, 2018

I have a local branch which contains most of the work toward building beautification into the serializer, in a way which is not particularly opinionated (e.g. if newlines existed in a shortcode, the serializer would output them verbatim). The remaining work is largely edge cases, but the recurring pain point in working through them is an understanding of what we mean with "beautification".

As I understand it, the original issue #633 was targeted at what was at-the-time a large continuous string of text for the entirety of post content. A simplest solution may have been to add newlines between block demarcations. Still not great for more complex markup; traditionally one would expect some indented hierarchy of elements to be most readable. But what are the rules for said indentation?

js-beautify considers some objectives like ideal line length, which may be overkill for what we're trying to accomplish.

In my branch, the working naive implementation I adopted was "place on own line and indent non-inline descendent nodes", producing something like (from my own test cases):

const result = renderElement(
	<section>
		{ '\n\tString' }
		<h2>Hello <em>World!</em></h2>
		<span>On previous line</span>
		<p>This is content.</p>
		<div>
			<span>Nesting:</span>
			<p>Can get out of control.</p>
		</div><pre><code>{ 'foo\nbar\t' }</code></pre>
	</section>
);
<section>
	String
	<h2>Hello <em>World!</em></h2><span>On previous line</span>
	<p>This is content.</p>
	<div><span>Nesting:</span>
		<p>Can get out of control.</p>
	</div>
	<pre><code>foo
bar	</code></pre>
</section>

@iandunn
Copy link
Member

iandunn commented Jun 29, 2018

FWIW, another example of this is the [sourcecode] shortcode from the SyntaxHighlighter plugin, where you end up with many lines of code squished onto a single line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backwards Compatibility Issues or PRs that impact backwards compatability [Feature] Shortcodes Related to shortcode functionality [Type] Bug An existing feature does not function as intended
Projects
None yet
7 participants