Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMP_Content_Sanitizer::sanitize() strips out amp-bind Bindings #836

Closed
d4mation opened this issue Dec 26, 2017 · 3 comments · Fixed by #895
Closed

AMP_Content_Sanitizer::sanitize() strips out amp-bind Bindings #836

d4mation opened this issue Dec 26, 2017 · 3 comments · Fixed by #895
Assignees
Milestone

Comments

@d4mation
Copy link

I've made some template files which run through a loop to pull in data from Child Posts which then runs AMP_Content_Sanitizer::sanitize() on the Post Content of each item that is pulled in.

The Post Content of the pulled in Posts (When viewed in AMP) utilizes amp-bind to toggle a Hidden state on elements. However, I noticed that when the DOMDocument Object is created from the passed in $content as part of AMP_DOM_Utils::get_dom_from_content(), it ignores any of the special amp-bind Binding attributes. This is presumably due to PHP not recognizing them as valid and throwing them out.

This only effects cases where amp-bind Binding attributes are being ran through the Sanitization method directly. If it simply exists in the template file (Therefore not sanitized in the same way Post Content is) it does not affect it.

This primarily is an issue in cases where you are wanting a Shortcode to still function similarly when viewed on an AMP version of a page, such as hiding/showing/expanding/etc. different elements based on user interaction.

Reproduction steps:

Place the following code at the top of https://github.com/Automattic/amp-wp/blob/8b416f9bbdba64e07082dcc02a9b64aa34c2d270/includes/templates/class-amp-content-sanitizer.php#L22 to simulate it being a part of the Post Content:

$content .= '<p id="test-stripping-amp-bind" [text]="foo">test</p>';
$content .= "<button on=\"tap:AMP.setState({foo: 'amp-bind'})\">Set to amp-bind</button>";

And then the following to load in amp-bind:

add_filter( 'amp_post_template_data', function( $data ) {
			
    if ( is_singular() || 
        is_front_page() ) {
			
            if ( empty( $data['amp_component_scripts']['amp-bind'] ) ) {
                $data['amp_component_scripts']['amp-bind'] = 'https://cdn.ampproject.org/v0/amp-bind-0.1.js';
            }
				
        }
			
        return $data;
			
}, 21 );

Your <p> tag will appear without the [text] attribute but the <button> will still set the AMP State correctly (Although it will not be able to do much without amp-bind Bindings that use it).

@westonruter westonruter added this to the v0.7 milestone Jan 19, 2018
@westonruter
Copy link
Member

I can confirm this is indeed a problem. Here's a standalone test case that shows the problem with parsing:

<?php
$dom = new DOMDocument();
$test_html = '<p id="test-stripping-amp-bind" [text]="foo">test</p>' . PHP_EOL;
$test_html .= "<button on=\"tap:AMP.setState({foo: 'amp-bind'})\">Set to amp-bind</button>\n";
$dom->loadHTML( sprintf( '<!doctype html><html amp><head><meta charset="utf-8"></head><body>%s</body></html>', PHP_EOL . $test_html ) );
echo $dom->saveHTML();

The result is this:

PHP Warning:  DOMDocument::loadHTML(): error parsing attribute name in Entity, line: 2 in /Users/westonruter/Projects/vvv/www/wordpress-develop/public_html/src/wp-content/plugins/amp/vendor/amphtml/standalone-amp-bind-test.php on line 5
PHP Stack trace:
PHP   1. {main}() standalone-amp-bind-test.php:0
PHP   2. DOMDocument->loadHTML() standalone-amp-bind-test.php:5

Warning: DOMDocument::loadHTML(): error parsing attribute name in Entity, line: 2 in ...amp/vendor/amphtml/standalone-amp-bind-test.php on line 5

Call Stack:
    0.0002     366128   1. {main}() ...standalone-amp-bind-test.php:0
    0.0007     366720   2. DOMDocument->loadHTML() ...standalone-amp-bind-test.php:5
<!DOCTYPE html>
<html amp><head><meta charset="utf-8"></head><body>
<p id="test-stripping-amp-bind">test</p>
<button on="tap:AMP.setState({foo: 'amp-bind'})">Set to amp-bind</button>
</body></html>

The binding attribute [text] does indeed get stripped out, and it causes an HTML warning.

The only thing I can think of is to do a global replacement of such attributes before parsing to take the form of data-amp-bind-property, where “property” is the name of the binding. Then upon parsing, treat any such data-amp-bind-* attributes as if they were [*]. Then when serializing them back out, then they should get saved back to the original bracketed form.

@jdalk
Copy link

jdalk commented Oct 10, 2019

I have a similar problem:

I load html-content via xhr through amp-script and form into div xhrcontent.
Normal html-tags and also amp-img in xhr-content is shown but all things like amp-ad, amp-youtube, amp-facebook and others are sanitized away. :-/

<amp-script src="https://www.XXX.de/script.js"> <div id="xhrcontent"></div> <form method="post" action-xhr="/form" target="_top" class="xhrform"> &nbsp; </form> </amp-script>

@westonruter
Copy link
Member

@jdalk Please open a new issue and provide the code you are using the generate AMP markup, and the full code you are using to fetch it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants