-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorporate AMP validator into post authorship flow, to alert if content within the post editor is not compatible with AMP #843
Comments
Request To Work On This Hi @ThierryA and @westonruter, If so, I'll write a simple solution design here so we're on the same page. Update: I'm now working on a sub-issue of #839. Thanks! |
@kienstra Please do. I see this highly related to #842, but probably a different mechanism. For content, I suppose it would involve hooking into the whitelist sanitizer to find elements that are dropped due to being illegal? If this could identify the specific paragraph in the content that has invalid content in it, that would be ideal. Naturally this would straightforward to accomplish with Gutenberg, and perhaps Gutenberg should be the focus for this here then. For #842 my thought was do do something at a higher level, when hooks are run, to run the validator on the entire page output rather than on just the content. I don't see these two as being mutually exclusive. One is focused on the content authorship—telling users what they can't author—and the other is focused on the site administration—telling users what plugins are doing illegally. |
In Development Hi @westonruter, |
Let's chat about the approach to take here. |
Possible Approach Hi @westonruter, Unfortunately, this only applies on editing a block. Not on saving it, as you mentioned. Here's a very rough prototype that you could paste in the console on a Gutenberg var isValidAMP,
previousWp = wp,
el = wp.element.createElement,
Notice = wp.components.Notice,
blockType = 'core/html',
htmlBlock = wp.blocks.unregisterBlockType( blockType ),
OriginalBlockEdit = htmlBlock.edit;
/**
* Whether markup is valid AMP.
*
* Use the AMP validator from the AMP CDN.
* And place the passed markup inside the <body> tag of a basic valid AMP page.
*
* @param string markup
* @returns {boolean} $valid Whether the passed markup is valid AMP.
*/
isValidAMP = function( markup ) {
var ampDocument = `<!doctype html><html ⚡><head><meta charset="utf-8"><link rel="canonical" href="./regular-html-version.html"><meta name="viewport" content="width=device-width,minimum-scale=1"><style amp-boilerplate>body{-webkit-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-moz-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-ms-animation:-amp-start 8s steps(1,end) 0s 1 normal both;animation:-amp-start 8s steps(1,end) 0s 1 normal both}@-webkit-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-moz-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-ms-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-o-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}</style><noscript><style amp-boilerplate>body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}</style></noscript><script async src="https://cdn.ampproject.org/v0.js"></script></head><body>${markup}</body></html>`,
validated = amp.validator.validateString( ampDocument );
return ( validated.hasOwnProperty( 'status' ) && 'PASS' === validated.status );
};
jQuery.getScript( 'https://cdn.ampproject.org/v0/validator.js', function() {
/*
* Workaround, not recommended in actual plugin.
* Replaces the wp value that the script above overwrote.
*/
wp = previousWp;
} );
/**
* Overwrites the 'Custom HTML' block's edit() function.
*
* Outputs the original component, in OriginalBlockEdit.
* If it's not valid AMP, it also outputs a Notice.
*/
htmlBlock.edit = function( props ) {
var content = props.attributes.hasOwnProperty( 'content' ) ? props.attributes.content : '';
result = [ el( OriginalBlockEdit, Object.assign( props, { key: 'original' } ) ) ];
if ( ! isValidAMP( content ) ) {
result.unshift( el(
Notice,
{
key: 'notice',
title: 'Notice',
status: 'warning',
content: 'This is not valid AMP',
}
) );
}
return result;
};
wp.blocks.registerBlockType( blockType, htmlBlock );
Issues:
This is based on the 'Adding a caption' section in this post. |
@kienstra that looks amazing! Having that client-side JS-based validation is perfect as it is realtime and it is block-based. You should absolutely keep going with this. I don't see this in any way conflicting with the validation of the frontend as a whole. When saving, we can send out a request to get the frontend HTML and then run it through the validator as you're doing at the block level. This will need a bit more
Why would it do this? That seems like a big flaw in the validator script. Apparently it's not encapsulating the minified function names which is causing |
Would you mind filing an issue on the AMP github tracker: |
In wp-admin/post.php, load a script. This overwrites the edit() function of registered blocks. The new function validates the block's content. And conditionally displays a notice. @todo: consider not using this for all blocks. Some blocks are valid once the back-end sanitizer alters them. Like the 'Audio' block. This simply uses the save() function of the block to get the markup.
AMP Validation In 'Classic' Editor Here's a screencast of validating a post in the 'classic' editor. On saving the post, the JavaScript file makes an AJAX request to the permalink. Then, it validates that entire HTML document for AMP compatibility, using But the back-end sanitizers work really well, and it's hard to enter content that won't be AMP-compatible by the time the page is rendered. Next steps include applying similar front-page validation to Gutenberg. |
On saving a post in post.php, make request to the permalink for the post. And validate that for AMP compatibility. This applies to the 'classic' editor, not AMP. But refactoring validatePage() would allow it to be used on Gutenberg.
Building upon Weston's work and solution design, Add a class to track whenever a node or attribute is removed. And a method to get whether a node was removed. The format of the stored nodes and attributes might change. This will probably depend on the error reporting needed in the REST API and GET request response.
There was an error: Class file names should be based on the class name with 'class-' But the format of the other test files is different. So use that format, and exclude this rule for test files.
The 'mutation_callback' will then track removed nodes and attributes. Also, change the way in which we pass the 'mutation_callback.' Before, it was part of the constructor of: AMP_Tag_And_Attribute_Sanitizer. Instead, move it to the $args of: AMP_Content_Sanitizer::sanitize(). This will pass it to all of the sanitizer/* files when they're instantiated. @todo: look at whether to call the callback for all node removals.
Before, there were 3 places in the file that called removeChild(). This was fine, but they now need to call the mutation callback. So abstract these into remove_child(). Also, call the mutation callback in AMP_Video_Sanitizer.
Per Weston's description in PR #912, It allows sending a POST with markup for validation. The headers should have 'Content-Type' of 'application/json.' And it should pass the markup in the param 'markup.' The current response only has 'is_error.' @todo: look at returning more in the response, like the stripped tags and attributes. Also, add nonce verification.
This is only one approach. But for now, the response has counts for: 'removed_nodes' and 'removed_attributes'. If a <script> is removed, 'removed_nodes' will be: {"script":1}. The count will increment every time the same node type is removed. There is a similar histogram for 'removed_attributes'.
Abstract the logic for the response into get_response(). This enables using it for the existing REST API logic, And the new use-case of full-page GET requests.
Before, this was on the 'save_post' action. But as Weston mentioned, this could be in one place, and this removes the need for a nonce. Also, remove the function is_authorized(). This checked for a nonce. Replace this with the existing has_cap().
Because 'wpautop' runs on 'the_content', process_markup() showed errors from removing <p> tags. For example, it created this markup: <p><script async src=https://example.com/script></script></p> It seemed to have removed the <p> tag, As it contained a disallowed element. But it's not needed to report that the <p> is removed. So remove 'wpautop' as a callback for 'the_content.' Gutenberg also does this, unless the post has no block. @see gutenberg_wpautop().
Instead of wrapping the 'invalid_callback' in this, simply exist process_markup(). If that callback isn't added, there's no need for the rest of the function.
Moving Into QA Hi @westonruter, Steps To Test
Known Issue |
This is due to the whitelist stanitizer removing empty parent elements when an invalid element is removed. I believe it is this logic here: At the very least this can be changed from |
@westonruter @kienstra what is the reason for the preprocessor remove the parent if the parent is valid, is the intention not leave empty HTML Tag? |
Removed Parent Element Hi @ThierryA, In that case, there's probably no reason to report its removal, as we don't know if it's invalid. @westonruter's suggestion works locally: $parent = $parent->parentNode;
if ( $parent ) {
- $this->remove_invalid_child( $node );
+ $parent->removeChild( $node );
} There's also a slightly-relate edge case where this doesn't report the removal of a parent if the child is removed. For example:
This only reports:
Making this change to replace_node_with_children() seems to address this: // Replace node with fragment.
$node->parentNode->replaceChild( $fragment, $node );
+ if ( isset( $this->args['remove_invalid_callback'] ) ) {
+ call_user_func( $this->args['remove_invalid_callback'], $node );
+ } |
@kienstra , not seeing any warning here |
verified in QA |
Request For Regression Testing Hi @csossi, Please create a post using the classic editor. Here are the test steps. |
Acceptance Criteria
AC1: On saving a post on /wp-admin/post.php, if there’s post content that’s not compatible with AMP, display an error at the top of the page.
AC2: The error will not display specifically what is incompatible.
AC3: The error will display the following text: “Notice: your post fails AMP validation”.
The text was updated successfully, but these errors were encountered: