-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tidy does not move <style>-Tag to head #567
Comments
Here a simple test file:
|
@suchafreak thank you for the issue... I know very little about jTidy, but, for sure, current 5.5.31 Tidy refuses to move Current tidy still has the message Going back in Tidy versions, my earliest, But the fix was gone by our first release 5.0.0, circa 2015... the first release since 2009... but that is all just history... It will take more digging into the code, to find out exactly when and why this was changed, and hopefully change it back, unless, as stated, there is some valid reason for this change... Will look deeper into this soonest, and report... and would appreciate any further feedback... thanks |
Thank you for your answer. As of now, I consider this to be a bug as well. The W3C validator reports an error for each Please let me know, if I can be of any assistance. (Unfortunately, I am versed too well in C as I had to learn the hard way trying to access libtidy via JNA from Java to no avail. ) Thanks for your hard work! |
@suchafreak, ok, found the exact line that causes this... It changed from -
to -
But, at some point And even if that And thus never gets to the code, which does exactly what it says -
Would be relatively easy to add code like -
Full patch - diff --git a/src/parser.c b/src/parser.c
index a037280..41ff2d0 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -4121,6 +4121,14 @@ void TY_(ParseBody)(TidyDocImpl* doc, Node *body, GetTokenMode mode)
}
}
+ /* Issue #567 - <style> tags found in this <body> parsing
+ should be moved to the <head>, always... */
+ if (nodeIsSTYLE(node))
+ {
+ MoveToHead(doc, body, node);
+ continue;
+ }
+
if (node->type == EndTag)
{
if ( nodeIsBR(node) ) Is there any reason this would not be true? Need to give this a little more thought... testing and feedback very welcome... thanks... |
The only reason I see for this check is that at some point there were plans to support scoped style elements in HTML5. But as far as I can tell, this seems to be the case no longer, see: http://w3c.github.io/html/document-metadata.html#the-style-element No current browser supports this anyway. |
@suchafreak thank for the links and feedback... wow, there has certainly been a lot said about allowing And while few browsers, other than Firefox, support the One can read the clear comments from @sideshowbarker, who worked on tidy a while back, that he added the error to W3C So if the aim of Tidy is to produce valid html, then I too think it should move it to the No, the addition of As mentioned, the moving of the So removing that bit would be one solution, but could have other consequences... although none seen in the 235 But on balance, it seems the proposed patch is a clean, very specific, and a clear solution, once it is agreed Tidy should issue this warning, and make this document fix... Will ponder this some more... as always feedback very welcome... thanks.. |
@suchafreak to facilitate easy testing have pushed the changes to an Have also pushed some sample |
Hi Thanks a lot for the prompt response. Is there any way to get hold of a binary of this fixed version for Windows? Unfortunately, I lack the necessary build environment to build tidy.exe from the sources. If not, what is the estimated time frame for the release of this fix? MTIA |
@suchafreak well I do like working on bugs ;=)) It's my kind of fun...
Well, we have no present place to I could copy one to a temporary place, but you would probably still need some other But just because it is easy, I have copied a zip - http://geoffair.org/tmp/tidy-5.5.31.I567-w64-vc14-md.zip - but this assumes a Windows 64-bits, and you may have to install the appropriate vc_redist.x64, if not already done...
But this is so easy to set up... you just need 3 things
This is documented in our BUILD.md... but repeated here, once the above 3 tools are installed... Start in
This may default to a 32-bit build, but if in a 64-bit system, change 4. to That does not seem too hard... Why not try it? It is fun! Look forward to further feedback... thanks... |
Hi I tested the patched version with this input:
The good: The style-Tag directly within body gets moved to the head correctly. This is the output:
Hope this helps to nail down the problem further. Please let me know if I can be of any assistance. |
@suchafreak yup, the patch I proposed will only deal with moving the But when the token So if we want to catch any and every Then tidy runs Here the complete node tree can be processed again, node, by node... This is where, for example, inline Reading So as indicted, to move all Will give this some thought, and cycle back to it as time permits, but if you, or others, have some ideas, then feedback, patches, PR very welcome... thanks... Glad to hear you were able to clone and build tidy. The next step would be to fork Then you can make code changes, usually in a branch, push these to your fork, and present a PR... the process is simple... after you have forked the repo -
I usually leave step 4. until after 8., and load the MSVCxx IDE, and make code changes using its very helpful IDE editor... Then as in 9. you can build and test in the IDE... Advise if you need any help with this... thanks... |
Thank you for your reply. Regards, |
@suchafreak ok, took another look at this, but not pushed anything yet... but as before have copied an experimental build to http://geoffair.org/tmp/tidy-5.5.31.I567-2-w64-vc14-md.zip Note it is experimental, and have not yet tested all cases... It would really help if you could find a sample where it fails... And if I get the chance will push this to an issue-567-2 But now two things trouble me -
If I load my in_567-3.html test file in a browser that supports So there would need to be a option like Of course, it seems Chome and IE, and maybe others, show both words in green... And at the moment, this is a silent fix! I think I would want to know, at least as an SOOO, really seek further feedback on this... thanks... |
Hi @geoffmcl Thank you for your work. We have a large number of pages we can use to test your changes, some of them contain rather "nasty" markup. I will get back to you as soon as I have results. As far as I understand it, the support for scoped style definitions has been dropped from HTML5, probably because web components offer better isolation of CSS. So, from where I stand, the only correct behavior is to move any style tag found in the body to the head. A relaxed implementation could leave scope styles untouched until Mozilla drops style support, see: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/style:
Hope this helps. I will report my findings asap. |
Hi Unfortunately, the experimental version found at http://geoffair.org/tmp/tidy-5.5.31.I567-2-w64-vc14-md.zip does not move any style tags to the head anymore. What am I missing? |
@suchafreak thanks for the feedback and links... I understand the And if I was into browser development I too would be against it, since you would have to attach rendering attributes to just about every level of the DOM (like) tree... rather ugly and quite difficult... aside from consuming more memory, to hold rendering attributes at every context level... YIKES!!! But to play it safe, I have proceeded to implement a It seems to work on the five samples I have created in my test repo, see in_567*.html... maybe you can confirm the above So it is my turn to say What am I missing? ;=)) Can you provide at least one little sample, hopefully the smallest, where it fails... please try to trim it down to the basics... Maybe I need to do more to fully traverse the tidy DOM like tree... I guess what I have now will miss if the Am sure I can fix that by fully iterating every node in the So, like I say, still considering an But the first thing is to get it working fully, and hope you can help with a small sample that fails... thanks... |
Hi @geoffmcl Sorry for not getting back to you sooner. The experimental version found at http://geoffair.org/tmp/tidy-5.5.31.I567-2-w64-vc14-md.zip does not move any style tags to the head anymore, not even in this simple case:
Cheers, |
@suchafreak really do not understand why http://geoffair.org/tmp/tidy-5.5.31.I567-2-w64-vc14-md.zip works fine for me. Given your above input, I get the following output -
That is the But as stated, I do recognize there may be cases where the fix in Current tidy 5.5.31 will already fix the following sample - <table>
<tr>
<td>
<style type="text/css">
p { color: red }
</style>
<p>red text</p> With the output: line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: inserting implicit <body>
line 4 column 1 - Warning: <style> isn't allowed in <td> elements
line 3 column 1 - Info: <td> previously mentioned
line 1 column 1 - Warning: missing </table>
line 1 column 1 - Warning: inserting missing 'title' element
Info: Document content looks like HTML5
Tidy found 5 warnings and 0 errors!
<!DOCTYPE html>
<html>
<head>
<meta name="generator" content=
"HTML Tidy for HTML5 for Windows version 5.5.31">
<style type="text/css">
p { color: red }
</style>
<title></title>
</head>
<body>
<table>
<tr>
<td>
<p>red text</p>
</td>
</tr>
</table>
</body>
</html> So am still searching for an example that breaks my current patch... so I can work on it further... But still really puzzled why version Anyway, will work on this as time permits... thanks... PS: Found a test case that FAILED! in_567-7.html, and did some more fixes... Still testing, but copied a test http://geoffair.org/tmp/tidy-5.5.31.I567-3-w64-vc14-md.zip... hope you get the chance to test... hopefully reporting success... thanks... |
Hi @geoffmcl I will test the new version. Maybe the problem lies in the settings we use:
I will get back to you asap. |
Good news: the new version works 👍
Next, I will test more tricky scenarios and report my findings here asap. |
Me again: Run some more tests on files that contain style tags in divs nested up to five levels deep. Tidy moved all the encountered style tags to the head. @geoffmcl : Should we also tests files containing even deeper nested structures? Many thanks! |
@suchafreak thank you for testing the version 3, I had tested with your given See tidyXmlTags, where tidy will return from Reading around it seems But many
I do not think this is necessary. The way I have re-arranged to fix, it iterates through every tag in the I will shortly get around to pushing my latest changes to the But in general, starting to feel good about this fix... thanks... |
Add option TidyStyleTags, --fix-style-tags, Bool, to turn off this action. Add warning messages MOVED_STYLE_TO_HEAD, and FOUND_STYLE_IN_BODY. Fully iterate ALL nodes in the body, in search of style tags... Changes to be committed: modified: include/tidyenum.h modified: src/clean.c modified: src/config.c modified: src/language_en.h modified: src/message.c
All changes pushed to |
All tests on our side show success! This is very good news. Many thanks for your work. |
If it helps cement this decision any, it's not even "could be" anymore: Firefox devs announced plans a month ago to unship |
We just recently switched from old jTidy to tidy-html5 in our Java application. We rely on tidying to correct user provided html that can be quite messy.
Now, we hit a problem regarding style tags in the body: jTidy and the online tidy at https://infohound.net/tidy/ both move style tags found anywhere in the body to the head correctly. This is exactly what we we want!
Unfortunately, we cannot seem to get tidy-html5 to do the same: Tidying reports no errors even though there are several style tags nested in divs like so:
Unfortunately, they are not moved to the head as I would expect. (Again, jTidy and the online version at https://infohound.net/tidy/ do this).
Has this behavior changed or am I doing something wrong?
Any help is very much appreciated!
MTIA
-sascha
The text was updated successfully, but these errors were encountered: