Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems formatting HTML #147

Closed
hugoqribeiro opened this issue Mar 12, 2021 · 5 comments
Closed

Problems formatting HTML #147

hugoqribeiro opened this issue Mar 12, 2021 · 5 comments

Comments

@hugoqribeiro
Copy link

Hi.

I have the following code to format a string containing HTML:

Dictionary<string, object> parameters = new Dictionary<string, object>()
{
    ["IDSBrand"] = "MyBrand",
    ["IDSColor1"] = "MyColor",
    ["IDSSalutation"] = "MySalutation",
    ["IDSBody"] = "MyBody",
    ["IDSActionLink"] = "MyActionLink",
    ["IDSAction"] = "MyAction",
    ["IDSTimestampDescription"] = "MyTimestampDescription",
    ["IDSCopyrightDescription"] = "MyCopyrightDescription",
    ["NSTimestampEN"] = "MyTimestamp"
};

string inputFile = File.ReadAllText(Path.Combine(Environment.CurrentDirectory, "Files\\InputHtml.txt"));

SmartFormatter formatter = Smart.CreateDefaultSmartFormat();
formatter.Settings.CaseSensitivity = CaseSensitivityType.CaseInsensitive;
formatter.Settings.ParseErrorAction = ErrorAction.MaintainTokens;
formatter.Settings.FormatErrorAction = ErrorAction.MaintainTokens;
formatter.Settings.ConvertCharacterStringLiterals = false;

string outputFile = formatter.Format(inputFile, parameters);

File.WriteAllText(Path.Combine(Environment.CurrentDirectory, "Files\\OutputHtml262.txt"), outputFile);

If I run this with version 2.6.2 the formatter does nothing (no replacements).

If I run this with version 2.5.2 the formatter works more or less correctly (does the replacements but it introduces some errors in the HTML).

This is visible with the following files:

InputHtml.txt - the input HTML
OutputHtml262.txt - the output produced with 2.6.2
OutputHtml252.txt - the output produced with 2.5.2
OutputHtml.txt - the expected output

Comparing 2.6.2 with the expected output... no replacements at all.
Comparing 2.5.2 with the expected output... replacements OK but errors in HTML (e.g. line 68, col 9).

Is this a bug? Something that I need to change in the upgrade from 2.5.2 to 2.6.2?

@axunonb
Copy link
Member

axunonb commented Mar 12, 2021

Parsing HTML works fine, except - like in your case - when it includes CSS styles and JavaScript. Both have curly braces in their syntax, which Smart.Format interprets as { placeholders }. This did not work with any Smart.Format version.
You can do the following:

  1. Replace the CSS style section of your HTML with a placeholder for Smart.Format (e.g. { CssPlaceholder }) and fill it with the content of a variable.
  2. Load the CSS styles and JavaScripts from separate files (most simple)
  3. Use AngleSharp to exclude style and script tags (e.g. only work with Smart.Format on the body tag)

@hugoqribeiro
Copy link
Author

The HTML is user input, including the variables to replace. It makes it hard to implement your suggestions. :)

Did you see the difference between version 2.5.2 and 2.6.2?
One would expect at least the same behavior, given that I'm using MaintainTokens.

@axunonb
Copy link
Member

axunonb commented Mar 13, 2021

If it is all user input, it takes only few lines of code to solve with AngleSharp:

using AngleSharp;
using AngleSharp.Html.Dom;
using AngleSharp.Html.Parser;
// Create a new parser front-end (can be re-used)
var parser = new HtmlParser();
// Get the DOM representation
IHtmlDocument htmlDocument = parser.ParseDocument(yourUserHtmlInput);
htmlDocument.Body.InnerHtml = Smart.Format(htmlDocument.Body.InnerHtml, yourDataItems);
// This gets the complete HTML as a string
var result = htmlDocument.ToHtml();

Hope this helps.

I did not analyze the differences between versions, because HTML containing CSS styles and/or JavaScript cannot be processed reliably with Smart.Format as per the curly braces both are using.

@hugoqribeiro
Copy link
Author

Thank you @axunonb

@axunonb
Copy link
Member

axunonb commented Mar 14, 2021

Welcome

@axunonb axunonb closed this as completed Mar 14, 2021
axunonb added a commit that referenced this issue Apr 2, 2021
axunonb added a commit that referenced this issue Apr 2, 2021
axunonb added a commit that referenced this issue Apr 9, 2021
* Reference to issues #148, #147, #143

* Reference to issues #148, #147, #143

* Updated README.md

* Fix for #149 (comment)
axunonb added a commit that referenced this issue Apr 10, 2021
* Reference to issues #148, #147, #143
* Updated README.md
* Fix for #149 (comment)
* Update CHANGES.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants