Skip to content

Conversation

@maxammann
Copy link
Contributor

@maxammann maxammann commented Oct 23, 2020

This PR improves how libxml errors are handled. Because there can be a lot of duplicate errrors I decided to count similar errors and show how often it occurred. The log output for multiple failure looks like this:

{"reqId":"EYcnDGGmhWZ8kXC4Te7E","level":2,"time":"2020-10-25T18:02:51+00:00","remoteAddr":"2a02:810d:640:1e52:3f8c:68c2:b41d:b462","user":"max","app":"no app in context","method":"POST","url":"/index.php/apps/cookbook/import","message":"libxml: Error 68 occurred 907 times while parsing https://www.chefkoch.de/rezepte/18711004787789/Risotto-mit-gruenem-Spargel-und-Parmesan.html. Last time in line 7728 and column 119: htmlParseEntityRef: no name\n","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0","version":"19.0.4.2"}
{"reqId":"EYcnDGGmhWZ8kXC4Te7E","level":2,"time":"2020-10-25T18:02:51+00:00","remoteAddr":"2a02:810d:640:1e52:3f8c:68c2:b41d:b462","user":"max","app":"no app in context","method":"POST","url":"/index.php/apps/cookbook/import","message":"libxml: Error 801 occurred 329 times while parsing https://www.chefkoch.de/rezepte/18711004787789/Risotto-mit-gruenem-Spargel-und-Parmesan.html. Last time in line 10225 and column 50: Tag amp-install-serviceworker invalid\n","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0","version":"19.0.4.2"}
{"reqId":"EYcnDGGmhWZ8kXC4Te7E","level":2,"time":"2020-10-25T18:02:51+00:00","remoteAddr":"2a02:810d:640:1e52:3f8c:68c2:b41d:b462","user":"max","app":"no app in context","method":"POST","url":"/index.php/apps/cookbook/import","message":"libxml: Error 42 occurred 3 times while parsing https://www.chefkoch.de/rezepte/18711004787789/Risotto-mit-gruenem-Spargel-und-Parmesan.html. Last time in line 5041 and column 109: Attribute parentid redefined\n","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0","version":"19.0.4.2"}
{"reqId":"EYcnDGGmhWZ8kXC4Te7E","level":2,"time":"2020-10-25T18:02:51+00:00","remoteAddr":"2a02:810d:640:1e52:3f8c:68c2:b41d:b462","user":"max","app":"no app in context","method":"POST","url":"/index.php/apps/cookbook/import","message":"libxml: Error 23 occurred 333 times while parsing https://www.chefkoch.de/rezepte/18711004787789/Risotto-mit-gruenem-Spargel-und-Parmesan.html. Last time in line 10210 and column 96: htmlParseEntityRef: expecting ';'\n","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0","version":"19.0.4.2"}
{"reqId":"EYcnDGGmhWZ8kXC4Te7E","level":2,"time":"2020-10-25T18:02:51+00:00","remoteAddr":"2a02:810d:640:1e52:3f8c:68c2:b41d:b462","user":"max","app":"no app in context","method":"POST","url":"/index.php/apps/cookbook/import","message":"libxml: Error 76 occurred 10 times while parsing https://www.chefkoch.de/rezepte/18711004787789/Risotto-mit-gruenem-Spargel-und-Parmesan.html. Last time in line 6509 and column 7: Unexpected end tag : div\n","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0","version":"19.0.4.2"}

The problem with the handling before this is that nextcloud alwas logs the parameters of the function where an libxml error occurred. This meany in the above example the $html is logged a few hundred times.

@christianlupus
Copy link
Collaborator

First of all: Thank you for contributing to the app!

For the record: This should solve #344.

Two comments on your contribution, @maxammann:

  1. You should signoff your commit. Depending on your local envorinmant this is done by git commit -s. As only one commit has been made, you might be able to git commit --amend -s && git push --force-with-lease. If you are using a GUI of your IDE you will have to look up how this needs to be used.
  2. Just a hint for future work: You had many changes of empty lines. In fact you removed the indentations of these lines. Of course, this introduces no change in the functionality but might cause merge conflicts with other peoples' work. If your IDE is doing this "behind your back", you might want to add only the relevant lines to your commits to keep the history of the git repository cleaner.

Thank you very much.

@christianlupus christianlupus linked an issue Oct 24, 2020 that may be closed by this pull request
Signed-off-by: Maximilian Ammann <[email protected]>
@maxammann
Copy link
Contributor Author

@christianlupus thanks for the heads up :) changes for new lines have been reverted and I signed that commit off. There's still some testing I need to do

Signed-off-by: Maximilian Ammann <[email protected]>
@maxammann maxammann marked this pull request as ready for review October 25, 2020 18:06
@christianlupus
Copy link
Collaborator

All in all, this looks good to me. I will merge it and in case any issues arise, we might solve the bug then.

@christianlupus christianlupus merged commit a9aa1b8 into nextcloud:master Oct 26, 2020
@maxammann maxammann deleted the habdle-libxml-errors branch October 28, 2020 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Logs gets spammed

2 participants