Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robustness to control chars #205

Open
bittlingmayer opened this issue Jun 27, 2020 · 0 comments
Open

Robustness to control chars #205

bittlingmayer opened this issue Jun 27, 2020 · 0 comments

Comments

@bittlingmayer
Copy link

Certain input causes the error not well-formed (invalid token).

(That's the value of e in parser.on('error', e...).)

If we use bash's built-in xmllint, the error message is more revealing:

parser error : PCDATA invalid Char value 8

ASCII char 8 is of course a control char, Backspace.

Is this expected? Or is there an option to let the parser handle or ignore such segments?

Right now a large file can fail cryptically just because 1 or 2 segments in a million have this character which is like any ASCII not especially exotic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant