Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Make it possible for the parser to successfully parse anny XML document that contains characters that are invalid according to the XML spec.
A document that contains a character entity reference that resolves to an invalid character according to the XML spec – for example
<doc></doc>
– is technically invalid according to the XML specification. However, replacing these with U+FFFD is both better for the consuming application (because the application does not need to do some weird pre-processing of the document before parsing it). This takes previous work that already handled this for surrogate pairs and extends it to any invalid unicode character.This, similar to the previous work, technically makes the XML parser non-conformant, so put it behind the existing
replace_unknown_entity_references
field on ParserConfig.Previous context at: