-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Fix ArgumentOutOfRangeException in XmlReader when parsing malformed UTF-8 sequences #118081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: jeffhandley <[email protected]>
src/libraries/System.Private.Xml/tests/XmlReader/Tests/ReaderEncodingTests.cs
Outdated
Show resolved
Hide resolved
|
@krwq What do you think of this fix approach. On the surface, it seems to be potentially a broader change than this specific scenario, which worries me. But at the same time, it does seem like a logical fix. |
| } | ||
| else if (bytesLeft < 0) | ||
| { | ||
| // This can happen when encoding switch causes bytePos to be calculated incorrectly for malformed data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to detect this situation earlier to avoid introducing invalid ParsingState in the first place?
It is hard to reason about what all else can misbehave due to invalid ParsingState.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree and that was my hesitation in this comment too. It was worth a shot seeing if Copilot could tackle this, but I'm going to close the PR and leave the issue open for further investigation later.
|
Closing as this fix approach is questionable. |
This PR fixes an issue where
XmlReader.Create(stream)throws an undocumentedArgumentOutOfRangeExceptioninstead of the expectedXmlExceptionwhen parsing malformed XML containing invalid UTF-8 sequences in the XML declaration.Problem
When parsing XML like
<?xml version="1.0\xbf"?>from aMemoryStream, the following exception was thrown:This is problematic because:
ArgumentOutOfRangeExceptionis not documented forXmlReadermethodsXmlExceptionRoot Cause
The issue occurs in
XmlTextReaderImpl.ReadData()when:UnDecodeChars()calculates_ps.bytePosusing_ps.encoding.GetByteCount()for malformed UTF-8 sequences_ps.bytePosbecomes greater than_ps.bytesUseddue to encoding issuesbytesLeft = _ps.bytesUsed - _ps.bytePosbecomes negative (-2)Buffer.BlockCopy(), causing the exceptionSolution
Added bounds checking in
XmlTextReaderImpl.ReadData()to detect whenbytesLeftis negative and throw an appropriateXmlExceptionwith the message "Invalid character in the given encoding" instead of allowing the negative value to reachBuffer.BlockCopy().The fix is minimal and surgical - it only adds validation where the problem occurs without changing broader parsing logic.
Testing
Added a regression test
ReadWithMalformedUtf8InXmlDeclaration()that verifies:ArgumentOutOfRangeExceptionXmlExceptionis thrown insteadFixes #113061.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.