-
-
Notifications
You must be signed in to change notification settings - Fork 108
Fix JUnit reporter crash on invalid XML characters in test names and exceptions #4262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…on method Co-authored-by: thomhurst <[email protected]>
…ssage Co-authored-by: thomhurst <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical bug where the JUnit reporter crashes with ArgumentException when test arguments or exception messages contain invalid XML 1.0 characters (such as \x04), which was terminating the entire test run.
Key Changes:
- Added a
SanitizeForXmlmethod that validates characters against the XML 1.0 specification and replaces invalid characters with hex notation[0xNN] - Applied sanitization to all string outputs in the JUnit XML writer including test names, class names, exception messages, stack traces, and metadata
- Added integration tests to verify the fix works for various scenarios including multiple invalid characters and exception messages
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| TUnit.Engine/Xml/JUnitXmlWriter.cs | Implements SanitizeForXml method and applies it to all XML string outputs to prevent crashes from invalid XML characters |
| TUnit.TestProject/JUnitReporterInvalidXmlCharacterTests.cs | Adds integration tests covering invalid XML characters in test arguments and exception messages |
Comments suppressed due to low confidence (1)
TUnit.Engine/Xml/JUnitXmlWriter.cs:350
- These 'if' statements can be combined.
if (timing.GlobalTiming.StartTime is { } startTime)
{
if (earliestStartTime is null || startTime < earliestStartTime)
{
earliestStartTime = startTime;
}
}
| foreach (var ch in value!) | ||
| { | ||
| // Check if character is valid according to XML 1.0 spec | ||
| if (ch == 0x9 || ch == 0xA || ch == 0xD || | ||
| (ch >= 0x20 && ch <= 0xD7FF) || | ||
| (ch >= 0xE000 && ch <= 0xFFFD)) | ||
| { | ||
| builder.Append(ch); | ||
| } | ||
| else | ||
| { | ||
| // Replace invalid character with its hex representation | ||
| builder.Append($"[0x{((int)ch):X}]"); | ||
| } |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SanitizeForXml method does not correctly handle Unicode surrogate pairs. When iterating character-by-character, characters in the range U+10000 to U+10FFFF (which are valid per XML 1.0 spec) are represented as two char values (high and low surrogates). The current implementation will incorrectly process each surrogate individually, potentially replacing valid characters with hex notation. Consider using StringInfo or Rune enumeration to properly handle multi-char Unicode sequences.
| foreach (var ch in value!) | |
| { | |
| // Check if character is valid according to XML 1.0 spec | |
| if (ch == 0x9 || ch == 0xA || ch == 0xD || | |
| (ch >= 0x20 && ch <= 0xD7FF) || | |
| (ch >= 0xE000 && ch <= 0xFFFD)) | |
| { | |
| builder.Append(ch); | |
| } | |
| else | |
| { | |
| // Replace invalid character with its hex representation | |
| builder.Append($"[0x{((int)ch):X}]"); | |
| } | |
| var length = value!.Length; | |
| for (var i = 0; i < length; i++) | |
| { | |
| var ch = value[i]; | |
| int codePoint; | |
| var charsConsumed = 1; | |
| // Handle surrogate pairs to correctly process characters in the range U+10000 to U+10FFFF | |
| if (char.IsHighSurrogate(ch) && i + 1 < length && char.IsLowSurrogate(value[i + 1])) | |
| { | |
| codePoint = char.ConvertToUtf32(ch, value[i + 1]); | |
| charsConsumed = 2; | |
| } | |
| else | |
| { | |
| codePoint = ch; | |
| } | |
| // Check if character is valid according to XML 1.0 spec | |
| var isValid = | |
| codePoint == 0x9 || | |
| codePoint == 0xA || | |
| codePoint == 0xD || | |
| (codePoint >= 0x20 && codePoint <= 0xD7FF) || | |
| (codePoint >= 0xE000 && codePoint <= 0xFFFD) || | |
| (codePoint >= 0x10000 && codePoint <= 0x10FFFF); | |
| if (isValid) | |
| { | |
| if (charsConsumed == 1) | |
| { | |
| builder.Append((char)codePoint); | |
| } | |
| else | |
| { | |
| // Append the original surrogate pair | |
| builder.Append(ch); | |
| builder.Append(value[i + 1]); | |
| } | |
| } | |
| else | |
| { | |
| // Replace invalid character (or unpaired surrogate) with its hex representation | |
| builder.Append($"[0x{codePoint:X}]"); | |
| } | |
| if (charsConsumed == 2) | |
| { | |
| i++; | |
| } |
| var builder = new StringBuilder(value!.Length); | ||
| foreach (var ch in value!) |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The redundant null-forgiving operators (!) on lines 26 and 27 are unnecessary since the check on line 20 already confirms the value is not null or empty. The value cannot be null at this point in the execution.
| var builder = new StringBuilder(value!.Length); | |
| foreach (var ch in value!) | |
| var builder = new StringBuilder(value.Length); | |
| foreach (var ch in value) |
|
|
||
| namespace TUnit.TestProject; | ||
|
|
||
| public class JUnitReporterInvalidXmlCharacterTests |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test class is missing Category attributes. According to TUnit's testing approach, tests in TUnit.TestProject should be properly categorized (e.g., "Pass" or "Fail") to enable filtered test execution. The TestFailingWithInvalidCharacterInException test intentionally fails and should have a [Category("Fail")] attribute. The other tests that pass should have [Category("Pass")] attributes. This enables proper filtering when running the test suite.
| { | ||
| writer.WriteAttributeString("message", exception.Message); | ||
| writer.WriteAttributeString("message", SanitizeForXml(exception.Message)); | ||
| writer.WriteAttributeString("type", exception.GetType().FullName ?? "AssertionException"); |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception type names from exception.GetType().FullName are written to XML without sanitization. While standard .NET type names won't contain invalid XML characters, custom exception types could theoretically have non-ASCII or problematic characters in their namespace or type names. For consistency and defensive programming, these should also be sanitized.
| { | ||
| writer.WriteAttributeString("message", exception.Message); | ||
| writer.WriteAttributeString("message", SanitizeForXml(exception.Message)); | ||
| writer.WriteAttributeString("type", exception.GetType().FullName ?? "Exception"); |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception type names from exception.GetType().FullName are written to XML without sanitization. While standard .NET type names won't contain invalid XML characters, custom exception types could theoretically have non-ASCII or problematic characters in their namespace or type names. For consistency and defensive programming, these should also be sanitized.
| private static string SanitizeForXml(string? value) | ||
| { | ||
| if (string.IsNullOrEmpty(value)) | ||
| { | ||
| return string.Empty; | ||
| } | ||
|
|
||
| // At this point, value is guaranteed to not be null | ||
| var builder = new StringBuilder(value!.Length); | ||
| foreach (var ch in value!) | ||
| { | ||
| // Check if character is valid according to XML 1.0 spec | ||
| if (ch == 0x9 || ch == 0xA || ch == 0xD || | ||
| (ch >= 0x20 && ch <= 0xD7FF) || | ||
| (ch >= 0xE000 && ch <= 0xFFFD)) | ||
| { | ||
| builder.Append(ch); | ||
| } | ||
| else | ||
| { | ||
| // Replace invalid character with its hex representation | ||
| builder.Append($"[0x{((int)ch):X}]"); | ||
| } | ||
| } | ||
|
|
||
| return builder.ToString(); | ||
| } |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SanitizeForXml method lacks dedicated unit tests. While integration tests exist in TUnit.TestProject, there should be unit tests in TUnit.Engine.Tests that directly test the XML sanitization logic with various edge cases (null, empty, valid control characters, invalid characters, surrogate pairs, boundary conditions). This would ensure the sanitization function works correctly in isolation.
The JUnit reporter crashes with
ArgumentExceptionwhen test arguments or exception messages contain invalid XML 1.0 characters (e.g.,\x04), terminating the test run.Changes
SanitizeForXmlmethod to validate characters against XML 1.0 spec and replace invalid characters with hex notation[0xNN]Example
Before (crashes):
After (generates valid XML):
Exception messages are similarly sanitized:
Original prompt
This section details on the original issue you should resolve
<issue_title>[Bug]: JUnit reporter crashes if test arguments contains an "invalid character"</issue_title>
<issue_description>### Description
When running a test of the form
The test runner will crash with the following exception:
Error output: Unhandled exception in AppDomain: System.ArgumentException: '�', hexadecimal value 0x04, is an invalid character.Expected Behavior
The invalid character should be escaped in the resulting name / the runner should not crash.
Actual Behavior
The runner crashed when running with the JUnit reporter active. The tests behave fine when running from an IDE or without reporting.
Steps to Reproduce
dotnet test --test-modules <Path to test> --report-trx --results-directory <Path to results>TUnit Version
1.9.26
.NET Version
.NET 10
Operating System
Linux
IDE / Test Runner
dotnet CLI (dotnet test / dotnet run)
Error Output / Stack Trace
Add...
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.