You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Updated the Sanitize method in LogHelper.cs to use SearchValues<char> for improved performance on .NET 8+, while maintaining full backward compatibility with older frameworks.
Changes Made
Added conditional using System.Buffers directive for .NET 8+
Created static SearchValues<char> field with all 108 control and format characters
Implemented efficient SearchValues-based sanitization for .NET 8+
Maintained original implementation as fallback for older frameworks
Preserved identical output format (\r, \n, \t, \uXXXX)
Added 15 comprehensive tests covering all character types and edge cases
Fixed encoding issues on lines 462 and 482 (preserved original file encoding)
Added targeted benchmark for log sanitization performance
Benchmark Added
Created LogSanitizationBenchmarks.cs with the following scenarios:
Baseline: String with no special characters (tests early return optimization)
Common control chars: Strings with \r, \n, \t
Unicode format chars: Zero-width spaces, directional marks
Mixed chars: Combination of control and format characters
Long strings: Both with and without special characters to test performance at scale
Run with: dotnet run -c release -f net9.0 --filter Microsoft.IdentityModel.Benchmarks.LogSanitizationBenchmarks*
Character Set (108 total)
All ASCII control characters (U+0000-U+001F, U+007F-U+009F)
All Unicode format characters (Cf category)
Including: zero-width spaces, directional marks, and other format characters
Testing
All 72 tests pass on .NET 8 (57 original + 15 new)
All 72 tests pass on .NET 9
Output format verified identical to previous implementation
Backward compatibility maintained for all target frameworks
Benchmark builds successfully for net6.0, net8.0, and net9.0
File encoding preserved on lines 462 and 482
Original prompt
This section details on the original issue you should resolve
<issue_title>Log sanitization with use of SearchValues v1</issue_title>
<issue_description>Problem Statement.
We want to update the Sanitize method:
to utilize SearchValues. All values which need to be sanitized can be added to a SearchValues collection.
Specific Characters to Sanitize:
'\r' (Carriage Return, U+000D)
'\n' (Line Feed, U+000A)
'\t' (Tab, U+0009)
All other ASCII control characters: U+0000-U+0008, U+000B-U+000C, U+000E-U+001F, U+007F-U+009F
All Unicode characters where char.IsControl(c) is true (Unicode category Cc)
All Unicode characters where CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.Format (Unicode category Cf), e.g.:
U+200B (Zero Width Space)
U+200C (Zero Width Non-Joiner)
U+200D (Zero Width Joiner)
U+200E (Left-to-Right Mark)
U+200F (Right-to-Left Mark)
U+202A-U+202E (Directional formatting)
U+2060-U+206F (Various format characters)
U+FEFF (Zero Width No-Break Space)
Implementation Clarifications:
The referenced SanitizeEntryFromFilePath method from dotnet/runtime should be used as a performance pattern reference only. The actual character set for sanitization should follow this issue's explicit enumeration.
The new implementation must produce identical output to the current method, including string formatting (e.g., using "\u{(int)c:X4}" for control/format characters).
Steps:
Update the Sanitize method to have a SearchValues collection that includes the above values. The list of control and format characters should be fully enumerated and hardcoded into a static SearchValues collection.
Update tests accordingly. If tests to cover the following edge cases are not present, add them: Strings with no sanitizable characters (performance fast path), Strings with only ASCII control characters, Strings with Unicode format characters. Very long strings, and Null/empty string handling
Only update the Sanitize method as described above; do not update any other methods.</issue_description>
Comments on the Issue (you are @copilot in this section)
CopilotAI
changed the title
[WIP] Update Sanitize method to use SearchValues for sanitization
Optimize log sanitization with SearchValues for improved performance
Oct 8, 2025
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Updated the
Sanitizemethod inLogHelper.csto useSearchValues<char>for improved performance on .NET 8+, while maintaining full backward compatibility with older frameworks.Changes Made
using System.Buffersdirective for .NET 8+SearchValues<char>field with all 108 control and format charactersBenchmark Added
Created
LogSanitizationBenchmarks.cswith the following scenarios:Run with:
dotnet run -c release -f net9.0 --filter Microsoft.IdentityModel.Benchmarks.LogSanitizationBenchmarks*Character Set (108 total)
Testing
Original prompt
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.