-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Switch NameMangler to UTF-8 #122128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch NameMangler to UTF-8 #122128
Conversation
|
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the ILCompiler by switching the NameMangler from UTF-16 strings to UTF-8, eliminating unnecessary encoding conversions during compilation. The change reduces string allocations by approximately 101,000 instances when compiling a hello world application (from 403,000 to 302,000), though it increases byte array allocations slightly (from 529,000 to 552,000).
Key Changes:
- Modified
NameMangler.GetMangledTypeName()and related methods to returnUtf8Stringinstead ofstring - Implemented UTF-8 versions of name sanitization and disambiguation logic
- Added UTF-8 string writing overloads in debug info builders and section writers
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/tools/Common/Compiler/NameMangler.cs | Changed abstract method signatures from string to Utf8String for type and name mangling |
| src/coreclr/tools/Common/Compiler/NativeAotNameMangler.cs | Implemented UTF-8 versions of name mangling, sanitization, and disambiguation logic; added CountDigits helper for efficient numeric formatting |
| src/coreclr/tools/Common/Internal/Text/Utf8String.cs | Added Concat method to combine multiple UTF-8 strings without conversion |
| src/coreclr/tools/Common/Internal/Text/Utf8StringBuilder.cs | Added constructor accepting initial capacity parameter |
| src/coreclr/tools/Common/Compiler/ObjectWriter/SectionWriter.cs | Added WriteUtf8String(Utf8String) overload to write UTF-8 strings directly |
| src/coreclr/tools/Common/TypeSystem/TypesDebugInfoWriter/TypesDebugInfoWriter.cs | Changed interface methods and type descriptors to use Utf8String |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/Dwarf/DwarfInfoWriter.cs | Added WriteStringReference(Utf8String) overload for UTF-8 string references |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/Dwarf/DwarfBuilder.cs | Changed GetMangledName to return Utf8String |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/CodeView/CodeViewTypesBuilder.cs | Updated user-defined types to use Utf8String; added Write(Utf8String) overload |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/CodeView/CodeViewSymbolsBuilder.cs | Updated WriteUserDefinedTypes signature and added Write(Utf8String) overload |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/WindowsNodeMangler.cs | Added ToString() call to convert UTF-8 result to string for method table names |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/UnixNodeMangler.cs | Added ToString() call to convert UTF-8 result to string for method table names |
...coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/CodeView/CodeViewTypesBuilder.cs
Show resolved
Hide resolved
...reclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/CodeView/CodeViewSymbolsBuilder.cs
Show resolved
Hide resolved
agocke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Do we have any unit tests for the name mangler? If not I could have copilot give it a shot
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/UnixNodeMangler.cs
Outdated
Show resolved
Hide resolved
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/WindowsNodeMangler.cs
Outdated
Show resolved
Hide resolved
I validated this by comparing the output object file (it was byte identical). We don't have unit tests; we can try giving it a shot if copilot comes up with something useful. But name mangling is typically not touched so I can only see the unit test getting in the way if we ever change how we want to mangle things (it is an implementation detail and not a contract). A unit test would definitely make this PR more annoying to do. |
Removes some unnecessary UTF-8 -> UTF-16 -> UTF-8 conversions.
Before this change, when compiling hello world: 529000
byte[]allocations, 403000stringallocations.After this change: 552000
byte[]allocations, 302000stringallocations.Cc @dotnet/ilc-contrib