-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
The short time format on Fedora 38 has replaced its breaking space by a 'NARROW NO-BREAK SPACE' (U+202F).
That space gets removed by:
runtime/src/libraries/System.Private.CoreLib/src/System/Globalization/CultureData.Icu.cs
Line 225 in 34c0472
| private static string ConvertIcuTimeFormatString(ReadOnlySpan<char> icuFormatString) |
Any character not explicitly recognized by this function (like U+202F) gets removed.
This function includes a specific case for a regular 'NO-BREAK SPACE' (U+00A0):
runtime/src/libraries/System.Private.CoreLib/src/System/Globalization/CultureData.Icu.cs
Lines 261 to 264 in 34c0472
| case '\u00A0': | |
| // Convert nonbreaking spaces into regular spaces | |
| result[resultPos++] = ' '; | |
| break; |
There are two ways to fix this:
- Instead of removing unknown characters, we pass characters (including these non-breaking spaces) as is.
- Or, we add
U+202Fso it also gets converted to a regular space.
I don't know why the current implementation opted for the second option for U+00A0.
I think it may be to have the same time format as on Windows under en_US.
I have a slight preference for the first option, because the second is overwriting part of the format information from icu. And, the second option would have prevented this issue from occurring.
What is the preferred option?
cc @omajid