-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string.IndexOf bug when using Thai culture #75616
Comments
Most likely: Globalization APIs use ICU libraries on Windows 10 (starting with .NET 5). |
Tagging subscribers to this area: @dotnet/area-system-globalization Issue DetailsHow to reproduce: Create a net472 Console project and paste the following code:
Output (as expected): -1 Switch the .csproj to net6 TFM and run again. Output (not expected): 0
|
Thai language has specific collation behavior which you are seeing here. It treats some characters like I am closing the issue but feel free to send any question and we'll be happy to help answering it. |
@tarekgh, thanks for the explanation which makes a lot of sense. Thanks even more for the workaround which we will need to apply since we do not own the problematic source code (https://github.com/ClosedXML/ClosedXML/blob/78150efbbd4a36d65e95ef3c793f12feb12c1a9c/ClosedXML/Excel/XLWorkbook_Load.cs#L1249). I realize that you've answered very similar questions already here: ...and I suspect that tons of applications will stop functioning in Thailand (and probably elsewhere, too) as we speak due to this change... In case anyone reading this cares, here's the Thai alphabet in UTF-8 (nota bene it does indeed lack the square bracket or any other special character...): https://www.utf8-chartable.de/unicode-utf8-table.pl?start=3584&number=128&utf8=0x |
Thanks @dnickless for the feedback.
I have opened issue for such library to get this fixed in their side. ClosedXML/ClosedXML#1862. If you see similar issues in some other places, I suggest you open issues for such cases or contact us and we can follow up.
Unicode lists different languages, and it is not necessary to add all ascii characters to the language character list. But the collation for this language decides what would be the behavior when using such characters from the ascii range. |
How to reproduce:
Create a net472 Console project and paste the following code:
Output (as expected): -1
Switch the .csproj to net6 TFM and run again.
Output (not expected): 0
The text was updated successfully, but these errors were encountered: