Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode character collation properties and ordering #117

Open
ericpruitt opened this issue Jan 24, 2018 · 2 comments
Open

Unicode character collation properties and ordering #117

ericpruitt opened this issue Jan 24, 2018 · 2 comments

Comments

@ericpruitt
Copy link

I'm investigating implementing the Unicode Collation Algorithm using utf8proc for normalization and character property information. I know different locales use different collation rules, but the standard supplies DUCET, the Default Unicode Collation Element Table which generally produces much more palatable results than something like a naive wcscmp(3). Would the y'all consider accepting a patch implementing a collation comparison function and adding DUCET data to the property tables?

@stevengj
Copy link
Member

That seems reasonable.

@vadz
Copy link

vadz commented Aug 23, 2021

Just curious, has there been work on this since then or any plans to add collation support to utf8proc? It would be great to have it, as right now using ICU is required to sort things correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants