Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect reversal of the U+0489 character #14

Open
lunakurame opened this issue Feb 8, 2018 · 1 comment
Open

Incorrect reversal of the U+0489 character #14

lunakurame opened this issue Feb 8, 2018 · 1 comment

Comments

@lunakurame
Copy link

I've got a string with this character:
҉ U+0489 COMBINING CYRILLIC MILLIONS SIGN

Before reversing: te҉st te\u0489st
After reversing: ts҉et ts\u0489et

I might be wrong, but I expected tse҉t tse\u0489t instead. Is there a reason why it behaves like this or is it just a bug? I found it when my unit tests failed while checking my code using random zalgo examples.

@mathiasbynens
Copy link
Owner

mathiasbynens commented Feb 12, 2018

This is an example of a symbol with the Grapheme_Extend property. IIUC they can only really follow Grapheme_Base symbols, but for esrever’s purposes we could probably just treat them like regular combining marks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants