Combining characters not supported #6

jepler · 2017-05-09T01:58:11Z

For instance, the unicode sequence "x\u0300" renders in a standard Unicode terminal using a single character cell, but string-width counts it as 2. (tested in xfce4-terminal and rxvt-unicode on Debian Jessie)

This could be represented as a (non-passing) test:

t.is(m('x\u0300'), 1);

The text was updated successfully, but these errors were encountered:

sindresorhus · 2017-05-09T06:19:21Z

Can you open this on https://github.com/sindresorhus/is-fullwidth-code-point instead? It's the module actually doing the guessing. It uses a list of codepoints from the Unicode spec. So I'm surprised it doesn't work.

jepler · 2017-05-09T11:46:15Z

With respect, I don't think the bug is in is-fullwidth-code-point. "x" is not a full-width code point, and "\u0300" is not a full-width code point, so is-fullwidth-code-point is working as designed when it returns false for those inputs.

Rather, the algorithm in this module is incomplete because it does not have correct (any) treatment for Unicode characters with the category "Mn" (or Mc or Me).

You might want to study https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c for inspiration.

As the author of both the github repos / npm modules in question, I'll leave the decision to you about how to handle it, rather than opening a new issue right now. Thanks for your time.

sindresorhus · 2017-05-09T12:26:59Z

Sorry, you're right. I was a bit quick with this one... I've added a failing test.

jepler · 2017-07-18T21:20:15Z

@gucong3000 thanks for your work to improve this. @sindresorhus thanks for accepting #12.

While the range you name in your patch is a range of combining characters, I think it does not cover all combining characters in Unicode. For example, consider the sequence of code points “◌᷀” (U+25cc, dotted circle + U+1dc0, combinning dotted grave accent)

This implementation of wcwidth for Unicode has a list of sorted, non-overlapping ranges of combining characters (search for interval combining), as well as a Unix commandline to build such a table from uniset.

sindresorhus closed this as completed May 9, 2017

sindresorhus reopened this May 9, 2017

sindresorhus added a commit that referenced this issue May 9, 2017

Add failing test for #6

130d0ad

sindresorhus added bug help wanted labels May 9, 2017

gucong3000 mentioned this issue Jul 13, 2017

Support combining characters #12

Merged

sindresorhus closed this as completed in #12 Jul 18, 2017

mknj mentioned this issue Jun 7, 2022

thai combining charactern not working #43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combining characters not supported #6

Combining characters not supported #6

jepler commented May 9, 2017

sindresorhus commented May 9, 2017 •

edited

Loading

jepler commented May 9, 2017

sindresorhus commented May 9, 2017

jepler commented Jul 18, 2017

Combining characters not supported #6

Combining characters not supported #6

Comments

jepler commented May 9, 2017

sindresorhus commented May 9, 2017 • edited Loading

jepler commented May 9, 2017

sindresorhus commented May 9, 2017

jepler commented Jul 18, 2017

sindresorhus commented May 9, 2017 •

edited

Loading