-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emoji skin modifiers don't work #942
Comments
I got it working with the following code:
However, doing this screws up the prompt as 2 characters (according to the shell) become one, which means you can go left into the We should fix that before going forward with supporting modifiers and the zero-width joiner. Not sure how best to do this, perhaps characters need some more metadata indicating how many characters are joined into the cell, then left/right arrow repeats the action that number of times? |
@Tyriar Imho this needs a more advanced character evaluation than xterm.js has atm - the current approach can only handle leading combining chars, while this one follows the char to be modified (http://unicode.org/reports/tr51/#Emoji_Modifiers_Table). The info about the correct rendering is in the sequence of chars (+ unicode spec), it is just a nightmare to cover all those edge cases (emoji modifiers were not part of the official spec until 2015). |
Oh I was wrong with my last statement, the input does not support leading but only following combining characters atm (combining chars with zero width will end up in the last active cell). So the emoji problem should be fixable by adding the modifiers to the wcwidth combining char table |
Ok the problem gets quite nasty - the correct handling of an emoji + modifier in the browser depends on the system font renderer. There is no platform independent way to solve this, seems this feature is to new. testing results:
Thats the point where the wcwidth implementation has to reassemble the system wcwidth to get correct widths. Only way I can think of to solve this is to prerender the emoji+modifier offline and measure the text width to see if it actually combines. If it does not combine the Unicode spec suggests to print them both or only the first one. The latter seems reasonable in a terminal environment to preserve the row width. |
Well it would be good to support it, I'm not exactly to keen on jumping on it atm though. The offscreen measuring thing looks like it would solve the problem. I'm not sure we would want to touch wcwidth though? Isn't that used to determine the width that the character consumes, not the number of characters in a single "cell"? |
Well wcwidth is part of the deal: currently combining chars end up in the last cell. This is because wcwidth determined it as a char with zero width. With systems treating emoji modifiers differently it does not work that easy anymore:
If you have a look at the Unicode spec above it gets even trickier with the ZERO WIDTH JOINER (http://unicode.org/reports/tr51/#Emoji_ZWJ_Sequences). Maybe its time to get the hands dirty with splitting up the stream into graphemes (see https://github.com/devongovett/grapheme-breaker) and feed those to wcwidth. wcwidth then needs a way to dynamically determine the width of unkown graphemes, which could be some offscreen thingy. NOTE - any implementation in that direction will have a big negative performance impact, since every char has to be inspected at least once (like now with wcwidth) and multiple times in case combining chars are following. |
Relating #1059 |
To align to the behavior of remote https://github.com/mintty/mintty/wiki/Tips#installing-emoji-resources |
@mandel59 Well yes and no. Problem behind is that xterm.js can run on a total different system than the backend due to the browser/network stack in between, and both systems involved can differ about the wcwidth and the final render output width (which is even worse because font capabilities dependent). Currently xterm.js's wcwidth is not frontend aware, it simply applies the same rules for all systems. But even being frontend system aware does not solve the problem: Imagine the following for char XY:
Which one to apply now? If we take the 1 of the backend the rendered char is still drawn as wide char due to the way the frontend wants to render it - the cursor ends up on top of or "in the middle" of the glyph. If we take the 2 of the frontend we violate the grid layout of the terminal and will run one cell short at the end of the line, because the pty on the backend side just saw the 1 there. Imho there is no real solution for this problem. In a perfect world wcwidth is the same for backend and frontend and we dont face this problem at all. For systems where the backend runs on the same system as the frontend (like a local terminal) it can be fixed this way, yes. For remote systems, it is a matter of priority - go with the backend for correct cell alignment with possible over/under draws or go with the frontend for correct draw but broken cell alignment? Note that most of the unicode chars did not change the wcwidth spec for years and do not create these hassles. The emojies were changed recently from a width of 1 to 2, therefore the problem. |
Background on this: Those composed characters are part of the grapheme parsing rules in Unicode, same as country flags and others. We will not see grapheme support anytime soon in xterm.js, as all cmdline programs get them wrong currently. Also note that they are specced to produce different outputs, technically the hand symbol + the box is correct. If the font renderer has a merged symbol for the codepoints it is free to use that prolly shorter symbol. That ambiguity is a nightmare for monospaced grids, but Unicode does not care much in this regard. @Tyriar Should we close this issue as wont fix? |
Pasting "👍🏻 👈🏿"
It looks like the skin modifier is being put into a new cell, when it should actually be added to the single emoji.
The text was updated successfully, but these errors were encountered: