Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Chinese characters displayed as Japanese glyphs on macos #1637

Closed
hans-x opened this issue Mar 31, 2024 · 9 comments · Fixed by #1734
Closed

Some Chinese characters displayed as Japanese glyphs on macos #1637

hans-x opened this issue Mar 31, 2024 · 9 comments · Fixed by #1734
Assignees
Labels

Comments

@hans-x
Copy link
Collaborator

hans-x commented Mar 31, 2024

image

@hans-x hans-x changed the title Some Chinese characters displayed as Japanese glyphs on Some Chinese characters displayed as Japanese glyphs on macos Mar 31, 2024
@hans-x
Copy link
Collaborator Author

hans-x commented Mar 31, 2024

直径

@qwerasd205
Copy link
Collaborator

These are not explicitly Chinese characters, these are characters in the Unicode CJK Unified ideograph block. What you are observing is the same characters being rendered in different fonts. Whatever font Ghostty is picking as the fallback for these characters happens to use the shinjitai forms of these glyphs rather than the simplified forms. This could maybe be considered a bug depending on if a strange/unexpected font is being picked by the font discovery for the fallback. As a fix that you can apply yourself either way, I believe you can configure your font family to explicitly specify a fallback font that uses the desired forms.

See https://en.wiktionary.org/wiki/%E5%BE%84#Alternative_forms for info on the alternative form(s) situation.

@qwerasd205
Copy link
Collaborator

qwerasd205 commented Mar 31, 2024

To specify a fallback for your font family configuration, simply include the key a second time, e.g.

font-family = "JetBrains Mono"
font-family = "Hei Regular"

to use JetBrains Mono as your primary font and fall back to Hei Regular (which I believe should have Chinese Simplified forms for CJK Unified Ideograph glyphs) for glyphs not present in JetBrains Mono.

Obviously, you may choose whatever fonts you like best for your main and fallback fonts, the above is just an example.

@mitchellh
Copy link
Contributor

Thanks @qwerasd205.

I agree with that assessment. In the eyes of Ghostty, we're seeing a Unicode codepoint and asking the system what font can satisfy that codepoint. We do this one codepoint at a time, so we're getting it appears a Japanese font first from the system since... codepoints are codepoints.

If you want to ensure that a certain font is used, please do what @qwerasd205 suggested.

If algorithmically you (or anyone!) can suggest something better we can do to improve this, I'd also be happy to hear it!

@hans-x
Copy link
Collaborator Author

hans-x commented Apr 1, 2024

20240401-101553.mp4

@mitchellh
Copy link
Contributor

Thanks, let me take a closer look at this. I wonder if there's some locale information or something that we can pass through to the OS APIs to get them to return a different preferred language. The codepoints are just plain old codepoints but maybe there some metadata macOS is giving us that I'm not using yet.

@mitchellh mitchellh reopened this Apr 1, 2024
@mitchellh mitchellh added bug os/macos needs-confirmation A reproduction has been reported, but the bug hasn't been confirmed or reproduced by a maintainer. labels Apr 1, 2024
@mitchellh
Copy link
Contributor

mitchellh commented May 5, 2024

Okay, I've had some time to investigate more.

If I configure Japanese IME first (in Settings.app) then Chinese, and never use Japanese input, the characters show up as Japanese characters even from Chinese input. This is in Terminal.app and Alacritty. However, if I configure Chinese first then Japanese IME (again never use Japanese input), the characters show up as Chinese characters. If you try this, you must restart your apps in between.

There seems to be some preferred language order search happening in CoreText that Ghostty is not leveraging. We do our own font sorting that's probably it. I need to do more research.

@mitchellh
Copy link
Contributor

The issue appears to be that we're doing custom fallback search rather than using CTFontForString. CTFontForString returns Chinese as a preference when the locale is English but the issue I have is that our custom search does a better job finding monospace fonts. Still, I'd prefer to match the fallback behavior of other macOS programs by default and as @qwerasd205 pointed out, users can override this...

Looking into it.

@mitchellh mitchellh added core and removed needs-confirmation A reproduction has been reported, but the bug hasn't been confirmed or reproduced by a maintainer. labels May 6, 2024
@mitchellh mitchellh self-assigned this May 6, 2024
@mitchellh
Copy link
Contributor

#1734 This fixes this issue. I compromised by only using CoreText's mechanism for CJK blocks, and our custom font fallback for everything else.

mitchellh added a commit that referenced this issue Oct 26, 2024
Fixes #2499

We rely on CoreText's font discovery to find the best font for a
fallback by using the character set attribute. It appears that for some
codepoints, the character set attribute is not enough to find a font
that supports the codepoint.

In this case, we use CTFontCreateForString to find the font that
CoreText would use. The one subtlety here is we need to ignore the
last resort font, which just has replacement glyphs for all codepoints.

We already had a function to do this for CJK characters (#1637)
thankfully so we can just reuse that!

This also fixes a bug where CTFontCreateForString range param expects
the range length to be utf16 code units, not utf32.
mitchellh added a commit that referenced this issue Oct 26, 2024
Fixes #2499

We rely on CoreText's font discovery to find the best font for a
fallback by using the character set attribute. It appears that for some
codepoints, the character set attribute is not enough to find a font
that supports the codepoint.

In this case, we use CTFontCreateForString to find the font that
CoreText would use. The one subtlety here is we need to ignore the
last resort font, which just has replacement glyphs for all codepoints.

We already had a function to do this for CJK characters (#1637)
thankfully so we can just reuse that!

This also fixes a bug where CTFontCreateForString range param expects
the range length to be utf16 code units, not utf32.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants