Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use UTF-8 code page when using native ANSI sequence processing #4968

Merged

Conversation

dscho
Copy link
Member

@dscho dscho commented May 25, 2024

In #4700, I introduced a change in Git for Windows' behavior where it would favor recent Windows 10 versions' native ANSI sequence processing to Git for Windows' home-grown one.

What I missed was that the home-grown processing also ensured that text written to the Win32 Console was carefully converted from UTF-8 to UTF-16 encoding, while the native ANSI sequence processing would respect the currently-set code page.

However, Git for Windows does not use the current code page at all, always using UTF-8 encoded text internally. So let's make sure that the code page is CP_UTF8 when Git for Windows uses the native ANSI sequence processing.

This fixes #4851.

win32: use native ANSI sequence processing, if possible

Windows 10 version 1511 (also known as Anniversary Update), according to
https://learn.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences
introduced native support for ANSI sequence processing. This allows
using colors from the entire 24-bit color range.

All we need to do is test whether the console's "virtual processing
support" can be enabled. If it can, we do not even need to start the
`console_thread` to handle ANSI sequences.

Or, almost all we need to do: When `console_thread()` does its work, it
uses the Unicode-aware `write_console()` function to write to the Win32
Console, which supports Git for Windows' implicit convention that all
text that is written is encoded in UTF-8. The same is not necessarily
true if native ANSI sequence processing is used, as the output is then
subject to the current code page. Let's ensure that the code page is set
to `CP_UTF8` as long as Git writes to it.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho self-assigned this May 25, 2024
@dscho dscho added this to the Next release milestone May 25, 2024
@dscho dscho merged commit 9cf5174 into git-for-windows:main May 26, 2024
44 checks passed
@dscho dscho deleted the ensure-utf-8-code-page-on-older-windows branch May 26, 2024 18:22
@dscho
Copy link
Member Author

dscho commented May 26, 2024

/add relnote bug When Git for Windows v2.44.0 introduced the ability to use native Win32 Console ANSI sequence processing, an inadvertent fallout was that in this instance, non-ASCII characters were no longer printed correctly unless the current code page was set to 65001. This bug has been fixed.

The workflow run was started

github-actions bot pushed a commit to git-for-windows/build-extra that referenced this pull request May 26, 2024
When Git for Windows v2.44.0 introduced the ability [to use native Win32
Console ANSI sequence
processing](git-for-windows/git#4700), an
inadvertent fallout was that in this instance, [non-ASCII characters
were no longer printed correctly unless the current code page was set to
65001](git-for-windows/git#4851). This bug
[has been fixed](git-for-windows/git#4968).

Signed-off-by: gitforwindowshelper[bot] <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Encoding problems with non-ASCII characters with Git 2.44.0
1 participant