You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The output of file listing won't do iconv transforming automatically so that there will be garbled code in output of git status/list-files/add -i.
#5004
Closed
1 task done
GitPopcorn opened this issue
Jun 12, 2024
· 5 comments
What did you expect to occur after running these commands?
The filenames with CJK character be print normally as printed in the output of commands like git log, git diff.
What actually happened instead?
The filenames with CJK character will be encoded by Git as UTF-8 character and then decoded by CMD as GBK character, so it will finally be rendered as many garbled code (just wrong encoding result like '闂淇', not the escaped unicode like '\u95ee\u9898\u4fee\u590d', the original text is '问题修复').
While at the same time, all the same filenames will be print normally in the output of commands like git log, git diff. So I don't think it's something wrong caused by the configurations.
Any other details?
Why I think it was caused by the lack of encoding conversion?
Because the commands like git log, git diff just worked well while all my configurations and environment variables about character set are set to 'UTF-8'. There is no way for Git to print CJK character normally in a CMD decoding with GBK without additional converting operation.
In my mind this issue was not appeared in all the version of Git for Windows, it only happened after one upgrading, but I am sorry that I can not remember the exact version, I am trying to downgrade to a correct old version too.
I found a dynamic link library file libiconv-2.dll under %GIT_HOME%\mingw64\bin, seems to be used in encoding transforming.
After I found the general cause of problem, I do some test with independent iconv command and found interesting results:
4.1. git status | iconv -f UTF-8 -t GBK: The output back to normal, but lost the color in terminal.
4.2. git status | iconv -f UTF-8 -t UTF-8: The output was not correct.
4.3. git config --global alias.st2 "!f(){ git status | iconv -f UTF-8 -t GBK; };f" && git st2: The output was not correct and shows another type of garbled code.
4.4. git config --global alias.st2 "!f(){ git status | iconv -f UTF-8 -t UTF-8; };f" && git st2: The output back to normal, but lost the color in terminal.
4.5. git config --global alias.st2 "!f(){ git status | grep \".*\"; };f" && git st2: The output back to normal, but lost the color in terminal.
4.6. What cause those difference between native command and alias? I think the output of alias with shell command is not original bytes anymore because it need to be run in a shell environment, but git will detected the output environment and transform the plaintext printed by shell command to matched encoded bytes automatically, so we will always see correct output only if we run command in a alias function with pipeline handling. But not the native git status command does so because it directly send bytes that has been already encoded with 'UTF-8' character set to the CMD, and this character set using to encode could not be changed by any known configuration of Git.
If the problem was occurring with a specific repository, can you provide the URL to that repository to help us with testing?
This issue is common in any CMD window running with CHCP 936 and any repositories that contains file with CJK characters in their names, so I think we do not need a specific repository to reproduce it.
The text was updated successfully, but these errors were encountered:
Well dude I found it works just fine under the version Git-2.42.0.2-64-bit which I download it on October 22, 2023, no configuration changes during the reinstallation.
So now I'm pretty sure that there must have been something changed in the source code of Gir for Windows which finally caused that.
Setup
defaults?
to the issue you're seeing?
All the related environment variables or git configurations about character set was set to UTF-8, as follows:
Details
Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other
CMD, with CHCP 936(decoding with GBK)
What commands did you run to trigger this issue? If you can provide a
Minimal, Complete, and Verifiable example
this will help us understand the issue.
What did you expect to occur after running these commands?
The filenames with CJK character be print normally as printed in the output of commands like
git log
,git diff
.What actually happened instead?
The filenames with CJK character will be encoded by Git as UTF-8 character and then decoded by CMD as GBK character, so it will finally be rendered as many garbled code (just wrong encoding result like '闂淇', not the escaped unicode like '\u95ee\u9898\u4fee\u590d', the original text is '问题修复').
While at the same time, all the same filenames will be print normally in the output of commands like
git log
,git diff
. So I don't think it's something wrong caused by the configurations.Any other details?
Why I think it was caused by the lack of encoding conversion?
git log
,git diff
just worked well while all my configurations and environment variables about character set are set to 'UTF-8'. There is no way for Git to print CJK character normally in a CMD decoding with GBK without additional converting operation.libiconv-2.dll
under%GIT_HOME%\mingw64\bin
, seems to be used in encoding transforming.4.1.
git status | iconv -f UTF-8 -t GBK
: The output back to normal, but lost the color in terminal.4.2.
git status | iconv -f UTF-8 -t UTF-8
: The output was not correct.4.3.
git config --global alias.st2 "!f(){ git status | iconv -f UTF-8 -t GBK; };f" && git st2
: The output was not correct and shows another type of garbled code.4.4.
git config --global alias.st2 "!f(){ git status | iconv -f UTF-8 -t UTF-8; };f" && git st2
: The output back to normal, but lost the color in terminal.4.5.
git config --global alias.st2 "!f(){ git status | grep \".*\"; };f" && git st2
: The output back to normal, but lost the color in terminal.4.6. What cause those difference between native command and alias? I think the output of alias with shell command is not original bytes anymore because it need to be run in a shell environment, but git will detected the output environment and transform the plaintext printed by shell command to matched encoded bytes automatically, so we will always see correct output only if we run command in a alias function with pipeline handling. But not the native
git status
command does so because it directly send bytes that has been already encoded with 'UTF-8' character set to the CMD, and this character set using to encode could not be changed by any known configuration of Git.If the problem was occurring with a specific repository, can you provide the URL to that repository to help us with testing?
This issue is common in any CMD window running with CHCP 936 and any repositories that contains file with CJK characters in their names, so I think we do not need a specific repository to reproduce it.
The text was updated successfully, but these errors were encountered: