Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot encode <Cyrillic-names> to local encoding 2252 #6810

Closed
voroyam opened this issue Oct 12, 2018 · 12 comments
Closed

cannot encode <Cyrillic-names> to local encoding 2252 #6810

voroyam opened this issue Oct 12, 2018 · 12 comments
Labels
Discussion ReadyToTest QA, please validate the fix/enhancement
Milestone

Comments

@voroyam
Copy link
Contributor

voroyam commented Oct 12, 2018

User in central reported this

https://central.owncloud.org/t/some-folders-with-cyrillic-names-are-ignored/16196
Client seems to have a problem with Cyrillic characters. Error message: "cannot encode За принтиране to local encoding 2252"

Client version 2.5

oC version 10.0.10

Platform win 7 pro

From his post:

For several weeks my desktop client ignores the main folders with cyrillic names and all they contain. This issue is only for one of the users. Other two have no problems.
Interestingly some folders, again with cyrillic names, were synced after I renamed the folder they were in.

Expected behaviour

Sync all folders and files within to the server and from the server

Actual behaviour

The client shows the folder as ignored with issue: "The filename cannot be encoded on your system."

Steps to reproduce

  1. Rename folder ..\ownCloud\Представяне to "Presentation"
  2. All files in it are uploaded
  3. Rename again to "Представяне"
  4. Client logs "Folder Presentation - deleted"
  5. In not synced tab there is Issue: "The filename cannot be encoded on your system."

Server configuration

Shared hosting
Operating system: Linux

Web server: Apache 2.4.27

Database: mySQL

Server PHP version: 5.3.24
For ownCloud: 5.6

ownCloud version: latest - 10.0.10.4

Storage backend (external storage): -

Client configuration

Client version: 2.5.0

Operating system: Windows 7 profesional

OS language: english

Installation path of client: C:\Program Files (x86)\ownCloud

Logs

Client log output shows a lot of similar lines

10-12 11:00:09:037 [ info sync.csync.updater ]: cannot encode За принтиране to local encoding 2252

  1. Client logfile: Output of owncloud --logwindow or owncloud --logfile log.txt
    (On Windows using cmd.exe, you might need to first cd into the ownCloud directory)
    (See also http://doc.owncloud.org/desktop/2.2/troubleshooting.html#client-logfile )

I hope this is enough to understand the problem.

@ogoffart
Copy link
Contributor

ogoffart commented Oct 12, 2018

This happens because the system locale codec is Windows-1252 which cannot encode cyrilic character. At least that's what Qt tells us.
Normally, most modern system are using UTF-8.
But perhaps this user configured his system to use Windows-1252 instead of UTF-8 ?

On Linux this is important since the file system is using that locale. On Windows however, the file system is always UCS-2, to my knowledge. And so we probably can encode everything

@ckamm: what do you think? Should we remove the "canEncode" test in _csync_detect_update on windows?

@guruz guruz added this to the 2.5.1 milestone Oct 15, 2018
@ckamm
Copy link
Contributor

ckamm commented Oct 17, 2018

@ogoffart It does indeed sound like we should not use the local encoding for this check on Windows. Is windows ucs-2 or utf-16? If it's ucs-2 keeping the canEncode() check would make sense to reject the 3 and 4 byte code points.

@ckamm
Copy link
Contributor

ckamm commented Oct 17, 2018

According to wikipedia it migrated to utf-16 with win2k, so no check needed.

@ckamm
Copy link
Contributor

ckamm commented Oct 17, 2018

Though it seems to be fully correct we might need to be wary of unpaired surrogates when reading existing paths (rust-lang/rust#12056). I'm not sure how the Qt filesystem abstraction handles these cases when converting to QString.

@ogoffart
Copy link
Contributor

@ckamm: That's not the same thing, that would be taken care of by csync_vio_local_readdir. On Linux we indeed handle file that cannot be converted to utf-8, but we don't seem to do it on windows. But that's another topic.

ogoffart added a commit that referenced this issue Oct 17, 2018
Because on windows, all filename can be encoded: The file system uses
UTF-16, regardless of the locale

Issue: #6810
ogoffart added a commit that referenced this issue Oct 18, 2018
Because on windows, all filename can be encoded: The file system uses
UTF-16, regardless of the locale

Issue: #6810
ogoffart added a commit that referenced this issue Oct 18, 2018
Because on windows, all filename can be encoded: The file system uses
UTF-16, regardless of the locale

Issue: #6810
@ogoffart ogoffart added the ReadyToTest QA, please validate the fix/enhancement label Oct 18, 2018
@amc2002
Copy link

amc2002 commented Oct 23, 2018

Same thing here with Chinese characters in 2.5.0 build 10560. When my boss went back to the previous version, sync issues were no longer an issue.

I tried a test on my own client (Windows 10), and I had the same issue:
10-23 11:29:12:178 [ warning sync.propagator ]: Could not complete propagation of "发送时.txt" by OCC::PropagateIgnoreJob(0x7f34e08) with status 6 and error: "The filename cannot be encoded on your file system."

Will the proposed fix also take care of this issue? Thanks.

@ogoffart
Copy link
Contributor

@amc2002 It should, please try the 2.5.1 daily build.

@guruz
Copy link
Contributor

guruz commented Oct 24, 2018

@amc2002
Copy link

amc2002 commented Oct 24, 2018

Thank you, @guruz. I didn't install yesterday because I wasn't sure which daily I should grab. Just installed the one at the link you posted and can confirm the files with Chinese characters synced as expected. Thanks everyone!

@guruz
Copy link
Contributor

guruz commented Oct 25, 2018

Great! :)

@guruz guruz closed this as completed Oct 25, 2018
@c13mn14k
Copy link

c13mn14k commented Jul 28, 2023

I encountered a similar issue today, but in nextcloud. I'll leave a comment here because it might help someone using owncloud.

Nextcloud client didn't warn me about files not being able to be synced. It just skipped checking these files on the remote server, and they didn't appear in my Nextcloud folder as virtual files. When I disabled virtual file support Nextcloud just ignored them silently. I did some debugging including trying to do convmv -f utf-8 -t utf-8 -r --notest --nfc nextcloud-data-path server-side, which didn't change any files. I had this in my nextcloud client log: Cannot encode "Zdjęcia" to local encoding "windows-1252"

Solution

Go to Settings -> Language -> Administrative Language Settings -> Change system locale -> Tick Beta: (lol) Use Unicode UTF-8 for worldwide language support

Reboot

My OS Windows 10.19044.3208
Server OS Ubuntu 22.04.2 LTS
Nextcloud 27.0.1
Nextcloud client 3.9.1

@michaelstingl
Copy link
Contributor

I encountered a similar issue today, but in nextcloud. I'll leave a comment here because it might help someone using owncloud.

Supposed to be fixed on oC client 2.5.1 back in 2018 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion ReadyToTest QA, please validate the fix/enhancement
Projects
None yet
Development

No branches or pull requests

7 participants