Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cannot encode *polish diacritics* to local encoding "windows-1252" #5935

Closed
5 of 8 tasks
c13mn14k opened this issue Jul 28, 2023 · 9 comments · Fixed by #5971
Closed
5 of 8 tasks

[Bug]: Cannot encode *polish diacritics* to local encoding "windows-1252" #5935

c13mn14k opened this issue Jul 28, 2023 · 9 comments · Fixed by #5971
Assignees
Labels
0. Needs triage approved bug approved by the team

Comments

@c13mn14k
Copy link

c13mn14k commented Jul 28, 2023

⚠️ Before submitting, please verify the following: ⚠️

Bug description

Nextcloud client didn't warn me about files not being able to be synced. It just skipped checking these files on the remote server, and they didn't appear in my Nextcloud folder as virtual files. When I disabled virtual file support Nextcloud just ignored them silently. I did some debugging including trying to do convmv -f utf-8 -t utf-8 -r --notest --nfc nextcloud-data-path server-side, which didn't change any files - on linux they had the proper encoding. I had this in my nextcloud debug archive log: Cannot encode "Zdjęcia" to local encoding "windows-1252".

Solution

Go to Settings -> Language -> Administrative Language Settings -> Change system locale -> Tick Beta: (lol) Use Unicode UTF-8 for worldwide language support

Reboot

My OS Windows 10.19044.3208
Server OS Ubuntu 22.04.2 LTS
Nextcloud 27.0.1
Nextcloud client 3.9.1

Steps to reproduce

  1. Have a windows with old encoding
  2. Create a folder with Polish diacritic "ę"
  3. Try to sync
  4. profit???

Expected behavior

In my opinion Nextcloud isn't reposnisble for windows having a random encoding, but the desktop client shouldn't silently fail on this, it should add a helpful suggestion to check filesystem encoding.

Which files are affected by this bug

nextcloud.sync.discovery C:\Users\sysadmin\AppData\Local\Temp\2\windows-16700\client-building\desktop\src\libsync\discovery.cpp:302

Operating system

Windows

Which version of the operating system you are running.

Windows 10

Package

Other

Nextcloud Server version

27.0.1

Nextcloud Desktop Client version

3.9.1

Is this bug present after an update or on a fresh install?

Fresh desktop client install

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

Are you using an external user-backend?

  • Default internal user-backend
  • LDAP/ Active Directory
  • SSO - SAML
  • Other

Nextcloud Server logs

No response

Additional info

This was on a brand new install, since I reinstalled nextcloud as part of debugging process. I'm using nextcloud-aio on docker-compose.

Seems similar: owncloud/issue/6810, nextcloud forum post

@Elberet
Copy link

Elberet commented Jul 29, 2023

Solution

Go to Settings -> Language -> Administrative Language Settings -> Change system locale -> Tick Beta: (lol) Use Unicode UTF-8 > for worldwide language support

Can confirm that this can be used as a work-around. Thank you!

Still, it's obvious that the Nextcloud client is doing something wrong. NTFS fully supports Unicode in filenames, trying to convert filenames to some locale-dependent character set for any other purpose than displaying the filename in a UI seems like a mistake.

@cedvan
Copy link

cedvan commented Aug 3, 2023

Thank you to your solution, enable beta solve my synchronisation problem !

@cedvan
Copy link

cedvan commented Aug 3, 2023

Ops, not all good :/

  • Files by artist "Ayọ" are ignored because of the specific o. -> solved
  • Files with an "X-X" in their name are ignored ("In-Between.flac", "Hors-Saison.flac", etc...) -> always ignored

@Elberet
Copy link

Elberet commented Aug 3, 2023

FWIW, I should add that I encountered this bug in this environment:

  • Windows 11 22621.2070
  • Localized for Germany (de-de), and using locale-appropriate language settings throughout.
  • Source volume is NTFS on a standard GPT disk. chkdsk reports no errors.
  • Nextcloud server is AIO 27.0.1 running fully dockerized.
  • Nextcloud client is version 3.9.1 and is using virtual files.
  • I've removed the default ignore patterns *~ and *.~* as they're matching in-use files that I do not wish to rename.

I've encountered this problem when trying to sync an existing folder to an empty cloud directory; all files currently in Nextcloud's data directory have been created there by Nextcloud itself; all files' filenames are valid UTF-8 encoded strings and none contain any non-printable characters, Unicode control chars, or characters that would be illegal as part of a Windows filename.

At least one directory affected by the bug contains the character U+2606 .

@allexzander
Copy link
Contributor

3.9.2 release that was just published fixes the issue, however, if a folder with such files was already synced with 3.9.1, one would need to enforce folder resync by creating an empty file in it in the client or on the server, keeping it open for now, 3.9.2 with some reverts of MSI build dependencies was the fastest solution, but the automatic fix will be introduced in later releases as now it is unclear how to detect such scenario

@allexzander allexzander self-assigned this Aug 9, 2023
@Elberet
Copy link

Elberet commented Aug 10, 2023

Out of pure curiosity, can you point out the commit that fixed the filename encoding problem? Maybe I'm just blind, but I've found nothing in v3.9.1...v3.9.2 that handles filenames or filepaths in any way.

@allexzander
Copy link
Contributor

Out of pure curiosity, can you point out the commit that fixed the filename encoding problem? Maybe I'm just blind, but I've found nothing in v3.9.1...v3.9.2 that handles filenames or filepaths in any way.

Those are not in the desktop client branch, all in client-building, the issue is not caused by handling file paths in client code but by yet to be investigated issue with pre-built dependencies. See my reverts here https://github.com/nextcloud/client-building/commits/master
This, is partially why it is hard to simply fix it without a proper investigation.

@lssong99
Copy link

Out of pure curiosity, can you point out the commit that fixed the filename encoding problem? Maybe I'm just blind, but I've found nothing in v3.9.1...v3.9.2 that handles filenames or filepaths in any way.

Those are not in the desktop client branch, all in client-building, the issue is not caused by handling file paths in client code but by yet to be investigated issue with pre-built dependencies. See my reverts here https://github.com/nextcloud/client-building/commits/master This, is partially why it is hard to simply fix it without a proper investigation.

Thank you for the effort!
Now really interested in the issue behind... I also thought it was something related to the handling of path/file name but it seems more deep. I know we non-developers (of this repo) has no position asking but if after investigation, I do hope some developer could share some background on this issue, so we could avoid such thing in our other development.

@allexzander
Copy link
Contributor

@mgallien FYI, is related to MSI building with Craft dependencies

@allexzander allexzander added the approved bug approved by the team label Aug 11, 2023
mgallien added a commit that referenced this issue Aug 16, 2023
mgallien added a commit that referenced this issue Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage approved bug approved by the team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants