Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL metadata mtime truncation to microsecond interferes with backups #3122

Open
seth-hunter opened this issue Dec 23, 2022 · 6 comments
Open
Assignees
Labels
kind/feature New feature or request
Milestone

Comments

@seth-hunter
Copy link

What happened: Files copied into JuiceFS with mtime preservation (e.g. rsync -axHAX) should retain exactly the same mtime. Failure to do so can, for example, cause backup software that uses mtime to decide if a file is changed to unnecessarily read/download from cloud storage. With SQL-based metadata, JuiceFS truncates mtime to the microsecond, resulting in an mtime change.

What you expected to happen: For a file copied into JuiceFS with mtime preservation (e.g. rsync -axHAX), nanosecond-level mtime precision should be retained.

How to reproduce it (as minimally and precisely as possible): rsync -axHAX a file into JuiceFS, use 'stat' to compare mtime of original and resulting files. Notice that nanoseconds portion of mtime is zeroed out.

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version: juicefs version 1.0.2+2022-12-18.1ba6bbd7
  • OS (e.g cat /etc/os-release): Linux
  • Kernel (e.g. uname -a): 4.18.0, x86_64
  • Object storage (cloud provider and region, or self maintained): Backblaze B2
  • Metadata engine info (version, cloud provider managed or self maintained): sqlite
@seth-hunter seth-hunter added the kind/bug Something isn't working label Dec 23, 2022
@SandyXSD
Copy link
Contributor

It's designed and won't be changed in the v1.0 releases.
You may use Redis or TKV (e.g. badger) as metadata engine if you need nanosecond precision.

@SandyXSD SandyXSD added kind/feature New feature or request and removed kind/bug Something isn't working labels Dec 26, 2022
@SandyXSD SandyXSD self-assigned this Jan 3, 2023
@flipfloptech
Copy link

I'm seeing a slightly different issue where mtime is being updated to match ctime even when set to preserve. My expectation is that if I opt to preserve the atime and mtime via something like "cp -p" that they should be preserved. I'm not sure why or where the mtime is being updated to match the ctime. or where to even start looking.

I'm using minio as object storage, TiKV for metadata.

@NodeGuy
Copy link

NodeGuy commented Mar 8, 2023

Why isn't this considered a bug?

@davies
Copy link
Contributor

davies commented May 22, 2023

Why isn't this considered a bug?

This is a good question, the implement of SQL engine has this tradeoff for efficiency (use single int64 column for mtime), but it introduce different behavior between meta engines, could be confusing for users, we should fix that.

@davies
Copy link
Contributor

davies commented May 22, 2023

I'm seeing a slightly different issue where mtime is being updated to match ctime even when set to preserve. My expectation is that if I opt to preserve the atime and mtime via something like "cp -p" that they should be preserved. I'm not sure why or where the mtime is being updated to match the ctime. or where to even start looking.

I'm using minio as object storage, TiKV for metadata.

The mtime was updated when data is written into object store, which overwrites the mtime set by cp -p. This issue is fixed in 1.1 by #3552

@davies davies added this to the Release 1.1 milestone May 22, 2023
@davies davies closed this as completed May 24, 2023
@seth-hunter
Copy link
Author

different behavior between meta engines, could be confusing for users, we should fix that

@davies: in above quote you seemed to agree that the original issue reported here should be fixed, but...

The mtime was updated when data is written into object store, which overwrites the mtime set by cp -p. This issue is fixed in 1.1 by #3552

... here you closed the issue while referenced #3552 fixed a different issue (mtime/ctime confusion). Original issue still seems to be present, i.e. that sql backend is truncating nanoseconds resulting in different behavior between metadata engines. I'd still recommend this should be considered a problem and the issue re-opened.

@davies davies reopened this Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants