Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JuiceFS does not support OFD lock (Linux only) #4081

Closed
FabriceLuo opened this issue Oct 7, 2023 · 10 comments
Closed

JuiceFS does not support OFD lock (Linux only) #4081

FabriceLuo opened this issue Oct 7, 2023 · 10 comments

Comments

@FabriceLuo
Copy link

FabriceLuo commented Oct 7, 2023

What happened:
I store the virtual machine image file on jfs and submit the differential data through qemu-img commit. If the process crashes or does not actively release the lock during the first submission, it will fail during the second submission, indicating that acquisition of the lock failed.

What you expected to happen:
I hope that no matter whether it is successful the first time or not, the submission will be successful the second time.

How to reproduce it (as minimally and precisely as possible):
In addition to my own development environment, it can be reproduced by performing the following operations on general Debian Linux:

  1. under one jfs mount dir,create two qcow2 files
qemu-img create -f qcow2 test1.qcow2 10G
qemu-img create -f qcow2 test2.qcow2 10G
  1. rebase backing file of test2.qcow2 to test1.qcow2
qemu-img rebase -u -b test1.qcow2 test2.qcow2
  1. commit data from test2.qcow2 to test1.qcow2, the first commit will be successful
qemu-img commit test2.qcow2
  1. the second commit will be failed
qemu-img commit test2.qcow2
# err msg
qemu-img-real: Could not open 'test2.qcow2': Could not open backing file: Failed to get shared "write" lock
Is another process using the image [test1.qcow2]?

Anything else we need to know?
I added some logs to pkg/vfs/vfs.go and found that the lockOwner in the parameters was inconsistent with the one in f, resulting in the lock not being released.In flush, I changed the owner parameter of v.Meta.Setlk from lockOwner to f.flockOwner, and this problem no longer occurs. I don't know if this modification is reasonable.
image
image
image

The access log:
access.log

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version:
    juicefs version 1.1.0+2023-09-04.08c4ae622953

  • Cloud provider or hardware configuration running JuiceFS:
    virtual machine on kvm,8 cores cpu, 16 GiB memory
    ./juicefs mount --cache-size=0 --no-usage-report -d 'mysql://lzzc86_s3_user1@(200.201.44.45:3306)/lzzc86_s3_db1?tls=skip-verify' /mnt/jfs

  • OS (e.g cat /etc/os-release):
    PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
    NAME="Debian GNU/Linux"
    VERSION_ID="11"
    VERSION="11 (bullseye)"
    VERSION_CODENAME=bullseye
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/

  • Kernel (e.g. uname -a):
    Linux debian11 5.10.0-22-amd64 update reader length after write #1 SMP Debian 5.10.178-3 (2023-04-22) x86_64 GNU/Linux

  • Object storage (cloud provider and region, or self maintained):
    MinIO maintained by self

  • Metadata engine info (version, cloud provider managed or self maintained):
    MySQL maintained by self

  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage):
    Gigabit Ethernet

  • Others:

@FabriceLuo FabriceLuo added the kind/bug Something isn't working label Oct 7, 2023
@FabriceLuo FabriceLuo changed the title When the user process crashes, the plock held is not released, causing the next lock acquisition to fail. When the user process crashes or does not actively release the lock, the plock held is not released, causing the next lock acquisition to fail. Oct 7, 2023
@davies
Copy link
Contributor

davies commented Oct 8, 2023

qemu use Open file description locks [1], which is much different than POSIX records lock, FUSE (also JuiceFS) does not support that right now, there is no workaround for that unfortunately.

We will update the docs about this.

[1] https://man7.org/linux/man-pages/man2/fcntl.2.html

@davies
Copy link
Contributor

davies commented Oct 8, 2023

We have tried to fix it in #2649, but it conflict with POSIX records lock, and get reverted finally.

@davies davies removed the kind/bug Something isn't working label Oct 8, 2023
@davies davies changed the title When the user process crashes or does not actively release the lock, the plock held is not released, causing the next lock acquisition to fail. JuiceFS does not support OFD lock (Linux only) Oct 8, 2023
@davies
Copy link
Contributor

davies commented Oct 8, 2023

We have a workaround for the simple case in #4083, can you try that?

@FabriceLuo
Copy link
Author

Thanks! I found a patch about fuse, which seems to solve this problem. Can this patch fix this problem?
https://lore.kernel.org/lkml/[email protected]/T/#m3696f9f4052d4b6f116fdecb22f95f3ea0e37d5d

@davies
Copy link
Contributor

davies commented Oct 9, 2023

#4083 has similar behavior without patch the kernel, can you try that?

@davies
Copy link
Contributor

davies commented Oct 24, 2023

@FabriceLuo Does #4083 work in your case?

@dbotwinick
Copy link

dbotwinick commented Nov 24, 2023

@davies I wouldn't take this as a guaranteed test, but I was able to reproduce write lock errors on juicefs version 1.1.0+2023-09-04.08c4ae62 following the poster's instructions... and then I compiled the ofd_lock branch [#4083] (juicefs version 1.2.0-dev+2023-10-08.08478fea) and was able to run the same commands without error messages about write locks.

@dbotwinick
Copy link

dbotwinick commented Nov 24, 2023

@davies I actually found this issue as I was trying to get QuestDB working via Kubernetes using juicefs csi driver... and I spent a few hours to figure out how to get that working. From what I've found (and I don't think this problem was exclusive to QuestDB), the database would try to "downgrade" an exclusive lock to a shared lock by calling flock again (without non-blocking flag) and then it would hang. flock man page says that should be valid usage.

I've submitted a PR on the ofd_lock branch that addresses that for the redis meta backend. #4197. It definitely needs more testing and thought, but at a glance, this seems to fix some problems for me and it's a fairly compact change that (hopefully) shouldn't really cause any other problems.

@davies
Copy link
Contributor

davies commented Nov 25, 2023

@dbotwinick The downgrade issue was fixed by #4179 in main branch, will be released in next week.

@dbotwinick
Copy link

@dbotwinick The downgrade issue was fixed by #4179 in main branch, will be released in next week.

I wish I noticed that earlier! But oh well... that was a good exercise. Keep up the good work.

@davies davies closed this as completed Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants