Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a lot of hardlinks can cause the filesystem to become broken. #3689

Closed
mingyang91 opened this issue May 26, 2023 · 6 comments
Closed
Labels
kind/bug Something isn't working

Comments

@mingyang91
Copy link

What happened:
Creating a lot of hardlinks can cause the filesystem to become broken.

What you expected to happen:
Hardlinks work fine.

How to reproduce it (as minimally and precisely as possible):

echo hello > hello.txt
const fs = require('fs').promises;


async function main() {

    const origin = 'hello.txt'

    // log time
    console.time('hardlink')
    const promises = Array(1000).fill(0).map((_, i) => fs.link(origin, `link-${i}.obj`))
    await Promise.all(promises)
    console.timeEnd('hardlink')

    await sleep(10000)

    console.time('remove hardlink')
    const promises2 = Array(1000).fill(0).map((_, i) => fs.unlink(`link-${i}.obj`))
    await Promise.all(promises2)
    console.timeEnd('remove hardlink')
}

main()

function sleep(ms) {
    // add ms millisecond timeout before promise resolution
    return new Promise(resolve => setTimeout(resolve, ms))
}

Save this into a reproduce.js file and run it.

node reproduce.js

Then we can see that the inode has an incorrect count.

root@ubuntu:/default/hardlink-verify# node link-benchmark.js 
hardlink: 55177.086ms
remove hardlink: 54855.869ms
root@ubuntu:/default/hardlink-verify# ll
drwxr-xr-x    2 root root  4096 May 26 07:08 ./
drwxrwxrwx    9 root root  4096 May 26 06:23 ../
-rw-r--r-- 1001 root root     5 May 26 07:06 hello.txt # <- only have 1 file, but 1001 count inode is reported

Then, if you re-run this script, it will cause an error and the current folder will become an illegal state. The ls command will not respond, and attempting to use rm or unlink will result in an error.

root@ubuntu:/default/hardlink-verify# node link-benchmark.js 
hardlink: 55406.505ms
(node:4551) UnhandledPromiseRejectionWarning: Error: EIO: i/o error, unlink 'link-2.obj'
...
# <- Ctrl + C
root@ubuntu:/default/hardlink-verify# ll hello.txt
-rw-r--r-- 2002 root root 5 May 26 07:06 hello.txt
root@ubuntu:/default/hardlink-verify# ll link-1.obj
-rw-r--r-- 2002 root root 5 May 26 07:06 link-1.obj
root@ubuntu:/default/hardlink-verify# rm link-1.obj
rm: cannot remove 'link-1.obj': Input/output error

However, it seems that the other fold has no effect on this illegal state.

Is this a bug or is there a limitation due to the implementation or meta engine?

In my scenario, I want to create 10,000 hard links in 1 second. Is it possible to do so in JuiceFS?

Anything else we need to know?

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version: juicefs version 1.0.0+2022-08-08.cf0c269b
  • Cloud provider or hardware configuration running JuiceFS: Aliyun
  • OS (e.g cat /etc/os-release): Ubuntu 22.04.2 LTS
  • Kernel (e.g. uname -a): 4.19.91-26.6.al7.x86_64
  • Object storage (cloud provider and region, or self maintained): Aliyun OSS
  • Metadata engine info (version, cloud provider managed or self maintained): PostgreSQL 14.0 Aliyun RDS
  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage): Aliyun VPC
  • Others:
@mingyang91 mingyang91 added the kind/bug Something isn't working label May 26, 2023
@SandyXSD
Copy link
Contributor

It's probably caused by the trash feature. The value of nlink is not reduced when you remove a hardlink, as it's only moved to the trash, but not really deleted. The EIO is caused by name conflicts when creating trash entry (and yes, it's a bug for hardlinks).

You may try the test again with trash disabled by juicefs config META-URL --trash-days 0.

@mingyang91
Copy link
Author

mingyang91 commented May 26, 2023

@SandyXSD, thank you for your reply. Also, does my usage align with the motivation behind the hard link design/implementation?
In my scenario, there is a frequent operation of creating hundreds to thousands of hard links at the same time, and a latency of 1 second is expected.

@SandyXSD
Copy link
Contributor

It sounds like a pretty special case as few user creates that many hardlinks in such a short time.

From the result we got previously, it's possible to create 1K+ hardlinks within 1 second when using Redis as the metadata engine.

@SandyXSD
Copy link
Contributor

The EIO is fixed by #3706

@mingyang91
Copy link
Author

Awesome work! I'm glad you fixed the issue. Do you know when the next PATCH release is? I hope it's not too far away.

@SandyXSD
Copy link
Contributor

v1.0.5 should be released on July or August, if there's no critical bug found. There is a minor release v1.1-beta on the way though, which contains this fix as well and will probably be published next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants