Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock during concurrent truncate and release #3454

Closed
hit1943 opened this issue Apr 10, 2023 · 2 comments · Fixed by #3457
Closed

Deadlock during concurrent truncate and release #3454

hit1943 opened this issue Apr 10, 2023 · 2 comments · Fixed by #3457
Labels
kind/bug Something isn't working

Comments

@hit1943
Copy link

hit1943 commented Apr 10, 2023

What happened:
Deadlock during concurrent truncate and release

goroutine 467799 [select]:
runtime.gopark(0xc001fdfbe8?, 0x2?, 0x9?, 0x18?, 0xc001fdfbbc?)
/usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc001fdfa38 sp=0xc001fdfa18 pc=0x4575f6
runtime.selectgo(0xc001fdfbe8, 0xc001fdfbb8, 0x1b55746?, 0x0, 0xc001fdfc10?, 0x1)
/usr/local/go/src/runtime/select.go:328 +0x7bc fp=0xc001fdfb78 sp=0xc001fdfa38 pc=0x46773c
github.com/juicedata/juicefs/pkg/utils.(*Cond).WaitWithTimeout(0xc0311e1938, 0x3b9aca00?)
/home/liulei.88/juicefs/pkg/utils/cond.go:72 +0x125 fp=0xc001fdfc20 sp=0xc001fdfb78 pc=0xa2a505
github.com/juicedata/juicefs/pkg/vfs.(*handle).Wlock(0xc01c5a7a40, {0x303e260?, 0xc0022add80})
/home/liulei.88/juicefs/pkg/vfs/handle.go:115 +0x99 fp=0xc001fdfc88 sp=0xc001fdfc20 pc=0x1b55f19
github.com/juicedata/juicefs/pkg/vfs.(*VFS).Truncate(0xc001738160, {0x303e260?, 0xc0022add80}, 0x2f373f, 0x0, 0x70?, 0x5210c6?)
/home/liulei.88/juicefs/pkg/vfs/vfs.go:427 +0x132 fp=0xc001fdfd08 sp=0xc001fdfc88 pc=0x1b64032
github.com/juicedata/juicefs/pkg/vfs.(*VFS).SetAttr(0xc001738160, {0x303e260?, 0xc0022add80}, 0x2f373f, 0x248, 0x0?, 0x0, 0x0, 0x0, 0x0, ...)
/home/liulei.88/juicefs/pkg/vfs/vfs_unix.go:159 +0x1e5 fp=0xc001fdfdd0 sp=0xc001fdfd08 pc=0x1b6bbe5
github.com/juicedata/juicefs/pkg/fuse.(*fileSystem).SetAttr(0xc00174c600, 0x21a4158?, 0xc0031701b0, 0xc001f85c08)
/home/liulei.88/juicefs/pkg/fuse/fuse.go:127 +0x10e fp=0xc001fdfe78 sp=0xc001fdfdd0 pc=0x21aa16e
github.com/hanwen/go-fuse/v2/fuse.doSetattr(0xc001f85b00?, 0xc001f85b00)
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/opcode.go:195 +0x55 fp=0xc001fdfea8 sp=0xc001fdfe78 pc=0x21995b5
github.com/hanwen/go-fuse/v2/fuse.(*Server).handleRequest(0xc001736840, 0xc001f85b00)
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:483 +0x1f3 fp=0xc001fdff50 sp=0xc001fdfea8 pc=0x21a4fd3
github.com/hanwen/go-fuse/v2/fuse.(*Server).loop(0xc001736840, 0xf0?)
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:456 +0x108 fp=0xc001fdffc0 sp=0xc001fdff50 pc=0x21a4c68
github.com/hanwen/go-fuse/v2/fuse.(*Server).readRequest.func3()
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:323 +0x2b fp=0xc001fdffe0 sp=0xc001fdffc0 pc=0x21a42eb
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc001fdffe8 sp=0xc001fdffe0 pc=0x489941
created by github.com/hanwen/go-fuse/v2/fuse.(*Server).readRequest
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:323 +0x53e

goroutine 472123 [select]:
runtime.gopark(0xc000c2ed38?, 0x2?, 0xa6?, 0xd?, 0xc000c2ed0c?)
/usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000c2eb88 sp=0xc000c2eb68 pc=0x4575f6
runtime.selectgo(0xc000c2ed38, 0xc000c2ed08, 0x0?, 0x0, 0x2?, 0x1)
/usr/local/go/src/runtime/select.go:328 +0x7bc fp=0xc000c2ecc8 sp=0xc000c2eb88 pc=0x46773c
github.com/juicedata/juicefs/pkg/utils.(*Cond).WaitWithTimeout(0xc0311e1578, 0x3b9aca00?)
/home/liulei.88/juicefs/pkg/utils/cond.go:72 +0x125 fp=0xc000c2ed70 sp=0xc000c2ecc8 pc=0xa2a505
github.com/juicedata/juicefs/pkg/vfs.(*VFS).Release(0xc001738160, {0x303e260?, 0xc0037e97c0}, 0x2f373f, 0x1af80e)
/home/liulei.88/juicefs/pkg/vfs/vfs.go:462 +0x27a fp=0xc000c2ee28 sp=0xc000c2ed70 pc=0x1b6457a
github.com/juicedata/juicefs/pkg/fuse.(*fileSystem).Release(0xc00174c600, 0xc00380c1b0?, 0xc000a51c98)
/home/liulei.88/juicefs/pkg/fuse/fuse.go:296 +0x96 fp=0xc000c2ee80 sp=0xc000c2ee28 pc=0x21ac336
github.com/hanwen/go-fuse/v2/fuse.doRelease(0xc000a51b00?, 0xc000a51b00?)
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/opcode.go:387 +0x30 fp=0xc000c2eea8 sp=0xc000c2ee80 pc=0x219aa70
github.com/hanwen/go-fuse/v2/fuse.(*Server).handleRequest(0xc001736840, 0xc000a51b00)
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:483 +0x1f3 fp=0xc000c2ef50 sp=0xc000c2eea8 pc=0x21a4fd3
github.com/hanwen/go-fuse/v2/fuse.(*Server).loop(0xc001736840, 0xa0?)
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:456 +0x108 fp=0xc000c2efc0 sp=0xc000c2ef50 pc=0x21a4c68
github.com/hanwen/go-fuse/v2/fuse.(*Server).readRequest.func3()
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:323 +0x2b fp=0xc000c2efe0 sp=0xc000c2efc0 pc=0x21a42eb
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc000c2efe8 sp=0xc000c2efe0 pc=0x489941
created by github.com/hanwen/go-fuse/v2/fuse.(*Server).readRequest
/home/liulei.88/go/pkg/mod/code.byted.org/data-system-ste/go-fuse/[email protected]/fuse/server.go:323 +0x53e

What you expected to happen:
fix it
How to reproduce it (as minimally and precisely as possible):

/tmp/jfs is mount dir by juicefs,execute following script:
for i in seq 1 100
do
truncate -s 1 /tmp/jfs/xixixi &
echo "sdfs" > /tmp/jfs/xixixi &
done
Anything else we need to know?

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version:
  • Cloud provider or hardware configuration running JuiceFS:
  • OS (e.g cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Object storage (cloud provider and region, or self maintained):
  • Metadata engine info (version, cloud provider managed or self maintained):
  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage):
  • Others:
@hit1943 hit1943 added the kind/bug Something isn't working label Apr 10, 2023
@davies
Copy link
Contributor

davies commented Apr 10, 2023

Fixed by #1383

@davies davies closed this as completed Apr 10, 2023
@hit1943
Copy link
Author

hit1943 commented Apr 10, 2023

func (v VFS) releaseHandle(inode Ino, fh uint64) {
v.hanleM.Lock()
defer v.hanleM.Unlock()
hs := v.handles[inode]
for i, f := range hs {
if f.fh == fh {
if i+1 < len(hs) {
hs[i] = hs[len(hs)-1]
/
这个过程中有加v.hanleM.Lock这个大锁,尝试delete hs中的其中handle指针,这个操作是将切片末尾的元素去覆盖要删除的元素,此时hs切片中会有两个元素的值是同一个handle指针, 而在此时如果并发有一个Truncate操作,只在findAllHandles中加了v.hanleM.Lock锁,因此可以并发操作同一个hs切片,并尝试给所有hs中的handle加写锁,则遍历切片过程中会遍历同一个元素两次加锁,导致死锁 */
}
if len(hs) > 1 {
v.handles[inode] = hs[:len(hs)-1]
} else {
delete(v.handles, inode)
}
break
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants