Skip to content

Commit 28a2359

Browse files
fdmananamasoncl
authored andcommitted
Btrfs: fix lockdep warning on deadlock against an inode's log mutex
Commit 44f714d ("Btrfs: improve performance on fsync against new inode after rename/unlink"), which landed in 4.8-rc2, introduced a possibility for a deadlock due to double locking of an inode's log mutex by the same task, which lockdep reports with: [23045.433975] ============================================= [23045.434748] [ INFO: possible recursive locking detected ] [23045.435426] 4.7.0-rc6-btrfs-next-34+ rib#1 Not tainted [23045.436044] --------------------------------------------- [23045.436044] xfs_io/3688 is trying to acquire lock: [23045.436044] (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] but task is already holding lock: [23045.436044] (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] other info that might help us debug this: [23045.436044] Possible unsafe locking scenario: [23045.436044] CPU0 [23045.436044] ---- [23045.436044] lock(&ei->log_mutex); [23045.436044] lock(&ei->log_mutex); [23045.436044] *** DEADLOCK *** [23045.436044] May be due to missing lock nesting notation [23045.436044] 3 locks held by xfs_io/3688: [23045.436044] #0: (&sb->s_type->i_mutex_key#15){+.+...}, at: [<ffffffffa035f2ae>] btrfs_sync_file+0x14e/0x425 [btrfs] [23045.436044] rib#1: (sb_internal#2){.+.+.+}, at: [<ffffffff8118446b>] __sb_start_write+0x5f/0xb0 [23045.436044] rib#2: (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] stack backtrace: [23045.436044] CPU: 4 PID: 3688 Comm: xfs_io Not tainted 4.7.0-rc6-btrfs-next-34+ rib#1 [23045.436044] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [23045.436044] 0000000000000000 ffff88022f5f7860 ffffffff8127074d ffffffff82a54b70 [23045.436044] ffffffff82a54b70 ffff88022f5f7920 ffffffff81092897 ffff880228015d68 [23045.436044] 0000000000000000 ffffffff82a54b70 ffffffff829c3f00 ffff880228015d68 [23045.436044] Call Trace: [23045.436044] [<ffffffff8127074d>] dump_stack+0x67/0x90 [23045.436044] [<ffffffff81092897>] __lock_acquire+0xcbb/0xe4e [23045.436044] [<ffffffff8109155f>] ? mark_lock+0x24/0x201 [23045.436044] [<ffffffff8109179a>] ? mark_held_locks+0x5e/0x74 [23045.436044] [<ffffffff81092de0>] lock_acquire+0x12f/0x1c3 [23045.436044] [<ffffffff81092de0>] ? lock_acquire+0x12f/0x1c3 [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffff814a51a4>] mutex_lock_nested+0x77/0x3a7 [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffffa039705e>] ? btrfs_release_delayed_node+0xb/0xd [btrfs] [23045.436044] [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffff810a0ed1>] ? vprintk_emit+0x453/0x465 [23045.436044] [<ffffffffa0385a61>] btrfs_log_inode+0x66e/0xc95 [btrfs] [23045.436044] [<ffffffffa03c084d>] log_new_dir_dentries+0x26c/0x359 [btrfs] [23045.436044] [<ffffffffa03865aa>] btrfs_log_inode_parent+0x4a6/0x628 [btrfs] [23045.436044] [<ffffffffa0387552>] btrfs_log_dentry_safe+0x5a/0x75 [btrfs] [23045.436044] [<ffffffffa035f464>] btrfs_sync_file+0x304/0x425 [btrfs] [23045.436044] [<ffffffff811acaf4>] vfs_fsync_range+0x8c/0x9e [23045.436044] [<ffffffff811acb22>] vfs_fsync+0x1c/0x1e [23045.436044] [<ffffffff811acc79>] do_fsync+0x31/0x4a [23045.436044] [<ffffffff811ace99>] SyS_fsync+0x10/0x14 [23045.436044] [<ffffffff814a88e5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [23045.436044] [<ffffffff8108f039>] ? trace_hardirqs_off_caller+0x3f/0xaa An example reproducer for this is: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/dir $ touch /mnt/dir/foo $ sync $ mv /mnt/dir/foo /mnt/dir/bar $ touch /mnt/dir/foo $ xfs_io -c "fsync" /mnt/dir/bar This is because while logging the inode of file bar we end up logging its parent directory (since its inode has an unlink_trans field matching the current transaction id due to the rename operation), which in turn logs the inodes for all its new dentries, so that the new inode for the new file named foo gets logged which in turn triggered another logging attempt for the inode we are fsync'ing, since that inode had an old name that corresponds to the name of the new inode. So fix this by ensuring that when logging the inode for a new dentry that has a name matching an old name of some other inode, we don't log again the original inode that we are fsync'ing. Fixes: 44f714d ("Btrfs: improve performance on fsync against new inode after rename/unlink") Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Chris Mason <[email protected]>
1 parent 1ba98d0 commit 28a2359

File tree

3 files changed

+8
-4
lines changed

3 files changed

+8
-4
lines changed

fs/btrfs/file.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -2070,7 +2070,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
20702070
}
20712071
trans->sync = true;
20722072

2073-
btrfs_init_log_ctx(&ctx);
2073+
btrfs_init_log_ctx(&ctx, inode);
20742074

20752075
ret = btrfs_log_dentry_safe(trans, root, dentry, start, end, &ctx);
20762076
if (ret < 0) {

fs/btrfs/tree-log.c

+3-2
Original file line numberDiff line numberDiff line change
@@ -2823,7 +2823,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
28232823
*/
28242824
mutex_unlock(&root->log_mutex);
28252825

2826-
btrfs_init_log_ctx(&root_log_ctx);
2826+
btrfs_init_log_ctx(&root_log_ctx, NULL);
28272827

28282828
mutex_lock(&log_root_tree->log_mutex);
28292829
atomic_inc(&log_root_tree->log_batch);
@@ -4757,7 +4757,8 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
47574757
if (ret < 0) {
47584758
err = ret;
47594759
goto out_unlock;
4760-
} else if (ret > 0) {
4760+
} else if (ret > 0 && ctx &&
4761+
other_ino != btrfs_ino(ctx->inode)) {
47614762
struct btrfs_key inode_key;
47624763
struct inode *other_inode;
47634764

fs/btrfs/tree-log.h

+4-1
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,18 @@ struct btrfs_log_ctx {
3030
int log_transid;
3131
int io_err;
3232
bool log_new_dentries;
33+
struct inode *inode;
3334
struct list_head list;
3435
};
3536

36-
static inline void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx)
37+
static inline void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx,
38+
struct inode *inode)
3739
{
3840
ctx->log_ret = 0;
3941
ctx->log_transid = 0;
4042
ctx->io_err = 0;
4143
ctx->log_new_dentries = false;
44+
ctx->inode = inode;
4245
INIT_LIST_HEAD(&ctx->list);
4346
}
4447

0 commit comments

Comments
 (0)