Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs-progs: doc: add a warning when converting to a profile with lower #945

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

adam900710
Copy link
Collaborator

duplication

[BUG]
There is a bug report that, a running btrfs with one of its device deleted using sysfs ('/sys/block//device/delete'), btrfs will still read write on that device.

Normally it's fine as long as all chunks can tolerate that removed device (e.g. all RAID1).

But the problem is when one is trying to lower the duplication by converting to a less-safe profile:

mkfs.btrfs -f -m raid1 -d raid1 /dev/sdd /dev/sde

mount /dev/sdd /mnt

echo 1 > /sys/block/sde/device/delete

btrfs balance start --force -mdup -dsingle /mnt

This will lead to the fs mounted RO, with the following error messages:

sd 6:0:0:0: [sde] Synchronizing SCSI cache
ata7.00: Entering standby power mode
btrfs: attempt to access beyond end of device
sde: rw=6145, sector=21696, nr_sectors = 32 limit=0
btrfs: attempt to access beyond end of device
sde: rw=6145, sector=21728, nr_sectors = 32 limit=0
btrfs: attempt to access beyond end of device
sde: rw=6145, sector=21760, nr_sectors = 32 limit=0
BTRFS error (device sdd): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
BTRFS error (device sdd): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 1, corrupt 0, gen 0
btrfs: attempt to access beyond end of device
sde: rw=145409, sector=128, nr_sectors = 8 limit=0
BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
BTRFS error (device sdd): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0
btrfs: attempt to access beyond end of device
sde: rw=14337, sector=131072, nr_sectors = 8 limit=0
BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0
BTRFS error (device sdd): error writing primary super block to device 2
BTRFS info (device sdd): balance: start -dconvert=single -mconvert=dup -sconvert=dup
BTRFS info (device sdd): relocating block group 1372585984 flags data|raid1
BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 2, corrupt 0, gen 0
BTRFS warning (device sdd): chunk 2446327808 missing 1 devices, max tolerance is 0 for writable mount
BTRFS: error (device sdd) in write_all_supers:4044: errno=-5 IO failure (errors while submitting device barriers.)
BTRFS info (device sdd state E): forced readonly
BTRFS warning (device sdd state E): Skipping commit of aborted transaction.
BTRFS error (device sdd state EA): Transaction aborted (error -5)
BTRFS: error (device sdd state EA) in cleanup_transaction:2017: errno=-5 IO failure
BTRFS info (device sdd state EA): balance: ended with status: -5

[CAUSE]
Btrfs doesn't have any runtime device error handling, it fully rely on the extra copy provided.

For the sysfs block device removal, normally there is a device shutdown callback to the running fs, but unfortunately btrfs doesn't support this callback either.

Thus even with that device removed, btrfs will still access that removed device (both read and write, even if they will fail).

Normally for a full RAID1 btrfs, it will still be fine reading/write the fs as usual.
And the proper action is to replace the removed/missing/failing device with a newer one using btrfs device replace.

But when doing the convert, btrfs will allocate new metadata chunks on to the removed device (which will lose all writes).

And since the new metadata profile is DUP, which can not handle any missing device of that metadata chunk, finally it triggers the final protection at transaction commit time, and flips the fs RO, before it causing any real data loss.

[DOC ENHANCEMENT]
Add a warning to the convert filter about the dangerous doing convert to a less-safe profile when there is a known failing/removed device.

And mention the proper way to handle such failing/missing device.

The root fix is to introduce a failing/removed device detection for btrfs, but that will be a pretty big feature and will take quite some time before landing it upstream.

Reported-by: Jeff Siddall [email protected]
Link: https://lore.kernel.org/linux-btrfs/[email protected]/

duplication

[BUG]
There is a bug report that, a running btrfs with one of its device
deleted using sysfs ('/sys/block/<dev>/device/delete'), btrfs will still
read write on that device.

Normally it's fine as long as all chunks can tolerate that removed
device (e.g. all RAID1).

But the problem is when one is trying to lower the duplication by
converting to a less-safe profile:

  # mkfs.btrfs -f -m raid1 -d raid1 /dev/sdd /dev/sde
  # mount /dev/sdd /mnt
  # echo 1 > /sys/block/sde/device/delete
  # btrfs balance start --force -mdup -dsingle /mnt

This will lead to the fs mounted RO, with the following error messages:

 sd 6:0:0:0: [sde] Synchronizing SCSI cache
 ata7.00: Entering standby power mode
 btrfs: attempt to access beyond end of device
 sde: rw=6145, sector=21696, nr_sectors = 32 limit=0
 btrfs: attempt to access beyond end of device
 sde: rw=6145, sector=21728, nr_sectors = 32 limit=0
 btrfs: attempt to access beyond end of device
 sde: rw=6145, sector=21760, nr_sectors = 32 limit=0
 BTRFS error (device sdd): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device sdd): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device sdd): bdev /dev/sde errs: wr 3, rd 0, flush 1, corrupt 0, gen 0
 btrfs: attempt to access beyond end of device
 sde: rw=145409, sector=128, nr_sectors = 8 limit=0
 BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
 BTRFS error (device sdd): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0
 btrfs: attempt to access beyond end of device
 sde: rw=14337, sector=131072, nr_sectors = 8 limit=0
 BTRFS warning (device sdd): lost super block write due to IO error on /dev/sde (-5)
 BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0
 BTRFS error (device sdd): error writing primary super block to device 2
 BTRFS info (device sdd): balance: start -dconvert=single -mconvert=dup -sconvert=dup
 BTRFS info (device sdd): relocating block group 1372585984 flags data|raid1
 BTRFS error (device sdd): bdev /dev/sde errs: wr 5, rd 0, flush 2, corrupt 0, gen 0
 BTRFS warning (device sdd): chunk 2446327808 missing 1 devices, max tolerance is 0 for writable mount
 BTRFS: error (device sdd) in write_all_supers:4044: errno=-5 IO failure (errors while submitting device barriers.)
 BTRFS info (device sdd state E): forced readonly
 BTRFS warning (device sdd state E): Skipping commit of aborted transaction.
 BTRFS error (device sdd state EA): Transaction aborted (error -5)
 BTRFS: error (device sdd state EA) in cleanup_transaction:2017: errno=-5 IO failure
 BTRFS info (device sdd state EA): balance: ended with status: -5

[CAUSE]
Btrfs doesn't have any runtime device error handling, it fully rely on
the extra copy provided.

For the sysfs block device removal, normally there is a device shutdown
callback to the running fs, but unfortunately btrfs doesn't support this
callback either.

Thus even with that device removed, btrfs will still access that
removed device (both read and write, even if they will fail).

Normally for a full RAID1 btrfs, it will still be fine reading/write the
fs as usual.
And the proper action is to replace the removed/missing/failing device
with a newer one using `btrfs device replace`.

But when doing the convert, btrfs will allocate new metadata chunks on
to the removed device (which will lose all writes).

And since the new metadata profile is DUP, which can not handle any
missing device of that metadata chunk, finally it triggers the final
protection at transaction commit time, and flips the fs RO, before it
causing any real data loss.

[DOC ENHANCEMENT]
Add a warning to the `convert` filter about the dangerous doing convert
to a less-safe profile when there is a known failing/removed device.

And mention the proper way to handle such failing/missing device.

The root fix is to introduce a failing/removed device detection for
btrfs, but that will be a pretty big feature and will take quite some
time before landing it upstream.

Reported-by: Jeff Siddall <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
Signed-off-by: Qu Wenruo <[email protected]>
@adam900710 adam900710 added the docs Changes in documentation or help text label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Changes in documentation or help text
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant