You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Detaching the spares got the pools back to being healthy again. Here is the procedure our admins used to get the pool back to normal:
1. zpool detach <GUID of L6/old>
1. It detached, but still left with 2 ONLINE spares
2. zpool detach draid2-0-1
1. Spare detached and the good L6 decreased one indentation level
but draid2-0-0 didn't auto-detach
3. zpool detach draid2-0-0
1. Spare detached leaving everything looking normal
4. Started a scrub
Describe how to reproduce the problem
We will need to develop a test case to reproduce this. I think it would be roughly:
Create a dRAID pool with 2 spares.
Fault one of the disks, call it disk1
Let the dRAID spare kick in.
Replace disk1 with a new disk, called disk1-new
While it's resilvering to disk1-new, fault disk1-new.
See if the 2nd spare kicks in
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered:
It's possible for two spares to get attached to a single failed vdev.
This happens when you have a failed disk that is spared, and then you
replace the failed disk with a new disk, but during the resilver
the new disk fails, and ZED kicks in a spare for the failed new
disk. This commit checks for that condition and disallows it.
Reviewed-by: Akash B <[email protected]>
Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes: #16547Closes: #17231
System information
Describe the problem you're observing
We've seen cases where two spares were assigned to the same failed vdev:
Detaching the spares got the pools back to being healthy again. Here is the procedure our admins used to get the pool back to normal:
Describe how to reproduce the problem
We will need to develop a test case to reproduce this. I think it would be roughly:
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: