-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS write regression on kernel 6.12 in Fedora 41 #17034
Comments
Fedora 41/6.12 got my attention as I've just upgraded myself. WD60EFAX are SMR drives which will tank performance, but the kernel change is interesting. It could well be booting back to 6.8 gives this drive more time to soak up writes before performance inevitably drops off. Try both kernels but monitor performance closely, if performance is good for a bit and then suddenly drops off then it will just be the joy of using an SMR drive unfortunately. The other one WD60EFRX is CMR though so it should be fine. Looks like you're in the process of replacing them all with WD120EFBX? If so, replace that slow drive next or switch it for a WD60EFRX if you have one spare. |
I was expecting the SMR drive to be the bottleneck and it is under the 6.8 kernel, so I probably could get even faster throughput once I replace it. But for some reason under 6.12 it's the least-slow drive. The first two drives in the array have 100+ ms write waits. |
Considering how complicated and unpredictable can be the SMR drive firmware, to me it is not obvious that the kernel version has anything to do about it, unless you tested switching there and back dozen times in unpredictable order. Otherwise it can just be that different drives started some internal housekeeping at different times. |
I cannot reproduce it on a raidz2-0 with 10 physical harddrives. RAM is less than written data.
|
Reproducible on my F41 machine with these kernel/zfs combos:
I have a mirrored pool setup with 18TB Exos drives, and exhibited similar performance to OP with asyncq_wait in the hundreds of ms:
I don't have any old enough kernels to test where the performance was back to normal, but I'm struggling to write more than ~40MB/s on a pool that used to have several hundred MB/s write throughput. Read speed seems normal. |
System information
Describe the problem you're observing
I just upgraded my installation of Fedora 40 running kernel 6.8.5 to Fedora 41 running kernel 6.12.11 and noticed a major decrease in write throughput.
I am using a single 17 GB file that I'm copying from an ext4 SSD onto a raidz-1 hdd array of 4 disks. The command I'm running is
rsync --info=progress2 test_file <destination>
(can rerun withfio
if that's preferred).Baseline: SSD -> SSD
Putting this here just to show that the file source is not the bottleneck.
SSD -> ZFS raidz1 on kernel 6.12.11
Results of
zpool iostat -vyl 30 1
taken in the middle of the transfer:It should not take 23 minutes to copy 17 GB. The disk waits are very long.
SSD -> ZFS raidz1 on kernel 6.8.5
I reboot the machine back onto 6.8.5 kernel that was retained from the upgrade. Everything else is the same.
I can see my one WD60EFAX disk is the write bottleneck but the overall disk bandwidth is much better than on 6.12.11. I don't have empiric pre-upgrade numbers but these are on par with how the system behaved in the past.
Is there any known change between kernels 6.8 and 6.12 that could explain such a drop in write bandwidth?
ZFS config
The text was updated successfully, but these errors were encountered: