-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS 2.3.0 ignores zfs_arc_max, exhausts system memory #17052
Comments
@maru-sama The fact that ARC is allowed to use almost all of RAM in 2.3 is not a bug, but feature. Once there appear memory pressure from the kernel and other consumers, it should shrink. |
Hello, thanks for the reply. I removed my initial comment since the behaviour I described is standard (as you said) and the original poster is seeing a different issue. |
@brian-maloney As I see, most of your ARC size is reported as non-evictable metadata, which means something is actively referencing it. It might be metadata blocks backing dnodes, referenced by kernel via inodes. ZFS should start inodes pruning process for kernel once the percent of non-evictable metadata goes above the threshold and there is a need of eviction. So either the pruning does not start, or it does not work right, or I wonder if it just can't since those files are actually opened and kernel doesn't want them to go. |
@brian-maloney Do you use docker? |
I do have Docker running on the system, but the application that triggers the memory exhaustion (duplicacy) is not running in a container. So if the mere presence of containers on the system is enough to trigger the issue, then that might be of use, but most of the time (all except for 10-15 minutes per day) the ARC max is honored, it's only while duplicacy is running that I see the problem behavior. To @amotin's comment, it's likely that duplicacy does hold many files open while computing chunks to check the backup storage location for. I am not that familiar with the implementation. |
I did some testing with this. The dangerous interaction seems to occur only during the first phase of the backup operation, which is a depth-first listing of all files on the system (as seen in ListLocalFiles). I'm not sure what exactly about this implementation would be causing this behavior, but I did do a During this first phase, I am hopeful that these are enough clues to help someone more knowledgable than myself on ZFS internals and the 2.3.0 changes either suggest a code fix or a workaround. All of my attempts to tune so far have been completely ineffective. Thanks in advance for any advice you can provide! |
@amotin A quick test shows that, even on 2.3, setting |
@shodanshok Yes, |
@amotin :
Does that mean that the 50 % RAM limit for ARC that was in place for ZFS 2.2 and before does not apply anymore? Where can I read more about this change? I havent found a note in the change log. |
Right.
It was changed here: #15437 . It was discussed among developers and accompanied by several other changes to better coexist with Linux kernel, including some fixes to Linux kernel itself. |
Just to (hopefully) rule out any issues related to docker, I stopped the docker service (and all containers) and ran another backup. The same issue is present, so I think it's probably not docker-related. |
System information
Describe the problem you're observing
Possibly a duplicate of #16325, but I didn't see the behavior with the versions listed in that bug.
After upgrading from kernel
6.6.63
with ZFS2.2.6
to kernel6.12.13
with ZFS2.3.0
, I am experiencing issues with ARC memory utilization when my system's daily backups occur. My system has 16GB of RAM and I havezfs_arc_max
set to2147483648
(2 GiB). When the backup occurs now, the ARC grows to a large size and the system locks up due to memory exhaustion.I can help the system get through the backup using
/proc/sys/vm/drop_caches
to clear the cache manually. I'm attempting to tune other ZFS settings to see if there's another way around it but I've never seen it exceedzfs_arc_max
by this much before.Describe how to reproduce the problem
This is reproducible any time I run a backup task.
Include any warning/errors/backtraces from the system logs
Here's an
arc_summary -d
run exhibiting the issue:The text was updated successfully, but these errors were encountered: