Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZTS: FreeBSD 13 panics on cp_stress.ksh (reproducable) #16297

Open
tonyhutter opened this issue Jun 24, 2024 · 4 comments
Open

ZTS: FreeBSD 13 panics on cp_stress.ksh (reproducable) #16297

tonyhutter opened this issue Jun 24, 2024 · 4 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@tonyhutter
Copy link
Contributor

tonyhutter commented Jun 24, 2024

System information

Type Version/Name
Distribution Name FreeBSD
Distribution Version 13.2-RELEASE-p10
Kernel Version
Architecture x86-64
OpenZFS Version master (c98295e)

Describe the problem you're observing

You can easily panic FreeBSD 13 by running the cp_stress.ksh ZTS test.

Describe how to reproduce the problem

Run the cp_stress tests on FreeBSD 13:

 ./scripts/zfs-tests.sh -x -t `pwd`/tests/zfs-tests/tests/functional/cp_files/cp_stress.ksh

Sometimes it passes, but typically it will panic after 1-3 tries. I hit it running on a VM with 4 vCPUs.

Include any warning/errors/backtraces from the system logs

panic: (link->list_next == NULL) is equivalent to (link->list_prev == NULL)
cpuid = 2
time = 1719250843
KDB: stack backtrace:
#0 0xffffffff80c53ff5 at kdb_backtrace+0x65
#1 0xffffffff80c06971 at vpanic+0x151
#2 0xffffffff824419ba at spl_panic+0x3a
#3 0xffffffff82440095 at list_link_active+0x55
#4 0xffffffff824ec3d3 at dnode_is_dirty+0x93
#5 0xffffffff824c6e87 at dmu_offset_next+0x57
#6 0xffffffff8264eb0d at zfs_holey+0x14d
#7 0xffffffff8246272f at zfs_freebsd_ioctl+0x4f
#8 0xffffffff80cf9474 at vn_ioctl+0x1a4
#9 0xffffffff80cf9dac at vn_seek+0x20c
#10 0xffffffff80cf289b at kern_lseek+0x6b
#11 0xffffffff810b289c at amd64_syscall+0x10c
#12 0xffffffff81089a8b at fast_syscall_common+0xf8
Uptime: 27m0s
Dumping 406 out of 4062 MB:..4%..12%..24%..32%..44%..52%..63%..71%..83%..91%
@tonyhutter tonyhutter added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jun 24, 2024
tonyhutter added a commit to tonyhutter/zfs that referenced this issue Jun 24, 2024
Mark it as pass until openzfs#16297
is fixed.
@robn
Copy link
Member

robn commented Jun 25, 2024

Unable to reproduce on 13.2-RELEASE-p11, OpenZFS c98295e. Tried VM with 2x and 4x cores, and 2G and 16G RAM.

Typical run:

robn@freebsd13:~/zfs $ ./scripts/zfs-tests.sh -Dvxt cp_stress

--- Cleanup ---
Removing pool(s):
Removing loopback(s):
Removing files(s):

--- Configuration ---
Runfiles:        /var/tmp/zfs-tests.2773.run
STF_TOOLS:       /home/robn/zfs/tests/test-runner
STF_SUITE:       /home/robn/zfs/tests/zfs-tests
STF_PATH:        /home/robn/zfs/tests/zfs-tests/bin
FILEDIR:         /var/tmp
FILES:           /var/tmp/file-vdev0 /var/tmp/file-vdev1 /var/tmp/file-vdev2
LOOPBACKS:       md0 md1 md2
DISKS:           md0 md1 md2
NUM_DISKS:       3
FILESIZE:        4G
ITERATIONS:      1
TAGS:            functional
STACK_TRACER:    no
Keep pool(s):    rpool
Missing util(s): arc_summary arcstat zilstat dbufstat mount.zfs zed zgenhostid devname2devid file_fadvise getversion mmap_libaio randfree_file read_dos_attributes renameat2 user_ns_exec write_dos_attributes xattrtest zed_fd_spill-zedlet idmap_util fio net pamtester rsync

/home/robn/zfs/tests/test-runner/bin/test-runner.py  -D   -c "/var/tmp/zfs-tests.2773.run" -T "functional" -i "/home/robn/zfs/tests/zfs-tests" -I "1"
NOTE: begin default_setup_noexit
SUCCESS: zpool create -f testpool md0
SUCCESS: zfs create testpool/testfs
SUCCESS: zfs set mountpoint=/var/tmp/testdir testpool/testfs
Test: /home/robn/zfs/tests/zfs-tests/tests/functional/cp_files/setup (run as root) [00:00] [PASS]
ASSERTION: Run the 'seekflood' binary repeatedly to try to trigger #15526
SUCCESS: mkdir /testpool/cp_stress
SUCCESS: /home/robn/zfs/tests/zfs-tests/tests/functional/cp_files/seekflood 2000 4
SUCCESS: /home/robn/zfs/tests/zfs-tests/tests/functional/cp_files/seekflood 2000 4
SUCCESS: /home/robn/zfs/tests/zfs-tests/tests/functional/cp_files/seekflood 2000 4
No corruption detected
NOTE: Performing local cleanup via log_onexit (cleanup)
Test: /home/robn/zfs/tests/zfs-tests/tests/functional/cp_files/cp_stress.ksh (run as root) [00:15] [PASS]
SUCCESS: zpool destroy -f testpool
SUCCESS: rm -rf /var/tmp/testdir
Test: /home/robn/zfs/tests/zfs-tests/tests/functional/cp_files/cleanup (run as root) [00:00] [PASS]

Results Summary
PASS	  3

Running Time:	00:00:15
Percent passed:	100.0%
Log directory:	/var/tmp/test_results/20240625T132626

Tests with results other than PASS that are expected:

Tests with result of PASS that are unexpected:

Tests with results other than PASS that are unexpected:

Anything interesting in your ZTS config? Specifically, I'm wondering about whether or not you're using "real" disks or the default files in /var/tmp. If the latter, is /var/tmp itself backed by ZFS, or UFS?

I'll stick it in a loop for an hour, try a bit harder. If that doesn't turn up anything, I'll make a real pool and put seekflood on a long run.

@robn
Copy link
Member

robn commented Jun 26, 2024

I ran the test over and over for a few hours (I forgot about it...), no dice. I set it up to run many thousands of files & threads for a good long while, no change there either. Finally, I reran all that against the builtin OpenZFS in 13.2, which also refused to blow up.

robn@freebsd13:~ $ zfs version
zfs-2.1.9-FreeBSD_g92e0d9d18
zfs-kmod-2.1.9-FreeBSD_g92e0d9d18

robn@freebsd13:~ $ uname -a
FreeBSD freebsd13 13.2-RELEASE-p11 FreeBSD 13.2-RELEASE-p11 GENERIC amd64

So more info needed!

@tonyhutter
Copy link
Contributor Author

Specifically, I'm wondering about whether or not you're using "real" disks or the default files in /var/tmp. If the latter, is /var/tmp itself backed by ZFS, or UFS?

Just UFS for all of /. I'm using the defaults for ZTS so I assume it's the /var/tmp disks.

I can still hit this 100% reliably in ZTS, but when I run the same seekflood binary manually to re-create the test by hand, I'm unable to hit the panic. It's very weird.

tonyhutter added a commit to tonyhutter/zfs that referenced this issue Jul 22, 2024
Mark it as pass until openzfs#16297
is fixed.
tonyhutter added a commit to tonyhutter/zfs that referenced this issue Jul 22, 2024
Mark it as pass until openzfs#16297
is fixed.
@mcmilk
Copy link
Contributor

mcmilk commented Aug 4, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

3 participants