Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash at login on Qemu #123

Closed
shenki opened this issue Jan 30, 2017 · 2 comments
Closed

Crash at login on Qemu #123

shenki opened this issue Jan 30, 2017 · 2 comments

Comments

@shenki
Copy link
Member

shenki commented Jan 30, 2017

Linux version 4.7.10-e62c7f11175525c028d9a6abaf6a2a5c275664d1 (jenkins@hudson) (gcc version 5.3.0 (GCC) ) #1 Wed Nov 16 04:25:06 UTC 2016

romulus login: ftgmac100 1e660000.ethernet eth0: NCSI interface up
root
Unable to handle kernel NULL pointer dereference at virtual address 00000094
pgd = 9e754000
[00000094] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
CPU: 0 PID: 682 Comm: systemd-network Not tainted 4.7.10-e62c7f11175525c028d9a6abaf6a2a5c275664d1 #1
Hardware name: ASpeed SoC
task: 9d4e4520 ti: 9e722000 task.ti: 9e722000
PC is at fib6_prune_clone+0x0/0x10
LR is at fib6_clean_node+0xac/0x15c
pc : [<803a005c>]    lr : [<803a1e1c>]    psr: 20000013
sp : 9e723c48  ip : 9ec19000  fp : 00000400
r10: 00000008  r9 : 9cfdcb18  r8 : 9e723dc8
r7 : 806217b0  r6 : 8060200c  r5 : 00000001  r4 : 9e723c98
r3 : 803a005c  r2 : 8060200c  r1 : 00000000  r0 : 00000001
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 00c5387d  Table: 9e754008  DAC: 00000051
Process systemd-network (pid: 682, stack limit = 0x9e722188)
Stack: (0x9e723c48 to 0x9e724000)
3c40:                   9e479000 00000001 00000000 80617380 00000000 fa8e0beb
3c60: 02080020 9e723c98 00000002 80617380 9e6e5700 803a01e8 9e723c98 9cfdcb0c
3c80: 80617380 803a032c 8060200c 9cfdcb0c 00000000 803a0398 8061766c 8061766c
3ca0: 9cfdcb0c 9d0e09e0 9d055634 00000002 8061ac01 00000000 00000001 803a1d70
3cc0: 9cfdcb0c 80617380 803a005c 00000000 00000000 fa8e0beb 9d057020 803a164c
3ce0: 00000003 00000001 00000000 9e6e577c 9e723d2c 00000000 00000000 9cfdcb00
3d00: 9d057020 9e723d54 8060200c 9e740c00 9e723e40 00000000 8060200c 8039baf0
3d20: 00000000 8039e8dc 00000000 00000000 00000000 fa8e0beb 00000000 8060200c
3d40: 9d5ad000 0000000a 8060200c 8039e984 8042faae 000000fe 00000400 00000000
3d60: 00000000 00000002 00000003 00000009 00000001 00000000 00000000 00000000
3d80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3da0: 00000000 000080fe 00000000 00000000 02000000 00000000 00000000 00000000
3dc0: 00000000 00000000 9d5ad000 80617380 00000001 00000000 00000000 fa8e0beb
3de0: 00000060 80329730 8060200c 803379d4 9e6fe000 9e723df4 9e723df4 80337c28
3e00: 9e740c00 fa8e0beb 9e723e3b 9d5ad000 9e740c00 80329590 8060200c 9e6fe000
3e20: 9e723e40 80339084 9e740c00 00000048 9e740c00 80329578 9e479000 80337b88
3e40: 7fffffff fa8e0beb 9e740c00 00000000 9e6fe000 9e723ed8 00000048 00000000
3e60: 00000008 80338398 7ee73a24 558d77d8 00000050 803a9a48 9d4c07c0 00000000
3e80: 000002aa 000003e0 000003e0 fa8e0beb 8060200c 9e723f04 00000000 8060200c
3ea0: 9e04f7c0 00000010 7ee7381c 00000000 558d7828 803012ac 9e723f04 80302488
3ec0: 9e723ee0 00000002 00000000 00000001 558d6f30 00000048 9e723f04 00000010
3ee0: 00000001 00000000 00000000 9e723ed8 00000000 00000000 00000000 00000000
3f00: 9d5b6d21 00000010 00000000 00000000 00000000 1efb5dd8 39af39bf 00000000
3f20: ffffffff 00000000 136f51a4 00b2d05e 00000000 8062a700 7ee739b8 8060200c
3f40: 9e723f8c 00000001 00000107 80102384 9e722000 00000051 00000001 00000107
3f60: 80102384 7ee73824 00000008 00000000 00000000 80276aac 7ee73824 8060200c
3f80: 00000001 fa8e0beb 00000001 7ee7381c 00000010 76f93ce8 00000122 80102384
3fa0: 9e722000 801021c0 7ee7381c 00000010 00000003 558d6f30 00000048 00000000
3fc0: 7ee7381c 00000010 76f93ce8 00000122 000002aa 558d3530 00000000 558d7828
3fe0: 00000000 7ee73800 7ee737dc 76eeceac 80000010 00000003 e5941008 e594300c
[<803a005c>] (fib6_prune_clone) from [<803a1e1c>] (fib6_clean_node+0xac/0x15c)
[<803a1e1c>] (fib6_clean_node) from [<803a01e8>] (fib6_walk_continue+0xe0/0x164)
[<803a01e8>] (fib6_walk_continue) from [<803a032c>] (fib6_walk+0x28/0x44)
[<803a032c>] (fib6_walk) from [<803a0398>] (fib6_prune_clones+0x50/0x78)
[<803a0398>] (fib6_prune_clones) from [<803a164c>] (fib6_add+0x788/0x8c4)
[<803a164c>] (fib6_add) from [<8039baf0>] (__ip6_ins_rt+0x34/0x50)
[<8039baf0>] (__ip6_ins_rt) from [<8039e8dc>] (ip6_route_add+0x5c/0xc4)
[<8039e8dc>] (ip6_route_add) from [<8039e984>] (inet6_rtm_newroute+0x40/0x60)
[<8039e984>] (inet6_rtm_newroute) from [<80329730>] (rtnetlink_rcv_msg+0x1a0/0x1d0)
[<80329730>] (rtnetlink_rcv_msg) from [<80339084>] (netlink_rcv_skb+0x58/0xb4)
[<80339084>] (netlink_rcv_skb) from [<80329578>] (rtnetlink_rcv+0x18/0x24)
[<80329578>] (rtnetlink_rcv) from [<80337b88>] (netlink_unicast+0x13c/0x210)
[<80337b88>] (netlink_unicast) from [<80338398>] (netlink_sendmsg+0x314/0x338)
[<80338398>] (netlink_sendmsg) from [<803012ac>] (sock_sendmsg+0x14/0x24)
[<803012ac>] (sock_sendmsg) from [<80302488>] (SyS_sendto+0xc4/0x100)
[<80302488>] (SyS_sendto) from [<801021c0>] (ret_fast_syscall+0x0/0x3c)
Code: e201101f e3a00001 e0030110 e12fff1e (e5d00093) 
---[ end trace ff0108a6246179b5 ]---
Kernel panic - not syncing: Fatal exception in interrupt
---[ end Kernel panic - not syncing: Fatal exception in interrupt
@amboar
Copy link
Member

amboar commented Jan 30, 2017

This is in similar territory to #101

@legoater
Copy link

legoater commented Jul 7, 2017

moved to openbmc/qemu openbmc/qemu#11

@legoater legoater closed this as completed Jul 7, 2017
shenki pushed a commit that referenced this issue Jun 4, 2019
[ Upstream commit a9fd095 ]

Leaving dev_init_lock mutex locked in probe causes BUG and a WARNING when
kernel is compiled with CONFIG_PROVE_LOCKING. Convert mutex to completion
which silences those warnings and improves code readability.

Fix below errors when connecting the USB WiFi dongle:

brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43143 for chip BCM43143/2
BUG: workqueue leaked lock or atomic: kworker/0:2/0x00000000/434
     last function: hub_event
1 lock held by kworker/0:2/434:
 #0: 18d5dcdf (&devinfo->dev_init_lock){+.+.}, at: brcmf_usb_probe+0x78/0x550 [brcmfmac]
CPU: 0 PID: 434 Comm: kworker/0:2 Not tainted 4.19.23-00084-g454a789-dirty #123
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: usb_hub_wq hub_event
[<8011237c>] (unwind_backtrace) from [<8010d74c>] (show_stack+0x10/0x14)
[<8010d74c>] (show_stack) from [<809c4324>] (dump_stack+0xa8/0xd4)
[<809c4324>] (dump_stack) from [<8014195c>] (process_one_work+0x710/0x808)
[<8014195c>] (process_one_work) from [<80141a80>] (worker_thread+0x2c/0x564)
[<80141a80>] (worker_thread) from [<80147bcc>] (kthread+0x13c/0x16c)
[<80147bcc>] (kthread) from [<801010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xed1d9fb0 to 0xed1d9ff8)
9fa0:                                     00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000

======================================================
WARNING: possible circular locking dependency detected
4.19.23-00084-g454a789-dirty #123 Not tainted
------------------------------------------------------
kworker/0:2/434 is trying to acquire lock:
e29cf799 ((wq_completion)"events"){+.+.}, at: process_one_work+0x174/0x808

but task is already holding lock:
18d5dcdf (&devinfo->dev_init_lock){+.+.}, at: brcmf_usb_probe+0x78/0x550 [brcmfmac]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&devinfo->dev_init_lock){+.+.}:
       mutex_lock_nested+0x1c/0x24
       brcmf_usb_probe+0x78/0x550 [brcmfmac]
       usb_probe_interface+0xc0/0x1bc
       really_probe+0x228/0x2c0
       __driver_attach+0xe4/0xe8
       bus_for_each_dev+0x68/0xb4
       bus_add_driver+0x19c/0x214
       driver_register+0x78/0x110
       usb_register_driver+0x84/0x148
       process_one_work+0x228/0x808
       worker_thread+0x2c/0x564
       kthread+0x13c/0x16c
       ret_from_fork+0x14/0x20
         (null)

-> #1 (brcmf_driver_work){+.+.}:
       worker_thread+0x2c/0x564
       kthread+0x13c/0x16c
       ret_from_fork+0x14/0x20
         (null)

-> #0 ((wq_completion)"events"){+.+.}:
       process_one_work+0x1b8/0x808
       worker_thread+0x2c/0x564
       kthread+0x13c/0x16c
       ret_from_fork+0x14/0x20
         (null)

other info that might help us debug this:

Chain exists of:
  (wq_completion)"events" --> brcmf_driver_work --> &devinfo->dev_init_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&devinfo->dev_init_lock);
                               lock(brcmf_driver_work);
                               lock(&devinfo->dev_init_lock);
  lock((wq_completion)"events");

 *** DEADLOCK ***

1 lock held by kworker/0:2/434:
 #0: 18d5dcdf (&devinfo->dev_init_lock){+.+.}, at: brcmf_usb_probe+0x78/0x550 [brcmfmac]

stack backtrace:
CPU: 0 PID: 434 Comm: kworker/0:2 Not tainted 4.19.23-00084-g454a789-dirty #123
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events request_firmware_work_func
[<8011237c>] (unwind_backtrace) from [<8010d74c>] (show_stack+0x10/0x14)
[<8010d74c>] (show_stack) from [<809c4324>] (dump_stack+0xa8/0xd4)
[<809c4324>] (dump_stack) from [<80172838>] (print_circular_bug+0x210/0x330)
[<80172838>] (print_circular_bug) from [<80175940>] (__lock_acquire+0x160c/0x1a30)
[<80175940>] (__lock_acquire) from [<8017671c>] (lock_acquire+0xe0/0x268)
[<8017671c>] (lock_acquire) from [<80141404>] (process_one_work+0x1b8/0x808)
[<80141404>] (process_one_work) from [<80141a80>] (worker_thread+0x2c/0x564)
[<80141a80>] (worker_thread) from [<80147bcc>] (kthread+0x13c/0x16c)
[<80147bcc>] (kthread) from [<801010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xed1d9fb0 to 0xed1d9ff8)
9fa0:                                     00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000

Signed-off-by: Piotr Figiel <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
shenki pushed a commit that referenced this issue Jan 31, 2022
commit 380a009 upstream.

We got issue as follows when run syzkaller:
[  167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user
[  167.938306] EXT4-fs (loop0): Remounting filesystem read-only
[  167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0'
[  167.983601] ------------[ cut here ]------------
[  167.984245] kernel BUG at fs/ext4/inode.c:847!
[  167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G    B             5.16.0-rc5-next-20211217+ #123
[  167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[  167.988590] RIP: 0010:ext4_getblk+0x17e/0x504
[  167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244
[  167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282
[  167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000
[  167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66
[  167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09
[  167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8
[  167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001
[  167.997158] FS:  00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000
[  167.998238] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0
[  167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  168.001899] Call Trace:
[  168.002235]  <TASK>
[  168.007167]  ext4_bread+0xd/0x53
[  168.007612]  ext4_quota_write+0x20c/0x5c0
[  168.010457]  write_blk+0x100/0x220
[  168.010944]  remove_free_dqentry+0x1c6/0x440
[  168.011525]  free_dqentry.isra.0+0x565/0x830
[  168.012133]  remove_tree+0x318/0x6d0
[  168.014744]  remove_tree+0x1eb/0x6d0
[  168.017346]  remove_tree+0x1eb/0x6d0
[  168.019969]  remove_tree+0x1eb/0x6d0
[  168.022128]  qtree_release_dquot+0x291/0x340
[  168.023297]  v2_release_dquot+0xce/0x120
[  168.023847]  dquot_release+0x197/0x3e0
[  168.024358]  ext4_release_dquot+0x22a/0x2d0
[  168.024932]  dqput.part.0+0x1c9/0x900
[  168.025430]  __dquot_drop+0x120/0x190
[  168.025942]  ext4_clear_inode+0x86/0x220
[  168.026472]  ext4_evict_inode+0x9e8/0xa22
[  168.028200]  evict+0x29e/0x4f0
[  168.028625]  dispose_list+0x102/0x1f0
[  168.029148]  evict_inodes+0x2c1/0x3e0
[  168.030188]  generic_shutdown_super+0xa4/0x3b0
[  168.030817]  kill_block_super+0x95/0xd0
[  168.031360]  deactivate_locked_super+0x85/0xd0
[  168.031977]  cleanup_mnt+0x2bc/0x480
[  168.033062]  task_work_run+0xd1/0x170
[  168.033565]  do_exit+0xa4f/0x2b50
[  168.037155]  do_group_exit+0xef/0x2d0
[  168.037666]  __x64_sys_exit_group+0x3a/0x50
[  168.038237]  do_syscall_64+0x3b/0x90
[  168.038751]  entry_SYSCALL_64_after_hwframe+0x44/0xae

In order to reproduce this problem, the following conditions need to be met:
1. Ext4 filesystem with no journal;
2. Filesystem image with incorrect quota data;
3. Abort filesystem forced by user;
4. umount filesystem;

As in ext4_quota_write:
...
         if (EXT4_SB(sb)->s_journal && !handle) {
                 ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)"
                         " cancelled because transaction is not started",
                         (unsigned long long)off, (unsigned long long)len);
                 return -EIO;
         }
...
We only check handle if NULL when filesystem has journal. There is need
check handle if NULL even when filesystem has no journal.

Signed-off-by: Ye Bin <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants