Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oops 80f/17 Unable to handle kernel paging request at virtual address ffffffec #1019

Closed
schuhmam opened this issue Jun 10, 2015 · 4 comments
Closed

Comments

@schuhmam
Copy link

Today I received my Raspberry PI 2 Model B. To be sure to have a proper power supply I bought a kit containing a transparent case and a 2A output power supply and some cooling.
The microSDHC card is a Transcend 16GB class 10 (TS16GUSDHC10U1)

The device is supposed to work as router to replace a virtual machine running Ubuntu on Microsoft Hyper-V (so server can be turned off over night). Anyways. Everything is working but often the system stops reacting. I only see the attached message on the display.

The last time the error appears was after I tried to configure the perl CPAN with perl -MCPAN -e shell. After pressing twice enter (2 questions with default answer 'yes') this error occurs. After a reboot I retried and at the same spot it happened again. Don't get me wrong this happened before - the same error. I am not 100% sure but I think the mentioned virtual address is always the same.
I configure the system via putty SSH.

20150610_oops

Sorry for the light bulb but otherwise my Windows Phone is too stupid to make a proper autofocus.

Unable to handle kernel paging request at virtual address 7f1dd14c
pgd = 80004000
[7f1dd14c] *pgd=393ad811, *pte=3611a4df, *ppte=3611a65e
Internal error: Oops: 80f [#1] PREEMPT SMP ARM
Modules linked in: [...]
[...]
Process kworker/0:1 (pid: 34)

I tried to raise "vm.min_free_kbytes" in /etc/sysctl.conf to 16384 but it did not help. While googleing I turned off the VLANs in the switch for the PI so that I can use the old virtual machine for internet. During this process the PI run well. Very strange...
This routing, IPv6 and so on is very basic stuff. I can't imagine that this cause so many problems. The memory consumption was about 68 MiB. The device is not in the case yet. I can't imagine that it is going to hot - it is even not overclocked and handle some packages from network can't be that exhausting.
Any ideas?

The distribution is a Ubuntu with Kernel 3.18.0-20-rpi2 #21-Ubuntu

The original source is my recorded video:
https://drive.google.com/open?id=0B0R5aKIdB-XzQ2FIdUhVeDN4ZmM&authuser=0
(alternative) http://1drv.ms/1S5OGfH

@schuhmam
Copy link
Author

Additional info:

Some "specials" of my system (I don't know if it is that special but I better mention it):

  • installed with apt-get install: build-essential, vim, htop, wide-dhcpv6-client, vlan (don't remember 100%)
  • modprobe 8021q for VLAN usage; added VLAN 1, 4 and 3 on eth0 with vconfig add eth0 1; then added the appropriate IP address configurations in the /etc/network/interfaces
  • set in "/etc/sysctl.conf": net.ipv4.tcp_syncookies=1, net.ipv6.conf.all.forwarding=1, net.ipv6.conf.ppp0.accept_ra=2 and vm.min_free_kbytes = 16384
  • changed the keyboard layout to German and changed timezone with dpkg-reconfigure
  • added "ipv6 ipv6cp-use-ipaddr" in /etc/ppp/options to receive the IPv6 address from ISP (dynamic prefix :/)
  • installed curl and run rpi-update (after error occurred)

@popcornmix
Copy link
Collaborator

I assume you've seen no warning squares in top right of display? (for under-voltage or over-temperature)?

Do you get the same failure with raspbian?
Ubuntu is not something tested or directly supported by us so it's hard to offer advice.
If the issue doesn't occur with raspbian then it may suggest a problem with Ubuntu's kernel configuration or other setup.

@schuhmam
Copy link
Author

The warning square? Well at least I didn't know that they exist. So I think I should have recognized them.

But you are right. Ubuntu is nothing official with the Raspberry. I knew Ubuntu already so I wanted to give it a try first. I already started configuring the system and using Raspian now.

Now I am finished with setting up the system. I will have a look at the stability. Currently it runs fine. No crash so fare. Hope that will stay this way because I have absolutely no idea were the error is hidden.
Thank you so far.

@schuhmam
Copy link
Author

Looking good so far. I will come back if any problems show up.
Thanks.

anholt pushed a commit to anholt/linux that referenced this issue Jun 18, 2015
For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev->qdisc is finally set, this causes
q->list points to an old root qdisc which is going to be
freed right before assigning with a new one.

Fix this by moving ->attach() after setting dev->qdisc.

For the record, this fixes the following crash:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 975 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
 list_del corruption. prev->next should be ffff8800d1998ae8, but was 6b6b6b6b6b6b6b6b
 CPU: 1 PID: 975 Comm: tc Not tainted 4.1.0-rc4+ raspberrypi#1019
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000009 ffff8800d73fb928 ffffffff81a44e7f 0000000047574756
  ffff8800d73fb978 ffff8800d73fb968 ffffffff810790da ffff8800cfc4cd20
  ffffffff814e725b ffff8800d1998ae8 ffffffff82381250 0000000000000000
 Call Trace:
  [<ffffffff81a44e7f>] dump_stack+0x4c/0x65
  [<ffffffff810790da>] warn_slowpath_common+0x9c/0xb6
  [<ffffffff814e725b>] ? __list_del_entry+0x5a/0x98
  [<ffffffff81079162>] warn_slowpath_fmt+0x46/0x48
  [<ffffffff81820eb0>] ? dev_graft_qdisc+0x5e/0x6a
  [<ffffffff814e725b>] __list_del_entry+0x5a/0x98
  [<ffffffff814e72a7>] list_del+0xe/0x2d
  [<ffffffff81822f05>] qdisc_list_del+0x1e/0x20
  [<ffffffff81820cd1>] qdisc_destroy+0x30/0xd6
  [<ffffffff81822676>] qdisc_graft+0x11d/0x243
  [<ffffffff818233c1>] tc_get_qdisc+0x1a6/0x1d4
  [<ffffffff810b5eaf>] ? mark_lock+0x2e/0x226
  [<ffffffff817ff8f5>] rtnetlink_rcv_msg+0x181/0x194
  [<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
  [<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
  [<ffffffff817ff774>] ? __rtnl_unlock+0x17/0x17
  [<ffffffff81855dc6>] netlink_rcv_skb+0x4d/0x93
  [<ffffffff817ff756>] rtnetlink_rcv+0x26/0x2d
  [<ffffffff818544b2>] netlink_unicast+0xcb/0x150
  [<ffffffff81161db9>] ? might_fault+0x59/0xa9
  [<ffffffff81854f78>] netlink_sendmsg+0x4fa/0x51c
  [<ffffffff817d6e09>] sock_sendmsg_nosec+0x12/0x1d
  [<ffffffff817d8967>] sock_sendmsg+0x29/0x2e
  [<ffffffff817d8cf3>] ___sys_sendmsg+0x1b4/0x23a
  [<ffffffff8100a1b8>] ? native_sched_clock+0x35/0x37
  [<ffffffff810a1d83>] ? sched_clock_local+0x12/0x72
  [<ffffffff810a1fd4>] ? sched_clock_cpu+0x9e/0xb7
  [<ffffffff810def2a>] ? current_kernel_time+0xe/0x32
  [<ffffffff810b4bc5>] ? lock_release_holdtime.part.29+0x71/0x7f
  [<ffffffff810ddebf>] ? read_seqcount_begin.constprop.27+0x5f/0x76
  [<ffffffff810b6292>] ? trace_hardirqs_on_caller+0x17d/0x199
  [<ffffffff811b14d5>] ? __fget_light+0x50/0x78
  [<ffffffff817d9808>] __sys_sendmsg+0x42/0x60
  [<ffffffff817d9838>] SyS_sendmsg+0x12/0x1c
  [<ffffffff81a50e97>] system_call_fastpath+0x12/0x6f
 ---[ end trace ef29d3fb28e97ae7 ]---

For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.

Fixes: 95dc192 ("pkt_sched: give visibility to mq slave qdiscs")
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
davet321 pushed a commit to davet321/rpi-linux that referenced this issue Jun 23, 2015
[ Upstream commit 86e363d ]

For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev->qdisc is finally set, this causes
q->list points to an old root qdisc which is going to be
freed right before assigning with a new one.

Fix this by moving ->attach() after setting dev->qdisc.

For the record, this fixes the following crash:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 975 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
 list_del corruption. prev->next should be ffff8800d1998ae8, but was 6b6b6b6b6b6b6b6b
 CPU: 1 PID: 975 Comm: tc Not tainted 4.1.0-rc4+ raspberrypi#1019
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000009 ffff8800d73fb928 ffffffff81a44e7f 0000000047574756
  ffff8800d73fb978 ffff8800d73fb968 ffffffff810790da ffff8800cfc4cd20
  ffffffff814e725b ffff8800d1998ae8 ffffffff82381250 0000000000000000
 Call Trace:
  [<ffffffff81a44e7f>] dump_stack+0x4c/0x65
  [<ffffffff810790da>] warn_slowpath_common+0x9c/0xb6
  [<ffffffff814e725b>] ? __list_del_entry+0x5a/0x98
  [<ffffffff81079162>] warn_slowpath_fmt+0x46/0x48
  [<ffffffff81820eb0>] ? dev_graft_qdisc+0x5e/0x6a
  [<ffffffff814e725b>] __list_del_entry+0x5a/0x98
  [<ffffffff814e72a7>] list_del+0xe/0x2d
  [<ffffffff81822f05>] qdisc_list_del+0x1e/0x20
  [<ffffffff81820cd1>] qdisc_destroy+0x30/0xd6
  [<ffffffff81822676>] qdisc_graft+0x11d/0x243
  [<ffffffff818233c1>] tc_get_qdisc+0x1a6/0x1d4
  [<ffffffff810b5eaf>] ? mark_lock+0x2e/0x226
  [<ffffffff817ff8f5>] rtnetlink_rcv_msg+0x181/0x194
  [<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
  [<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
  [<ffffffff817ff774>] ? __rtnl_unlock+0x17/0x17
  [<ffffffff81855dc6>] netlink_rcv_skb+0x4d/0x93
  [<ffffffff817ff756>] rtnetlink_rcv+0x26/0x2d
  [<ffffffff818544b2>] netlink_unicast+0xcb/0x150
  [<ffffffff81161db9>] ? might_fault+0x59/0xa9
  [<ffffffff81854f78>] netlink_sendmsg+0x4fa/0x51c
  [<ffffffff817d6e09>] sock_sendmsg_nosec+0x12/0x1d
  [<ffffffff817d8967>] sock_sendmsg+0x29/0x2e
  [<ffffffff817d8cf3>] ___sys_sendmsg+0x1b4/0x23a
  [<ffffffff8100a1b8>] ? native_sched_clock+0x35/0x37
  [<ffffffff810a1d83>] ? sched_clock_local+0x12/0x72
  [<ffffffff810a1fd4>] ? sched_clock_cpu+0x9e/0xb7
  [<ffffffff810def2a>] ? current_kernel_time+0xe/0x32
  [<ffffffff810b4bc5>] ? lock_release_holdtime.part.29+0x71/0x7f
  [<ffffffff810ddebf>] ? read_seqcount_begin.constprop.27+0x5f/0x76
  [<ffffffff810b6292>] ? trace_hardirqs_on_caller+0x17d/0x199
  [<ffffffff811b14d5>] ? __fget_light+0x50/0x78
  [<ffffffff817d9808>] __sys_sendmsg+0x42/0x60
  [<ffffffff817d9838>] SyS_sendmsg+0x12/0x1c
  [<ffffffff81a50e97>] system_call_fastpath+0x12/0x6f
 ---[ end trace ef29d3fb28e97ae7 ]---

For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.

Fixes: 95dc192 ("pkt_sched: give visibility to mq slave qdiscs")
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants