-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Board will not completely powered off #1
Comments
RobertCNelson
pushed a commit
that referenced
this issue
Aug 14, 2014
[ Upstream commit 1be9a95 ] Jason reported an oops caused by SCTP on his ARM machine with SCTP authentication enabled: Internal error: Oops: 17 [#1] ARM CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1 task: c6eefa40 ti: c6f52000 task.ti: c6f52000 PC is at sctp_auth_calculate_hmac+0xc4/0x10c LR is at sg_init_table+0x20/0x38 pc : [<c024bb80>] lr : [<c00f32dc>] psr: 40000013 sp : c6f538e8 ip : 00000000 fp : c6f53924 r10: c6f50d80 r9 : 00000000 r8 : 00010000 r7 : 00000000 r6 : c7be4000 r5 : 00000000 r4 : c6f56254 r3 : c00c8170 r2 : 00000001 r1 : 00000008 r0 : c6f1e660 Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005397f Table: 06f28000 DAC: 00000015 Process sctp-test (pid: 104, stack limit = 0xc6f521c0) Stack: (0xc6f538e8 to 0xc6f54000) [...] Backtrace: [<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8) [<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844) [<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28) [<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220) [<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4) [<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160) [<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74) [<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888) While we already had various kind of bugs in that area ec0223e ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable") and b14878c ("net: sctp: cache auth_enable per endpoint"), this one is a bit of a different kind. Giving a bit more background on why SCTP authentication is needed can be found in RFC4895: SCTP uses 32-bit verification tags to protect itself against blind attackers. These values are not changed during the lifetime of an SCTP association. Looking at new SCTP extensions, there is the need to have a method of proving that an SCTP chunk(s) was really sent by the original peer that started the association and not by a malicious attacker. To cause this bug, we're triggering an INIT collision between peers; normal SCTP handshake where both sides intent to authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO parameters that are being negotiated among peers: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- RFC4895 says that each endpoint therefore knows its own random number and the peer's random number *after* the association has been established. The local and peer's random number along with the shared key are then part of the secret used for calculating the HMAC in the AUTH chunk. Now, in our scenario, we have 2 threads with 1 non-blocking SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling sctp_bindx(3), listen(2) and connect(2) against each other, thus the handshake looks similar to this, e.g.: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------- -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------> ... Since such collisions can also happen with verification tags, the RFC4895 for AUTH rather vaguely says under section 6.1: In case of INIT collision, the rules governing the handling of this Random Number follow the same pattern as those for the Verification Tag, as explained in Section 5.2.4 of RFC 2960 [5]. Therefore, each endpoint knows its own Random Number and the peer's Random Number after the association has been established. In RFC2960, section 5.2.4, we're eventually hitting Action B: B) In this case, both sides may be attempting to start an association at about the same time but the peer endpoint started its INIT after responding to the local endpoint's INIT. Thus it may have picked a new Verification Tag not being aware of the previous Tag it had sent this endpoint. The endpoint should stay in or enter the ESTABLISHED state but it MUST update its peer's Verification Tag from the State Cookie, stop any init or cookie timers that may running and send a COOKIE ACK. In other words, the handling of the Random parameter is the same as behavior for the Verification Tag as described in Action B of section 5.2.4. Looking at the code, we exactly hit the sctp_sf_do_dupcook_b() case which triggers an SCTP_CMD_UPDATE_ASSOC command to the side effect interpreter, and in fact it properly copies over peer_{random, hmacs, chunks} parameters from the newly created association to update the existing one. Also, the old asoc_shared_key is being released and based on the new params, sctp_auth_asoc_init_active_key() updated. However, the issue observed in this case is that the previous asoc->peer.auth_capable was 0, and has *not* been updated, so that instead of creating a new secret, we're doing an early return from the function sctp_auth_asoc_init_active_key() leaving asoc->asoc_shared_key as NULL. However, we now have to authenticate chunks from the updated chunk list (e.g. COOKIE-ACK). That in fact causes the server side when responding with ... <------------------ AUTH; COOKIE-ACK ----------------- ... to trigger a NULL pointer dereference, since in sctp_packet_transmit(), it discovers that an AUTH chunk is being queued for xmit, and thus it calls sctp_auth_calculate_hmac(). Since the asoc->active_key_id is still inherited from the endpoint, and the same as encoded into the chunk, it uses asoc->asoc_shared_key, which is still NULL, as an asoc_key and dereferences it in ... crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len) ... causing an oops. All this happens because sctp_make_cookie_ack() called with the *new* association has the peer.auth_capable=1 and therefore marks the chunk with auth=1 after checking sctp_auth_send_cid(), but it is *actually* sent later on over the then *updated* association's transport that didn't initialize its shared key due to peer.auth_capable=0. Since control chunks in that case are not sent by the temporary association which are scheduled for deletion, they are issued for xmit via SCTP_CMD_REPLY in the interpreter with the context of the *updated* association. peer.auth_capable was 0 in the updated association (which went from COOKIE_WAIT into ESTABLISHED state), since all previous processing that performed sctp_process_init() was being done on temporary associations, that we eventually throw away each time. The correct fix is to update to the new peer.auth_capable value as well in the collision case via sctp_assoc_update(), so that in case the collision migrated from 0 -> 1, sctp_auth_asoc_init_active_key() can properly recalculate the secret. This therefore fixes the observed server panic. Fixes: 730fc3d ("[SCTP]: Implete SCTP-AUTH parameter processing") Reported-by: Jason Gunthorpe <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Jason Gunthorpe <[email protected]> Cc: Vlad Yasevich <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
It was never ported from 3.8... lots of little things are missing.. If you find it before me, feel free to send a patch. ;) |
RobertCNelson
pushed a commit
that referenced
this issue
Aug 21, 2014
omap hwmod is really sensitive to hwmod misconfiguration. Getting a minor clock wrong always ended up in a crash. Attempt to be more resilient by not assigning variables with error codes and then attempting to use them. Without this patch, missing a clock ends up with something like this: omap_hwmod: ehrpwm0: cannot clk_get opt_clk ehrpwm0_tbclk! Unable to handle kernel NULL pointer dereference at virtual address 0000002a! pgd = c0004000! [0000002a] *pgd=00000000! Internal error: Oops: 5 [#1] SMP ARM! Modules linked in:! CPU: 0 Not tainted (3.8.0-rc2-12157-g76c7825-dirty #291)! PC is at __clk_prepare+0x10/0x70! LR is at clk_prepare+0x1c/0x34! pc : [<c03e37f0>] lr : [<c03e386c>] psr: a0000113! sp : cf04fef8 ip : 22222222 fp : 00000000! r10: ffffffea r9 : 00000000 r8 : 00000000! r7 : fffffffe r6 : 00000001 r5 : fffffffe r4 : fffffffe! r3 : cf041ac0 r2 : cf04ff00 r1 : 22222222 r0 : fffffffe! Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel! Control: 10c5387d Table: 80004019 DAC: 00000015! Process swapper/0 (pid: 1, stack limit = 0xcf04e240)! Stack: (0xcf04fef8 to 0xcf050000)! fee0: cf041ac0 c07749f4! ff00: fffffffe c03e386c c07499cc c073c070 c073d2fc c06d4e4c c073c070 c071cc18! ff20: c06d4c4c 00000000 00000000 c0708284 c06c91bc c0025e28 c073a030 00000001! ff40: c06f5f60 c06f5f40 c06d5324 c06d533c cf04e000 c0008870 c06d5324 c060abe8! ff60: c07082e8 00000002 00000001 c06f5f60 c06f5f40 c077d700 00000099 c04a43d4! ff80: 00000001 00000001 c06c91bc 00000000 00000000 c04a42dc 00000000 00000000! ffa0: 00000000 00000000 00000000 c000d678 00000000 00000000 00000000 00000000! ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000! ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 9e7befee f7bbaaab! [<c03e37f0>] (__clk_prepare+0x10/0x70) from [<c03e386c>] (clk_prepare+0x1c/0x34)! [<c03e386c>] (clk_prepare+0x1c/0x34) from [<c06d4e4c>] (_init+0x200/0x288)! [<c06d4e4c>] (_init+0x200/0x288) from [<c0025e28>] (omap_hwmod_for_each+0x28/0x58)! [<c0025e28>] (omap_hwmod_for_each+0x28/0x58) from [<c06d533c>] (omap_hwmod_setup_all+0x18/0x34)! [<c06d533c>] (omap_hwmod_setup_all+0x18/0x34) from [<c0008870>] (do_one_initcall+0x90/0x160)! [<c0008870>] (do_one_initcall+0x90/0x160) from [<c04a43d4>] (kernel_init+0xf8/0x290)! [<c04a43d4>] (kernel_init+0xf8/0x290) from [<c000d678>] (ret_from_fork+0x14/0x3c)! Code: e92d4038 e2504000 01a05004 0a000015 (e594302c) ! ---[ end trace 1b75b31a2719ed1c ]---! Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b! Signed-off-by: Pantelis Antoniou <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: 0000 [beagleboard#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: ffff8800563b3f00 ti: ffff88004bdf4000 task.ti: ffff88004bdf4000 RIP: 0010:[<ffffffffa008fb37>] [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:ffff88004bdf5958 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8801135e0000 RCX: ffff8800ad3bf8e0 RDX: ffff8800ad3bf8e0 RSI: 0000000000000000 RDI: ffff8801007ee780 RBP: ffff88004bdf5978 R08: ffff8800ad3bf8e0 R09: 0000000000000000 R10: ffffffff86ca1810 R11: ffff880036a17101 R12: ffff8801007ee780 R13: 0000000000018001 R14: ffff880118c4e000 R15: ffff8801007ee780 FS: 00007f401a0ce740(0000) GS:ffff88011e280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000005635c000 CR4: 00000000001407e0 Stack: ffff8801007ee780 ffff88005c253180 0000000000018000 ffff8801135e0000 ffff88004bdf59a8 ffffffffa0088e55 0000000000000011 ffff8801007eec00 0000000000018000 ffff880036a17101 ffff88004bdf5a08 ffffffffa0089026 Call Trace: [<ffffffffa0088e55>] i915_vma_unbind+0xdf/0x1ab [i915] [<ffffffffa0089026>] __i915_gem_shrink+0x105/0x177 [i915] [<ffffffffa0089452>] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [<ffffffffa0085ba9>] i915_gem_object_get_pages+0x61/0x90 [i915] [<ffffffffa008f22b>] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [<ffffffffa008a113>] i915_gem_object_pin+0x1fa/0x5df [i915] [<ffffffffa008cdfe>] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [<ffffffffa008d156>] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [<ffffffffa008dbf6>] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [<ffffffff810fc823>] ? might_fault+0x40/0x90 [<ffffffffa008eb89>] i915_gem_execbuffer2+0x187/0x222 [i915] [<ffffffffa000971c>] drm_ioctl+0x308/0x442 [drm] [<ffffffffa008ea02>] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [<ffffffff817db156>] ? __do_page_fault+0x3dd/0x481 [<ffffffff8112fdba>] vfs_ioctl+0x26/0x39 [<ffffffff811306a2>] do_vfs_ioctl+0x40e/0x451 [<ffffffff817deda7>] ? sysret_check+0x1b/0x56 [<ffffffff8113073c>] SyS_ioctl+0x57/0x87 [<ffffffff8135bbfe>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff817ded82>] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 <8b> 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP <ffff88004bdf5958> CR2: 0000000000000008 As a consequence we need to change the "only one vma for now" check in vma_unbind - since vma_destroy isn't always called the obj->vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. This fixes the igt/gem_evict_everything|alignment testcases. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson <[email protected]> Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! v3: Improvements suggested in Chris' review: - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm allocation before the early return. - Bail out if we hit the WARN in mark_free to hopefully make the kernel survive for long enough to capture it. Cc: Chris Wilson <[email protected]> Cc: Ben Widawsky <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Tested-by: lu hua <[email protected]> (v2) Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Daniel Vetter <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Recently had this Oops reported to me on the 3.10 kernel: [ 807.554955] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 807.562799] IP: [<ffffffff814e6fc7>] skb_dequeue+0x47/0x70 [ 807.568296] PGD 20c889067 PUD 20c8b8067 PMD 0 [ 807.572769] Oops: 0002 [beagleboard#1] SMP [ 807.655597] Hardware name: Dell Inc. PowerEdge R415/0DDT2D, BIOS 1.8.6 12/06/2011 [ 807.663079] Workqueue: events fcoe_ctlr_recv_work [libfcoe] [ 807.668656] task: ffff88020b42a160 ti: ffff88020ae6c000 task.ti: ffff88020ae6c000 [ 807.676126] RIP: 0010:[<ffffffff814e6fc7>] [<ffffffff814e6fc7>] skb_dequeue+0x47/0x70 [ 807.684046] RSP: 0000:ffff88020ae6dd70 EFLAGS: 00010097 [ 807.689349] RAX: 0000000000000246 RBX: ffff8801d04d6700 RCX: 0000000000000000 [ 807.696474] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff88020df26434 [ 807.703598] RBP: ffff88020ae6dd88 R08: 00000000000173e0 R09: ffff880216e173e0 [ 807.710723] R10: ffffffff814e5897 R11: ffffea0007413580 R12: ffff88020df26420 [ 807.717847] R13: ffff88020df26434 R14: 0000000000000004 R15: ffff8801d04c42ce [ 807.724972] FS: 00007fdaab6048c0(0000) GS:ffff880216e00000(0000) knlGS:0000000000000000 [ 807.733049] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 807.738785] CR2: 0000000000000008 CR3: 000000020cbc9000 CR4: 00000000000006f0 [ 807.745910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 807.753033] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 807.760156] Stack: [ 807.762162] ffff8801d04d6700 0000000000000001 ffff88020df26400 ffff88020ae6de20 [ 807.769586] ffffffffa0444409 ffff88020b046a00 ffff88020ae6dde8 ffffffff810105be [ 807.777008] ffff88020b42a868 0000000000000000 ffff88020df264a8 ffff88020df26348 [ 807.784431] Call Trace: [ 807.786885] [<ffffffffa0444409>] fcoe_ctlr_recv_work+0x59/0x9a0 [libfcoe] [ 807.793755] [<ffffffff810105be>] ? __switch_to+0x13e/0x4a0 [ 807.799324] [<ffffffff8107d0e6>] process_one_work+0x176/0x420 [ 807.805151] [<ffffffff8107dd0b>] worker_thread+0x11b/0x3a0 [ 807.810717] [<ffffffff8107dbf0>] ? rescuer_thread+0x350/0x350 [ 807.816545] [<ffffffff810842b0>] kthread+0xc0/0xd0 [ 807.821416] [<ffffffff810841f0>] ? insert_kthread_work+0x40/0x40 [ 807.827503] [<ffffffff8160ce2c>] ret_from_fork+0x7c/0xb0 [ 807.832897] [<ffffffff810841f0>] ? insert_kthread_work+0x40/0x40 [ 807.858500] RIP [<ffffffff814e6fc7>] skb_dequeue+0x47/0x70 [ 807.864076] RSP <ffff88020ae6dd70> [ 807.867558] CR2: 0000000000000008 Looks like the root cause is the fact that the packet recieve function fcoe_ctlr_recv enqueues the skb to a sk_buff_head_list prior to ensuring that the skb is unshared. This can happen when multiple packet listeners recieve an skb, as the deliver_skb function just increments skb->users for each handler. As a result, having multiple users of a single skb results in multiple manipulators of its methods, implying list corruption, and the oops recorded above. The fix is pretty easy, just make sure that we clone the skb if its got multiple users with the skb_share_check function, like other protocols do. Signed-off-by: Neil Horman <[email protected]> Signed-off-by: Robert Love <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
If memory allocation of in pcpu_embed_first_chunk() fails, the allocated memory is not released correctly. In the release loop also the non-allocated elements are released which leads to the following kernel BUG on systems with very little memory: [ 0.000000] kernel BUG at mm/bootmem.c:307! [ 0.000000] illegal operation: 0001 [beagleboard#1] PREEMPT SMP DEBUG_PAGEALLOC [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0 beagleboard#22 [ 0.000000] task: 0000000000a20ae0 ti: 0000000000a08000 task.ti: 0000000000a08000 [ 0.000000] Krnl PSW : 0400000180000000 0000000000abda7a (__free+0x116/0x154) [ 0.000000] R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 EA:3 ... [ 0.000000] [<0000000000abdce2>] mark_bootmem_node+0xde/0xf0 [ 0.000000] [<0000000000abdd9c>] mark_bootmem+0xa8/0x118 [ 0.000000] [<0000000000abcbba>] pcpu_embed_first_chunk+0xe7a/0xf0c [ 0.000000] [<0000000000abcc96>] setup_per_cpu_areas+0x4a/0x28c To fix the problem now only allocated elements are released. This then leads to the correct kernel panic: [ 0.000000] Kernel panic - not syncing: Failed to initialize percpu areas. ... [ 0.000000] Call Trace: [ 0.000000] ([<000000000011307e>] show_trace+0x132/0x150) [ 0.000000] [<0000000000113160>] show_stack+0xc4/0xd4 [ 0.000000] [<00000000007127dc>] dump_stack+0x74/0xd8 [ 0.000000] [<00000000007123fe>] panic+0xea/0x264 [ 0.000000] [<0000000000b14814>] setup_per_cpu_areas+0x5c/0x28c tj: Flipped if conditional so that it doesn't need "continue". Signed-off-by: Michael Holzheu <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided and less registers are used when code is generated. At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too. The patchset includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as beagleboard#1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, u); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(this_cpu_ptr(&x), y, sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to this_cpu_inc(y) Signed-off-by: Christoph Lameter <[email protected]> [ paulmck: Address conflicts. ] Signed-off-by: Paul E. McKenney <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Tested via g_ether: $ modprobe g_ether using random self ethernet address using random host ethernet address usb0: HOST MAC 42:b5:26:a9:48:21 usb0: MAC 36:a6:85:9b:9e:13 using random self ethernet address using random host ethernet address g_ether gadget: Ethernet Gadget, version: Memorial Day 2008 g_ether gadget: g_ether ready $ g_ether gadget: high-speed config beagleboard#1: CDC Ethernet (ECM) $ ifconfig usb0 10.0.0.2 Then on the PC host side: ~$ sudo ifconfig usb0 10.0.0.1 ~$ ping 10.0.0.2 PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data. 64 bytes from 10.0.0.2: icmp_req=1 ttl=64 time=1.26 ms 64 bytes from 10.0.0.2: icmp_req=2 ttl=64 time=0.280 ms 64 bytes from 10.0.0.2: icmp_req=3 ttl=64 time=0.297 ms Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: Shawn Guo <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
The SELinux/NetLabel glue code has a locking bug that affects systems with NetLabel enabled, see the kernel error message below. This patch corrects this problem by converting the bottom half socket lock to a more conventional, and correct for this call-path, lock_sock() call. =============================== [ INFO: suspicious RCU usage. ] 3.11.0-rc3+ beagleboard#19 Not tainted ------------------------------- net/ipv4/cipso_ipv4.c:1928 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ping/731: #0: (slock-AF_INET/1){+.-...}, at: [...] selinux_netlbl_socket_connect beagleboard#1: (rcu_read_lock){.+.+..}, at: [<...>] netlbl_conn_setattr stack backtrace: CPU: 1 PID: 731 Comm: ping Not tainted 3.11.0-rc3+ beagleboard#19 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff88006f659d28 ffffffff81726b6a ffff88003732c500 ffff88006f659d58 ffffffff810e4457 ffff88006b845a00 0000000000000000 000000000000000c ffff880075aa2f50 ffff88006f659d90 ffffffff8169bec7 Call Trace: [<ffffffff81726b6a>] dump_stack+0x54/0x74 [<ffffffff810e4457>] lockdep_rcu_suspicious+0xe7/0x120 [<ffffffff8169bec7>] cipso_v4_sock_setattr+0x187/0x1a0 [<ffffffff8170f317>] netlbl_conn_setattr+0x187/0x190 [<ffffffff8170f195>] ? netlbl_conn_setattr+0x5/0x190 [<ffffffff8131ac9e>] selinux_netlbl_socket_connect+0xae/0xc0 [<ffffffff81303025>] selinux_socket_connect+0x135/0x170 [<ffffffff8119d127>] ? might_fault+0x57/0xb0 [<ffffffff812fb146>] security_socket_connect+0x16/0x20 [<ffffffff815d3ad3>] SYSC_connect+0x73/0x130 [<ffffffff81739a85>] ? sysret_check+0x22/0x5d [<ffffffff810e5e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff81373d4e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff815d52be>] SyS_connect+0xe/0x10 [<ffffffff81739a59>] system_call_fastpath+0x16/0x1b Cc: [email protected] Signed-off-by: Paul Moore <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
…nfs_open() Use i_writecount to control whether to get an fscache cookie in nfs_open() as NFS does not do write caching yet. I *think* this is the cause of a problem encountered by Mark Moseley whereby __fscache_uncache_page() gets a NULL pointer dereference because cookie->def is NULL: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160 PGD 0 Thread overran stack, or stack corrupted Oops: 0000 [beagleboard#1] SMP Modules linked in: ... CPU: 7 PID: 18993 Comm: php Not tainted 3.11.1 beagleboard#1 Hardware name: Dell Inc. PowerEdge R420/072XWF, BIOS 1.3.5 08/21/2012 task: ffff8804203460c0 ti: ffff880420346640 RIP: 0010:[<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160 RSP: 0018:ffff8801053af878 EFLAGS: 00210286 RAX: 0000000000000000 RBX: ffff8800be2f8780 RCX: ffff88022ffae5e8 RDX: 0000000000004c66 RSI: ffffea00055ff440 RDI: ffff8800be2f8780 RBP: ffff8801053af898 R08: 0000000000000001 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00055ff440 R13: 0000000000001000 R14: ffff8800c50be538 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88042fc60000(0063) knlGS:00000000e439c700 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000000001d8f000 CR4: 00000000000607f0 Stack: ... Call Trace: [<ffffffff81365a72>] __nfs_fscache_invalidate_page+0x42/0x70 [<ffffffff813553d5>] nfs_invalidate_page+0x75/0x90 [<ffffffff811b8f5e>] truncate_inode_page+0x8e/0x90 [<ffffffff811b90ad>] truncate_inode_pages_range.part.12+0x14d/0x620 [<ffffffff81d6387d>] ? __mutex_lock_slowpath+0x1fd/0x2e0 [<ffffffff811b95d3>] truncate_inode_pages_range+0x53/0x70 [<ffffffff811b969d>] truncate_inode_pages+0x2d/0x40 [<ffffffff811b96ff>] truncate_pagecache+0x4f/0x70 [<ffffffff81356840>] nfs_setattr_update_inode+0xa0/0x120 [<ffffffff81368de4>] nfs3_proc_setattr+0xc4/0xe0 [<ffffffff81357f78>] nfs_setattr+0xc8/0x150 [<ffffffff8122d95b>] notify_change+0x1cb/0x390 [<ffffffff8120a55b>] do_truncate+0x7b/0xc0 [<ffffffff8121f96c>] do_last+0xa4c/0xfd0 [<ffffffff8121ffbc>] path_openat+0xcc/0x670 [<ffffffff81220a0e>] do_filp_open+0x4e/0xb0 [<ffffffff8120ba1f>] do_sys_open+0x13f/0x2b0 [<ffffffff8126aaf6>] compat_SyS_open+0x36/0x50 [<ffffffff81d7204c>] sysenter_dispatch+0x7/0x24 The code at the instruction pointer was disassembled: > (gdb) disas __fscache_uncache_page > Dump of assembler code for function __fscache_uncache_page: > ... > 0xffffffff812a18ff <+31>: mov 0x48(%rbx),%rax > 0xffffffff812a1903 <+35>: cmpb $0x0,0x10(%rax) > 0xffffffff812a1907 <+39>: je 0xffffffff812a19cd <__fscache_uncache_page+237> These instructions make up: ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX); That cmpb is the faulting instruction (%rax is 0). So cookie->def is NULL - which presumably means that the cookie has already been at least partway through __fscache_relinquish_cookie(). What I think may be happening is something like a three-way race on the same file: PROCESS 1 PROCESS 2 PROCESS 3 =============== =============== =============== open(O_TRUNC|O_WRONLY) open(O_RDONLY) open(O_WRONLY) -->nfs_open() -->nfs_fscache_set_inode_cookie() nfs_fscache_inode_lock() nfs_fscache_disable_inode_cookie() __fscache_relinquish_cookie() nfs_inode->fscache = NULL <--nfs_fscache_set_inode_cookie() -->nfs_open() -->nfs_fscache_set_inode_cookie() nfs_fscache_inode_lock() nfs_fscache_enable_inode_cookie() __fscache_acquire_cookie() nfs_inode->fscache = cookie <--nfs_fscache_set_inode_cookie() <--nfs_open() -->nfs_setattr() ... ... -->nfs_invalidate_page() -->__nfs_fscache_invalidate_page() cookie = nfsi->fscache -->nfs_open() -->nfs_fscache_set_inode_cookie() nfs_fscache_inode_lock() nfs_fscache_disable_inode_cookie() -->__fscache_relinquish_cookie() -->__fscache_uncache_page(cookie) <crash> <--__fscache_relinquish_cookie() nfs_inode->fscache = NULL <--nfs_fscache_set_inode_cookie() What is needed is something to prevent process beagleboard#2 from reacquiring the cookie - and I think checking i_writecount should do the trick. It's also possible to have a two-way race on this if the file is opened O_TRUNC|O_RDONLY instead. Reported-by: Mark Moseley <[email protected]> Signed-off-by: David Howells <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: beagleboard#1 beagleboard#2 beagleboard#3 beagleboard#4 beagleboard#5 beagleboard#6 beagleboard#7 OK [ 0.644008] smpboot: Booting Node 1, Processors: beagleboard#8 beagleboard#9 beagleboard#10 beagleboard#11 beagleboard#12 beagleboard#13 beagleboard#14 beagleboard#15 OK [ 1.245006] smpboot: Booting Node 2, Processors: beagleboard#16 beagleboard#17 beagleboard#18 beagleboard#19 beagleboard#20 beagleboard#21 beagleboard#22 beagleboard#23 OK [ 1.864005] smpboot: Booting Node 3, Processors: beagleboard#24 beagleboard#25 beagleboard#26 beagleboard#27 beagleboard#28 beagleboard#29 beagleboard#30 beagleboard#31 OK [ 2.489005] smpboot: Booting Node 4, Processors: beagleboard#32 beagleboard#33 beagleboard#34 beagleboard#35 beagleboard#36 beagleboard#37 beagleboard#38 beagleboard#39 OK [ 3.093005] smpboot: Booting Node 5, Processors: beagleboard#40 beagleboard#41 beagleboard#42 beagleboard#43 beagleboard#44 beagleboard#45 beagleboard#46 beagleboard#47 OK [ 3.698005] smpboot: Booting Node 6, Processors: beagleboard#48 beagleboard#49 beagleboard#50 beagleboard#51 beagleboard#52 beagleboard#53 beagleboard#54 beagleboard#55 OK [ 4.304005] smpboot: Booting Node 7, Processors: beagleboard#56 beagleboard#57 beagleboard#58 beagleboard#59 beagleboard#60 beagleboard#61 beagleboard#62 beagleboard#63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: beagleboard#1 beagleboard#2 beagleboard#3 beagleboard#4 beagleboard#5 beagleboard#6 beagleboard#7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <[email protected]> Cc: Libin <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Michael Semon reported that xfs/299 generated this lockdep warning: ============================================= [ INFO: possible recursive locking detected ] 3.12.0-rc2+ beagleboard#2 Not tainted --------------------------------------------- touch/21072 is trying to acquire lock: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 but task is already holding lock: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xfs_dquot_other_class); lock(&xfs_dquot_other_class); *** DEADLOCK *** May be due to missing lock nesting notation 7 locks held by touch/21072: #0: (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e beagleboard#1: (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40 beagleboard#2: (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35 beagleboard#3: (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1 beagleboard#4: (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f beagleboard#5: (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 beagleboard#6: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 The lockdep annotation for dquot lock nesting only understands locking for user and "other" dquots, not user, group and quota dquots. Fix the annotations to match the locking heirarchy we now have. Reported-by: Michael L. Semon <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Ben Myers <[email protected]> Signed-off-by: Ben Myers <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: beagleboard#1 beagleboard#2 beagleboard#3 beagleboard#4 beagleboard#5 beagleboard#6 beagleboard#7 [ 0.603005] .... node beagleboard#1, CPUs: beagleboard#8 beagleboard#9 beagleboard#10 beagleboard#11 beagleboard#12 beagleboard#13 beagleboard#14 beagleboard#15 [ 1.200005] .... node beagleboard#2, CPUs: beagleboard#16 beagleboard#17 beagleboard#18 beagleboard#19 beagleboard#20 beagleboard#21 beagleboard#22 beagleboard#23 [ 1.796005] .... node beagleboard#3, CPUs: beagleboard#24 beagleboard#25 beagleboard#26 beagleboard#27 beagleboard#28 beagleboard#29 beagleboard#30 beagleboard#31 [ 2.393005] .... node beagleboard#4, CPUs: beagleboard#32 beagleboard#33 beagleboard#34 beagleboard#35 beagleboard#36 beagleboard#37 beagleboard#38 beagleboard#39 [ 2.996005] .... node beagleboard#5, CPUs: beagleboard#40 beagleboard#41 beagleboard#42 beagleboard#43 beagleboard#44 beagleboard#45 beagleboard#46 beagleboard#47 [ 3.600005] .... node beagleboard#6, CPUs: beagleboard#48 beagleboard#49 beagleboard#50 beagleboard#51 beagleboard#52 beagleboard#53 beagleboard#54 beagleboard#55 [ 4.202005] .... node beagleboard#7, CPUs: beagleboard#56 beagleboard#57 beagleboard#58 beagleboard#59 beagleboard#60 beagleboard#61 beagleboard#62 beagleboard#63 [ 4.811005] .... node beagleboard#8, CPUs: beagleboard#64 beagleboard#65 beagleboard#66 beagleboard#67 beagleboard#68 beagleboard#69 beagleboard#70 beagleboard#71 [ 5.421006] .... node beagleboard#9, CPUs: beagleboard#72 beagleboard#73 beagleboard#74 beagleboard#75 beagleboard#76 beagleboard#77 beagleboard#78 beagleboard#79 [ 6.032005] .... node beagleboard#10, CPUs: beagleboard#80 beagleboard#81 beagleboard#82 beagleboard#83 beagleboard#84 beagleboard#85 beagleboard#86 beagleboard#87 [ 6.648006] .... node beagleboard#11, CPUs: beagleboard#88 beagleboard#89 beagleboard#90 beagleboard#91 beagleboard#92 beagleboard#93 beagleboard#94 beagleboard#95 [ 7.262005] .... node beagleboard#12, CPUs: beagleboard#96 beagleboard#97 beagleboard#98 beagleboard#99 beagleboard#100 beagleboard#101 beagleboard#102 beagleboard#103 [ 7.865005] .... node beagleboard#13, CPUs: beagleboard#104 beagleboard#105 beagleboard#106 beagleboard#107 beagleboard#108 beagleboard#109 beagleboard#110 beagleboard#111 [ 8.466005] .... node beagleboard#14, CPUs: beagleboard#112 beagleboard#113 beagleboard#114 beagleboard#115 beagleboard#116 beagleboard#117 beagleboard#118 beagleboard#119 [ 9.073006] .... node beagleboard#15, CPUs: beagleboard#120 beagleboard#121 beagleboard#122 beagleboard#123 beagleboard#124 beagleboard#125 beagleboard#126 beagleboard#127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
IPv4 mapped addresses cause kernel panic. The patch juste check whether the IPv6 address is an IPv4 mapped address. If so, use IPv4 API instead of IPv6. [ 940.026915] general protection fault: 0000 [beagleboard#1] [ 940.026915] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppox ppp_generic slhc loop psmouse [ 940.026915] CPU: 0 PID: 3184 Comm: memcheck-amd64- Not tainted 3.11.0+ beagleboard#1 [ 940.026915] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 940.026915] task: ffff880007130e20 ti: ffff88000737e000 task.ti: ffff88000737e000 [ 940.026915] RIP: 0010:[<ffffffff81333780>] [<ffffffff81333780>] ip6_xmit+0x276/0x326 [ 940.026915] RSP: 0018:ffff88000737fd28 EFLAGS: 00010286 [ 940.026915] RAX: c748521a75ceff48 RBX: ffff880000c30800 RCX: 0000000000000000 [ 940.026915] RDX: ffff88000075cc4e RSI: 0000000000000028 RDI: ffff8800060e5a40 [ 940.026915] RBP: ffff8800060e5a40 R08: 0000000000000000 R09: ffff88000075cc90 [ 940.026915] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88000737fda0 [ 940.026915] R13: 0000000000000000 R14: 0000000000002000 R15: ffff880005d3b580 [ 940.026915] FS: 00007f163dc5e800(0000) GS:ffffffff81623000(0000) knlGS:0000000000000000 [ 940.026915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 940.026915] CR2: 00000004032dc940 CR3: 0000000005c25000 CR4: 00000000000006f0 [ 940.026915] Stack: [ 940.026915] ffff88000075cc4e ffffffff81694e90 ffff880000c30b38 0000000000000020 [ 940.026915] 11000000523c4bac ffff88000737fdb4 0000000000000000 ffff880000c30800 [ 940.026915] ffff880005d3b580 ffff880000c30b38 ffff8800060e5a40 0000000000000020 [ 940.026915] Call Trace: [ 940.026915] [<ffffffff81356cc3>] ? inet6_csk_xmit+0xa4/0xc4 [ 940.026915] [<ffffffffa0038535>] ? l2tp_xmit_skb+0x503/0x55a [l2tp_core] [ 940.026915] [<ffffffff812b8d3b>] ? pskb_expand_head+0x161/0x214 [ 940.026915] [<ffffffffa003e91d>] ? pppol2tp_xmit+0xf2/0x143 [l2tp_ppp] [ 940.026915] [<ffffffffa00292e0>] ? ppp_channel_push+0x36/0x8b [ppp_generic] [ 940.026915] [<ffffffffa00293fe>] ? ppp_write+0xaf/0xc5 [ppp_generic] [ 940.026915] [<ffffffff8110ead4>] ? vfs_write+0xa2/0x106 [ 940.026915] [<ffffffff8110edd6>] ? SyS_write+0x56/0x8a [ 940.026915] [<ffffffff81378ac0>] ? system_call_fastpath+0x16/0x1b [ 940.026915] Code: 00 49 8b 8f d8 00 00 00 66 83 7c 11 02 00 74 60 49 8b 47 58 48 83 e0 fe 48 8b 80 18 01 00 00 48 85 c0 74 13 48 8b 80 78 02 00 00 <48> ff 40 28 41 8b 57 68 48 01 50 30 48 8b 54 24 08 49 c7 c1 51 [ 940.026915] RIP [<ffffffff81333780>] ip6_xmit+0x276/0x326 [ 940.026915] RSP <ffff88000737fd28> [ 940.057945] ---[ end trace be8aba9a61c8b7f3 ]--- [ 940.058583] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: François CACHEREUL <[email protected]> Signed-off-by: David S. Miller <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
With DT-based boot, the GPMC OneNAND sync mode setup does not work correctly. During the async mode setup, sync flags gets incorrectly set in the onenand_async data and the system crashes during the async setup. Also, the sync mode never gets set in gpmc_onenand_data->flags, so even without the crash, the actual sync mode setup would never be called. The patch fixes this by adjusting the gpmc_onenand_data->flags when the data is read from the DT. Also while doing this we force the onenand_async to be always async. The patch enables to use the following DTS chunk (that should correspond the arch/arm/mach-omap2/board-rm680.c board file setup) with Nokia N950, which currently crashes with 3.12-rc1. The crash output can be also found below. &gpmc { ranges = <0 0 0x04000000 0x20000000>; onenand@0,0 { #address-cells = <1>; #size-cells = <1>; reg = <0 0 0x20000000>; gpmc,sync-read; gpmc,sync-write; gpmc,burst-length = <16>; gpmc,burst-read; gpmc,burst-wrap; gpmc,burst-write; gpmc,device-width = <2>; gpmc,mux-add-data = <2>; gpmc,cs-on-ns = <0>; gpmc,cs-rd-off-ns = <87>; gpmc,cs-wr-off-ns = <87>; gpmc,adv-on-ns = <0>; gpmc,adv-rd-off-ns = <10>; gpmc,adv-wr-off-ns = <10>; gpmc,oe-on-ns = <15>; gpmc,oe-off-ns = <87>; gpmc,we-on-ns = <0>; gpmc,we-off-ns = <87>; gpmc,rd-cycle-ns = <112>; gpmc,wr-cycle-ns = <112>; gpmc,access-ns = <81>; gpmc,page-burst-access-ns = <15>; gpmc,bus-turnaround-ns = <0>; gpmc,cycle2cycle-delay-ns = <0>; gpmc,wait-monitoring-ns = <0>; gpmc,clk-activation-ns = <5>; gpmc,wr-data-mux-bus-ns = <30>; gpmc,wr-access-ns = <81>; gpmc,sync-clk-ps = <15000>; }; }; [ 1.467559] GPMC CS0: cs_on : 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.474822] GPMC CS0: cs_rd_off : 1 ticks, 5 ns (was 24 ticks) 5 ns [ 1.482116] GPMC CS0: cs_wr_off : 14 ticks, 71 ns (was 24 ticks) 71 ns [ 1.489349] GPMC CS0: adv_on : 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.496582] GPMC CS0: adv_rd_off: 3 ticks, 15 ns (was 3 ticks) 15 ns [ 1.503845] GPMC CS0: adv_wr_off: 3 ticks, 15 ns (was 3 ticks) 15 ns [ 1.511077] GPMC CS0: oe_on : 3 ticks, 15 ns (was 4 ticks) 15 ns [ 1.518310] GPMC CS0: oe_off : 1 ticks, 5 ns (was 24 ticks) 5 ns [ 1.525543] GPMC CS0: we_on : 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.532806] GPMC CS0: we_off : 8 ticks, 40 ns (was 24 ticks) 40 ns [ 1.540039] GPMC CS0: rd_cycle : 4 ticks, 20 ns (was 29 ticks) 20 ns [ 1.547302] GPMC CS0: wr_cycle : 4 ticks, 20 ns (was 29 ticks) 20 ns [ 1.554504] GPMC CS0: access : 0 ticks, 0 ns (was 23 ticks) 0 ns [ 1.561767] GPMC CS0: page_burst_access: 0 ticks, 0 ns (was 3 ticks) 0 ns [ 1.569641] GPMC CS0: bus_turnaround: 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.577270] GPMC CS0: cycle2cycle_delay: 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.585144] GPMC CS0: wait_monitoring: 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.592834] GPMC CS0: clk_activation: 0 ticks, 0 ns (was 0 ticks) 0 ns [ 1.600463] GPMC CS0: wr_data_mux_bus: 5 ticks, 25 ns (was 8 ticks) 25 ns [ 1.608154] GPMC CS0: wr_access : 0 ticks, 0 ns (was 23 ticks) 0 ns [ 1.615386] GPMC CS0 CLK period is 5 ns (div 1) [ 1.625122] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf009e442 [ 1.633178] Internal error: : 1008 [beagleboard#1] ARM [ 1.637573] Modules linked in: [ 1.640777] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.0-rc1-n9xx-los.git-5318619-00006-g4baa700-dirty beagleboard#26 [ 1.651123] task: ef04c000 ti: ef050000 task.ti: ef050000 [ 1.656799] PC is at gpmc_onenand_setup+0x98/0x1e0 [ 1.661865] LR is at gpmc_cs_set_timings+0x494/0x5a4 [ 1.667083] pc : [<c002e040>] lr : [<c001f384>] psr: 60000113 [ 1.667083] sp : ef051d10 ip : ef051ce0 fp : ef051d94 [ 1.679138] r10: c0caaf60 r9 : ef050000 r8 : ef18b32c [ 1.684631] r7 : f0080000 r6 : c0caaf60 r5 : 00000000 r4 : f009e400 [ 1.691497] r3 : f009e442 r2 : 80050000 r1 : 00000014 r0 : 00000000 [ 1.698333] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel [ 1.706024] Control: 10c5387d Table: af290019 DAC: 00000015 [ 1.712066] Process swapper (pid: 1, stack limit = 0xef050240) [ 1.718200] Stack: (0xef051d10 to 0xef052000) [ 1.722778] 1d00: 00004000 00001402 00000000 00000005 [ 1.731384] 1d20: 00000047 00000000 0000000f 0000000f 00000000 00000028 0000000f 00000005 [ 1.739990] 1d40: 00000000 00000000 00000014 00000014 00000000 00000000 00000000 00000000 [ 1.748596] 1d60: 00000000 00000019 00000000 00000000 ef18b000 ef099c50 c0c8cb30 00000000 [ 1.757171] 1d80: c0488074 c048f868 ef051dcc ef051d98 c024447c c002dfb4 00000000 c048f868 [ 1.765777] 1da0: 00000000 00000000 c010e4a4 c0dbbb7c c0c8cb40 00000000 c0ca2500 c0488074 [ 1.774383] 1dc0: ef051ddc ef051dd0 c01fd508 c0244370 ef051dfc ef051de0 c01fc204 c01fd4f4 [ 1.782989] 1de0: c0c8cb40 c0ca2500 c0c8cb74 00000000 ef051e1c ef051e00 c01fc3b0 c01fc104 [ 1.791595] 1e00: ef0983bc 00000000 c0ca2500 c01fc31c ef051e44 ef051e20 c01fa794 c01fc328 [ 1.800201] 1e20: ef03634c ef0983b0 ef27d534 c0ca2500 ef27d500 c0c9a2f8 ef051e54 ef051e48 [ 1.808807] 1e40: c01fbcfc c01fa744 ef051e84 ef051e58 c01fb838 c01fbce4 c0411df8 c0caa040 [ 1.817413] 1e60: ef051e84 c0ca2500 00000006 c0caa040 00000066 c0488074 ef051e9c ef051e88 [ 1.825988] 1e80: c01fca30 c01fb768 c04975b8 00000006 ef051eac ef051ea0 c01fd728 c01fc9bc [ 1.834594] 1ea0: ef051ebc ef051eb0 c048808c c01fd6e4 ef051f4c ef051ec0 c0008888 c0488080 [ 1.843200] 1ec0: 0000006f c046bae8 00000000 00000000 ef051efc ef051ee0 ef051f04 ef051ee8 [ 1.851806] 1ee0: c046d400 c0181218 c046d410 c18da8d5 c036a8e4 00000066 ef051f4c ef051f08 [ 1.860412] 1f00: c004b9a8 c046d41c c048f840 00000006 00000006 c046b488 00000000 c043ec08 [ 1.869018] 1f20: ef051f4c c04975b8 00000006 c0caa040 00000066 c046d410 c048f85c c048f868 [ 1.877593] 1f40: ef051f94 ef051f50 c046db8c c00087a0 00000006 00000006 c046d410 ffffffff [ 1.886199] 1f60: ffffffff ffffffff ffffffff 00000000 c0348fd0 00000000 00000000 00000000 [ 1.894805] 1f80: 00000000 00000000 ef051fac ef051f98 c0348fe0 c046daa8 00000000 00000000 [ 1.903411] 1fa0: 00000000 ef051fb0 c000e7f8 c0348fdc 00000000 00000000 00000000 00000000 [ 1.912017] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 1.920623] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff [ 1.929199] Backtrace: [ 1.931793] [<c002dfa8>] (gpmc_onenand_setup+0x0/0x1e0) from [<c024447c>] (omap2_onenand_probe+0x118/0x49c) [ 1.942047] [<c0244364>] (omap2_onenand_probe+0x0/0x49c) from [<c01fd508>] (platform_drv_probe+0x20/0x24) [ 1.952117] r8:c0488074 r7:c0ca2500 r6:00000000 r5:c0c8cb40 r4:c0dbbb7c [ 1.959197] [<c01fd4e8>] (platform_drv_probe+0x0/0x24) from [<c01fc204>] (driver_probe_device+0x10c/0x224) [ 1.969360] [<c01fc0f8>] (driver_probe_device+0x0/0x224) from [<c01fc3b0>] (__driver_attach+0x94/0x98) [ 1.979125] r7:00000000 r6:c0c8cb74 r5:c0ca2500 r4:c0c8cb40 [ 1.985107] [<c01fc31c>] (__driver_attach+0x0/0x98) from [<c01fa794>] (bus_for_each_dev+0x5c/0x90) [ 1.994506] r6:c01fc31c r5:c0ca2500 r4:00000000 r3:ef0983bc [ 2.000488] [<c01fa738>] (bus_for_each_dev+0x0/0x90) from [<c01fbcfc>] (driver_attach+0x24/0x28) [ 2.009735] r6:c0c9a2f8 r5:ef27d500 r4:c0ca2500 [ 2.014587] [<c01fbcd8>] (driver_attach+0x0/0x28) from [<c01fb838>] (bus_add_driver+0xdc/0x260) [ 2.023742] [<c01fb75c>] (bus_add_driver+0x0/0x260) from [<c01fca30>] (driver_register+0x80/0xfc) [ 2.033081] r8:c0488074 r7:00000066 r6:c0caa040 r5:00000006 r4:c0ca2500 [ 2.040161] [<c01fc9b0>] (driver_register+0x0/0xfc) from [<c01fd728>] (__platform_driver_register+0x50/0x64) [ 2.050476] r5:00000006 r4:c04975b8 [ 2.054260] [<c01fd6d8>] (__platform_driver_register+0x0/0x64) from [<c048808c>] (omap2_onenand_driver_init+0x18/0x20) [ 2.065490] [<c0488074>] (omap2_onenand_driver_init+0x0/0x20) from [<c0008888>] (do_one_initcall+0xf4/0x150) [ 2.075836] [<c0008794>] (do_one_initcall+0x0/0x150) from [<c046db8c>] (kernel_init_freeable+0xf0/0x1b4) [ 2.085815] [<c046da9c>] (kernel_init_freeable+0x0/0x1b4) from [<c0348fe0>] (kernel_init+0x10/0xec) [ 2.095336] [<c0348fd0>] (kernel_init+0x0/0xec) from [<c000e7f8>] (ret_from_fork+0x14/0x3c) [ 2.104125] r4:00000000 r3:00000000 [ 2.107879] Code: ebffc3ae e2505000 ba00002e e2843042 (e1d320b0) [ 2.114318] ---[ end trace b8ee3e3e5e002451 ]--- Signed-off-by: Aaro Koskinen <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
The current test for an attached enabled encoder fails if we have multiple connectors aliased to the same encoder - both connectors believe they own the enabled encoder and so we attempt to both enable and disable DPMS on the encoder, leading to hilarity and an OOPs: [ 354.803064] WARNING: CPU: 0 PID: 482 at /usr/src/linux/dist/3.11.2/drivers/gpu/drm/i915/intel_display.c:3869 intel_modeset_check_state+0x764/0x770 [i915]() [ 354.803064] wrong connector dpms state [ 354.803084] Modules linked in: nfsd auth_rpcgss oid_registry exportfs nfs lockd sunrpc xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_limit xt_LOG xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT ipv6 xt_recent xt_conntrack nf_conntrack iptable_filter ip_tables x_tables snd_hda_codec_realtek snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_intel coretemp kvm_intel snd_hda_codec i915 kvm snd_hwdep snd_pcm_oss snd_mixer_oss crc32_pclmul snd_pcm crc32c_intel e1000e intel_agp igb ghash_clmulni_intel intel_gtt aesni_intel cfbfillrect aes_x86_64 cfbimgblt lrw cfbcopyarea drm_kms_helper ptp video thermal processor gf128mul snd_page_alloc drm snd_timer glue_helper 8250_pci snd pps_core ablk_helper agpgart cryptd sg soundcore fan i2c_algo_bit sr_mod thermal_sys 8250 i2c_i801 serial_core hwmon cdrom i2c_core evdev button [ 354.803086] CPU: 0 PID: 482 Comm: kworker/0:1 Not tainted 3.11.2 beagleboard#1 [ 354.803087] Hardware name: Supermicro X10SAE/X10SAE, BIOS 1.00 05/03/2013 [ 354.803091] Workqueue: events console_callback [ 354.803092] 0000000000000009 ffff88023611db48 ffffffff814048ac ffff88023611db90 [ 354.803093] ffff88023611db80 ffffffff8103d4e3 ffff880230d82800 ffff880230f9b800 [ 354.803094] ffff880230f99000 ffff880230f99448 ffff8802351c0e00 ffff88023611dbe0 [ 354.803094] Call Trace: [ 354.803098] [<ffffffff814048ac>] dump_stack+0x54/0x8d [ 354.803101] [<ffffffff8103d4e3>] warn_slowpath_common+0x73/0x90 [ 354.803103] [<ffffffff8103d547>] warn_slowpath_fmt+0x47/0x50 [ 354.803109] [<ffffffffa089f1be>] ? intel_ddi_connector_get_hw_state+0x5e/0x110 [i915] [ 354.803114] [<ffffffffa0896974>] intel_modeset_check_state+0x764/0x770 [i915] [ 354.803117] [<ffffffffa08969bb>] intel_connector_dpms+0x3b/0x60 [i915] [ 354.803120] [<ffffffffa037e1d0>] drm_fb_helper_dpms.isra.11+0x120/0x160 [drm_kms_helper] [ 354.803122] [<ffffffffa037e24e>] drm_fb_helper_blank+0x3e/0x80 [drm_kms_helper] [ 354.803123] [<ffffffff812116c2>] fb_blank+0x52/0xc0 [ 354.803125] [<ffffffff8121e04b>] fbcon_blank+0x21b/0x2d0 [ 354.803127] [<ffffffff81062243>] ? update_rq_clock.part.74+0x13/0x30 [ 354.803129] [<ffffffff81047486>] ? lock_timer_base.isra.30+0x26/0x50 [ 354.803130] [<ffffffff810472b2>] ? internal_add_timer+0x12/0x40 [ 354.803131] [<ffffffff81047f48>] ? mod_timer+0xf8/0x1c0 [ 354.803133] [<ffffffff81266d61>] do_unblank_screen+0xa1/0x1c0 [ 354.803134] [<ffffffff81268087>] poke_blanked_console+0xc7/0xd0 [ 354.803136] [<ffffffff812681cf>] console_callback+0x13f/0x160 [ 354.803137] [<ffffffff81053258>] process_one_work+0x148/0x3d0 [ 354.803138] [<ffffffff81053f19>] worker_thread+0x119/0x3a0 [ 354.803140] [<ffffffff81053e00>] ? manage_workers.isra.30+0x2a0/0x2a0 [ 354.803141] [<ffffffff8105994b>] kthread+0xbb/0xc0 [ 354.803142] [<ffffffff81059890>] ? kthread_create_on_node+0x120/0x120 [ 354.803144] [<ffffffff8140b32c>] ret_from_fork+0x7c/0xb0 [ 354.803145] [<ffffffff81059890>] ? kthread_create_on_node+0x120/0x120 This regression goes back to the big modeset rework and the conversion to the new dpms helpers which started with: commit 5ab432e Author: Daniel Vetter <[email protected]> Date: Sat Jun 30 08:59:56 2012 +0200 drm/i915/hdmi: convert to encoder->disable/enable Fixes: igt/kms_flip/dpms-off-confusion Reported-and-tested-by: Wakko Warner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68030 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Chris Wilson <[email protected]> Cc: [email protected] [danvet: Add regression citation, mention the igt testcase this fixes and slap a cc: stable on the patch.] Signed-off-by: Daniel Vetter <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
->needs_read_fill is used to implement the following behaviors. 1. Ensure buffer filling on the first read. 2. Force buffer filling after a write. 3. Force buffer filling after a successful poll. However, beagleboard#2 and beagleboard#3 don't really work as sysfs doesn't reset file position. While the read buffer would be refilled, the next read would continue from the position after the last read or write, requiring an explicit seek to the start for it to be useful, which makes ->needs_read_fill superflous as read buffer is always refilled if f_pos == 0. Update sysfs_read_file() to test buffer->page for beagleboard#1 instead and remove ->needs_read_fill. While this changes behavior in extreme corner cases - e.g. re-reading a sysfs file after seeking to non-zero position after a write or poll, it's highly unlikely to lead to actual breakage. This change is to prepare for using seq_file in the read path. While at it, reformat a comment in fill_write_buffer(). Signed-off-by: Tejun Heo <[email protected]> Cc: Kay Sievers <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Commit 1400eb6 (MIPS: r4k,octeon,r2300: stack protector: change canary per task) was merged in v3.11 and introduced assembly in the MIPS resume functions to update the value of the current canary in __stack_chk_guard. However it used PTR_L resulting in a load of the canary value, instead of PTR_LA to construct its address. The value is intended to be random but is then treated as an address in the subsequent LONG_S (store). This was observed to cause a fault and panic: CPU 0 Unable to handle kernel paging request at virtual address 139fea20, epc == 8000cc0c, ra == 8034f2a4 Oops[beagleboard#1]: ... $24 : 139fea20 1e1f7cb6 ... Call Trace: [<8000cc0c>] resume+0xac/0x118 [<8034f2a4>] __schedule+0x5f8/0x78c [<8034f4e0>] schedule_preempt_disabled+0x20/0x2c [<80348eec>] rest_init+0x74/0x84 [<804dc990>] start_kernel+0x43c/0x454 Code: 3c18804b 8f184030 8cb901f8 <af190000> 00c0e021 8cb002f0 8cb102f4 8cb202f8 8cb302fc This can also be forced by modifying arch/mips/include/asm/stackprotector.h so that the default __stack_chk_guard value is more likely to be a bad (or unaligned) pointer. Fix it to use PTR_LA instead, to load the address of the canary value, which the LONG_S can then use to write into it. Reported-by: bobjones (via #mipslinux on IRC) Signed-off-by: James Hogan <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Gregory Fong <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/6026/ Signed-off-by: Ralf Baechle <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
… smc beagleboard#1 Here is new version (v4) of omap secure part patch: Other secure functions omap_smc1() and omap_smc2() calling instruction smc #0 but Nokia RX-51 board needs to call smc beagleboard#1 for PPA access. Signed-off-by: Ivaylo Dimitrov <[email protected]> Signed-off-by: Pali Rohár <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Previously we coalesced windows by expanding the first overlapping one and making the second invalid. But we never look at the expanded first window again, so we fail to notice other windows that overlap it. For example, we coalesced these: [io 0x0000-0x03af] // #0 [io 0x03e0-0x0cf7] // beagleboard#1 [io 0x0000-0xdfff] // beagleboard#2 into these, which still overlap: [io 0x0000-0xdfff] // #0 [io 0x03e0-0x0cf7] // beagleboard#1 The fix is to expand the *second* overlapping resource and ignore the first, so we get this instead with no overlaps: [io 0x0000-0xdfff] // beagleboard#2 [bhelgaas: changelog] Reference: https://bugzilla.kernel.org/show_bug.cgi?id=62511 Signed-off-by: Alexey Neyman <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini beagleboard#1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini beagleboard#1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <[email protected]> Tested-by: François Cachereul <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Booting a mx6 with CONFIG_PROVE_LOCKING we get: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0-rc4-next-20131009+ beagleboard#34 Not tainted ------------------------------------------------------- swapper/0/1 is trying to acquire lock: (&imx_drm_device->mutex){+.+.+.}, at: [<804575a8>] imx_drm_encoder_get_mux_id+0x28/0x98 but task is already holding lock: (&crtc->mutex){+.+...}, at: [<802fe778>] drm_modeset_lock_all+0x40/0x54 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> beagleboard#2 (&crtc->mutex){+.+...}: [<800777d0>] __lock_acquire+0x18d4/0x1c24 [<80077fec>] lock_acquire+0x68/0x7c [<805ead5c>] _mutex_lock_nest_lock+0x58/0x3a8 [<802fec50>] drm_crtc_init+0x48/0xa8 [<80457c88>] imx_drm_add_crtc+0xd4/0x144 [<8045e2e8>] ipu_drm_probe+0x114/0x1fc [<80312278>] platform_drv_probe+0x20/0x50 [<80310c68>] driver_probe_device+0x110/0x22c [<80310e20>] __driver_attach+0x9c/0xa0 [<8030f218>] bus_for_each_dev+0x5c/0x90 [<80310750>] driver_attach+0x20/0x28 [<8031034c>] bus_add_driver+0xdc/0x1dc [<803114d8>] driver_register+0x80/0xfc [<80312198>] __platform_driver_register+0x50/0x64 [<808172fc>] ipu_drm_driver_init+0x18/0x20 [<800088c0>] do_one_initcall+0xfc/0x160 [<807e7c5c>] kernel_init_freeable+0x104/0x1d4 [<805e2930>] kernel_init+0x10/0xec [<8000ea68>] ret_from_fork+0x14/0x2c -> beagleboard#1 (&dev->mode_config.mutex){+.+.+.}: [<800777d0>] __lock_acquire+0x18d4/0x1c24 [<80077fec>] lock_acquire+0x68/0x7c [<805eb100>] mutex_lock_nested+0x54/0x3a4 [<802fe758>] drm_modeset_lock_all+0x20/0x54 [<802fead4>] drm_encoder_init+0x20/0x7c [<80457ae4>] imx_drm_add_encoder+0x88/0xec [<80459838>] imx_ldb_probe+0x344/0x4fc [<80312278>] platform_drv_probe+0x20/0x50 [<80310c68>] driver_probe_device+0x110/0x22c [<80310e20>] __driver_attach+0x9c/0xa0 [<8030f218>] bus_for_each_dev+0x5c/0x90 [<80310750>] driver_attach+0x20/0x28 [<8031034c>] bus_add_driver+0xdc/0x1dc [<803114d8>] driver_register+0x80/0xfc [<80312198>] __platform_driver_register+0x50/0x64 [<8081722c>] imx_ldb_driver_init+0x18/0x20 [<800088c0>] do_one_initcall+0xfc/0x160 [<807e7c5c>] kernel_init_freeable+0x104/0x1d4 [<805e2930>] kernel_init+0x10/0xec [<8000ea68>] ret_from_fork+0x14/0x2c -> #0 (&imx_drm_device->mutex){+.+.+.}: [<805e510c>] print_circular_bug+0x74/0x2e0 [<80077ad0>] __lock_acquire+0x1bd4/0x1c24 [<80077fec>] lock_acquire+0x68/0x7c [<805eb100>] mutex_lock_nested+0x54/0x3a4 [<804575a8>] imx_drm_encoder_get_mux_id+0x28/0x98 [<80459a98>] imx_ldb_encoder_prepare+0x34/0x114 [<802ef724>] drm_crtc_helper_set_mode+0x1f0/0x4c0 [<802f0344>] drm_crtc_helper_set_config+0x828/0x99c [<802ff270>] drm_mode_set_config_internal+0x5c/0xdc [<802eebe0>] drm_fb_helper_set_par+0x50/0xb4 [<802af580>] fbcon_init+0x490/0x500 [<802dd104>] visual_init+0xa8/0xf8 [<802df414>] do_bind_con_driver+0x140/0x37c [<802df764>] do_take_over_console+0x114/0x1c4 [<802af65c>] do_fbcon_takeover+0x6c/0xd4 [<802b2b30>] fbcon_event_notify+0x7c8/0x818 [<80049954>] notifier_call_chain+0x4c/0x8c [<80049cd8>] __blocking_notifier_call_chain+0x50/0x68 [<80049d10>] blocking_notifier_call_chain+0x20/0x28 [<802a75f0>] fb_notifier_call_chain+0x1c/0x24 [<802a9224>] register_framebuffer+0x188/0x268 [<802ee994>] drm_fb_helper_initial_config+0x2bc/0x4b8 [<802f118c>] drm_fbdev_cma_init+0x7c/0xec [<80817288>] imx_fb_helper_init+0x54/0x90 [<800088c0>] do_one_initcall+0xfc/0x160 [<807e7c5c>] kernel_init_freeable+0x104/0x1d4 [<805e2930>] kernel_init+0x10/0xec [<8000ea68>] ret_from_fork+0x14/0x2c other info that might help us debug this: Chain exists of: &imx_drm_device->mutex --> &dev->mode_config.mutex --> &crtc->mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&crtc->mutex); lock(&dev->mode_config.mutex); lock(&crtc->mutex); lock(&imx_drm_device->mutex); *** DEADLOCK *** 6 locks held by swapper/0/1: #0: (registration_lock){+.+.+.}, at: [<802a90bc>] register_framebuffer+0x20/0x268 beagleboard#1: (&fb_info->lock){+.+.+.}, at: [<802a7a90>] lock_fb_info+0x20/0x44 beagleboard#2: (console_lock){+.+.+.}, at: [<802a9218>] register_framebuffer+0x17c/0x268 beagleboard#3: ((fb_notifier_list).rwsem){.+.+.+}, at: [<80049cbc>] __blocking_notifier_call_chain+0x34/0x68 beagleboard#4: (&dev->mode_config.mutex){+.+.+.}, at: [<802fe758>] drm_modeset_lock_all+0x20/0x54 beagleboard#5: (&crtc->mutex){+.+...}, at: [<802fe778>] drm_modeset_lock_all+0x40/0x54 In order to avoid this lockdep warning, remove the locking from imx_drm_encoder_get_mux_id() and imx_drm_crtc_panel_format_pins(). Tested on a mx6sabrelite and mx53qsb. Reported-by: Russell King <[email protected]> Tested-by: Russell King <[email protected]> Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Fixing the below dump: root@freescale ~$ modprobe g_serial g_serial gadget: Gadget Serial v2.4 g_serial gadget: g_serial ready BUG: sleeping function called from invalid context at /home/b29397/work/projects/upstream/usb/usb/drivers/base/power/runtime.c:952 in_atomic(): 1, irqs_disabled(): 128, pid: 805, name: modprobe 2 locks held by modprobe/805: #0: (udc_lock){+.+.+.}, at: [<7f000a74>] usb_gadget_probe_driver+0x44/0xb4 [udc_core] beagleboard#1: (&(&ci->lock)->rlock){......}, at: [<7f033488>] ci_udc_start+0x94/0x110 [ci_hdrc] irq event stamp: 3878 hardirqs last enabled at (3877): [<806b6720>] _raw_spin_unlock_irqrestore+0x40/0x6c hardirqs last disabled at (3878): [<806b6474>] _raw_spin_lock_irqsave+0x2c/0xa8 softirqs last enabled at (3872): [<8002ec0c>] __do_softirq+0x1c8/0x2e8 softirqs last disabled at (3857): [<8002f180>] irq_exit+0xbc/0x110 CPU: 0 PID: 805 Comm: modprobe Not tainted 3.11.0-next-20130910+ beagleboard#85 [<80016b94>] (unwind_backtrace+0x0/0xf8) from [<80012e0c>] (show_stack+0x20/0x24) [<80012e0c>] (show_stack+0x20/0x24) from [<806af554>] (dump_stack+0x9c/0xc4) [<806af554>] (dump_stack+0x9c/0xc4) from [<8005940c>] (__might_sleep+0xf4/0x134) [<8005940c>] (__might_sleep+0xf4/0x134) from [<803a04a4>] (__pm_runtime_resume+0x94/0xa0) [<803a04a4>] (__pm_runtime_resume+0x94/0xa0) from [<7f0334a4>] (ci_udc_start+0xb0/0x110 [ci_hdrc]) [<7f0334a4>] (ci_udc_start+0xb0/0x110 [ci_hdrc]) from [<7f0009b4>] (udc_bind_to_driver+0x5c/0xd8 [udc_core]) [<7f0009b4>] (udc_bind_to_driver+0x5c/0xd8 [udc_core]) from [<7f000ab0>] (usb_gadget_probe_driver+0x80/0xb4 [udc_core]) [<7f000ab0>] (usb_gadget_probe_driver+0x80/0xb4 [udc_core]) from [<7f008618>] (usb_composite_probe+0xac/0xd8 [libcomposite]) [<7f008618>] (usb_composite_probe+0xac/0xd8 [libcomposite]) from [<7f04b168>] (init+0x8c/0xb4 [g_serial]) [<7f04b168>] (init+0x8c/0xb4 [g_serial]) from [<800088e8>] (do_one_initcall+0x108/0x16c) [<800088e8>] (do_one_initcall+0x108/0x16c) from [<8008e518>] (load_module+0x1b00/0x20a4) [<8008e518>] (load_module+0x1b00/0x20a4) from [<8008eba8>] (SyS_init_module+0xec/0x100) [<8008eba8>] (SyS_init_module+0xec/0x100) from [<8000ec40>] (ret_fast_syscall+0x0/0x48) Signed-off-by: Peter Chen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
This patch adds a batch support to nfnetlink. Basically, it adds two new control messages: * NFNL_MSG_BATCH_BEGIN, that indicates the beginning of a batch, the nfgenmsg->res_id indicates the nfnetlink subsystem ID. * NFNL_MSG_BATCH_END, that results in the invocation of the ss->commit callback function. If not specified or an error ocurred in the batch, the ss->abort function is invoked instead. The end message represents the commit operation in nftables, the lack of end message results in an abort. This patch also adds the .call_batch function that is only called from the batch receival path. This patch adds atomic rule updates and dumps based on bitmask generations. This allows to atomically commit a set of rule-set updates incrementally without altering the internal state of existing nf_tables expressions/matches/targets. The idea consists of using a generation cursor of 1 bit and a bitmask of 2 bits per rule. Assuming the gencursor is 0, then the genmask (expressed as a bitmask) can be interpreted as: 00 active in the present, will be active in the next generation. 01 inactive in the present, will be active in the next generation. 10 active in the present, will be deleted in the next generation. ^ gencursor Once you invoke the transition to the next generation, the global gencursor is updated: 00 active in the present, will be active in the next generation. 01 active in the present, needs to zero its future, it becomes 00. 10 inactive in the present, delete now. ^ gencursor If a dump is in progress and nf_tables enters a new generation, the dump will stop and return -EBUSY to let userspace know that it has to retry again. In order to invalidate dumps, a global genctr counter is increased everytime nf_tables enters a new generation. This new operation can be used from the user-space utility that controls the firewall, eg. nft -f restore The rule updates contained in `file' will be applied atomically. cat file ----- add filter INPUT ip saddr 1.1.1.1 counter accept beagleboard#1 del filter INPUT ip daddr 2.2.2.2 counter drop beagleboard#2 -EOF- Note that the rule 1 will be inactive until the transition to the next generation, the rule 2 will be evicted in the next generation. There is a penalty during the rule update due to the branch misprediction in the packet matching framework. But that should be quickly resolved once the iteration over the commit list that contain rules that require updates is finished. Event notification happens once the rule-set update has been committed. So we skip notifications is case the rule-set update is aborted, which can happen in case that the rule-set is tested to apply correctly. This patch squashed the following patches from Pablo: * nf_tables: atomic rule updates and dumps * nf_tables: get rid of per rule list_head for commits * nf_tables: use per netns commit list * nfnetlink: add batch support and use it from nf_tables * nf_tables: all rule updates are transactional * nf_tables: attach replacement rule after stale one * nf_tables: do not allow deletion/replacement of stale rules * nf_tables: remove unused NFTA_RULE_FLAGS Signed-off-by: Pablo Neira Ayuso <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
…cm/linux/kernel/git/tmlind/linux-omap into next/boards From Tony Lindgren: Platform data changes for omaps for the display subsystem and n900 secure mode changes. Note that the n900 secure mode changes will still be needed for device tree based booting also. * tag 'omap-for-v3.13/board-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (508 commits) ARM: OMAP2+: display: Create omap_vout device inside omap_display_init ARM: OMAP2+: display: Create omapvrfb and omapfb devices inside omap_display_init ARM: OMAP2+: display: Create omapdrm device inside omap_display_init ARM: OMAP2+: drm: Don't build device for DMM RX-51: Add support for OMAP3 ROM Random Number Generator ARM: OMAP3: RX-51: ARM errata 430973 workaround ARM: OMAP3: Add secure function omap_smc3() which calling instruction smc beagleboard#1 +Linux 3.12-rc4 Signed-off-by: Kevin Hilman <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
If EM Transmit bit is busy during init ata_msleep() is called. It is wrong - msleep() should be used instead of ata_msleep(), because if EM Transmit bit is busy for one port, it will be busy for all other ports too, so using ata_msleep() causes wasting tries for another ports. The most common scenario looks like that now (six ports try to transmit a LED meaasege): - port #0 tries for the 1st time and succeeds - ports beagleboard#1-5 try for the 1st time and sleeps - port beagleboard#1 tries for the 2nd time and succeeds - ports beagleboard#2-5 try for the 2nd time and sleeps - port beagleboard#2 tries for the 3rd time and succeeds - ports beagleboard#3-5 try for the 3rd time and sleeps - port beagleboard#3 tries for the 4th time and succeeds - ports beagleboard#4-5 try for the 4th time and sleeps - port beagleboard#4 tries for the 5th time and succeeds - port beagleboard#5 tries for the 5th time and sleeps At this moment port beagleboard#5 wasted all its five tries and failed to initialize. Because there are only 5 (EM_MAX_RETRY) tries available usually only five ports succeed to initialize. The sixth port and next ones usually will fail. If msleep() is used instead of ata_msleep() the first port succeeds to initialize in the first try and next ones usually succeed to initialize in the second try. tj: updated comment Signed-off-by: Lukasz Dorau <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
This fixes an oops caused by buslogic driver when initializing a BusLogic MultiMaster adapter. Initialization code used scope of a variable incorrectly which created a NULL pointer. Oops message is below: BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: [<c150c137>] blogic_init_mm_probeinfo.isra.17+0x20a/0x583 *pde = 00000000 Oops: 002 [beagleboard#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.11.1.puz1 beagleboard#1 Hardware name: /Canterwood, BIOS 6.00 PG 05/16/2003 task: f7050000 ti: f7054000 task.ti: f7054000 EIP: 0060:[<c150c137>] EFLAGS: 00010246 CPU:1 EIP is at blogic_init_mm_probeinfo.isra.17+0x20a/0x583 EAX: 00000013 EBX: 00000000 ECX: 00000000 EDX: f8001000 ESI: f71cb800 EDI: f7388000 EBP: 00007800 ESP: f7055c84 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: 0000000c CR3: 0154f000 CR4: 000007d0 Stack: 0000001c 00000000 c11a59f6 f7055c98 00008130 ffffffff ffffffff 00000000 00000003 00000000 00000000 00000000 00000013 f8001000 00000001 000003d0 00000000 00000000 00000000 c14e3f84 f78803c8 00000000 f738c000 000000e9 Call Trace: [<c11a59f6>] ? pci_get_subsys+0x33/0x38 [<c150c4fb>] ? blogic_init_probeinfo_list+0x4b/0x19e [<c108d593>] ? __alloc_pages_nodemask+0xe3/0x623 [<c108d593>] ? __alloc_pages_nodemask+0xe3/0x623 [<c10fb99e>] ? sysfs_link_sibling+0x61/0x8d [<c10b0519>] ? kmem_cache_alloc+0x8b/0xb5 [<c150cce5>] ? blogic_init+0xa1/0x10e8 [<c10fc0a8>] ? sysfs_add_one+0x10/0x9d [<c10fc18a>] ? sysfs_addrm_finish+0x12/0x85 [<c10fca37>] ? sysfs_do_create_link_sd+0x9d/0x1b4 [<c117c272>] ? blk_register_queue+0x69/0xb3 [<c10fcb68>] ? sysfs_create_link+0x1a/0x2c [<c1181a07>] ? add_disk+0x1a1/0x3c7 [<c138737b>] ? klist_next+0x60/0xc3 [<c122cc3a>] ? scsi_dh_detach+0x68/0x68 [<c1213e36>] ? bus_for_each_dev+0x51/0x61 [<c1000356>] ? do_one_initcall+0x22/0x12c [<c10f3688>] ? __proc_create+0x8c/0xba [<c150cc44>] ? blogic_setup+0x5f6/0x5f6 [<c14e94aa>] ? repair_env_string+0xf/0x4d [<c14e949b>] ? do_early_param+0x71/0x71 [<c103efaa>] ? parse_args+0x21f/0x33d [<c14e9a54>] ? kernel_init_freeable+0xdf/0x17d [<c14e949b>] ? do_early_param+0x71/0x71 [<c1388b64>] ? kernel_init+0x8/0xc0 [<c1392222>] ? ret_from_kernel_thread+0x6/0x28 [<c1392227>] ? ret_from_kernel_thread+0x1b/0x28 [<c1388b5c>] ? rest_init+0x6c/0x6c Code: 89 44 24 10 0f b6 44 24 3d 89 44 24 0c c7 44 24 08 00 00 00 00 c7 44 24 04 38 62 46 c1 c7 04 24 02 00 00 00 e8 78 13 d2 ff 31 db <89> 6b 0c b0 20 89 ea ee c7 44 24 08 04 00 00 00 8d 44 24 4c 89 EIP: [<c150c137>] blogic_init_mm_probeinfo.isra.17+0x20a/0x583 SS:ESP 0068:f7055c84 CR2: 000000000000000c ---[ end trace 17f45f5196d40487 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 Signed-off-by: Khalid Aziz <[email protected]> Cc: <[email protected]> # 3.11.x Reported-by: Pierre Uszynski <[email protected]> Tested-by: Pierre Uszynski <[email protected]> Signed-off-by: James Bottomley <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
When CMA fails to initialize in v3.12-rc4, the chipidea driver oopses the kernel while trying to remove and put the HCD which doesn't exist: WARNING: CPU: 0 PID: 6 at /home/rmk/git/linux-rmk/arch/arm/mm/dma-mapping.c:511 __dma_alloc+0x200/0x240() coherent pool not initialised! Modules linked in: CPU: 0 PID: 6 Comm: kworker/u2:0 Tainted: G W 3.12.0-rc4+ beagleboard#56 Workqueue: deferwq deferred_probe_work_func Backtrace: [<c001218c>] (dump_backtrace+0x0/0x10c) from [<c0012328>] (show_stack+0x18/0x1c) r6:c05fd9cc r5:000001ff r4:00000000 r3:df86ad00 [<c0012310>] (show_stack+0x0/0x1c) from [<c05f3a4c>] (dump_stack+0x70/0x8c) [<c05f39dc>] (dump_stack+0x0/0x8c) from [<c00230a8>] (warn_slowpath_common+0x6c/0x8c) r4:df883a60 r3:df86ad00 [<c002303c>] (warn_slowpath_common+0x0/0x8c) from [<c002316c>] (warn_slowpath_fmt+0x38/0x40) r8:ffffffff r7:00001000 r6:c083b808 r5:00000000 r4:df2efe80 [<c0023134>] (warn_slowpath_fmt+0x0/0x40) from [<c00196bc>] (__dma_alloc+0x200/0x240) r3:00000000 r2:c05fda00 [<c00194bc>] (__dma_alloc+0x0/0x240) from [<c001982c>] (arm_dma_alloc+0x88/0xa0) [<c00197a4>] (arm_dma_alloc+0x0/0xa0) from [<c03e2904>] (ehci_setup+0x1f4/0x438) [<c03e2710>] (ehci_setup+0x0/0x438) from [<c03cbd60>] (usb_add_hcd+0x18c/0x664) [<c03cbbd4>] (usb_add_hcd+0x0/0x664) from [<c03e89f4>] (host_start+0xf0/0x180) [<c03e8904>] (host_start+0x0/0x180) from [<c03e7c34>] (ci_hdrc_probe+0x360/0x670 ) r6:df2ef410 r5:00000000 r4:df2c3010 r3:c03e8904 [<c03e78d4>] (ci_hdrc_probe+0x0/0x670) from [<c0311044>] (platform_drv_probe+0x20/0x24) [<c0311024>] (platform_drv_probe+0x0/0x24) from [<c030fcac>] (driver_probe_device+0x9c/0x234) ... ---[ end trace c88ccaf3969e8422 ]--- Unable to handle kernel NULL pointer dereference at virtual address 00000028 pgd = c0004000 [00000028] *pgd=00000000 Internal error: Oops: 17 [beagleboard#1] SMP ARM Modules linked in: CPU: 0 PID: 6 Comm: kworker/u2:0 Tainted: G W 3.12.0-rc4+ beagleboard#56 Workqueue: deferwq deferred_probe_work_func task: df86ad00 ti: df882000 task.ti: df882000 PC is at usb_remove_hcd+0x10/0x150 LR is at host_stop+0x1c/0x3c pc : [<c03cacec>] lr : [<c03e88e4>] psr: 60000013 sp : df883b50 ip : df883b78 fp : df883b74 r10: c11f4c54 r9 : c0836450 r8 : df30c400 r7 : fffffff4 r6 : df2ef410 r5 : 00000000 r4 : df2c3010 r3 : 00000000 r2 : 00000000 r1 : df86b0a0 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c53c7d Table: 2f29404a DAC: 00000015 Process kworker/u2:0 (pid: 6, stack limit = 0xdf882240) Stack: (0xdf883b50 to 0xdf884000) ... Backtrace: [<c03cacdc>] (usb_remove_hcd+0x0/0x150) from [<c03e88e4>] (host_stop+0x1c/0x3c) r6:df2ef410 r5:00000000 r4:df2c3010 [<c03e88c8>] (host_stop+0x0/0x3c) from [<c03e8aa0>] (ci_hdrc_host_destroy+0x1c/0x20) r5:00000000 r4:df2c3010 [<c03e8a84>] (ci_hdrc_host_destroy+0x0/0x20) from [<c03e7c80>] (ci_hdrc_probe+0x3ac/0x670) [<c03e78d4>] (ci_hdrc_probe+0x0/0x670) from [<c0311044>] (platform_drv_probe+0x20/0x24) [<c0311024>] (platform_drv_probe+0x0/0x24) from [<c030fcac>] (driver_probe_device+0x9c/0x234) [<c030fc10>] (driver_probe_device+0x0/0x234) from [<c030ff28>] (__device_attach+0x44/0x48) ... ---[ end trace c88ccaf3969e8423 ]--- Fix this so at least we can continue booting and get to a shell prompt. Signed-off-by: Russell King <[email protected]> Tested-by: Russell King <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Since we set IEEE80211_HW_QUEUE_CONTROL, we can let mac80211 do the queue assignement and don't need to override its decisions. While reassiging the same values is harmless of course, it triggered a WARNING when iwlwifi and mac80211 came to different conclusions. This happened when mac80211 set IEEE80211_TX_CTL_SEND_AFTER_DTIM, but didn't route the packet to the cab_queue because no stations were asleep. iwlwifi should not override mac80211's decicions for offchannel packets and packets to be sent after DTIM, but it should override mac80211's decision for AMPDUs since we have a special queue for them. So for AMPDU, we still override info->hw_queue by the AMPDU queue. This avoids: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2531 at drivers/net/wireless/iwlwifi/dvm/tx.c:456 iwlagn_tx_skb+0x6c5/0x883() Modules linked in: CPU: 0 PID: 2531 Comm: hostapd Not tainted 3.12.0-rc5+ beagleboard#1 Hardware name: /D53427RKE, BIOS RKPPT10H.86A.0017.2013.0425.1251 04/25/2013 0000000000000000 0000000000000009 ffffffff8189aa62 0000000000000000 ffffffff8105a4f2 ffff880058339a48 ffffffff815f8a04 0000000000000000 ffff8800560097b0 0000000000000208 0000000000000000 ffff8800561a9e5e Call Trace: [<ffffffff8189aa62>] ? dump_stack+0x41/0x51 [<ffffffff8105a4f2>] ? warn_slowpath_common+0x78/0x90 [<ffffffff815f8a04>] ? iwlagn_tx_skb+0x6c5/0x883 [<ffffffff815f8a04>] ? iwlagn_tx_skb+0x6c5/0x883 [<ffffffff818a0040>] ? put_cred+0x15/0x15 [<ffffffff815f6db4>] ? iwlagn_mac_tx+0x19/0x2f [<ffffffff8186cc45>] ? __ieee80211_tx+0x226/0x29b [<ffffffff8186e6bd>] ? ieee80211_tx+0xa6/0xb5 [<ffffffff8186e98b>] ? ieee80211_monitor_start_xmit+0x1e9/0x204 [<ffffffff8171ce5f>] ? dev_hard_start_xmit+0x271/0x3ec [<ffffffff817351ac>] ? sch_direct_xmit+0x66/0x164 [<ffffffff8171d1bf>] ? dev_queue_xmit+0x1e5/0x3c8 [<ffffffff817fac5a>] ? packet_sendmsg+0xac5/0xb3d [<ffffffff81709a09>] ? sock_sendmsg+0x37/0x52 [<ffffffff810f9e0c>] ? __do_fault+0x338/0x36b [<ffffffff81713820>] ? verify_iovec+0x44/0x94 [<ffffffff81709e63>] ? ___sys_sendmsg+0x1f1/0x283 [<ffffffff81140a73>] ? __inode_wait_for_writeback+0x67/0xae [<ffffffff8111735e>] ? __cache_free.isra.46+0x178/0x187 [<ffffffff811173b1>] ? kmem_cache_free+0x44/0x84 [<ffffffff81132c22>] ? dentry_kill+0x13d/0x149 [<ffffffff81132f6f>] ? dput+0xe5/0xef [<ffffffff81136e04>] ? fget_light+0x2e/0x7c [<ffffffff8170ae62>] ? __sys_sendmsg+0x39/0x57 [<ffffffff818a7e39>] ? system_call_fastpath+0x16/0x1b ---[ end trace 1b3eb79359c1d1e6 ]--- Reported-by: Sander Eikelenboom <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
jadonk
pushed a commit
to jadonk/linux-1
that referenced
this issue
Aug 25, 2014
Fix random kernel panic with below messages when remove dongle. [ 2212.355447] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250 [ 2212.355527] IP: [<ffffffffa02667f2>] rt2x00usb_kick_tx_entry+0x12/0x160 [rt2x00usb] [ 2212.355599] PGD 0 [ 2212.355626] Oops: 0000 [beagleboard#1] SMP [ 2212.355664] Modules linked in: rt2800usb rt2x00usb rt2800lib crc_ccitt rt2x00lib mac80211 cfg80211 tun arc4 fuse rfcomm bnep snd_hda_codec_realtek snd_hda_intel snd_hda_codec btusb uvcvideo bluetooth snd_hwdep x86_pkg_temp_thermal snd_seq coretemp aesni_intel aes_x86_64 snd_seq_device glue_helper snd_pcm ablk_helper videobuf2_vmalloc sdhci_pci videobuf2_memops videobuf2_core sdhci videodev mmc_core serio_raw snd_page_alloc microcode i2c_i801 snd_timer hid_multitouch thinkpad_acpi lpc_ich mfd_core snd tpm_tis wmi tpm tpm_bios soundcore acpi_cpufreq i915 i2c_algo_bit drm_kms_helper drm i2c_core video [last unloaded: cfg80211] [ 2212.356224] CPU: 0 PID: 34 Comm: khubd Not tainted 3.12.0-rc3-wl+ beagleboard#3 [ 2212.356268] Hardware name: LENOVO 3444CUU/3444CUU, BIOS G6ET93WW (2.53 ) 02/04/2013 [ 2212.356319] task: ffff880212f687c0 ti: ffff880212f66000 task.ti: ffff880212f66000 [ 2212.356392] RIP: 0010:[<ffffffffa02667f2>] [<ffffffffa02667f2>] rt2x00usb_kick_tx_entry+0x12/0x160 [rt2x00usb] [ 2212.356481] RSP: 0018:ffff880212f67750 EFLAGS: 00010202 [ 2212.356519] RAX: 000000000000000c RBX: 000000000000000c RCX: 0000000000000293 [ 2212.356568] RDX: ffff8801f4dc219a RSI: 0000000000000000 RDI: 0000000000000240 [ 2212.356617] RBP: ffff880212f67778 R08: ffffffffa02667e0 R09: 0000000000000002 [ 2212.356665] R10: 0001f95254ab4b40 R11: ffff880212f675be R12: ffff8801f4dc2150 [ 2212.356712] R13: 0000000000000000 R14: ffffffffa02667e0 R15: 000000000000000d [ 2212.356761] FS: 0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000 [ 2212.356813] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2212.356852] CR2: 0000000000000250 CR3: 0000000001a0c000 CR4: 00000000001407f0 [ 2212.356899] Stack: [ 2212.356917] 000000000000000c ffff8801f4dc2150 0000000000000000 ffffffffa02667e0 [ 2212.356980] 000000000000000d ffff880212f677b8 ffffffffa03a31ad ffff8801f4dc219a [ 2212.357038] ffff8801f4dc2150 0000000000000000 ffff8800b93217a0 ffff8801f49bc800 [ 2212.357099] Call Trace: [ 2212.357122] [<ffffffffa02667e0>] ? rt2x00usb_interrupt_txdone+0x90/0x90 [rt2x00usb] [ 2212.357174] [<ffffffffa03a31ad>] rt2x00queue_for_each_entry+0xed/0x170 [rt2x00lib] [ 2212.357244] [<ffffffffa026701c>] rt2x00usb_kick_queue+0x5c/0x60 [rt2x00usb] [ 2212.357314] [<ffffffffa03a3682>] rt2x00queue_flush_queue+0x62/0xa0 [rt2x00lib] [ 2212.357386] [<ffffffffa03a2930>] rt2x00mac_flush+0x30/0x70 [rt2x00lib] [ 2212.357470] [<ffffffffa04edded>] ieee80211_flush_queues+0xbd/0x140 [mac80211] [ 2212.357555] [<ffffffffa0502e52>] ieee80211_set_disassoc+0x2d2/0x3d0 [mac80211] [ 2212.357645] [<ffffffffa0506da3>] ieee80211_mgd_deauth+0x1d3/0x240 [mac80211] [ 2212.357718] [<ffffffff8108b17c>] ? try_to_wake_up+0xec/0x290 [ 2212.357788] [<ffffffffa04dbd18>] ieee80211_deauth+0x18/0x20 [mac80211] [ 2212.357872] [<ffffffffa0418ddc>] cfg80211_mlme_deauth+0x9c/0x140 [cfg80211] [ 2212.357913] [<ffffffffa041907c>] cfg80211_mlme_down+0x5c/0x60 [cfg80211] [ 2212.357962] [<ffffffffa041cd18>] cfg80211_disconnect+0x188/0x1a0 [cfg80211] [ 2212.358014] [<ffffffffa04013bc>] ? __cfg80211_stop_sched_scan+0x1c/0x130 [cfg80211] [ 2212.358067] [<ffffffffa03f8954>] cfg80211_leave+0xc4/0xe0 [cfg80211] [ 2212.358124] [<ffffffffa03f8d1b>] cfg80211_netdev_notifier_call+0x3ab/0x5e0 [cfg80211] [ 2212.358177] [<ffffffff815140f8>] ? inetdev_event+0x38/0x510 [ 2212.358217] [<ffffffff81085a94>] ? __wake_up+0x44/0x50 [ 2212.358254] [<ffffffff8155995c>] notifier_call_chain+0x4c/0x70 [ 2212.358293] [<ffffffff81081156>] raw_notifier_call_chain+0x16/0x20 [ 2212.358361] [<ffffffff814b6dd5>] call_netdevice_notifiers_info+0x35/0x60 [ 2212.358429] [<ffffffff814b6ec9>] __dev_close_many+0x49/0xd0 [ 2212.358487] [<ffffffff814b7028>] dev_close_many+0x88/0x100 [ 2212.358546] [<ffffffff814b8150>] rollback_registered_many+0xb0/0x220 [ 2212.358612] [<ffffffff814b8319>] unregister_netdevice_many+0x19/0x60 [ 2212.358694] [<ffffffffa04d8eb2>] ieee80211_remove_interfaces+0x112/0x190 [mac80211] [ 2212.358791] [<ffffffffa04c585f>] ieee80211_unregister_hw+0x4f/0x100 [mac80211] [ 2212.361994] [<ffffffffa03a1221>] rt2x00lib_remove_dev+0x161/0x1a0 [rt2x00lib] [ 2212.365240] [<ffffffffa0266e2e>] rt2x00usb_disconnect+0x2e/0x70 [rt2x00usb] [ 2212.368470] [<ffffffff81419ce4>] usb_unbind_interface+0x64/0x1c0 [ 2212.371734] [<ffffffff813b446f>] __device_release_driver+0x7f/0xf0 [ 2212.374999] [<ffffffff813b4503>] device_release_driver+0x23/0x30 [ 2212.378131] [<ffffffff813b3c98>] bus_remove_device+0x108/0x180 [ 2212.381358] [<ffffffff813b0565>] device_del+0x135/0x1d0 [ 2212.384454] [<ffffffff81417760>] usb_disable_device+0xb0/0x270 [ 2212.387451] [<ffffffff8140d9cd>] usb_disconnect+0xad/0x1d0 [ 2212.390294] [<ffffffff8140f6cd>] hub_thread+0x63d/0x1660 [ 2212.393034] [<ffffffff8107c860>] ? wake_up_atomic_t+0x30/0x30 [ 2212.395728] [<ffffffff8140f090>] ? hub_port_debounce+0x130/0x130 [ 2212.398412] [<ffffffff8107baa0>] kthread+0xc0/0xd0 [ 2212.401058] [<ffffffff8107b9e0>] ? insert_kthread_work+0x40/0x40 [ 2212.403639] [<ffffffff8155de3c>] ret_from_fork+0x7c/0xb0 [ 2212.406193] [<ffffffff8107b9e0>] ? insert_kthread_work+0x40/0x40 [ 2212.408732] Code: 24 58 08 00 00 bf 80 00 00 00 e8 3a c3 e0 e0 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 <48> 8b 47 10 48 89 fb 4c 8b 6f 28 4c 8b 20 49 8b 04 24 4c 8b 30 [ 2212.414671] RIP [<ffffffffa02667f2>] rt2x00usb_kick_tx_entry+0x12/0x160 [rt2x00usb] [ 2212.417646] RSP <ffff880212f67750> [ 2212.420547] CR2: 0000000000000250 [ 2212.441024] ---[ end trace 5442918f33832bce ]--- Cc: [email protected] Signed-off-by: Stanislaw Gruszka <[email protected]> Acked-by: Helmut Schaa <[email protected]> Signed-off-by: John W. Linville <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
commit 79d72c6 upstream. When configuring a hugetlb filesystem via the fsconfig() syscall, there is a possible NULL dereference in hugetlbfs_fill_super() caused by assigning NULL to ctx->hstate in hugetlbfs_parse_param() when the requested pagesize is non valid. E.g: Taking the following steps: fd = fsopen("hugetlbfs", FSOPEN_CLOEXEC); fsconfig(fd, FSCONFIG_SET_STRING, "pagesize", "1024", 0); fsconfig(fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); Given that the requested "pagesize" is invalid, ctxt->hstate will be replaced with NULL, losing its previous value, and we will print an error: ... ... case Opt_pagesize: ps = memparse(param->string, &rest); ctx->hstate = h; if (!ctx->hstate) { pr_err("Unsupported page size %lu MB\n", ps / SZ_1M); return -EINVAL; } return 0; ... ... This is a problem because later on, we will dereference ctxt->hstate in hugetlbfs_fill_super() ... ... sb->s_blocksize = huge_page_size(ctx->hstate); ... ... Causing below Oops. Fix this by replacing cxt->hstate value only when then pagesize is known to be valid. kernel: hugetlbfs: Unsupported page size 0 MB kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028 kernel: #PF: supervisor read access in kernel mode kernel: #PF: error_code(0x0000) - not-present page kernel: PGD 800000010f66c067 P4D 800000010f66c067 PUD 1b22f8067 PMD 0 kernel: Oops: 0000 [#1] PREEMPT SMP PTI kernel: CPU: 4 PID: 5659 Comm: syscall Tainted: G E 6.8.0-rc2-default+ #22 5a47c3fef76212addcc6eb71344aabc35190ae8f kernel: Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0016.D04.1705030402 05/03/2017 kernel: RIP: 0010:hugetlbfs_fill_super+0xb4/0x1a0 kernel: Code: 48 8b 3b e8 3e c6 ed ff 48 85 c0 48 89 45 20 0f 84 d6 00 00 00 48 b8 ff ff ff ff ff ff ff 7f 4c 89 e7 49 89 44 24 20 48 8b 03 <8b> 48 28 b8 00 10 00 00 48 d3 e0 49 89 44 24 18 48 8b 03 8b 40 28 kernel: RSP: 0018:ffffbe9960fcbd48 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: ffff9af5272ae780 RCX: 0000000000372004 kernel: RDX: ffffffffffffffff RSI: ffffffffffffffff RDI: ffff9af555e9b000 kernel: RBP: ffff9af52ee66b00 R08: 0000000000000040 R09: 0000000000370004 kernel: R10: ffffbe9960fcbd48 R11: 0000000000000040 R12: ffff9af555e9b000 kernel: R13: ffffffffa66b86c0 R14: ffff9af507d2f400 R15: ffff9af507d2f400 kernel: FS: 00007ffbc0ba4740(0000) GS:ffff9b0bd7000000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000028 CR3: 00000001b1ee0000 CR4: 00000000001506f0 kernel: Call Trace: kernel: <TASK> kernel: ? __die_body+0x1a/0x60 kernel: ? page_fault_oops+0x16f/0x4a0 kernel: ? search_bpf_extables+0x65/0x70 kernel: ? fixup_exception+0x22/0x310 kernel: ? exc_page_fault+0x69/0x150 kernel: ? asm_exc_page_fault+0x22/0x30 kernel: ? __pfx_hugetlbfs_fill_super+0x10/0x10 kernel: ? hugetlbfs_fill_super+0xb4/0x1a0 kernel: ? hugetlbfs_fill_super+0x28/0x1a0 kernel: ? __pfx_hugetlbfs_fill_super+0x10/0x10 kernel: vfs_get_super+0x40/0xa0 kernel: ? __pfx_bpf_lsm_capable+0x10/0x10 kernel: vfs_get_tree+0x25/0xd0 kernel: vfs_cmd_create+0x64/0xe0 kernel: __x64_sys_fsconfig+0x395/0x410 kernel: do_syscall_64+0x80/0x160 kernel: ? syscall_exit_to_user_mode+0x82/0x240 kernel: ? do_syscall_64+0x8d/0x160 kernel: ? syscall_exit_to_user_mode+0x82/0x240 kernel: ? do_syscall_64+0x8d/0x160 kernel: ? exc_page_fault+0x69/0x150 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0x76 kernel: RIP: 0033:0x7ffbc0cb87c9 kernel: Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 96 0d 00 f7 d8 64 89 01 48 kernel: RSP: 002b:00007ffc29d2f388 EFLAGS: 00000206 ORIG_RAX: 00000000000001af kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffbc0cb87c9 kernel: RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 kernel: RBP: 00007ffc29d2f3b0 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 kernel: R13: 00007ffc29d2f4c0 R14: 0000000000000000 R15: 0000000000000000 kernel: </TASK> kernel: Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) netfs(E) af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) intel_rapl_msr(E) intel_rapl_common(E) iTCO_wdt(E) intel_pmc_bxt(E) sb_edac(E) iTCO_vendor_support(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) rfkill(E) ipmi_ssif(E) kvm(E) acpi_ipmi(E) irqbypass(E) pcspkr(E) igb(E) ipmi_si(E) mei_me(E) i2c_i801(E) joydev(E) intel_pch_thermal(E) i2c_smbus(E) dca(E) lpc_ich(E) mei(E) ipmi_devintf(E) ipmi_msghandler(E) acpi_pad(E) tiny_power_button(E) button(E) fuse(E) efi_pstore(E) configfs(E) ip_tables(E) x_tables(E) ext4(E) mbcache(E) jbd2(E) hid_generic(E) usbhid(E) sd_mod(E) t10_pi(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) polyval_clmulni(E) ahci(E) xhci_pci(E) polyval_generic(E) gf128mul(E) ghash_clmulni_intel(E) sha512_ssse3(E) sha256_ssse3(E) xhci_pci_renesas(E) libahci(E) ehci_pci(E) sha1_ssse3(E) xhci_hcd(E) ehci_hcd(E) libata(E) kernel: mgag200(E) i2c_algo_bit(E) usbcore(E) wmi(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) scsi_common(E) aesni_intel(E) crypto_simd(E) cryptd(E) kernel: Unloaded tainted modules: acpi_cpufreq(E):1 fjes(E):1 kernel: CR2: 0000000000000028 kernel: ---[ end trace 0000000000000000 ]--- kernel: RIP: 0010:hugetlbfs_fill_super+0xb4/0x1a0 kernel: Code: 48 8b 3b e8 3e c6 ed ff 48 85 c0 48 89 45 20 0f 84 d6 00 00 00 48 b8 ff ff ff ff ff ff ff 7f 4c 89 e7 49 89 44 24 20 48 8b 03 <8b> 48 28 b8 00 10 00 00 48 d3 e0 49 89 44 24 18 48 8b 03 8b 40 28 kernel: RSP: 0018:ffffbe9960fcbd48 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: ffff9af5272ae780 RCX: 0000000000372004 kernel: RDX: ffffffffffffffff RSI: ffffffffffffffff RDI: ffff9af555e9b000 kernel: RBP: ffff9af52ee66b00 R08: 0000000000000040 R09: 0000000000370004 kernel: R10: ffffbe9960fcbd48 R11: 0000000000000040 R12: ffff9af555e9b000 kernel: R13: ffffffffa66b86c0 R14: ffff9af507d2f400 R15: ffff9af507d2f400 kernel: FS: 00007ffbc0ba4740(0000) GS:ffff9b0bd7000000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000028 CR3: 00000001b1ee0000 CR4: 00000000001506f0 Link: https://lkml.kernel.org/r/[email protected] Fixes: 3202198 ("hugetlbfs: Convert to fs_context") Signed-off-by: Michal Hocko <[email protected]> Signed-off-by: Oscar Salvador <[email protected]> Acked-by: Muchun Song <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
…39_FILTER) commit efe7cf8 upstream. Lock jsk->sk to prevent UAF when setsockopt(..., SO_J1939_FILTER, ...) modifies jsk->filters while receiving packets. Following trace was seen on affected system: ================================================================== BUG: KASAN: slab-use-after-free in j1939_sk_recv_match_one+0x1af/0x2d0 [can_j1939] Read of size 4 at addr ffff888012144014 by task j1939/350 CPU: 0 PID: 350 Comm: j1939 Tainted: G W OE 6.5.0-rc5 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: print_report+0xd3/0x620 ? kasan_complete_mode_report_info+0x7d/0x200 ? j1939_sk_recv_match_one+0x1af/0x2d0 [can_j1939] kasan_report+0xc2/0x100 ? j1939_sk_recv_match_one+0x1af/0x2d0 [can_j1939] __asan_load4+0x84/0xb0 j1939_sk_recv_match_one+0x1af/0x2d0 [can_j1939] j1939_sk_recv+0x20b/0x320 [can_j1939] ? __kasan_check_write+0x18/0x20 ? __pfx_j1939_sk_recv+0x10/0x10 [can_j1939] ? j1939_simple_recv+0x69/0x280 [can_j1939] ? j1939_ac_recv+0x5e/0x310 [can_j1939] j1939_can_recv+0x43f/0x580 [can_j1939] ? __pfx_j1939_can_recv+0x10/0x10 [can_j1939] ? raw_rcv+0x42/0x3c0 [can_raw] ? __pfx_j1939_can_recv+0x10/0x10 [can_j1939] can_rcv_filter+0x11f/0x350 [can] can_receive+0x12f/0x190 [can] ? __pfx_can_rcv+0x10/0x10 [can] can_rcv+0xdd/0x130 [can] ? __pfx_can_rcv+0x10/0x10 [can] __netif_receive_skb_one_core+0x13d/0x150 ? __pfx___netif_receive_skb_one_core+0x10/0x10 ? __kasan_check_write+0x18/0x20 ? _raw_spin_lock_irq+0x8c/0xe0 __netif_receive_skb+0x23/0xb0 process_backlog+0x107/0x260 __napi_poll+0x69/0x310 net_rx_action+0x2a1/0x580 ? __pfx_net_rx_action+0x10/0x10 ? __pfx__raw_spin_lock+0x10/0x10 ? handle_irq_event+0x7d/0xa0 __do_softirq+0xf3/0x3f8 do_softirq+0x53/0x80 </IRQ> <TASK> __local_bh_enable_ip+0x6e/0x70 netif_rx+0x16b/0x180 can_send+0x32b/0x520 [can] ? __pfx_can_send+0x10/0x10 [can] ? __check_object_size+0x299/0x410 raw_sendmsg+0x572/0x6d0 [can_raw] ? __pfx_raw_sendmsg+0x10/0x10 [can_raw] ? apparmor_socket_sendmsg+0x2f/0x40 ? __pfx_raw_sendmsg+0x10/0x10 [can_raw] sock_sendmsg+0xef/0x100 sock_write_iter+0x162/0x220 ? __pfx_sock_write_iter+0x10/0x10 ? __rtnl_unlock+0x47/0x80 ? security_file_permission+0x54/0x320 vfs_write+0x6ba/0x750 ? __pfx_vfs_write+0x10/0x10 ? __fget_light+0x1ca/0x1f0 ? __rcu_read_unlock+0x5b/0x280 ksys_write+0x143/0x170 ? __pfx_ksys_write+0x10/0x10 ? __kasan_check_read+0x15/0x20 ? fpregs_assert_state_consistent+0x62/0x70 __x64_sys_write+0x47/0x60 do_syscall_64+0x60/0x90 ? do_syscall_64+0x6d/0x90 ? irqentry_exit+0x3f/0x50 ? exc_page_fault+0x79/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Allocated by task 348: kasan_save_stack+0x2a/0x50 kasan_set_track+0x29/0x40 kasan_save_alloc_info+0x1f/0x30 __kasan_kmalloc+0xb5/0xc0 __kmalloc_node_track_caller+0x67/0x160 j1939_sk_setsockopt+0x284/0x450 [can_j1939] __sys_setsockopt+0x15c/0x2f0 __x64_sys_setsockopt+0x6b/0x80 do_syscall_64+0x60/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 349: kasan_save_stack+0x2a/0x50 kasan_set_track+0x29/0x40 kasan_save_free_info+0x2f/0x50 __kasan_slab_free+0x12e/0x1c0 __kmem_cache_free+0x1b9/0x380 kfree+0x7a/0x120 j1939_sk_setsockopt+0x3b2/0x450 [can_j1939] __sys_setsockopt+0x15c/0x2f0 __x64_sys_setsockopt+0x6b/0x80 do_syscall_64+0x60/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 9d71dd0 ("can: add support of SAE J1939 protocol") Reported-by: Sili Luo <[email protected]> Suggested-by: Sili Luo <[email protected]> Acked-by: Oleksij Rempel <[email protected]> Cc: [email protected] Signed-off-by: Oleksij Rempel <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
[ Upstream commit b513d30 ] When task_tag >= 32 (in MCQ mode) and sizeof(unsigned int) == 4, 1U << task_tag will out of bounds for a u32 mask. Fix this up to prevent SHIFT_ISSUE (bitwise shifts that are out of bounds for their data type). [name:debug_monitors&]Unexpected kernel BRK exception at EL1 [name:traps&]Internal error: BRK handler: 00000000f2005514 [#1] PREEMPT SMP [name:mediatek_cpufreq_hw&]cpufreq stop DVFS log done [name:mrdump&]Kernel Offset: 0x1ba5800000 from 0xffffffc008000000 [name:mrdump&]PHYS_OFFSET: 0x80000000 [name:mrdump&]pstate: 22400005 (nzCv daif +PAN -UAO) [name:mrdump&]pc : [0xffffffdbaf52bb2c] ufshcd_clear_cmd+0x280/0x288 [name:mrdump&]lr : [0xffffffdbaf52a774] ufshcd_wait_for_dev_cmd+0x3e4/0x82c [name:mrdump&]sp : ffffffc0081471b0 <snip> Workqueue: ufs_eh_wq_0 ufshcd_err_handler Call trace: dump_backtrace+0xf8/0x144 show_stack+0x18/0x24 dump_stack_lvl+0x78/0x9c dump_stack+0x18/0x44 mrdump_common_die+0x254/0x480 [mrdump] ipanic_die+0x20/0x30 [mrdump] notify_die+0x15c/0x204 die+0x10c/0x5f8 arm64_notify_die+0x74/0x13c do_debug_exception+0x164/0x26c el1_dbg+0x64/0x80 el1h_64_sync_handler+0x3c/0x90 el1h_64_sync+0x68/0x6c ufshcd_clear_cmd+0x280/0x288 ufshcd_wait_for_dev_cmd+0x3e4/0x82c ufshcd_exec_dev_cmd+0x5bc/0x9ac ufshcd_verify_dev_init+0x84/0x1c8 ufshcd_probe_hba+0x724/0x1ce0 ufshcd_host_reset_and_restore+0x260/0x574 ufshcd_reset_and_restore+0x138/0xbd0 ufshcd_err_handler+0x1218/0x2f28 process_one_work+0x5fc/0x1140 worker_thread+0x7d8/0xe20 kthread+0x25c/0x468 ret_from_fork+0x10/0x20 Signed-off-by: Alice Chao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Stanley Jhu <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
[ Upstream commit 4551b30 ] With default config, the value of NR_CPUS is 64. When HW platform has more then 64 cpus, system will crash on these platforms. MAX_CORE_PIC is the maximum cpu number in MADT table (max physical number) which can exceed the supported maximum cpu number (NR_CPUS, max logical number), but kernel should not crash. Kernel should boot cpus with NR_CPUS, let the remainder cpus stay in BIOS. The potential crash reason is that the array acpi_core_pic[NR_CPUS] can be overflowed when parsing MADT table, and it is obvious that CORE_PIC should be corresponding to physical core rather than logical core, so it is better to define the array as acpi_core_pic[MAX_CORE_PIC]. With the patch, system can boot up 64 vcpus with qemu parameter -smp 128, otherwise system will crash with the following message. [ 0.000000] CPU 0 Unable to handle kernel paging request at virtual address 0000420000004259, era == 90000000037a5f0c, ra == 90000000037a46ec [ 0.000000] Oops[#1]: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc2+ #192 [ 0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 0.000000] pc 90000000037a5f0c ra 90000000037a46ec tp 9000000003c90000 sp 9000000003c93d60 [ 0.000000] a0 0000000000000019 a1 9000000003d93bc0 a2 0000000000000000 a3 9000000003c93bd8 [ 0.000000] a4 9000000003c93a74 a5 9000000083c93a67 a6 9000000003c938f0 a7 0000000000000005 [ 0.000000] t0 0000420000004201 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001 [ 0.000000] t4 0000000000000003 t5 0000000000000000 t6 0000000000000030 t7 0000000000000063 [ 0.000000] t8 0000000000000014 u0 ffffffffffffffff s9 0000000000000000 s0 9000000003caee98 [ 0.000000] s1 90000000041b0480 s2 9000000003c93da0 s3 9000000003c93d98 s4 9000000003c93d90 [ 0.000000] s5 9000000003caa000 s6 000000000a7fd000 s7 000000000f556b60 s8 000000000e0a4330 [ 0.000000] ra: 90000000037a46ec platform_init+0x214/0x250 [ 0.000000] ERA: 90000000037a5f0c efi_runtime_init+0x30/0x94 [ 0.000000] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 0.000000] PRMD: 00000000 (PPLV0 -PIE -PWE) [ 0.000000] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 0.000000] ECFG: 00070800 (LIE=11 VS=7) [ 0.000000] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 0.000000] BADV: 0000420000004259 [ 0.000000] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 0.000000] Modules linked in: [ 0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____)) [ 0.000000] Stack : 9000000003c93a14 9000000003800898 90000000041844f8 90000000037a46ec [ 0.000000] 000000000a7fd000 0000000008290000 0000000000000000 0000000000000000 [ 0.000000] 0000000000000000 0000000000000000 00000000019d8000 000000000f556b60 [ 0.000000] 000000000a7fd000 000000000f556b08 9000000003ca7700 9000000003800000 [ 0.000000] 9000000003c93e50 9000000003800898 9000000003800108 90000000037a484c [ 0.000000] 000000000e0a4330 000000000f556b60 000000000a7fd000 000000000f556b08 [ 0.000000] 9000000003ca7700 9000000004184000 0000000000200000 000000000e02b018 [ 0.000000] 000000000a7fd000 90000000037a0790 9000000003800108 0000000000000000 [ 0.000000] 0000000000000000 000000000e0a4330 000000000f556b60 000000000a7fd000 [ 0.000000] 000000000f556b08 000000000eaae298 000000000eaa5040 0000000000200000 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<90000000037a5f0c>] efi_runtime_init+0x30/0x94 [ 0.000000] [<90000000037a46ec>] platform_init+0x214/0x250 [ 0.000000] [<90000000037a484c>] setup_arch+0x124/0x45c [ 0.000000] [<90000000037a0790>] start_kernel+0x90/0x670 [ 0.000000] [<900000000378b0d8>] kernel_entry+0xd8/0xdc Signed-off-by: Bibo Mao <[email protected]> Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
[ Upstream commit fa765c4 ] shutdown_pirq and startup_pirq are not taking the irq_mapping_update_lock because they can't due to lock inversion. Both are called with the irq_desc->lock being taking. The lock order, however, is first irq_mapping_update_lock and then irq_desc->lock. This opens multiple races: - shutdown_pirq can be interrupted by a function that allocates an event channel: CPU0 CPU1 shutdown_pirq { xen_evtchn_close(e) __startup_pirq { EVTCHNOP_bind_pirq -> returns just freed evtchn e set_evtchn_to_irq(e, irq) } xen_irq_info_cleanup() { set_evtchn_to_irq(e, -1) } } Assume here event channel e refers here to the same event channel number. After this race the evtchn_to_irq mapping for e is invalid (-1). - __startup_pirq races with __unbind_from_irq in a similar way. Because __startup_pirq doesn't take irq_mapping_update_lock it can grab the evtchn that __unbind_from_irq is currently freeing and cleaning up. In this case even though the event channel is allocated, its mapping can be unset in evtchn_to_irq. The fix is to first cleanup the mappings and then close the event channel. In this way, when an event channel gets allocated it's potential previous evtchn_to_irq mappings are guaranteed to be unset already. This is also the reverse order of the allocation where first the event channel is allocated and then the mappings are setup. On a 5.10 kernel prior to commit 3fcdaf3 ("xen/events: modify internal [un]bind interfaces"), we hit a BUG like the following during probing of NVMe devices. The issue is that during nvme_setup_io_queues, pci_free_irq is called for every device which results in a call to shutdown_pirq. With many nvme devices it's therefore likely to hit this race during boot because there will be multiple calls to shutdown_pirq and startup_pirq are running potentially in parallel. ------------[ cut here ]------------ blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled kernel BUG at drivers/xen/events/events_base.c:499! invalid opcode: 0000 [#1] SMP PTI CPU: 44 PID: 375 Comm: kworker/u257:23 Not tainted 5.10.201-191.748.amzn2.x86_64 #1 Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006 Workqueue: nvme-reset-wq nvme_reset_work RIP: 0010:bind_evtchn_to_cpu+0xdf/0xf0 Code: 5d 41 5e c3 cc cc cc cc 44 89 f7 e8 2b 55 ad ff 49 89 c5 48 85 c0 0f 84 64 ff ff ff 4c 8b 68 30 41 83 fe ff 0f 85 60 ff ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 RSP: 0000:ffffc9000d533b08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000028 RSI: 00000000ffffffff RDI: 00000000ffffffff RBP: ffff888107419680 R08: 0000000000000000 R09: ffffffff82d72b00 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000001ed R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffff88bc8b500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000002610001 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? show_trace_log_lvl+0x1c1/0x2d9 ? show_trace_log_lvl+0x1c1/0x2d9 ? set_affinity_irq+0xdc/0x1c0 ? __die_body.cold+0x8/0xd ? die+0x2b/0x50 ? do_trap+0x90/0x110 ? bind_evtchn_to_cpu+0xdf/0xf0 ? do_error_trap+0x65/0x80 ? bind_evtchn_to_cpu+0xdf/0xf0 ? exc_invalid_op+0x4e/0x70 ? bind_evtchn_to_cpu+0xdf/0xf0 ? asm_exc_invalid_op+0x12/0x20 ? bind_evtchn_to_cpu+0xdf/0xf0 ? bind_evtchn_to_cpu+0xc5/0xf0 set_affinity_irq+0xdc/0x1c0 irq_do_set_affinity+0x1d7/0x1f0 irq_setup_affinity+0xd6/0x1a0 irq_startup+0x8a/0xf0 __setup_irq+0x639/0x6d0 ? nvme_suspend+0x150/0x150 request_threaded_irq+0x10c/0x180 ? nvme_suspend+0x150/0x150 pci_request_irq+0xa8/0xf0 ? __blk_mq_free_request+0x74/0xa0 queue_request_irq+0x6f/0x80 nvme_create_queue+0x1af/0x200 nvme_create_io_queues+0xbd/0xf0 nvme_setup_io_queues+0x246/0x320 ? nvme_irq_check+0x30/0x30 nvme_reset_work+0x1c8/0x400 process_one_work+0x1b0/0x350 worker_thread+0x49/0x310 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 Modules linked in: ---[ end trace a11715de1eee1873 ]--- Fixes: d46a78b ("xen: implement pirq type event channels") Cc: [email protected] Co-debugged-by: Andrew Panyakin <[email protected]> Signed-off-by: Maximilian Heyne <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
commit e6f57c6 upstream. Unfortunately the commit `fd8958efe877` introduced another error causing the `descs` array to overflow. This reults in further crashes easily reproducible by `sendmsg` system call. [ 1080.836473] general protection fault, probably for non-canonical address 0x400300015528b00a: 0000 [#1] PREEMPT SMP PTI [ 1080.869326] RIP: 0010:hfi1_ipoib_build_ib_tx_headers.constprop.0+0xe1/0x2b0 [hfi1] -- [ 1080.974535] Call Trace: [ 1080.976990] <TASK> [ 1081.021929] hfi1_ipoib_send_dma_common+0x7a/0x2e0 [hfi1] [ 1081.027364] hfi1_ipoib_send_dma_list+0x62/0x270 [hfi1] [ 1081.032633] hfi1_ipoib_send+0x112/0x300 [hfi1] [ 1081.042001] ipoib_start_xmit+0x2a9/0x2d0 [ib_ipoib] [ 1081.046978] dev_hard_start_xmit+0xc4/0x210 -- [ 1081.148347] __sys_sendmsg+0x59/0xa0 crash> ipoib_txreq 0xffff9cfeba229f00 struct ipoib_txreq { txreq = { list = { next = 0xffff9cfeba229f00, prev = 0xffff9cfeba229f00 }, descp = 0xffff9cfeba229f40, coalesce_buf = 0x0, wait = 0xffff9cfea4e69a48, complete = 0xffffffffc0fe0760 <hfi1_ipoib_sdma_complete>, packet_len = 0x46d, tlen = 0x0, num_desc = 0x0, desc_limit = 0x6, next_descq_idx = 0x45c, coalesce_idx = 0x0, flags = 0x0, descs = {{ qw = {0x8024000120dffb00, 0x4} # SDMA_DESC0_FIRST_DESC_FLAG (bit 63) }, { qw = { 0x3800014231b108, 0x4} }, { qw = { 0x310000e4ee0fcf0, 0x8} }, { qw = { 0x3000012e9f8000, 0x8} }, { qw = { 0x59000dfb9d0000, 0x8} }, { qw = { 0x78000e02e40000, 0x8} }} }, sdma_hdr = 0x400300015528b000, <<< invalid pointer in the tx request structure sdma_status = 0x0, SDMA_DESC0_LAST_DESC_FLAG (bit 62) complete = 0x0, priv = 0x0, txq = 0xffff9cfea4e69880, skb = 0xffff9d099809f400 } If an SDMA send consists of exactly 6 descriptors and requires dword padding (in the 7th descriptor), the sdma_txreq descriptor array is not properly expanded and the packet will overflow into the container structure. This results in a panic when the send completion runs. The exact panic varies depending on what elements of the container structure get corrupted. The fix is to use the correct expression in _pad_sdma_tx_descs() to test the need to expand the descriptor array. With this patch the crashes are no longer reproducible and the machine is stable. Fixes: fd8958e ("IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors") Cc: [email protected] Reported-by: Mats Kronberg <[email protected]> Tested-by: Mats Kronberg <[email protected]> Signed-off-by: Daniel Vacek <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
commit 136cfac upstream. The gtp_net_ops pernet operations structure for the subsystem must be registered before registering the generic netlink family. Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 1 PID: 5826 Comm: gtp Not tainted 6.8.0-rc3-std-def-alt1 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-alt1 04/01/2014 RIP: 0010:gtp_genl_dump_pdp+0x1be/0x800 [gtp] Code: c6 89 c6 e8 64 e9 86 df 58 45 85 f6 0f 85 4e 04 00 00 e8 c5 ee 86 df 48 8b 54 24 18 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 de 05 00 00 48 8b 44 24 18 4c 8b 30 4c 39 f0 74 RSP: 0018:ffff888014107220 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88800fcda588 R14: 0000000000000001 R15: 0000000000000000 FS: 00007f1be4eb05c0(0000) GS:ffff88806ce80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1be4e766cf CR3: 000000000c33e000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? show_regs+0x90/0xa0 ? die_addr+0x50/0xd0 ? exc_general_protection+0x148/0x220 ? asm_exc_general_protection+0x22/0x30 ? gtp_genl_dump_pdp+0x1be/0x800 [gtp] ? __alloc_skb+0x1dd/0x350 ? __pfx___alloc_skb+0x10/0x10 genl_dumpit+0x11d/0x230 netlink_dump+0x5b9/0xce0 ? lockdep_hardirqs_on_prepare+0x253/0x430 ? __pfx_netlink_dump+0x10/0x10 ? kasan_save_track+0x10/0x40 ? __kasan_kmalloc+0x9b/0xa0 ? genl_start+0x675/0x970 __netlink_dump_start+0x6fc/0x9f0 genl_family_rcv_msg_dumpit+0x1bb/0x2d0 ? __pfx_genl_family_rcv_msg_dumpit+0x10/0x10 ? genl_op_from_small+0x2a/0x440 ? cap_capable+0x1d0/0x240 ? __pfx_genl_start+0x10/0x10 ? __pfx_genl_dumpit+0x10/0x10 ? __pfx_genl_done+0x10/0x10 ? security_capable+0x9d/0xe0 Cc: [email protected] Signed-off-by: Vasiliy Kovalev <[email protected]> Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
[ Upstream commit 5ba4e6d ] Avoid the following warning by making sure to free the allocated resources in case that qedr_init_user_queue() fail. -----------[ cut here ]----------- WARNING: CPU: 0 PID: 143192 at drivers/infiniband/core/rdma_core.c:874 uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs] Modules linked in: tls target_core_user uio target_core_pscsi target_core_file target_core_iblock ib_srpt ib_srp scsi_transport_srp nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs 8021q garp mrp stp llc ext4 mbcache jbd2 opa_vnic ib_umad ib_ipoib sunrpc rdma_ucm ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm hfi1 intel_rapl_msr intel_rapl_common mgag200 qedr sb_edac drm_shmem_helper rdmavt x86_pkg_temp_thermal drm_kms_helper intel_powerclamp ib_uverbs coretemp i2c_algo_bit kvm_intel dell_wmi_descriptor ipmi_ssif sparse_keymap kvm ib_core rfkill syscopyarea sysfillrect video sysimgblt irqbypass ipmi_si ipmi_devintf fb_sys_fops rapl iTCO_wdt mxm_wmi iTCO_vendor_support intel_cstate pcspkr dcdbas intel_uncore ipmi_msghandler lpc_ich acpi_power_meter mei_me mei fuse drm xfs libcrc32c qede sd_mod ahci libahci t10_pi sg crct10dif_pclmul crc32_pclmul crc32c_intel qed libata tg3 ghash_clmulni_intel megaraid_sas crc8 wmi [last unloaded: ib_srpt] CPU: 0 PID: 143192 Comm: fi_rdm_tagged_p Kdump: loaded Not tainted 5.14.0-408.el9.x86_64 #1 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.14.0 01/25/2022 RIP: 0010:uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs] Code: 5d 41 5c 41 5d 41 5e e9 0f 26 1b dd 48 89 df e8 67 6a ff ff 49 8b 86 10 01 00 00 48 85 c0 74 9c 4c 89 e7 e8 83 c0 cb dd eb 92 <0f> 0b eb be 0f 0b be 04 00 00 00 48 89 df e8 8e f5 ff ff e9 6d ff RSP: 0018:ffffb7c6cadfbc60 EFLAGS: 00010286 RAX: ffff8f0889ee3f60 RBX: ffff8f088c1a5200 RCX: 00000000802a0016 RDX: 00000000802a0017 RSI: 0000000000000001 RDI: ffff8f0880042600 RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8f11fffd5000 R11: 0000000000039000 R12: ffff8f0d5b36cd80 R13: ffff8f088c1a5250 R14: ffff8f1206d91000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8f11d7c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000147069200e20 CR3: 00000001c7210002 CR4: 00000000001706f0 Call Trace: <TASK> ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? ib_uverbs_close+0x1f/0xb0 [ib_uverbs] ? uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs] ? __warn+0x81/0x110 ? uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs] ? report_bug+0x10a/0x140 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs] ib_uverbs_close+0x1f/0xb0 [ib_uverbs] __fput+0x94/0x250 task_work_run+0x5c/0x90 do_exit+0x270/0x4a0 do_group_exit+0x2d/0x90 get_signal+0x87c/0x8c0 arch_do_signal_or_restart+0x25/0x100 ? ib_uverbs_ioctl+0xc2/0x110 [ib_uverbs] exit_to_user_mode_loop+0x9c/0x130 exit_to_user_mode_prepare+0xb6/0x100 syscall_exit_to_user_mode+0x12/0x40 do_syscall_64+0x69/0x90 ? syscall_exit_work+0x103/0x130 ? syscall_exit_to_user_mode+0x22/0x40 ? do_syscall_64+0x69/0x90 ? syscall_exit_work+0x103/0x130 ? syscall_exit_to_user_mode+0x22/0x40 ? do_syscall_64+0x69/0x90 ? do_syscall_64+0x69/0x90 ? common_interrupt+0x43/0xa0 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x1470abe3ec6b Code: Unable to access opcode bytes at RIP 0x1470abe3ec41. RSP: 002b:00007fff13ce9108 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: fffffffffffffffc RBX: 00007fff13ce9218 RCX: 00001470abe3ec6b RDX: 00007fff13ce9200 RSI: 00000000c0181b01 RDI: 0000000000000004 RBP: 00007fff13ce91e0 R08: 0000558d9655da10 R09: 0000558d9655dd00 R10: 00007fff13ce95c0 R11: 0000000000000246 R12: 00007fff13ce9358 R13: 0000000000000013 R14: 0000558d9655db50 R15: 00007fff13ce9470 </TASK> --[ end trace 888a9b92e04c5c97 ]-- Fixes: df15856 ("RDMA/qedr: restructure functions that create/destroy QPs") Signed-off-by: Kamal Heib <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
[ Upstream commit 5761eb9 ] Correct blk-mq registration issue with module parameter disable_managed_interrupts enabled. When we turn off the default PCI_IRQ_AFFINITY flag, the driver needs to register with blk-mq using blk_mq_map_queues(). The driver is currently calling blk_mq_pci_map_queues() which results in a stack trace and possibly undefined behavior. Stack Trace: [ 7.860089] scsi host2: smartpqi [ 7.871934] WARNING: CPU: 0 PID: 238 at block/blk-mq-pci.c:52 blk_mq_pci_map_queues+0xca/0xd0 [ 7.889231] Modules linked in: sd_mod t10_pi sg uas smartpqi(+) crc32c_intel scsi_transport_sas usb_storage dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse [ 7.924755] CPU: 0 PID: 238 Comm: kworker/0:3 Not tainted 4.18.0-372.88.1.el8_6_smartpqi_test.x86_64 #1 [ 7.944336] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 03/08/2022 [ 7.963026] Workqueue: events work_for_cpu_fn [ 7.978275] RIP: 0010:blk_mq_pci_map_queues+0xca/0xd0 [ 7.978278] Code: 48 89 de 89 c7 e8 f6 0f 4f 00 3b 05 c4 b7 8e 01 72 e1 5b 31 c0 5d 41 5c 41 5d 41 5e 41 5f e9 7d df 73 00 31 c0 e9 76 df 73 00 <0f> 0b eb bc 90 90 0f 1f 44 00 00 41 57 49 89 ff 41 56 41 55 41 54 [ 7.978280] RSP: 0018:ffffa95fc3707d50 EFLAGS: 00010216 [ 7.978283] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000010 [ 7.978284] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff9190c32d4310 [ 7.978286] RBP: 0000000000000000 R08: ffffa95fc3707d38 R09: ffff91929b81ac00 [ 7.978287] R10: 0000000000000001 R11: ffffa95fc3707ac0 R12: 0000000000000000 [ 7.978288] R13: ffff9190c32d4000 R14: 00000000ffffffff R15: ffff9190c4c950a8 [ 7.978290] FS: 0000000000000000(0000) GS:ffff9193efc00000(0000) knlGS:0000000000000000 [ 7.978292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.172814] CR2: 000055d11166c000 CR3: 00000002dae10002 CR4: 00000000007706f0 [ 8.172816] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8.172817] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8.172818] PKRU: 55555554 [ 8.172819] Call Trace: [ 8.172823] blk_mq_alloc_tag_set+0x12e/0x310 [ 8.264339] scsi_add_host_with_dma.cold.9+0x30/0x245 [ 8.279302] pqi_ctrl_init+0xacf/0xc8e [smartpqi] [ 8.294085] ? pqi_pci_probe+0x480/0x4c8 [smartpqi] [ 8.309015] pqi_pci_probe+0x480/0x4c8 [smartpqi] [ 8.323286] local_pci_probe+0x42/0x80 [ 8.337855] work_for_cpu_fn+0x16/0x20 [ 8.351193] process_one_work+0x1a7/0x360 [ 8.364462] ? create_worker+0x1a0/0x1a0 [ 8.379252] worker_thread+0x1ce/0x390 [ 8.392623] ? create_worker+0x1a0/0x1a0 [ 8.406295] kthread+0x10a/0x120 [ 8.418428] ? set_kthread_struct+0x50/0x50 [ 8.431532] ret_from_fork+0x1f/0x40 [ 8.444137] ---[ end trace 1bf0173d39354506 ]--- Fixes: cf15c3e ("scsi: smartpqi: Add module param to disable managed ints") Tested-by: Yogesh Chandra Pandey <[email protected]> Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Scott Teel <[email protected]> Reviewed-by: Mahesh Rajashekhara <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Reviewed-by: Kevin Barnett <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Tomas Henzl <[email protected]> Reviewed-by: Ewan D. Milne <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Mar 12, 2024
…ntroller [ Upstream commit a5c57fd ] When a PCI device is dynamically added, the kernel oopses with a NULL pointer dereference: BUG: Kernel NULL pointer dereference on read at 0x00000030 Faulting instruction address: 0xc0000000006bbe5c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs xsk_diag bonding nft_compat nf_tables nfnetlink rfkill binfmt_misc dm_multipath rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_umad ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c mlx5_core mlxfw sd_mod t10_pi sg tls ibmvscsi ibmveth scsi_transport_srp vmx_crypto pseries_wdt psample dm_mirror dm_region_hash dm_log dm_mod fuse CPU: 17 PID: 2685 Comm: drmgr Not tainted 6.7.0-203405+ #66 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries NIP: c0000000006bbe5c LR: c000000000a13e68 CTR: c0000000000579f8 REGS: c00000009924f240 TRAP: 0300 Not tainted (6.7.0-203405+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002220 XER: 20040006 CFAR: c000000000a13e64 DAR: 0000000000000030 DSISR: 40000000 IRQMASK: 0 ... NIP sysfs_add_link_to_group+0x34/0x94 LR iommu_device_link+0x5c/0x118 Call Trace: iommu_init_device+0x26c/0x318 (unreliable) iommu_device_link+0x5c/0x118 iommu_init_device+0xa8/0x318 iommu_probe_device+0xc0/0x134 iommu_bus_notifier+0x44/0x104 notifier_call_chain+0xb8/0x19c blocking_notifier_call_chain+0x64/0x98 bus_notify+0x50/0x7c device_add+0x640/0x918 pci_device_add+0x23c/0x298 of_create_pci_dev+0x400/0x884 of_scan_pci_dev+0x124/0x1b0 __of_scan_bus+0x78/0x18c pcibios_scan_phb+0x2a4/0x3b0 init_phb_dynamic+0xb8/0x110 dlpar_add_slot+0x170/0x3b8 [rpadlpar_io] add_slot_store.part.0+0xb4/0x130 [rpadlpar_io] kobj_attr_store+0x2c/0x48 sysfs_kf_write+0x64/0x78 kernfs_fop_write_iter+0x1b0/0x290 vfs_write+0x350/0x4a0 ksys_write+0x84/0x140 system_call_exception+0x124/0x330 system_call_vectored_common+0x15c/0x2ec Commit a940904 ("powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains") broke DLPAR add of PCI devices. The above added iommu_device structure to pci_controller. During system boot, PCI devices are discovered and this newly added iommu_device structure is initialized by a call to iommu_device_register(). During DLPAR add of a PCI device, a new pci_controller structure is allocated but there are no calls made to iommu_device_register() interface. Fix is to register the iommu device during DLPAR add as well. Fixes: a940904 ("powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains") Signed-off-by: Gaurav Batra <[email protected]> Reviewed-by: Brian King <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 26, 2024
Tested on PocketBeagle + ATUSB + CC1352R SensorTag: debian@beaglebone:~$ uname -a Linux beaglebone 5.8.18-bone23 #1xross PREEMPT Tue Dec 15 15:41:20 IST 2020 armv7l GNU/Linux All drivers probed correctly as per manifest, but a kernel panic occurs after sometime: debian@beaglebone:~$ dmesg [ 218.793223] greybus 1-svc: no primary interface detected on module 2 [ 218.876678] greybus 1-2.2: Interface added (greybus) [ 218.876707] greybus 1-2.2: GMP VID=0x00000126, PID=0x00000126 [ 218.876739] greybus 1-2.2: DDBL1 Manufacturer=0x00000126, Product=0x00000126 [ 219.296368] greybus 1-2.2: excess descriptors in interface manifest [ 220.922786] mikrobus:mikrobus_port_gb_register: mikrobus gb_probe , num cports= 3 [ 220.922793] mikrobus:mikrobus_port_gb_register: protocol added 11 [ 220.922824] mikrobus:mikrobus_port_gb_register: protocol added 3 [ 220.922829] mikrobus:mikrobus_port_gb_register: protocol added 2 [ 220.922896] mikrobus:mikrobus_port_register: registering port mikrobus-1 [ 220.931490] mikrobus_manifest:mikrobus_manifest_attach_device: parsed device 1, driver=bme280, protocol=3, reg=76 [ 220.931526] mikrobus_manifest:mikrobus_manifest_attach_device: parsed device 2, driver=opt3001, protocol=3, reg=44 [ 220.931557] mikrobus_manifest:mikrobus_manifest_parse: Greybus Service Sample Application manifest parsed with 2 devices [ 220.931602] mikrobus mikrobus-1: registering device : bme280 [ 220.937043] mikrobus mikrobus-1: registering device : opt3001 [ 221.628678] bmp280 3-0076: supply vddd not found, using dummy regulator [ 221.628976] bmp280 3-0076: supply vdda not found, using dummy regulator [ 221.630669] opt3001 3-0044: Found TI OPT3001 debian@beaglebone:~$ ls /sys/bus/iio/devices/iio\:device iio:device0/ iio:device1/ iio:device2/ debian@beaglebone:~$ ls /sys/bus/iio/devices/iio\:device1 current_timestamp_clock dev events in_illuminance_input in_illuminance_integration_time integration_time_available name power subsystem uevent debian@beaglebone:~$ ls /sys/bus/iio/devices/iio\:device2 dev in_humidityrelative_input in_humidityrelative_oversampling_ratio in_pressure_input in_pressure_oversampling_ratio in_temp_input in_temp_oversampling_ratio name power subsystem uevent . . . [ 235.947231] 8<--- cut here --- [ 235.951059] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 235.964814] pgd = 1f321289 [ 235.970754] [00000004] *pgd=00000000 [ 235.977393] Internal error: Oops: 5 [#1] PREEMPT THUMB2 [ 235.983348] Modules linked in: bmp280_i2c bmp280 opt3001 gb_netlink gb_loopback(C) gb_gpio(C) gb_i2c(C) gb_spi(C) gb_gbphy(C) gb_spilib(C) greybus nhc_udp nhc_routing nhc_ipv6 nhc_mobility nhc_hop nhc_fragment nhc_dest ieee802154_6lowpan 6lowpan evdev usb_f_acm u_serial usb_f_ncm usb_f_mass_storage usb_f_rndis u_ether libcomposite iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables x_tables atusb mac802154 ieee802154 spidev [last unloaded: gb_netlink] [ 236.028629] CPU: 0 PID: 5 Comm: kworker/0:0 Tainted: G C O 5.8.18-bone23 #1xross [ 236.037887] Hardware name: Generic AM33XX (Flattened Device Tree) [ 236.044761] Workqueue: events do_work [greybus] [ 236.050022] PC is at skb_release_data+0x52/0xe4 [ 236.055268] LR is at kfree_skb+0x23/0xa8 [ 236.059903] pc : [<c0857e6a>] lr : [<c0857583>] psr: 400f0033 [ 236.066892] sp : dc139e60 ip : a00f0013 fp : c0f1369c [ 236.072833] r10: da5b5200 r9 : da5b5214 r8 : 00000001 [ 236.078775] r7 : da5b5300 r6 : db633c00 r5 : da5b5300 r4 : 00000000 [ 236.086026] r3 : 00000008 r2 : 00000000 r1 : 00000000 r0 : 00000000 [ 236.093279] Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment none [ 236.101316] Control: 50c5387d Table: 9b5a4019 DAC: 00000051 [ 236.107785] Process kworker/0:0 (pid: 5, stack limit = 0x39163ebd) [ 236.114689] Stack: (0xdc139e60 to 0xdc13a000) [ 236.119766] 9e60: da5b5200 db633c00 ffffff91 bf9391cd db51c000 c0857583 db633c00 ffffff91 [ 236.128678] 9e80: dae21168 bf9391cd 00000000 00000001 00000000 c0f05288 d64e7ea0 00000000 [ 236.137590] 9ea0: 000007d0 00000cc0 da6b0e00 00000000 00000000 bf90f84d 00000cc0 d64e7ea0 [ 236.146502] 9ec0: 00000000 00000000 00000013 bf90f8b9 d64e7ea0 bf90f94d 00000000 00000cc0 [ 236.155415] 9ee0: d6543d40 db51c400 00000000 dfa07300 00000000 00000000 d6543d44 bf90e973 [ 236.164328] 9f00: 00000000 00000000 000007d0 c0153c45 c0f05288 bf90eb39 00000000 c0f05288 [ 236.173240] 9f20: dfa07305 d6543d40 dc0e5b00 00000000 dfa07300 c01377c7 dc139f50 c0137a53 [ 236.182153] 9f40: dc0e5b00 c0f1365c dc0e5b14 c0f1d060 c0f13670 dc138000 c0f1365c c0137af9 [ 236.191065] 9f60: 00000000 dc0fd080 dc0fd380 dc138000 00000000 c0137a2d dc0e5b00 dc117eb8 [ 236.199979] 9f80: dc0fd0a0 c013ba45 00000000 dc0fd380 c013b955 00000000 00000000 00000000 [ 236.208891] 9fa0: 00000000 00000000 00000000 c0100159 00000000 00000000 00000000 00000000 [ 236.217804] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 236.226716] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 [ 236.235638] [<c0857e6a>] (skb_release_data) from [<c0857583>] (kfree_skb+0x23/0xa8) [ 236.244032] [<c0857583>] (kfree_skb) from [<bf9391cd>] (message_send+0x91/0x12c [gb_netlink]) [ 236.253333] [<bf9391cd>] (message_send [gb_netlink]) from [<bf90f84d>] (gb_operation_request_send+0xad/0xfc [greybus]) [ 236.264799] [<bf90f84d>] (gb_operation_request_send [greybus]) from [<bf90f8b9>] (gb_operation_request_send_sync_timeout+0x1d/0x4c [greybus]) [ 236.278268] [<bf90f8b9>] (gb_operation_request_send_sync_timeout [greybus]) from [<bf90f94d>] (gb_operation_sync_timeout+0x65/0xb0 [greybus]) [ 236.291736] [<bf90f94d>] (gb_operation_sync_timeout [greybus]) from [<bf90e973>] (gb_svc_ping+0x23/0x28 [greybus]) [ 236.302849] [<bf90e973>] (gb_svc_ping [greybus]) from [<bf90eb39>] (do_work+0x19/0xe8 [greybus]) [ 236.312387] [<bf90eb39>] (do_work [greybus]) from [<c01377c7>] (process_one_work+0x127/0x38c) [ 236.321651] [<c01377c7>] (process_one_work) from [<c0137af9>] (worker_thread+0xcd/0x3d0) [ 236.330482] [<c0137af9>] (worker_thread) from [<c013ba45>] (kthread+0xf1/0x124) [ 236.338526] [<c013ba45>] (kthread) from [<c0100159>] (ret_from_fork+0x11/0x38) [ 236.346474] Exception stack(0xdc139fb0 to 0xdc139ff8) [ 236.352244] 9fa0: 00000000 00000000 00000000 00000000 [ 236.361156] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 236.370067] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 236.377413] Code: 78bb 42a3 dd1b 6aa8 (6843) 07da [ 236.394147] ---[ end trace dc655e652b764722 ]--- Signed-off-by: Vaishnav M A <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit db2a3cc ] Syzbot found the following issue: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000016 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000000010af56000 [0000000000000016] pgd=08000001090da003, p4d=08000001090da003, pud=08000001090ce003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 3036 Comm: syz-executor206 Not tainted 6.0.0-rc6-syzkaller-17739-g16c9f284e746 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : is_rec_inuse fs/ntfs3/ntfs.h:313 [inline] pc : ni_write_inode+0xac/0x798 fs/ntfs3/frecord.c:3232 lr : ni_write_inode+0xa0/0x798 fs/ntfs3/frecord.c:3226 sp : ffff8000126c3800 x29: ffff8000126c3860 x28: 0000000000000000 x27: ffff0000c8b02000 x26: ffff0000c7502320 x25: ffff0000c7502288 x24: 0000000000000000 x23: ffff80000cbec91c x22: ffff0000c8b03000 x21: ffff0000c8b02000 x20: 0000000000000001 x19: ffff0000c75024d8 x18: 00000000000000c0 x17: ffff80000dd1b198 x16: ffff80000db59158 x15: ffff0000c4b6b500 x14: 00000000000000b8 x13: 0000000000000000 x12: ffff0000c4b6b500 x11: ff80800008be1b60 x10: 0000000000000000 x9 : ffff0000c4b6b500 x8 : 0000000000000000 x7 : ffff800008be1b50 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000008 x1 : 0000000000000001 x0 : 0000000000000000 Call trace: is_rec_inuse fs/ntfs3/ntfs.h:313 [inline] ni_write_inode+0xac/0x798 fs/ntfs3/frecord.c:3232 ntfs_evict_inode+0x54/0x84 fs/ntfs3/inode.c:1744 evict+0xec/0x334 fs/inode.c:665 iput_final fs/inode.c:1748 [inline] iput+0x2c4/0x324 fs/inode.c:1774 ntfs_new_inode+0x7c/0xe0 fs/ntfs3/fsntfs.c:1660 ntfs_create_inode+0x20c/0xe78 fs/ntfs3/inode.c:1278 ntfs_create+0x54/0x74 fs/ntfs3/namei.c:100 lookup_open fs/namei.c:3413 [inline] open_last_lookups fs/namei.c:3481 [inline] path_openat+0x804/0x11c4 fs/namei.c:3688 do_filp_open+0xdc/0x1b8 fs/namei.c:3718 do_sys_openat2+0xb8/0x22c fs/open.c:1311 do_sys_open fs/open.c:1327 [inline] __do_sys_openat fs/open.c:1343 [inline] __se_sys_openat fs/open.c:1338 [inline] __arm64_sys_openat+0xb0/0xe0 fs/open.c:1338 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall arch/arm64/kernel/syscall.c:52 [inline] el0_svc_common+0x138/0x220 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x48/0x164 arch/arm64/kernel/syscall.c:206 el0_svc+0x58/0x150 arch/arm64/kernel/entry-common.c:636 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:654 el0t_64_sync+0x18c/0x190 Code: 97dafee4 340001b4 f9401328 2a1f03e0 (79402d14) ---[ end trace 0000000000000000 ]--- Above issue may happens as follows: ntfs_new_inode mi_init mi->mrec = kmalloc(sbi->record_size, GFP_NOFS); -->failed to allocate memory if (!mi->mrec) return -ENOMEM; iput iput_final evict ntfs_evict_inode ni_write_inode is_rec_inuse(ni->mi.mrec)-> As 'ni->mi.mrec' is NULL trigger NULL-ptr-deref To solve above issue if new inode failed make inode bad before call 'iput()' in 'ntfs_new_inode()'. Reported-by: [email protected] Signed-off-by: Ye Bin <[email protected]> Signed-off-by: Konstantin Komarov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit 00d6a28 ] Commit a5a9230 (fbdev: fbcon: Properly revert changes when vc_resize() failed) started restoring old font data upon failure (of vc_resize()). But it performs so only for user fonts. It means that the "system"/internal fonts are not restored at all. So in result, the very first call to fbcon_do_set_font() performs no restore at all upon failing vc_resize(). This can be reproduced by Syzkaller to crash the system on the next invocation of font_get(). It's rather hard to hit the allocation failure in vc_resize() on the first font_set(), but not impossible. Esp. if fault injection is used to aid the execution/failure. It was demonstrated by Sirius: BUG: unable to handle page fault for address: fffffffffffffff8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD cb7b067 P4D cb7b067 PUD cb7d067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 8007 Comm: poc Not tainted 6.7.0-g9d1694dc91ce #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:fbcon_get_font+0x229/0x800 drivers/video/fbdev/core/fbcon.c:2286 Call Trace: <TASK> con_font_get drivers/tty/vt/vt.c:4558 [inline] con_font_op+0x1fc/0xf20 drivers/tty/vt/vt.c:4673 vt_k_ioctl drivers/tty/vt/vt_ioctl.c:474 [inline] vt_ioctl+0x632/0x2ec0 drivers/tty/vt/vt_ioctl.c:752 tty_ioctl+0x6f8/0x1570 drivers/tty/tty_io.c:2803 vfs_ioctl fs/ioctl.c:51 [inline] ... So restore the font data in any case, not only for user fonts. Note the later 'if' is now protected by 'old_userfont' and not 'old_data' as the latter is always set now. (And it is supposed to be non-NULL. Otherwise we would see the bug above again.) Signed-off-by: Jiri Slaby (SUSE) <[email protected]> Fixes: a5a9230 ("fbdev: fbcon: Properly revert changes when vc_resize() failed") Reported-and-tested-by: Ubisectech Sirius <[email protected]> Cc: Ubisectech Sirius <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Helge Deller <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
commit 616d82c upstream. The gtp_link_ops operations structure for the subsystem must be registered after registering the gtp_net_ops pernet operations structure. Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug: [ 1010.702740] gtp: GTP module unloaded [ 1010.715877] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] SMP KASAN NOPTI [ 1010.715888] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] [ 1010.715895] CPU: 1 PID: 128616 Comm: a.out Not tainted 6.8.0-rc6-std-def-alt1 #1 [ 1010.715899] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-alt1 04/01/2014 [ 1010.715908] RIP: 0010:gtp_newlink+0x4d7/0x9c0 [gtp] [ 1010.715915] Code: 80 3c 02 00 0f 85 41 04 00 00 48 8b bb d8 05 00 00 e8 ed f6 ff ff 48 89 c2 48 89 c5 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4f 04 00 00 4c 89 e2 4c 8b 6d 00 48 b8 00 00 00 [ 1010.715920] RSP: 0018:ffff888020fbf180 EFLAGS: 00010203 [ 1010.715929] RAX: dffffc0000000000 RBX: ffff88800399c000 RCX: 0000000000000000 [ 1010.715933] RDX: 0000000000000001 RSI: ffffffff84805280 RDI: 0000000000000282 [ 1010.715938] RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000 [ 1010.715942] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88800399cc80 [ 1010.715947] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000400 [ 1010.715953] FS: 00007fd1509ab5c0(0000) GS:ffff88805b300000(0000) knlGS:0000000000000000 [ 1010.715958] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1010.715962] CR2: 0000000000000000 CR3: 000000001c07a000 CR4: 0000000000750ee0 [ 1010.715968] PKRU: 55555554 [ 1010.715972] Call Trace: [ 1010.715985] ? __die_body.cold+0x1a/0x1f [ 1010.715995] ? die_addr+0x43/0x70 [ 1010.716002] ? exc_general_protection+0x199/0x2f0 [ 1010.716016] ? asm_exc_general_protection+0x1e/0x30 [ 1010.716026] ? gtp_newlink+0x4d7/0x9c0 [gtp] [ 1010.716034] ? gtp_net_exit+0x150/0x150 [gtp] [ 1010.716042] __rtnl_newlink+0x1063/0x1700 [ 1010.716051] ? rtnl_setlink+0x3c0/0x3c0 [ 1010.716063] ? is_bpf_text_address+0xc0/0x1f0 [ 1010.716070] ? kernel_text_address.part.0+0xbb/0xd0 [ 1010.716076] ? __kernel_text_address+0x56/0xa0 [ 1010.716084] ? unwind_get_return_address+0x5a/0xa0 [ 1010.716091] ? create_prof_cpu_mask+0x30/0x30 [ 1010.716098] ? arch_stack_walk+0x9e/0xf0 [ 1010.716106] ? stack_trace_save+0x91/0xd0 [ 1010.716113] ? stack_trace_consume_entry+0x170/0x170 [ 1010.716121] ? __lock_acquire+0x15c5/0x5380 [ 1010.716139] ? mark_held_locks+0x9e/0xe0 [ 1010.716148] ? kmem_cache_alloc_trace+0x35f/0x3c0 [ 1010.716155] ? __rtnl_newlink+0x1700/0x1700 [ 1010.716160] rtnl_newlink+0x69/0xa0 [ 1010.716166] rtnetlink_rcv_msg+0x43b/0xc50 [ 1010.716172] ? rtnl_fdb_dump+0x9f0/0x9f0 [ 1010.716179] ? lock_acquire+0x1fe/0x560 [ 1010.716188] ? netlink_deliver_tap+0x12f/0xd50 [ 1010.716196] netlink_rcv_skb+0x14d/0x440 [ 1010.716202] ? rtnl_fdb_dump+0x9f0/0x9f0 [ 1010.716208] ? netlink_ack+0xab0/0xab0 [ 1010.716213] ? netlink_deliver_tap+0x202/0xd50 [ 1010.716220] ? netlink_deliver_tap+0x218/0xd50 [ 1010.716226] ? __virt_addr_valid+0x30b/0x590 [ 1010.716233] netlink_unicast+0x54b/0x800 [ 1010.716240] ? netlink_attachskb+0x870/0x870 [ 1010.716248] ? __check_object_size+0x2de/0x3b0 [ 1010.716254] netlink_sendmsg+0x938/0xe40 [ 1010.716261] ? netlink_unicast+0x800/0x800 [ 1010.716269] ? __import_iovec+0x292/0x510 [ 1010.716276] ? netlink_unicast+0x800/0x800 [ 1010.716284] __sock_sendmsg+0x159/0x190 [ 1010.716290] ____sys_sendmsg+0x712/0x880 [ 1010.716297] ? sock_write_iter+0x3d0/0x3d0 [ 1010.716304] ? __ia32_sys_recvmmsg+0x270/0x270 [ 1010.716309] ? lock_acquire+0x1fe/0x560 [ 1010.716315] ? drain_array_locked+0x90/0x90 [ 1010.716324] ___sys_sendmsg+0xf8/0x170 [ 1010.716331] ? sendmsg_copy_msghdr+0x170/0x170 [ 1010.716337] ? lockdep_init_map_type+0x2c7/0x860 [ 1010.716343] ? lockdep_hardirqs_on_prepare+0x430/0x430 [ 1010.716350] ? debug_mutex_init+0x33/0x70 [ 1010.716360] ? percpu_counter_add_batch+0x8b/0x140 [ 1010.716367] ? lock_acquire+0x1fe/0x560 [ 1010.716373] ? find_held_lock+0x2c/0x110 [ 1010.716384] ? __fd_install+0x1b6/0x6f0 [ 1010.716389] ? lock_downgrade+0x810/0x810 [ 1010.716396] ? __fget_light+0x222/0x290 [ 1010.716403] __sys_sendmsg+0xea/0x1b0 [ 1010.716409] ? __sys_sendmsg_sock+0x40/0x40 [ 1010.716419] ? lockdep_hardirqs_on_prepare+0x2b3/0x430 [ 1010.716425] ? syscall_enter_from_user_mode+0x1d/0x60 [ 1010.716432] do_syscall_64+0x30/0x40 [ 1010.716438] entry_SYSCALL_64_after_hwframe+0x62/0xc7 [ 1010.716444] RIP: 0033:0x7fd1508cbd49 [ 1010.716452] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ef 70 0d 00 f7 d8 64 89 01 48 [ 1010.716456] RSP: 002b:00007fff18872348 EFLAGS: 00000202 ORIG_RAX: 000000000000002e [ 1010.716463] RAX: ffffffffffffffda RBX: 000055f72bf0eac0 RCX: 00007fd1508cbd49 [ 1010.716468] RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000006 [ 1010.716473] RBP: 00007fff18872360 R08: 00007fff18872360 R09: 00007fff18872360 [ 1010.716478] R10: 00007fff18872360 R11: 0000000000000202 R12: 000055f72bf0e1b0 [ 1010.716482] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 1010.716491] Modules linked in: gtp(+) udp_tunnel ib_core uinput af_packet rfkill qrtr joydev hid_generic usbhid hid kvm_intel iTCO_wdt intel_pmc_bxt iTCO_vendor_support kvm snd_hda_codec_generic ledtrig_audio irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel nls_utf8 snd_intel_dspcfg nls_cp866 psmouse aesni_intel vfat crypto_simd fat cryptd glue_helper snd_hda_codec pcspkr snd_hda_core i2c_i801 snd_hwdep i2c_smbus xhci_pci snd_pcm lpc_ich xhci_pci_renesas xhci_hcd qemu_fw_cfg tiny_power_button button sch_fq_codel vboxvideo drm_vram_helper drm_ttm_helper ttm vboxsf vboxguest snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device snd_timer snd soundcore msr fuse efi_pstore dm_mod ip_tables x_tables autofs4 virtio_gpu virtio_dma_buf drm_kms_helper cec rc_core drm virtio_rng virtio_scsi rng_core virtio_balloon virtio_blk virtio_net virtio_console net_failover failover ahci libahci libata evdev scsi_mod input_leds serio_raw virtio_pci intel_agp [ 1010.716674] virtio_ring intel_gtt virtio [last unloaded: gtp] [ 1010.716693] ---[ end trace 04990a4ce61e174b ]--- Cc: [email protected] Signed-off-by: Alexander Ofitserov <[email protected]> Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Reviewed-by: Jiri Pirko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
commit 6b1ba3f upstream. Turning on CONFIG_DMA_API_DEBUG_SG results in the following warning: DMA-API: mmci-pl18x 48220000.mmc: cacheline tracking EEXIST, overlapping mappings aren't supported WARNING: CPU: 1 PID: 51 at kernel/dma/debug.c:568 add_dma_entry+0x234/0x2f4 Modules linked in: CPU: 1 PID: 51 Comm: kworker/1:2 Not tainted 6.1.28 #1 Hardware name: STMicroelectronics STM32MP257F-EV1 Evaluation Board (DT) Workqueue: events_freezable mmc_rescan Call trace: add_dma_entry+0x234/0x2f4 debug_dma_map_sg+0x198/0x350 __dma_map_sg_attrs+0xa0/0x110 dma_map_sg_attrs+0x10/0x2c sdmmc_idma_prep_data+0x80/0xc0 mmci_prep_data+0x38/0x84 mmci_start_data+0x108/0x2dc mmci_request+0xe4/0x190 __mmc_start_request+0x68/0x140 mmc_start_request+0x94/0xc0 mmc_wait_for_req+0x70/0x100 mmc_send_tuning+0x108/0x1ac sdmmc_execute_tuning+0x14c/0x210 mmc_execute_tuning+0x48/0xec mmc_sd_init_uhs_card.part.0+0x208/0x464 mmc_sd_init_card+0x318/0x89c mmc_attach_sd+0xe4/0x180 mmc_rescan+0x244/0x320 DMA API debug brings to light leaking dma-mappings as dma_map_sg and dma_unmap_sg are not correctly balanced. If an error occurs in mmci_cmd_irq function, only mmci_dma_error function is called and as this API is not managed on stm32 variant, dma_unmap_sg is never called in this error path. Signed-off-by: Christophe Kerello <[email protected]> Fixes: 46b723d ("mmc: mmci: add stm32 sdmmc variant") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
commit d6a9608 upstream. Syzbot and Eric reported a lockdep splat in the subflow diag: WARNING: possible circular locking dependency detected 6.8.0-rc4-syzkaller-00212-g40b9385dd8e6 #0 Not tainted syz-executor.2/24141 is trying to acquire lock: ffff888045870130 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_diag_put_ulp net/ipv4/tcp_diag.c:100 [inline] ffff888045870130 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_diag_get_aux+0x738/0x830 net/ipv4/tcp_diag.c:137 but task is already holding lock: ffffc9000135e488 (&h->lhash2[i].lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffffc9000135e488 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_diag_dump_icsk+0x39f/0x1f80 net/ipv4/inet_diag.c:1038 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&h->lhash2[i].lock){+.+.}-{2:2}: lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __inet_hash+0x335/0xbe0 net/ipv4/inet_hashtables.c:743 inet_csk_listen_start+0x23a/0x320 net/ipv4/inet_connection_sock.c:1261 __inet_listen_sk+0x2a2/0x770 net/ipv4/af_inet.c:217 inet_listen+0xa3/0x110 net/ipv4/af_inet.c:239 rds_tcp_listen_init+0x3fd/0x5a0 net/rds/tcp_listen.c:316 rds_tcp_init_net+0x141/0x320 net/rds/tcp.c:577 ops_init+0x352/0x610 net/core/net_namespace.c:136 __register_pernet_operations net/core/net_namespace.c:1214 [inline] register_pernet_operations+0x2cb/0x660 net/core/net_namespace.c:1283 register_pernet_device+0x33/0x80 net/core/net_namespace.c:1370 rds_tcp_init+0x62/0xd0 net/rds/tcp.c:735 do_one_initcall+0x238/0x830 init/main.c:1236 do_initcall_level+0x157/0x210 init/main.c:1298 do_initcalls+0x3f/0x80 init/main.c:1314 kernel_init_freeable+0x42f/0x5d0 init/main.c:1551 kernel_init+0x1d/0x2a0 init/main.c:1441 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242 -> #0 (k-sk_lock-AF_INET6){+.+.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18ca/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 lock_sock_fast include/net/sock.h:1723 [inline] subflow_get_info+0x166/0xd20 net/mptcp/diag.c:28 tcp_diag_put_ulp net/ipv4/tcp_diag.c:100 [inline] tcp_diag_get_aux+0x738/0x830 net/ipv4/tcp_diag.c:137 inet_sk_diag_fill+0x10ed/0x1e00 net/ipv4/inet_diag.c:345 inet_diag_dump_icsk+0x55b/0x1f80 net/ipv4/inet_diag.c:1061 __inet_diag_dump+0x211/0x3a0 net/ipv4/inet_diag.c:1263 inet_diag_dump_compat+0x1c1/0x2d0 net/ipv4/inet_diag.c:1371 netlink_dump+0x59b/0xc80 net/netlink/af_netlink.c:2264 __netlink_dump_start+0x5df/0x790 net/netlink/af_netlink.c:2370 netlink_dump_start include/linux/netlink.h:338 [inline] inet_diag_rcv_msg_compat+0x209/0x4c0 net/ipv4/inet_diag.c:1405 sock_diag_rcv_msg+0xe7/0x410 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543 sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline] netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367 netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:745 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667 do_syscall_64+0xf9/0x240 entry_SYSCALL_64_after_hwframe+0x6f/0x77 As noted by Eric we can break the lock dependency chain avoid dumping any extended info for the mptcp subflow listener: nothing actually useful is presented there. Fixes: b8adb69 ("mptcp: fix lockless access in subflow ULP diag") Cc: [email protected] Reported-by: Eric Dumazet <[email protected]> Closes: https://lore.kernel.org/netdev/CANn89iJ=Oecw6OZDwmSYc9HJKQ_G32uN11L+oUcMu+TOD5Xiaw@mail.gmail.com/ Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-9-162e87e48497@kernel.org Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
…SR-IOV [ Upstream commit 09a3c1e ] When kdump kernel tries to copy dump data over SR-IOV, LPAR panics due to NULL pointer exception: Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc000000020847ad4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: mlx5_core(+) vmx_crypto pseries_wdt papr_scm libnvdimm mlxfw tls psample sunrpc fuse overlay squashfs loop CPU: 12 PID: 315 Comm: systemd-udevd Not tainted 6.4.0-Test102+ #12 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries NIP: c000000020847ad4 LR: c00000002083b2dc CTR: 00000000006cd18c REGS: c000000029162ca0 TRAP: 0300 Not tainted (6.4.0-Test102+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48288244 XER: 00000008 CFAR: c00000002083b2d8 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1 ... NIP _find_next_zero_bit+0x24/0x110 LR bitmap_find_next_zero_area_off+0x5c/0xe0 Call Trace: dev_printk_emit+0x38/0x48 (unreliable) iommu_area_alloc+0xc4/0x180 iommu_range_alloc+0x1e8/0x580 iommu_alloc+0x60/0x130 iommu_alloc_coherent+0x158/0x2b0 dma_iommu_alloc_coherent+0x3c/0x50 dma_alloc_attrs+0x170/0x1f0 mlx5_cmd_init+0xc0/0x760 [mlx5_core] mlx5_function_setup+0xf0/0x510 [mlx5_core] mlx5_init_one+0x84/0x210 [mlx5_core] probe_one+0x118/0x2c0 [mlx5_core] local_pci_probe+0x68/0x110 pci_call_probe+0x68/0x200 pci_device_probe+0xbc/0x1a0 really_probe+0x104/0x540 __driver_probe_device+0xb4/0x230 driver_probe_device+0x54/0x130 __driver_attach+0x158/0x2b0 bus_for_each_dev+0xa8/0x130 driver_attach+0x34/0x50 bus_add_driver+0x16c/0x300 driver_register+0xa4/0x1b0 __pci_register_driver+0x68/0x80 mlx5_init+0xb8/0x100 [mlx5_core] do_one_initcall+0x60/0x300 do_init_module+0x7c/0x2b0 At the time of LPAR dump, before kexec hands over control to kdump kernel, DDWs (Dynamic DMA Windows) are scanned and added to the FDT. For the SR-IOV case, default DMA window "ibm,dma-window" is removed from the FDT and DDW added, for the device. Now, kexec hands over control to the kdump kernel. When the kdump kernel initializes, PCI busses are scanned and IOMMU group/tables created, in pci_dma_bus_setup_pSeriesLP(). For the SR-IOV case, there is no "ibm,dma-window". The original commit: b1fc44e, fixes the path where memory is pre-mapped (direct mapped) to the DDW. When TCEs are direct mapped, there is no need to initialize IOMMU tables. iommu_table_setparms_lpar() only considers "ibm,dma-window" property when initiallizing IOMMU table. In the scenario where TCEs are dynamically allocated for SR-IOV, newly created IOMMU table is not initialized. Later, when the device driver tries to enter TCEs for the SR-IOV device, NULL pointer execption is thrown from iommu_area_alloc(). The fix is to initialize the IOMMU table with DDW property stored in the FDT. There are 2 points to remember: 1. For the dedicated adapter, kdump kernel would encounter both default and DDW in FDT. In this case, DDW property is used to initialize the IOMMU table. 2. A DDW could be direct or dynamic mapped. kdump kernel would initialize IOMMU table and mark the existing DDW as "dynamic". This works fine since, at the time of table initialization, iommu_table_clear() makes some space in the DDW, for some predefined number of TCEs which are needed for kdump to succeed. Fixes: b1fc44e ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window") Signed-off-by: Gaurav Batra <[email protected]> Reviewed-by: Brian King <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
commit fa765c4 upstream. shutdown_pirq and startup_pirq are not taking the irq_mapping_update_lock because they can't due to lock inversion. Both are called with the irq_desc->lock being taking. The lock order, however, is first irq_mapping_update_lock and then irq_desc->lock. This opens multiple races: - shutdown_pirq can be interrupted by a function that allocates an event channel: CPU0 CPU1 shutdown_pirq { xen_evtchn_close(e) __startup_pirq { EVTCHNOP_bind_pirq -> returns just freed evtchn e set_evtchn_to_irq(e, irq) } xen_irq_info_cleanup() { set_evtchn_to_irq(e, -1) } } Assume here event channel e refers here to the same event channel number. After this race the evtchn_to_irq mapping for e is invalid (-1). - __startup_pirq races with __unbind_from_irq in a similar way. Because __startup_pirq doesn't take irq_mapping_update_lock it can grab the evtchn that __unbind_from_irq is currently freeing and cleaning up. In this case even though the event channel is allocated, its mapping can be unset in evtchn_to_irq. The fix is to first cleanup the mappings and then close the event channel. In this way, when an event channel gets allocated it's potential previous evtchn_to_irq mappings are guaranteed to be unset already. This is also the reverse order of the allocation where first the event channel is allocated and then the mappings are setup. On a 5.10 kernel prior to commit 3fcdaf3 ("xen/events: modify internal [un]bind interfaces"), we hit a BUG like the following during probing of NVMe devices. The issue is that during nvme_setup_io_queues, pci_free_irq is called for every device which results in a call to shutdown_pirq. With many nvme devices it's therefore likely to hit this race during boot because there will be multiple calls to shutdown_pirq and startup_pirq are running potentially in parallel. ------------[ cut here ]------------ blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled kernel BUG at drivers/xen/events/events_base.c:499! invalid opcode: 0000 [#1] SMP PTI CPU: 44 PID: 375 Comm: kworker/u257:23 Not tainted 5.10.201-191.748.amzn2.x86_64 #1 Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006 Workqueue: nvme-reset-wq nvme_reset_work RIP: 0010:bind_evtchn_to_cpu+0xdf/0xf0 Code: 5d 41 5e c3 cc cc cc cc 44 89 f7 e8 2b 55 ad ff 49 89 c5 48 85 c0 0f 84 64 ff ff ff 4c 8b 68 30 41 83 fe ff 0f 85 60 ff ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 RSP: 0000:ffffc9000d533b08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000028 RSI: 00000000ffffffff RDI: 00000000ffffffff RBP: ffff888107419680 R08: 0000000000000000 R09: ffffffff82d72b00 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000001ed R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffff88bc8b500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000002610001 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? show_trace_log_lvl+0x1c1/0x2d9 ? show_trace_log_lvl+0x1c1/0x2d9 ? set_affinity_irq+0xdc/0x1c0 ? __die_body.cold+0x8/0xd ? die+0x2b/0x50 ? do_trap+0x90/0x110 ? bind_evtchn_to_cpu+0xdf/0xf0 ? do_error_trap+0x65/0x80 ? bind_evtchn_to_cpu+0xdf/0xf0 ? exc_invalid_op+0x4e/0x70 ? bind_evtchn_to_cpu+0xdf/0xf0 ? asm_exc_invalid_op+0x12/0x20 ? bind_evtchn_to_cpu+0xdf/0xf0 ? bind_evtchn_to_cpu+0xc5/0xf0 set_affinity_irq+0xdc/0x1c0 irq_do_set_affinity+0x1d7/0x1f0 irq_setup_affinity+0xd6/0x1a0 irq_startup+0x8a/0xf0 __setup_irq+0x639/0x6d0 ? nvme_suspend+0x150/0x150 request_threaded_irq+0x10c/0x180 ? nvme_suspend+0x150/0x150 pci_request_irq+0xa8/0xf0 ? __blk_mq_free_request+0x74/0xa0 queue_request_irq+0x6f/0x80 nvme_create_queue+0x1af/0x200 nvme_create_io_queues+0xbd/0xf0 nvme_setup_io_queues+0x246/0x320 ? nvme_irq_check+0x30/0x30 nvme_reset_work+0x1c8/0x400 process_one_work+0x1b0/0x350 worker_thread+0x49/0x310 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 Modules linked in: ---[ end trace a11715de1eee1873 ]--- Fixes: d46a78b ("xen: implement pirq type event channels") Cc: [email protected] Co-debugged-by: Andrew Panyakin <[email protected]> Signed-off-by: Maximilian Heyne <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Maximilian Heyne <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit e6a7df9 ] The change try to fix below error specific to RV platform: BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 917 Comm: sway Not tainted 6.3.9-arch1-1 #1 124dc55df4f5272ccb409f39ef4872fc2b3376a2 Hardware name: LENOVO 20NKS01Y00/20NKS01Y00, BIOS R12ET61W(1.31 ) 07/28/2022 RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper] Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8> RSP: 0018:ffff960cc2df77d8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff8afb87e81280 RCX: 0000000000000224 RDX: ffff8afb9ee37c00 RSI: ffff8afb8da1a578 RDI: ffff8afb87e81280 RBP: ffff8afb83d67000 R08: 0000000000000001 R09: ffff8afb9652f850 R10: ffff960cc2df7908 R11: 0000000000000002 R12: 0000000000000000 R13: ffff8afb8d7688a0 R14: ffff8afb8da1a578 R15: 0000000000000224 FS: 00007f4dac35ce00(0000) GS:ffff8afe30b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000010ddc6000 CR4: 00000000003506e0 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? plist_add+0xbe/0x100 ? exc_page_fault+0x7c/0x180 ? asm_exc_page_fault+0x26/0x30 ? drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper 0e67723696438d8e02b741593dd50d80b44c2026] ? drm_dp_atomic_find_time_slots+0x28/0x260 [drm_display_helper 0e67723696438d8e02b741593dd50d80b44c2026] compute_mst_dsc_configs_for_link+0x2ff/0xa40 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054] ? fill_plane_buffer_attributes+0x419/0x510 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054] compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054] amdgpu_dm_atomic_check+0xecd/0x1190 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054] drm_atomic_check_only+0x5c5/0xa40 drm_mode_atomic_ioctl+0x76e/0xbc0 ? _copy_to_user+0x25/0x30 ? drm_ioctl+0x296/0x4b0 ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 drm_ioctl_kernel+0xcd/0x170 drm_ioctl+0x26d/0x4b0 ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054] __x64_sys_ioctl+0x94/0xd0 do_syscall_64+0x60/0x90 ? do_syscall_64+0x6c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f4dad17f76f Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c> RSP: 002b:00007ffd9ae859f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000055e255a55900 RCX: 00007f4dad17f76f RDX: 00007ffd9ae85a90 RSI: 00000000c03864bc RDI: 000000000000000b RBP: 00007ffd9ae85a90 R08: 0000000000000003 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc R13: 000000000000000b R14: 000055e255a7fc60 R15: 000055e255a01eb0 </TASK> Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm cmac algif_hash algif_skcipher af_alg joydev mousedev bnep > typec libphy k10temp ipmi_msghandler roles i2c_scmi acpi_cpufreq mac_hid nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_mas> CR2: 0000000000000008 ---[ end trace 0000000000000000 ]--- RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper] Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8> RSP: 0018:ffff960cc2df77d8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff8afb87e81280 RCX: 0000000000000224 RDX: ffff8afb9ee37c00 RSI: ffff8afb8da1a578 RDI: ffff8afb87e81280 RBP: ffff8afb83d67000 R08: 0000000000000001 R09: ffff8afb9652f850 R10: ffff960cc2df7908 R11: 0000000000000002 R12: 0000000000000000 R13: ffff8afb8d7688a0 R14: ffff8afb8da1a578 R15: 0000000000000224 FS: 00007f4dac35ce00(0000) GS:ffff8afe30b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000010ddc6000 CR4: 00000000003506e0 With a second DP monitor connected, drm_atomic_state in dm atomic check sequence does not include the connector state for the old/existing/first DP monitor. In such case, dsc determination policy would hit a null ptr when it tries to iterate the old/existing stream that does not have a valid connector state attached to it. When that happens, dm atomic check should call drm_atomic_get_connector_state for a new connector state. Existing dm has already done that, except for RV due to it does not have official support of dsc where .num_dsc is not defined in dcn10 resource cap, that prevent from getting drm_atomic_get_connector_state called. So, skip dsc determination policy for ASICs that don't have DSC support. Cc: [email protected] # 6.1+ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2314 Reviewed-by: Wayne Lin <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Fangzhi Zuo <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit 32019c6 ] When trying to use copy_from_kernel_nofault() to read vsyscall page through a bpf program, the following oops was reported: BUG: unable to handle page fault for address: ffffffffff600000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... RIP: 0010:copy_from_kernel_nofault+0x6f/0x110 ...... Call Trace: <TASK> ? copy_from_kernel_nofault+0x6f/0x110 bpf_probe_read_kernel+0x1d/0x50 bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d trace_call_bpf+0xc5/0x1c0 perf_call_bpf_enter.isra.0+0x69/0xb0 perf_syscall_enter+0x13e/0x200 syscall_trace_enter+0x188/0x1c0 do_syscall_64+0xb5/0xe0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 </TASK> ...... ---[ end trace 0000000000000000 ]--- The oops is triggered when: 1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall page and invokes copy_from_kernel_nofault() which in turn calls __get_user_asm(). 2) Because the vsyscall page address is not readable from kernel space, a page fault exception is triggered accordingly. 3) handle_page_fault() considers the vsyscall page address as a user space address instead of a kernel space address. This results in the fix-up setup by bpf not being applied and a page_fault_oops() is invoked due to SMAP. Considering handle_page_fault() has already considered the vsyscall page address as a userspace address, fix the problem by disallowing vsyscall page read for copy_from_kernel_nofault(). Originally-by: Thomas Gleixner <[email protected]> Reported-by: [email protected] Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com Reported-by: xingwei lee <[email protected]> Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com Signed-off-by: Hou Tao <[email protected]> Reviewed-by: Sohil Mehta <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit 9636951e4468f02c72cc75a82dc65d003077edbc ] When QoS is disabled, the queue priority value will not map to the correct ieee80211 queue since there is only one queue. Stop/wake queue 0 when QoS is disabled to prevent trying to stop/wake a non-existent queue and failing to stop/wake the actual queue instantiated. Log of issue before change (with kernel parameter qos=0): [ +5.112651] ------------[ cut here ]------------ [ +0.000005] WARNING: CPU: 7 PID: 25513 at net/mac80211/util.c:449 __ieee80211_wake_queue+0xd5/0x180 [mac80211] [ +0.000067] Modules linked in: b43(O) snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nft_chain_nat xt_MASQUERADE nf_nat xfrm_user xfrm_algo xt_addrtype overlay ccm af_packet amdgpu snd_hda_codec_cirrus snd_hda_codec_generic ledtrig_audio drm_exec amdxcp gpu_sched xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nf_tables nfnetlink sch_fq_codel btusb uinput iTCO_wdt ctr btrtl intel_pmc_bxt i915 intel_rapl_msr mei_hdcp mei_pxp joydev at24 watchdog btintel atkbd libps2 serio radeon btbcm vivaldi_fmap btmtk intel_rapl_common snd_hda_codec_hdmi bluetooth uvcvideo nls_iso8859_1 applesmc nls_cp437 x86_pkg_temp_thermal snd_hda_intel intel_powerclamp vfat videobuf2_vmalloc coretemp fat snd_intel_dspcfg crc32_pclmul uvc polyval_clmulni snd_intel_sdw_acpi loop videobuf2_memops snd_hda_codec tun drm_suballoc_helper polyval_generic drm_ttm_helper drm_buddy tap ecdh_generic videobuf2_v4l2 gf128mul macvlan ttm ghash_clmulni_intel ecc tg3 [ +0.000044] videodev bridge snd_hda_core rapl crc16 drm_display_helper cec mousedev snd_hwdep evdev intel_cstate bcm5974 hid_appleir videobuf2_common stp mac_hid libphy snd_pcm drm_kms_helper acpi_als mei_me intel_uncore llc mc snd_timer intel_gtt industrialio_triggered_buffer apple_mfi_fastcharge i2c_i801 mei snd lpc_ich agpgart ptp i2c_smbus thunderbolt apple_gmux i2c_algo_bit kfifo_buf video industrialio soundcore pps_core wmi tiny_power_button sbs sbshc button ac cordic bcma mac80211 cfg80211 ssb rfkill libarc4 kvm_intel kvm drm irqbypass fuse backlight firmware_class efi_pstore configfs efivarfs dmi_sysfs ip_tables x_tables autofs4 dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core input_leds hid_apple led_class hid_generic usbhid hid sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic ahci libahci libata uhci_hcd ehci_pci ehci_hcd crct10dif_pclmul crct10dif_common sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 aesni_intel usbcore scsi_mod libaes crypto_simd cryptd scsi_common [ +0.000055] usb_common rtc_cmos btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq dm_snapshot dm_bufio dm_mod dax [last unloaded: b43(O)] [ +0.000009] CPU: 7 PID: 25513 Comm: irq/17-b43 Tainted: G W O 6.6.7 #1-NixOS [ +0.000003] Hardware name: Apple Inc. MacBookPro8,3/Mac-942459F5819B171B, BIOS 87.0.0.0.0 06/13/2019 [ +0.000001] RIP: 0010:__ieee80211_wake_queue+0xd5/0x180 [mac80211] [ +0.000046] Code: 00 45 85 e4 0f 85 9b 00 00 00 48 8d bd 40 09 00 00 f0 48 0f ba ad 48 09 00 00 00 72 0f 5b 5d 41 5c 41 5d 41 5e e9 cb 6d 3c d0 <0f> 0b 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 48 8d b4 16 94 00 00 [ +0.000002] RSP: 0018:ffffc90003c77d60 EFLAGS: 00010097 [ +0.000001] RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000000000 [ +0.000001] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88820b924900 [ +0.000002] RBP: ffff88820b924900 R08: ffffc90003c77d90 R09: 000000000003bfd0 [ +0.000001] R10: ffff88820b924900 R11: ffffc90003c77c68 R12: 0000000000000000 [ +0.000001] R13: 0000000000000000 R14: ffffc90003c77d90 R15: ffffffffc0fa6f40 [ +0.000001] FS: 0000000000000000(0000) GS:ffff88846fb80000(0000) knlGS:0000000000000000 [ +0.000001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007fafda7ae008 CR3: 000000046d220005 CR4: 00000000000606e0 [ +0.000002] Call Trace: [ +0.000003] <TASK> [ +0.000001] ? __ieee80211_wake_queue+0xd5/0x180 [mac80211] [ +0.000044] ? __warn+0x81/0x130 [ +0.000005] ? __ieee80211_wake_queue+0xd5/0x180 [mac80211] [ +0.000045] ? report_bug+0x171/0x1a0 [ +0.000004] ? handle_bug+0x41/0x70 [ +0.000004] ? exc_invalid_op+0x17/0x70 [ +0.000003] ? asm_exc_invalid_op+0x1a/0x20 [ +0.000005] ? __ieee80211_wake_queue+0xd5/0x180 [mac80211] [ +0.000043] ieee80211_wake_queue+0x4a/0x80 [mac80211] [ +0.000044] b43_dma_handle_txstatus+0x29c/0x3a0 [b43] [ +0.000016] ? __pfx_irq_thread_fn+0x10/0x10 [ +0.000002] b43_handle_txstatus+0x61/0x80 [b43] [ +0.000012] b43_interrupt_thread_handler+0x3f9/0x6b0 [b43] [ +0.000011] irq_thread_fn+0x23/0x60 [ +0.000002] irq_thread+0xfe/0x1c0 [ +0.000002] ? __pfx_irq_thread_dtor+0x10/0x10 [ +0.000001] ? __pfx_irq_thread+0x10/0x10 [ +0.000001] kthread+0xe8/0x120 [ +0.000003] ? __pfx_kthread+0x10/0x10 [ +0.000003] ret_from_fork+0x34/0x50 [ +0.000002] ? __pfx_kthread+0x10/0x10 [ +0.000002] ret_from_fork_asm+0x1b/0x30 [ +0.000004] </TASK> [ +0.000001] ---[ end trace 0000000000000000 ]--- [ +0.000065] ------------[ cut here ]------------ [ +0.000001] WARNING: CPU: 0 PID: 56077 at net/mac80211/util.c:514 __ieee80211_stop_queue+0xcc/0xe0 [mac80211] [ +0.000077] Modules linked in: b43(O) snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nft_chain_nat xt_MASQUERADE nf_nat xfrm_user xfrm_algo xt_addrtype overlay ccm af_packet amdgpu snd_hda_codec_cirrus snd_hda_codec_generic ledtrig_audio drm_exec amdxcp gpu_sched xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nf_tables nfnetlink sch_fq_codel btusb uinput iTCO_wdt ctr btrtl intel_pmc_bxt i915 intel_rapl_msr mei_hdcp mei_pxp joydev at24 watchdog btintel atkbd libps2 serio radeon btbcm vivaldi_fmap btmtk intel_rapl_common snd_hda_codec_hdmi bluetooth uvcvideo nls_iso8859_1 applesmc nls_cp437 x86_pkg_temp_thermal snd_hda_intel intel_powerclamp vfat videobuf2_vmalloc coretemp fat snd_intel_dspcfg crc32_pclmul uvc polyval_clmulni snd_intel_sdw_acpi loop videobuf2_memops snd_hda_codec tun drm_suballoc_helper polyval_generic drm_ttm_helper drm_buddy tap ecdh_generic videobuf2_v4l2 gf128mul macvlan ttm ghash_clmulni_intel ecc tg3 [ +0.000073] videodev bridge snd_hda_core rapl crc16 drm_display_helper cec mousedev snd_hwdep evdev intel_cstate bcm5974 hid_appleir videobuf2_common stp mac_hid libphy snd_pcm drm_kms_helper acpi_als mei_me intel_uncore llc mc snd_timer intel_gtt industrialio_triggered_buffer apple_mfi_fastcharge i2c_i801 mei snd lpc_ich agpgart ptp i2c_smbus thunderbolt apple_gmux i2c_algo_bit kfifo_buf video industrialio soundcore pps_core wmi tiny_power_button sbs sbshc button ac cordic bcma mac80211 cfg80211 ssb rfkill libarc4 kvm_intel kvm drm irqbypass fuse backlight firmware_class efi_pstore configfs efivarfs dmi_sysfs ip_tables x_tables autofs4 dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core input_leds hid_apple led_class hid_generic usbhid hid sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic ahci libahci libata uhci_hcd ehci_pci ehci_hcd crct10dif_pclmul crct10dif_common sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 aesni_intel usbcore scsi_mod libaes crypto_simd cryptd scsi_common [ +0.000084] usb_common rtc_cmos btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq dm_snapshot dm_bufio dm_mod dax [last unloaded: b43] [ +0.000012] CPU: 0 PID: 56077 Comm: kworker/u16:17 Tainted: G W O 6.6.7 #1-NixOS [ +0.000003] Hardware name: Apple Inc. MacBookPro8,3/Mac-942459F5819B171B, BIOS 87.0.0.0.0 06/13/2019 [ +0.000001] Workqueue: phy7 b43_tx_work [b43] [ +0.000019] RIP: 0010:__ieee80211_stop_queue+0xcc/0xe0 [mac80211] [ +0.000076] Code: 74 11 48 8b 78 08 0f b7 d6 89 e9 4c 89 e6 e8 ab f4 00 00 65 ff 0d 9c b7 34 3f 0f 85 55 ff ff ff 0f 1f 44 00 00 e9 4b ff ff ff <0f> 0b 5b 5d 41 5c 41 5d c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 [ +0.000002] RSP: 0000:ffffc90004157d50 EFLAGS: 00010097 [ +0.000002] RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000000000 [ +0.000002] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8882d65d0900 [ +0.000002] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001 [ +0.000001] R10: 00000000000000ff R11: ffff88814d0155a0 R12: ffff8882d65d0900 [ +0.000002] R13: 0000000000000000 R14: ffff8881002d2800 R15: 00000000000000d0 [ +0.000002] FS: 0000000000000000(0000) GS:ffff88846f800000(0000) knlGS:0000000000000000 [ +0.000003] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000002] CR2: 00007f2e8c10c880 CR3: 0000000385b66005 CR4: 00000000000606f0 [ +0.000002] Call Trace: [ +0.000001] <TASK> [ +0.000001] ? __ieee80211_stop_queue+0xcc/0xe0 [mac80211] [ +0.000075] ? __warn+0x81/0x130 [ +0.000004] ? __ieee80211_stop_queue+0xcc/0xe0 [mac80211] [ +0.000075] ? report_bug+0x171/0x1a0 [ +0.000005] ? handle_bug+0x41/0x70 [ +0.000003] ? exc_invalid_op+0x17/0x70 [ +0.000004] ? asm_exc_invalid_op+0x1a/0x20 [ +0.000004] ? __ieee80211_stop_queue+0xcc/0xe0 [mac80211] [ +0.000076] ieee80211_stop_queue+0x36/0x50 [mac80211] [ +0.000077] b43_dma_tx+0x550/0x780 [b43] [ +0.000023] b43_tx_work+0x90/0x130 [b43] [ +0.000018] process_one_work+0x174/0x340 [ +0.000003] worker_thread+0x27b/0x3a0 [ +0.000004] ? __pfx_worker_thread+0x10/0x10 [ +0.000002] kthread+0xe8/0x120 [ +0.000003] ? __pfx_kthread+0x10/0x10 [ +0.000004] ret_from_fork+0x34/0x50 [ +0.000002] ? __pfx_kthread+0x10/0x10 [ +0.000003] ret_from_fork_asm+0x1b/0x30 [ +0.000006] </TASK> [ +0.000001] ---[ end trace 0000000000000000 ]--- Fixes: e6f5b93 ("b43: Add QOS support") Signed-off-by: Rahul Rameshbabu <[email protected]> Reviewed-by: Julian Calaby <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit f1d71576d2c9ec8fdb822173fa7f3de79475e9bd ] When the generic SCMI code tears down a channel, it calls the chan_free callback function, defined by each transport. Since multiple protocols might share the same transport_info member, chan_free() might want to clean up the same member multiple times within the given SCMI transport implementation. In this case, it is SMC transport. This will lead to a NULL pointer dereference at the second time: | scmi_protocol scmi_dev.1: Enabled polling mode TX channel - prot_id:16 | arm-scmi firmware:scmi: SCMI Notifications - Core Enabled. | arm-scmi firmware:scmi: unable to communicate with SCMI | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000881ef8000 | [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: | CPU: 4 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc2-00124-g455ef3d016c9-dirty #793 | Hardware name: FVP Base RevC (DT) | pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) | pc : smc_chan_free+0x3c/0x6c | lr : smc_chan_free+0x3c/0x6c | Call trace: | smc_chan_free+0x3c/0x6c | idr_for_each+0x68/0xf8 | scmi_cleanup_channels.isra.0+0x2c/0x58 | scmi_probe+0x434/0x734 | platform_probe+0x68/0xd8 | really_probe+0x110/0x27c | __driver_probe_device+0x78/0x12c | driver_probe_device+0x3c/0x118 | __driver_attach+0x74/0x128 | bus_for_each_dev+0x78/0xe0 | driver_attach+0x24/0x30 | bus_add_driver+0xe4/0x1e8 | driver_register+0x60/0x128 | __platform_driver_register+0x28/0x34 | scmi_driver_init+0x84/0xc0 | do_one_initcall+0x78/0x33c | kernel_init_freeable+0x2b8/0x51c | kernel_init+0x24/0x130 | ret_from_fork+0x10/0x20 | Code: f0004701 910a0021 aa1403e5 97b91c70 (b9400280) | ---[ end trace 0000000000000000 ]--- Simply check for the struct pointer being NULL before trying to access its members, to avoid this situation. This was found when a transport doesn't really work (for instance no SMC service), the probe routines then tries to clean up, and triggers a crash. Signed-off-by: Andre Przywara <[email protected]> Fixes: 1dc6558 ("firmware: arm_scmi: Add smc/hvc transport") Reviewed-by: Cristian Marussi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit 65e8fbde64520001abf1c8d0e573561b4746ef38 ] There is this reported crash when experimenting with the lvm2 testsuite. The list corruption is caused by the fact that the postsuspend and resume methods were not paired correctly; there were two consecutive calls to the origin_postsuspend function. The second call attempts to remove the "hash_list" entry from a list, while it was already removed by the first call. Fix __dm_internal_resume so that it calls the preresume and resume methods of the table's targets. If a preresume method of some target fails, we are in a tricky situation. We can't return an error because dm_internal_resume isn't supposed to return errors. We can't return success, because then the "resume" and "postsuspend" methods would not be paired correctly. So, we set the DMF_SUSPENDED flag and we fake normal suspend - it may confuse userspace tools, but it won't cause a kernel crash. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 8343 Comm: dmsetup Not tainted 6.8.0-rc6 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 RIP: 0010:__list_del_entry_valid_or_report+0x77/0xc0 <snip> RSP: 0018:ffff8881b831bcc0 EFLAGS: 00010282 RAX: 000000000000004e RBX: ffff888143b6eb80 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff819053d0 RDI: 00000000ffffffff RBP: ffff8881b83a3400 R08: 00000000fffeffff R09: 0000000000000058 R10: 0000000000000000 R11: ffffffff81a24080 R12: 0000000000000001 R13: ffff88814538e000 R14: ffff888143bc6dc0 R15: ffffffffa02e4bb0 FS: 00000000f7c0f780(0000) GS:ffff8893f0a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000057fb5000 CR3: 0000000143474000 CR4: 00000000000006b0 Call Trace: <TASK> ? die+0x2d/0x80 ? do_trap+0xeb/0xf0 ? __list_del_entry_valid_or_report+0x77/0xc0 ? do_error_trap+0x60/0x80 ? __list_del_entry_valid_or_report+0x77/0xc0 ? exc_invalid_op+0x49/0x60 ? __list_del_entry_valid_or_report+0x77/0xc0 ? asm_exc_invalid_op+0x16/0x20 ? table_deps+0x1b0/0x1b0 [dm_mod] ? __list_del_entry_valid_or_report+0x77/0xc0 origin_postsuspend+0x1a/0x50 [dm_snapshot] dm_table_postsuspend_targets+0x34/0x50 [dm_mod] dm_suspend+0xd8/0xf0 [dm_mod] dev_suspend+0x1f2/0x2f0 [dm_mod] ? table_deps+0x1b0/0x1b0 [dm_mod] ctl_ioctl+0x300/0x5f0 [dm_mod] dm_compat_ctl_ioctl+0x7/0x10 [dm_mod] __x64_compat_sys_ioctl+0x104/0x170 do_syscall_64+0x184/0x1b0 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0xf7e6aead <snip> ---[ end trace 0000000000000000 ]--- Fixes: ffcc393 ("dm: enhance internal suspend and resume interface") Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit 251a658bbfceafb4d58c76b77682c8bf7bcfad65 ] A call to listxattr() with a buffer size = 0 returns the actual size of the buffer needed for a subsequent call. When size > 0, nfs4_listxattr() does not return an error because either generic_listxattr() or nfs4_listxattr_nfs4_label() consumes exactly all the bytes then size is 0 when calling nfs4_listxattr_nfs4_user() which then triggers the following kernel BUG: [ 99.403778] kernel BUG at mm/usercopy.c:102! [ 99.404063] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 99.408463] CPU: 0 PID: 3310 Comm: python3 Not tainted 6.6.0-61.fc40.aarch64 #1 [ 99.415827] Call trace: [ 99.415985] usercopy_abort+0x70/0xa0 [ 99.416227] __check_heap_object+0x134/0x158 [ 99.416505] check_heap_object+0x150/0x188 [ 99.416696] __check_object_size.part.0+0x78/0x168 [ 99.416886] __check_object_size+0x28/0x40 [ 99.417078] listxattr+0x8c/0x120 [ 99.417252] path_listxattr+0x78/0xe0 [ 99.417476] __arm64_sys_listxattr+0x28/0x40 [ 99.417723] invoke_syscall+0x78/0x100 [ 99.417929] el0_svc_common.constprop.0+0x48/0xf0 [ 99.418186] do_el0_svc+0x24/0x38 [ 99.418376] el0_svc+0x3c/0x110 [ 99.418554] el0t_64_sync_handler+0x120/0x130 [ 99.418788] el0t_64_sync+0x194/0x198 [ 99.418994] Code: aa0003e3 d000a3e0 91310000 97f49bdb (d4210000) Issue is reproduced when generic_listxattr() returns 'system.nfs4_acl', thus calling lisxattr() with size = 16 will trigger the bug. Add check on nfs4_listxattr() to return ERANGE error when it is called with size > 0 and the return value is greater than size. Fixes: 012a211 ("NFSv4.2: hook in the user extended attribute handlers") Signed-off-by: Jorge Mora <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jun 27, 2024
[ Upstream commit d27e2da94a42655861ca4baea30c8cd65546f25d ] Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also tries to free SGEs. This race condition can result in system crashes due to accessing freed memory locations in bnx2x_free_rx_sge() 799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp, 800 struct bnx2x_fastpath *fp, u16 index) 801 { 802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index]; 803 struct page *page = sw_buf->page; .... where sw_buf was set to NULL after the call to dma_unmap_page() by the preceding thread. EEH: Beginning: 'slot_reset' PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... bnx2x 0011:01:00.0: enabling device (0140 -> 0142) bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0080000025065fc Oops: Kernel access of bad area, sig: 11 [#1] ..... Call Trace: [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Fixes: 4cace67 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") Signed-off-by: Thinh Tran <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jul 2, 2024
[ Upstream commit 984af1e ] echo max of u64 to cost.model can cause divide by 0 error. # echo 8:0 rbps=18446744073709551615 > /sys/fs/cgroup/io.cost.model divide error: 0000 [#1] PREEMPT SMP RIP: 0010:calc_lcoefs+0x4c/0xc0 Call Trace: <TASK> ioc_refresh_params+0x2b3/0x4f0 ioc_cost_model_write+0x3cb/0x4c0 ? _copy_from_iter+0x6d/0x6c0 ? kernfs_fop_write_iter+0xfc/0x270 cgroup_file_write+0xa0/0x200 kernfs_fop_write_iter+0x17d/0x270 vfs_write+0x414/0x620 ksys_write+0x73/0x160 __x64_sys_write+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd calc_lcoefs() uses the input value of cost.model in DIV_ROUND_UP_ULL, overflow would happen if bps plus IOC_PAGE_SIZE is greater than ULLONG_MAX, it can cause divide by 0 error. Fix the problem by setting basecost Signed-off-by: Li Nan <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jul 2, 2024
[ Upstream commit 984af1e ] echo max of u64 to cost.model can cause divide by 0 error. # echo 8:0 rbps=18446744073709551615 > /sys/fs/cgroup/io.cost.model divide error: 0000 [#1] PREEMPT SMP RIP: 0010:calc_lcoefs+0x4c/0xc0 Call Trace: <TASK> ioc_refresh_params+0x2b3/0x4f0 ioc_cost_model_write+0x3cb/0x4c0 ? _copy_from_iter+0x6d/0x6c0 ? kernfs_fop_write_iter+0xfc/0x270 cgroup_file_write+0xa0/0x200 kernfs_fop_write_iter+0x17d/0x270 vfs_write+0x414/0x620 ksys_write+0x73/0x160 __x64_sys_write+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd calc_lcoefs() uses the input value of cost.model in DIV_ROUND_UP_ULL, overflow would happen if bps plus IOC_PAGE_SIZE is greater than ULLONG_MAX, it can cause divide by 0 error. Fix the problem by setting basecost Signed-off-by: Li Nan <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jul 2, 2024
Tested on PocketBeagle + ATUSB + CC1352R SensorTag: debian@beaglebone:~$ uname -a Linux beaglebone 5.8.18-bone23 #1xross PREEMPT Tue Dec 15 15:41:20 IST 2020 armv7l GNU/Linux All drivers probed correctly as per manifest, but a kernel panic occurs after sometime: debian@beaglebone:~$ dmesg [ 218.793223] greybus 1-svc: no primary interface detected on module 2 [ 218.876678] greybus 1-2.2: Interface added (greybus) [ 218.876707] greybus 1-2.2: GMP VID=0x00000126, PID=0x00000126 [ 218.876739] greybus 1-2.2: DDBL1 Manufacturer=0x00000126, Product=0x00000126 [ 219.296368] greybus 1-2.2: excess descriptors in interface manifest [ 220.922786] mikrobus:mikrobus_port_gb_register: mikrobus gb_probe , num cports= 3 [ 220.922793] mikrobus:mikrobus_port_gb_register: protocol added 11 [ 220.922824] mikrobus:mikrobus_port_gb_register: protocol added 3 [ 220.922829] mikrobus:mikrobus_port_gb_register: protocol added 2 [ 220.922896] mikrobus:mikrobus_port_register: registering port mikrobus-1 [ 220.931490] mikrobus_manifest:mikrobus_manifest_attach_device: parsed device 1, driver=bme280, protocol=3, reg=76 [ 220.931526] mikrobus_manifest:mikrobus_manifest_attach_device: parsed device 2, driver=opt3001, protocol=3, reg=44 [ 220.931557] mikrobus_manifest:mikrobus_manifest_parse: Greybus Service Sample Application manifest parsed with 2 devices [ 220.931602] mikrobus mikrobus-1: registering device : bme280 [ 220.937043] mikrobus mikrobus-1: registering device : opt3001 [ 221.628678] bmp280 3-0076: supply vddd not found, using dummy regulator [ 221.628976] bmp280 3-0076: supply vdda not found, using dummy regulator [ 221.630669] opt3001 3-0044: Found TI OPT3001 debian@beaglebone:~$ ls /sys/bus/iio/devices/iio\:device iio:device0/ iio:device1/ iio:device2/ debian@beaglebone:~$ ls /sys/bus/iio/devices/iio\:device1 current_timestamp_clock dev events in_illuminance_input in_illuminance_integration_time integration_time_available name power subsystem uevent debian@beaglebone:~$ ls /sys/bus/iio/devices/iio\:device2 dev in_humidityrelative_input in_humidityrelative_oversampling_ratio in_pressure_input in_pressure_oversampling_ratio in_temp_input in_temp_oversampling_ratio name power subsystem uevent . . . [ 235.947231] 8<--- cut here --- [ 235.951059] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 235.964814] pgd = 1f321289 [ 235.970754] [00000004] *pgd=00000000 [ 235.977393] Internal error: Oops: 5 [#1] PREEMPT THUMB2 [ 235.983348] Modules linked in: bmp280_i2c bmp280 opt3001 gb_netlink gb_loopback(C) gb_gpio(C) gb_i2c(C) gb_spi(C) gb_gbphy(C) gb_spilib(C) greybus nhc_udp nhc_routing nhc_ipv6 nhc_mobility nhc_hop nhc_fragment nhc_dest ieee802154_6lowpan 6lowpan evdev usb_f_acm u_serial usb_f_ncm usb_f_mass_storage usb_f_rndis u_ether libcomposite iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables x_tables atusb mac802154 ieee802154 spidev [last unloaded: gb_netlink] [ 236.028629] CPU: 0 PID: 5 Comm: kworker/0:0 Tainted: G C O 5.8.18-bone23 #1xross [ 236.037887] Hardware name: Generic AM33XX (Flattened Device Tree) [ 236.044761] Workqueue: events do_work [greybus] [ 236.050022] PC is at skb_release_data+0x52/0xe4 [ 236.055268] LR is at kfree_skb+0x23/0xa8 [ 236.059903] pc : [<c0857e6a>] lr : [<c0857583>] psr: 400f0033 [ 236.066892] sp : dc139e60 ip : a00f0013 fp : c0f1369c [ 236.072833] r10: da5b5200 r9 : da5b5214 r8 : 00000001 [ 236.078775] r7 : da5b5300 r6 : db633c00 r5 : da5b5300 r4 : 00000000 [ 236.086026] r3 : 00000008 r2 : 00000000 r1 : 00000000 r0 : 00000000 [ 236.093279] Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment none [ 236.101316] Control: 50c5387d Table: 9b5a4019 DAC: 00000051 [ 236.107785] Process kworker/0:0 (pid: 5, stack limit = 0x39163ebd) [ 236.114689] Stack: (0xdc139e60 to 0xdc13a000) [ 236.119766] 9e60: da5b5200 db633c00 ffffff91 bf9391cd db51c000 c0857583 db633c00 ffffff91 [ 236.128678] 9e80: dae21168 bf9391cd 00000000 00000001 00000000 c0f05288 d64e7ea0 00000000 [ 236.137590] 9ea0: 000007d0 00000cc0 da6b0e00 00000000 00000000 bf90f84d 00000cc0 d64e7ea0 [ 236.146502] 9ec0: 00000000 00000000 00000013 bf90f8b9 d64e7ea0 bf90f94d 00000000 00000cc0 [ 236.155415] 9ee0: d6543d40 db51c400 00000000 dfa07300 00000000 00000000 d6543d44 bf90e973 [ 236.164328] 9f00: 00000000 00000000 000007d0 c0153c45 c0f05288 bf90eb39 00000000 c0f05288 [ 236.173240] 9f20: dfa07305 d6543d40 dc0e5b00 00000000 dfa07300 c01377c7 dc139f50 c0137a53 [ 236.182153] 9f40: dc0e5b00 c0f1365c dc0e5b14 c0f1d060 c0f13670 dc138000 c0f1365c c0137af9 [ 236.191065] 9f60: 00000000 dc0fd080 dc0fd380 dc138000 00000000 c0137a2d dc0e5b00 dc117eb8 [ 236.199979] 9f80: dc0fd0a0 c013ba45 00000000 dc0fd380 c013b955 00000000 00000000 00000000 [ 236.208891] 9fa0: 00000000 00000000 00000000 c0100159 00000000 00000000 00000000 00000000 [ 236.217804] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 236.226716] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 [ 236.235638] [<c0857e6a>] (skb_release_data) from [<c0857583>] (kfree_skb+0x23/0xa8) [ 236.244032] [<c0857583>] (kfree_skb) from [<bf9391cd>] (message_send+0x91/0x12c [gb_netlink]) [ 236.253333] [<bf9391cd>] (message_send [gb_netlink]) from [<bf90f84d>] (gb_operation_request_send+0xad/0xfc [greybus]) [ 236.264799] [<bf90f84d>] (gb_operation_request_send [greybus]) from [<bf90f8b9>] (gb_operation_request_send_sync_timeout+0x1d/0x4c [greybus]) [ 236.278268] [<bf90f8b9>] (gb_operation_request_send_sync_timeout [greybus]) from [<bf90f94d>] (gb_operation_sync_timeout+0x65/0xb0 [greybus]) [ 236.291736] [<bf90f94d>] (gb_operation_sync_timeout [greybus]) from [<bf90e973>] (gb_svc_ping+0x23/0x28 [greybus]) [ 236.302849] [<bf90e973>] (gb_svc_ping [greybus]) from [<bf90eb39>] (do_work+0x19/0xe8 [greybus]) [ 236.312387] [<bf90eb39>] (do_work [greybus]) from [<c01377c7>] (process_one_work+0x127/0x38c) [ 236.321651] [<c01377c7>] (process_one_work) from [<c0137af9>] (worker_thread+0xcd/0x3d0) [ 236.330482] [<c0137af9>] (worker_thread) from [<c013ba45>] (kthread+0xf1/0x124) [ 236.338526] [<c013ba45>] (kthread) from [<c0100159>] (ret_from_fork+0x11/0x38) [ 236.346474] Exception stack(0xdc139fb0 to 0xdc139ff8) [ 236.352244] 9fa0: 00000000 00000000 00000000 00000000 [ 236.361156] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 236.370067] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 236.377413] Code: 78bb 42a3 dd1b 6aa8 (6843) 07da [ 236.394147] ---[ end trace dc655e652b764722 ]--- Signed-off-by: Vaishnav M A <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jul 22, 2024
[ Upstream commit 984af1e ] echo max of u64 to cost.model can cause divide by 0 error. # echo 8:0 rbps=18446744073709551615 > /sys/fs/cgroup/io.cost.model divide error: 0000 [#1] PREEMPT SMP RIP: 0010:calc_lcoefs+0x4c/0xc0 Call Trace: <TASK> ioc_refresh_params+0x2b3/0x4f0 ioc_cost_model_write+0x3cb/0x4c0 ? _copy_from_iter+0x6d/0x6c0 ? kernfs_fop_write_iter+0xfc/0x270 cgroup_file_write+0xa0/0x200 kernfs_fop_write_iter+0x17d/0x270 vfs_write+0x414/0x620 ksys_write+0x73/0x160 __x64_sys_write+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd calc_lcoefs() uses the input value of cost.model in DIV_ROUND_UP_ULL, overflow would happen if bps plus IOC_PAGE_SIZE is greater than ULLONG_MAX, it can cause divide by 0 error. Fix the problem by setting basecost Signed-off-by: Li Nan <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson
pushed a commit
that referenced
this issue
Jul 22, 2024
[ Upstream commit 984af1e ] echo max of u64 to cost.model can cause divide by 0 error. # echo 8:0 rbps=18446744073709551615 > /sys/fs/cgroup/io.cost.model divide error: 0000 [#1] PREEMPT SMP RIP: 0010:calc_lcoefs+0x4c/0xc0 Call Trace: <TASK> ioc_refresh_params+0x2b3/0x4f0 ioc_cost_model_write+0x3cb/0x4c0 ? _copy_from_iter+0x6d/0x6c0 ? kernfs_fop_write_iter+0xfc/0x270 cgroup_file_write+0xa0/0x200 kernfs_fop_write_iter+0x17d/0x270 vfs_write+0x414/0x620 ksys_write+0x73/0x160 __x64_sys_write+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd calc_lcoefs() uses the input value of cost.model in DIV_ROUND_UP_ULL, overflow would happen if bps plus IOC_PAGE_SIZE is greater than ULLONG_MAX, it can cause divide by 0 error. Fix the problem by setting basecost Signed-off-by: Li Nan <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With Kernel 3.14 a BeagleBoneBlack will not completely powered off,
wehen I use commands like 'poweroff' or 'shutdown -h now'.
With Kernel 3.8 this was possible ...
In Kernel 3.8 there was a function 'rtc_power_off' in 'rtc_omap.c'
with this function it was possible to power off the board after shutdown.
in all later Kernels this function is missing ...
What's happend to this function? Why it was removed?
How can I power off the board completely?
The text was updated successfully, but these errors were encountered: