Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GIT dda3e15231b35840fe6f0973f803cc70ddb86281 commit b0f2853b56a2acaff19cca2c6a608f8ec268d21a Author: Christoph Hellwig <[email protected]> Date: Wed Jan 17 22:04:38 2018 +0100 nvme-pci: take sglist coalescing in dma_map_sg into account Some iommu implementations can merge physically and/or virtually contiguous segments inside sg_map_dma. The NVMe SGL support does not take this into account and will warn because of falling off a loop. Pass the number of mapped segments to nvme_pci_setup_sgls so that the SGL setup can take the number of mapped segments into account. Reported-by: Fangjian (Turing) <[email protected]> Fixes: a7a7cbe3 ("nvme-pci: add SGL support") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]> commit 20469a37aed12a886d0deda5a07c04037923144a Author: Keith Busch <[email protected]> Date: Wed Jan 17 22:04:37 2018 +0100 nvme-pci: check segement valid for SGL use The driver needs to verify there is a payload with a command before seeing if it should use SGLs to map it. Fixes: 955b1b5a00ba ("nvme-pci: move use_sgl initialization to nvme_init_iod()") Reported-by: Paul Menzel <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]> commit 091f02483df7b56615b524491f404e574c5e0668 Author: Russell King <[email protected]> Date: Sat Jan 13 12:11:26 2018 +0000 ARM: net: bpf: clarify tail_call index As per 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT"), the index used for array lookup is defined to be 32-bit wide. Update a misleading comment that suggests it is 64-bit wide. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <[email protected]> commit ec19e02b343db991d2d1610c409efefebf4e2ca9 Author: Russell King <[email protected]> Date: Sat Jan 13 21:06:16 2018 +0000 ARM: net: bpf: fix LDX instructions When the source and destination register are identical, our JIT does not generate correct code, which leads to kernel oopses. Fix this by (a) generating more efficient code, and (b) making use of the temporary earlier if we will overwrite the address register. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <[email protected]> commit 02088d9b392f605c892894b46aa8c83e3abd0115 Author: Russell King <[email protected]> Date: Sat Jan 13 22:38:18 2018 +0000 ARM: net: bpf: fix register saving When an eBPF program tail-calls another eBPF program, it enters it after the prologue to avoid having complex stack manipulations. This can lead to kernel oopses, and similar. Resolve this by always using a fixed stack layout, a CPU register frame pointer, and using this when reloading registers before returning. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <[email protected]> commit 0005e55a79cfda88199e41a406a829c88d708c67 Author: Russell King <[email protected]> Date: Sat Jan 13 22:51:27 2018 +0000 ARM: net: bpf: correct stack layout documentation The stack layout documentation incorrectly suggests that the BPF JIT scratch space starts immediately below BPF_FP. This is not correct, so let's fix the documentation to reflect reality. Signed-off-by: Russell King <[email protected]> commit 70ec3a6c2c11e4b0e107a65de943a082f9aff351 Author: Russell King <[email protected]> Date: Sat Jan 13 21:26:14 2018 +0000 ARM: net: bpf: move stack documentation Move the stack documentation towards the top of the file, where it's relevant for things like the register layout. Signed-off-by: Russell King <[email protected]> commit d1220efd23484c72c82d5471f05daeb35b5d1916 Author: Russell King <[email protected]> Date: Sat Jan 13 16:10:07 2018 +0000 ARM: net: bpf: fix stack alignment As per 2dede2d8e925 ("ARM EABI: stack pointer must be 64-bit aligned after a CPU exception") the stack should be aligned to a 64-bit boundary on EABI systems. Ensure that the eBPF JIT appropraitely aligns the stack. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <[email protected]> commit f4483f2cc1fdc03488c8a1452e545545ae5bda93 Author: Russell King <[email protected]> Date: Sat Jan 13 11:39:54 2018 +0000 ARM: net: bpf: fix tail call jumps When a tail call fails, it is documented that the tail call should continue execution at the following instruction. An example tail call sequence is: 12: (85) call bpf_tail_call#12 13: (b7) r0 = 0 14: (95) exit The ARM assembler for the tail call in this case ends up branching to instruction 14 instead of instruction 13, resulting in the BPF filter returning a non-zero value: 178: ldr r8, [sp, #588] ; insn 12 17c: ldr r6, [r8, r6] 180: ldr r8, [sp, #580] 184: cmp r8, r6 188: bcs 0x1e8 18c: ldr r6, [sp, #524] 190: ldr r7, [sp, #528] 194: cmp r7, #0 198: cmpeq r6, #32 19c: bhi 0x1e8 1a0: adds r6, r6, #1 1a4: adc r7, r7, #0 1a8: str r6, [sp, #524] 1ac: str r7, [sp, #528] 1b0: mov r6, #104 1b4: ldr r8, [sp, #588] 1b8: add r6, r8, r6 1bc: ldr r8, [sp, #580] 1c0: lsl r7, r8, #2 1c4: ldr r6, [r6, r7] 1c8: cmp r6, #0 1cc: beq 0x1e8 1d0: mov r8, #32 1d4: ldr r6, [r6, r8] 1d8: add r6, r6, #44 1dc: bx r6 1e0: mov r0, #0 ; insn 13 1e4: mov r1, #0 1e8: add sp, sp, #596 ; insn 14 1ec: pop {r4, r5, r6, r7, r8, sl, pc} For other sequences, the tail call could end up branching midway through the following BPF instructions, or maybe off the end of the function, leading to unknown behaviours. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <[email protected]> commit e9062481824384f00299971f923fecf6b3668001 Author: Russell King <[email protected]> Date: Sat Jan 13 11:35:15 2018 +0000 ARM: net: bpf: avoid 'bx' instruction on non-Thumb capable CPUs Avoid the 'bx' instruction on CPUs that have no support for Thumb and thus do not implement this instruction by moving the generation of this opcode to a separate function that selects between: bx reg and mov pc, reg according to the capabilities of the CPU. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <[email protected]> commit 45d55e7bac4028af93f5fa324e69958a0b868e96 Author: Thomas Gleixner <[email protected]> Date: Tue Jan 16 12:20:18 2018 +0100 x86/apic/vector: Fix off by one in error path Keith reported the following warning: WARNING: CPU: 28 PID: 1420 at kernel/irq/matrix.c:222 irq_matrix_remove_managed+0x10f/0x120 x86_vector_free_irqs+0xa1/0x180 x86_vector_alloc_irqs+0x1e4/0x3a0 msi_domain_alloc+0x62/0x130 The reason for this is that if the vector allocation fails the error handling code tries to free the failed vector as well, which causes the above imbalance warning to trigger. Adjust the error path to handle this correctly. Fixes: b5dc8e6c21e7 ("x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors") Reported-by: Keith Busch <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Keith Busch <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161217300.1823@nanos commit d47924417319e3b6a728c0b690f183e75bc2a702 Author: Thomas Gleixner <[email protected]> Date: Tue Jan 16 19:59:59 2018 +0100 x86/intel_rdt/cqm: Prevent use after free intel_rdt_iffline_cpu() -> domain_remove_cpu() frees memory first and then proceeds accessing it. BUG: KASAN: use-after-free in find_first_bit+0x1f/0x80 Read of size 8 at addr ffff883ff7c1e780 by task cpuhp/31/195 find_first_bit+0x1f/0x80 has_busy_rmid+0x47/0x70 intel_rdt_offline_cpu+0x4b4/0x510 Freed by task 195: kfree+0x94/0x1a0 intel_rdt_offline_cpu+0x17d/0x510 Do the teardown first and then free memory. Fixes: 24247aeeabe9 ("x86/intel_rdt/cqm: Improve limbo list processing") Reported-by: Joseph Salisbury <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Ravi Shankar <[email protected]> Cc: Peter Zilstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vikas Shivappa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "Roderick W. Smith" <[email protected]> Cc: [email protected] Cc: Fenghua Yu <[email protected]> Cc: Tony Luck <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161957510.2366@nanos commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12 Author: Andi Kleen <[email protected]> Date: Tue Jan 16 12:52:28 2018 -0800 module: Add retpoline tag to VERMAGIC Add a marker for retpoline to the module VERMAGIC. This catches the case when a non RETPOLINE compiled module gets loaded into a retpoline kernel, making it insecure. It doesn't handle the case when retpoline has been runtime disabled. Even in this case the match of the retcompile status will be enforced. This implies that even with retpoline run time disabled all modules loaded need to be recompiled. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Acked-by: David Woodhouse <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] commit 4fdec2034b7540dda461c6ba33325dfcff345c64 Author: Paolo Bonzini <[email protected]> Date: Tue Jan 16 16:42:25 2018 +0100 x86/cpufeature: Move processor tracing out of scattered features Processor tracing is already enumerated in word 9 (CPUID[7,0].EBX), so do not duplicate it in the scattered features word. Besides being more tidy, this will be useful for KVM when it presents processor tracing to the guests. KVM selects host features that are supported by both the host kernel (depending on command line options, CPU errata, or whatever) and KVM. Whenever a full feature word exists, KVM's code is written in the expectation that the CPUID bit number matches the X86_FEATURE_* bit number, but this is not the case for X86_FEATURE_INTEL_PT. Signed-off-by: Paolo Bonzini <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luwei Kang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit 07c7b6a52503ac13ae357a8b3ef3456590a64b65 Author: Linus Walleij <[email protected]> Date: Tue Jan 16 09:51:51 2018 +0100 gpio: mmio: Also read bits that are zero The code for .get_multiple() has bugs: 1. The simple .get_multiple() just reads a register, masks it and sets the return value. This is not correct: we only want to assign values (whether 0 or 1) to the bits that are set in the mask. Fix this by using &= ~mask to clear all bits in the mask and then |= val & mask to set the corresponding bits from the read. 2. The bgpio_get_multiple_be() call has a similar problem: it uses the |= operator to set the bits, so only the bits in the mask are affected, but it misses to clear all returned bits from the mask initially, so some bits will be returned erroneously set to 1. 3. The bgpio_get_set_multiple() again fails to clear the bits from the mask. 4. find_next_bit() wasn't handled correctly, use a totally different approach for one function and change the other function to follow the design pattern of assigning the first bit to -1, then use bit + 1 in the for loop and < num_iterations as break condition. Fixes: 80057cb417b2 ("gpio-mmio: Use the new .get_multiple() callback") Cc: Bartosz Golaszewski <[email protected]> Reported-by: Clemens Gruber <[email protected]> Tested-by: Clemens Gruber <[email protected]> Reported-by: Lukas Wunner <[email protected]> Signed-off-by: Linus Walleij <[email protected]> commit 81d947e2b8dd2394586c3eaffdd2357797d3bf59 Author: Daniel Borkmann <[email protected]> Date: Mon Jan 15 23:12:09 2018 +0100 net, sched: fix panic when updating miniq {b,q}stats While working on fixing another bug, I ran into the following panic on arm64 by simply attaching clsact qdisc, adding a filter and running traffic on ingress to it: [...] [ 178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000 [ 178.197314] Mem abort info: [ 178.200121] ESR = 0x96000004 [ 178.203168] Exception class = DABT (current EL), IL = 32 bits [ 178.209095] SET = 0, FnV = 0 [ 178.212157] EA = 0, S1PTW = 0 [ 178.215288] Data abort info: [ 178.218175] ISV = 0, ISS = 0x00000004 [ 178.222019] CM = 0, WnR = 0 [ 178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33 [ 178.231531] [0000810fb501f000] *pgd=0000000000000000 [ 178.236508] Internal error: Oops: 96000004 [#1] SMP [...] [ 178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G W 4.15.0-rc7+ #5 [ 178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017 [ 178.326887] pstate: 60400005 (nZCv daif +PAN -UAO) [ 178.331685] pc : __netif_receive_skb_core+0x49c/0xac8 [ 178.336728] lr : __netif_receive_skb+0x28/0x78 [ 178.341161] sp : ffff00002344b750 [ 178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580 [ 178.349769] x27: 0000000000000000 x26: ffff000009378000 [...] [ 178.418715] x1 : 0000000000000054 x0 : 0000000000000000 [ 178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4) [ 178.430537] Call trace: [ 178.432976] __netif_receive_skb_core+0x49c/0xac8 [ 178.437670] __netif_receive_skb+0x28/0x78 [ 178.441757] process_backlog+0x9c/0x160 [ 178.445584] net_rx_action+0x2f8/0x3f0 [...] Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init() which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc. Problem is that this cannot work since they are actually set up right after the qdisc ->init() callback in qdisc_create(), so first packet going into sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we therefore panic. In order to fix this, allocation of {b,q}stats needs to happen before we call into ->init(). In net-next, there's already such option through commit d59f5ffa59d8 ("net: sched: a dflt qdisc may be used with per cpu stats"). However, the bug needs to be fixed in net still for 4.15. Thus, include these bits to reduce any merge churn and reuse the static_flags field to set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since there is no other user left. Prashant Bhole ran into the same issue but for net-next, thus adding him below as well as co-author. Same issue was also reported by Sandipan Das when using bcc. Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath") Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html Reported-by: Sandipan Das <[email protected]> Co-authored-by: Prashant Bhole <[email protected]> Co-authored-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 70eeff66c4696cee4076d6388b6bede5bd7ff71c Author: Roland Dreier <[email protected]> Date: Mon Jan 15 12:24:49 2018 -0800 qed: Fix potential use-after-free in qed_spq_post() We need to check if p_ent->comp_mode is QED_SPQ_MODE_EBLOCK before calling qed_spq_add_entry(). The test is fine is the mode is EBLOCK, but if it isn't then qed_spq_add_entry() might kfree(p_ent). Signed-off-by: Roland Dreier <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 0d9c9f0f40ca262b67fc06a702b85f3976f5e1a1 Author: Jakub Kicinski <[email protected]> Date: Mon Jan 15 11:47:53 2018 -0800 nfp: use the correct index for link speed table sts variable is holding link speed as well as state. We should be using ls to index into ls_to_ethtool. Fixes: 265aeb511bd5 ("nfp: add support for .get_link_ksettings()") Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit a5b1379afbfabf91e3a689e82ac619a7157336b3 Author: Yuiko Oshino <[email protected]> Date: Mon Jan 15 13:24:28 2018 -0500 lan78xx: Fix failure in USB Full Speed Fix initialize the uninitialized tx_qlen to an appropriate value when USB Full Speed is used. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Yuiko Oshino <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit c5006b8aa74599ce19104b31d322d2ea9ff887cc Author: Xin Long <[email protected]> Date: Mon Jan 15 17:02:00 2018 +0800 sctp: do not allow the v4 socket to bind a v4mapped v6 address The check in sctp_sockaddr_af is not robust enough to forbid binding a v4mapped v6 addr on a v4 socket. The worse thing is that v4 socket's bind_verify would not convert this v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4 socket bound a v6 addr. This patch is to fix it by doing the common sa.sa_family check first, then AF_INET check for v4mapped v6 addrs. Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.") Reported-by: [email protected] Acked-by: Neil Horman <[email protected]> Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit a0ff660058b88d12625a783ce9e5c1371c87951f Author: Xin Long <[email protected]> Date: Mon Jan 15 17:01:36 2018 +0800 sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf After commit cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep"), it may change to lock another sk if the asoc has been peeled off in sctp_wait_for_sndbuf. However, the asoc's new sk could be already closed elsewhere, as it's in the sendmsg context of the old sk that can't avoid the new sk's closing. If the sk's last one refcnt is held by this asoc, later on after putting this asoc, the new sk will be freed, while under it's own lock. This patch is to revert that commit, but fix the old issue by returning error under the old sk's lock. Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep") Reported-by: [email protected] Signed-off-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 625637bf4afa45204bd87e4218645182a919485a Author: Xin Long <[email protected]> Date: Mon Jan 15 17:01:19 2018 +0800 sctp: reinit stream if stream outcnt has been change by sinit in sendmsg After introducing sctp_stream structure, sctp uses stream->outcnt as the out stream nums instead of c.sinit_num_ostreams. However when users use sinit in cmsg, it only updates c.sinit_num_ostreams in sctp_sendmsg. At that moment, stream->outcnt is still using previous value. If it's value is not updated, the sinit_num_ostreams of sinit could not really work. This patch is to fix it by updating stream->outcnt and reiniting stream if stream outcnt has been change by sinit in sendmsg. Fixes: a83863174a61 ("sctp: prepare asoc stream for stream reconf") Signed-off-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 3d1661304f0b2b51a8a43785b764822611dbdd53 Author: Thomas Falcon <[email protected]> Date: Wed Jan 10 19:39:52 2018 -0600 ibmvnic: Fix pending MAC address changes Due to architecture limitations, the IBM VNIC client driver is unable to perform MAC address changes unless the device has "logged in" to its backing device. Currently, pending MAC changes are handled before login, resulting in an error and failure to change the MAC address. Moving that chunk to the end of the ibmvnic_login function, when we are sure that it was successful, fixes that. The MAC address can be changed when the device is up or down, so only check if the device is in a "PROBED" state before setting the MAC address. Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Thomas Falcon <[email protected]> Reviewed-by: John Allen <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit c96f5471ce7d2aefd0dda560cc23f08ab00bc65d Author: Josh Snyder <[email protected]> Date: Mon Dec 18 16:15:10 2017 +0000 delayacct: Account blkio completion on the correct task Before commit: e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler") delayacct_blkio_end() was called after context-switching into the task which completed I/O. This resulted in double counting: the task would account a delay both waiting for I/O and for time spent in the runqueue. With e33a9bba85a8, delayacct_blkio_end() is called by try_to_wake_up(). In ttwu, we have not yet context-switched. This is more correct, in that the delay accounting ends when the I/O is complete. But delayacct_blkio_end() relies on 'get_current()', and we have not yet context-switched into the task whose I/O completed. This results in the wrong task having its delay accounting statistics updated. Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(), so that it can update the statistics of the correct task. Signed-off-by: Josh Snyder <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Balbir Singh <[email protected]> Cc: <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Fixes: e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit 107cd2532181b96c549e8f224cdcca8631c3076b Author: Tom Lendacky <[email protected]> Date: Wed Jan 10 13:26:34 2018 -0600 x86/mm: Encrypt the initrd earlier for BSP microcode update Currently the BSP microcode update code examines the initrd very early in the boot process. If SME is active, the initrd is treated as being encrypted but it has not been encrypted (in place) yet. Update the early boot code that encrypts the kernel to also encrypt the initrd so that early BSP microcode updates work. Tested-by: Gabriel Craciunescu <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit cc5f01e28d6c60f274fd1e33b245f679f79f543c Author: Tom Lendacky <[email protected]> Date: Wed Jan 10 13:26:26 2018 -0600 x86/mm: Prepare sme_encrypt_kernel() for PAGE aligned encryption In preparation for encrypting more than just the kernel, the encryption support in sme_encrypt_kernel() needs to support 4KB page aligned encryption instead of just 2MB large page aligned encryption. Update the routines that populate the PGD to support non-2MB aligned addresses. This is done by creating PTE page tables for the start and end portion of the address range that fall outside of the 2MB alignment. This results in, at most, two extra pages to hold the PTE entries for each mapping of a range. Tested-by: Gabriel Craciunescu <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit 2b5d00b6c2cdd94f6d6a494a6f6c0c0fc7b8e711 Author: Tom Lendacky <[email protected]> Date: Wed Jan 10 13:26:16 2018 -0600 x86/mm: Centralize PMD flags in sme_encrypt_kernel() In preparation for encrypting more than just the kernel during early boot processing, centralize the use of the PMD flag settings based on the type of mapping desired. When 4KB aligned encryption is added, this will allow either PTE flags or large page PMD flags to be used without requiring the caller to adjust. Tested-by: Gabriel Craciunescu <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4 Author: Tom Lendacky <[email protected]> Date: Wed Jan 10 13:26:05 2018 -0600 x86/mm: Use a struct to reduce parameters for SME PGD mapping In preparation for follow-on patches, combine the PGD mapping parameters into a struct to reduce the number of function arguments and allow for direct updating of the next pagetable mapping area pointer. Tested-by: Gabriel Craciunescu <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit 1303880179e67c59e801429b7e5d0f6b21137d99 Author: Tom Lendacky <[email protected]> Date: Wed Jan 10 13:25:56 2018 -0600 x86/mm: Clean up register saving in the __enc_copy() assembly code Clean up the use of PUSH and POP and when registers are saved in the __enc_copy() assembly function in order to improve the readability of the code. Move parameter register saving into general purpose registers earlier in the code and move all the pushes to the beginning of the function with corresponding pops at the end. We do this to prepare fixes. Tested-by: Gabriel Craciunescu <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> commit 385d11b152c4eb638eeb769edcb3249533bb9a00 Author: Josh Poimboeuf <[email protected]> Date: Mon Jan 15 08:17:08 2018 -0600 objtool: Improve error message for bad file argument If a nonexistent file is supplied to objtool, it complains with a non-helpful error: open: No such file or directory Improve it to: objtool: Can't open 'foo': No such file or directory Reported-by: Markus <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]> commit 2a0098d70640dda192a79966c14d449e7a34d675 Author: Josh Poimboeuf <[email protected]> Date: Mon Jan 15 08:17:07 2018 -0600 objtool: Fix seg fault with gold linker Objtool segfaults when the gold linker is used with CONFIG_MODVERSIONS=y and CONFIG_UNWINDER_ORC=y. With CONFIG_MODVERSIONS=y, the .o file gets passed to the linker before being passed to objtool. The gold linker seems to strip unused ELF symbols by default, which confuses objtool and causes the seg fault when it's trying to generate ORC metadata. Objtool should really be running immediately after GCC anyway, without a linker call in between. Change the makefile ordering so that objtool is called before the linker. Reported-and-tested-by: Markus <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Link: http://lkml.kernel.org/r/355f04da33581f4a3bf82e5b512973624a1e23a2.1516025651.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]> commit ae59c3f0b6cfd472fed96e50548a799b8971d876 Author: Leon Romanovsky <[email protected]> Date: Fri Jan 12 07:58:39 2018 +0200 RDMA/mlx5: Fix out-of-bound access while querying AH The rdma_ah_find_type() accesses the port array based on an index controlled by userspace. The existing bounds check is after the first use of the index, so userspace can generate an out of bounds access, as shown by the KASN report below. ================================================================== BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0xa8/0x3b0 Read of size 4 at addr ffff880019ae2268 by task ibv_rc_pingpong/409 CPU: 0 PID: 409 Comm: ibv_rc_pingpong Not tainted 4.15.0-rc2-00031-gb60a3faf5b83-dirty #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xe9/0x18f print_address_description+0xa2/0x350 kasan_report+0x3a5/0x400 to_rdma_ah_attr+0xa8/0x3b0 mlx5_ib_query_qp+0xd35/0x1330 ib_query_qp+0x8a/0xb0 ib_uverbs_query_qp+0x237/0x7f0 ib_uverbs_write+0x617/0xd80 __vfs_write+0xf7/0x500 vfs_write+0x149/0x310 SyS_write+0xca/0x190 entry_SYSCALL_64_fastpath+0x18/0x85 RIP: 0033:0x7fe9c7a275a0 RSP: 002b:00007ffee5498738 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fe9c7ce4b00 RCX: 00007fe9c7a275a0 RDX: 0000000000000018 RSI: 00007ffee5498800 RDI: 0000000000000003 RBP: 000055d0c8d3f010 R08: 00007ffee5498800 R09: 0000000000000018 R10: 00000000000000ba R11: 0000000000000246 R12: 0000000000008000 R13: 0000000000004fb0 R14: 000055d0c8d3f050 R15: 00007ffee5498560 Allocated by task 1: __kmalloc+0x3f9/0x430 alloc_mad_private+0x25/0x50 ib_mad_post_receive_mads+0x204/0xa60 ib_mad_init_device+0xa59/0x1020 ib_register_device+0x83a/0xbc0 mlx5_ib_add+0x50e/0x5c0 mlx5_add_device+0x142/0x410 mlx5_register_interface+0x18f/0x210 mlx5_ib_init+0x56/0x63 do_one_initcall+0x15b/0x270 kernel_init_freeable+0x2d8/0x3d0 kernel_init+0x14/0x190 ret_from_fork+0x24/0x30 Freed by task 0: (stack is not available) The buggy address belongs to the object at ffff880019ae2000 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 104 bytes to the right of 512-byte region [ffff880019ae2000, ffff880019ae2200) The buggy address belongs to the page: page:000000005d674e18 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 00000001000c000c raw: dead000000000100 dead000000000200 ffff88001a402000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880019ae2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc >ffff880019ae2200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880019ae2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Disabling lock debugging due to kernel taint Cc: <[email protected]> Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> commit 6311b7ce42e0c1d6d944bc099dc47e936c20cf11 Author: Johannes Berg <[email protected]> Date: Mon Jan 15 12:42:25 2018 +0100 netlink: extack: avoid parenthesized string constant warning NL_SET_ERR_MSG() and NL_SET_ERR_MSG_ATTR() lead to the following warning in newer versions of gcc: warning: array initialized from parenthesized string constant Just remove the parentheses, they're not needed in this context since anyway since there can be no operator precendence issues or similar. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit cd9ff4de0107c65d69d02253bb25d6db93c3dbc1 Author: Jim Westfall <[email protected]> Date: Sun Jan 14 04:18:51 2018 -0800 ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices to avoid making an entry for every remote ip the device needs to talk to. This used the be the old behavior but became broken in a263b3093641f (ipv4: Make neigh lookups directly in output packet path) and later removed in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point devices) because it was broken. Signed-off-by: Jim Westfall <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 096b9854c04df86f03b38a97d40b6506e5730919 Author: Jim Westfall <[email protected]> Date: Sun Jan 14 04:18:50 2018 -0800 net: Allow neigh contructor functions ability to modify the primary_key Use n->primary_key instead of pkey to account for the possibility that a neigh constructor function may have modified the primary_key value. Signed-off-by: Jim Westfall <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 17d0fb0caa68f2bfd8aaa8125ff15abebfbfa1d7 Author: Sergei Shtylyov <[email protected]> Date: Sat Jan 13 20:22:01 2018 +0300 sh_eth: fix dumping ARSTR ARSTR is always located at the start of the TSU register region, thus using add_reg() instead of add_tsu_reg() in __sh_eth_get_regs() to dump it causes EDMR or EDSR (depending on the register layout) to be dumped instead of ARSTR. Use the correct condition/macro there... Fixes: 6b4b4fead342 ("sh_eth: Implement ethtool register dump operations") Signed-off-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 95a332088ecb113c2e8753fa3f1df9b0dda9beec Author: William Tu <[email protected]> Date: Fri Jan 12 12:29:22 2018 -0800 Revert "openvswitch: Add erspan tunnel support." This reverts commit ceaa001a170e43608854d5290a48064f57b565ed. The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr should be designed as a nested attribute to support all ERSPAN v1 and v2's fields. The current attr is a be32 supporting only one field. Thus, this patch reverts it and later patch will redo it using nested attr. Signed-off-by: William Tu <[email protected]> Cc: Jiri Benc <[email protected]> Cc: Pravin Shelar <[email protected]> Acked-by: Jiri Benc <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 30be8f8dba1bd2aff73e8447d59228471233a3d4 Author: [email protected] <[email protected]> Date: Fri Jan 12 15:42:06 2018 +0100 net/tls: Fix inverted error codes to avoid endless loop sendfile() calls can hang endless with using Kernel TLS if a socket error occurs. Socket error codes must be inverted by Kernel TLS before returning because they are stored with positive sign. If returned non-inverted they are interpreted as number of bytes sent, causing endless looping of the splice mechanic behind sendfile(). Signed-off-by: Robert Hering <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 95ef498d977bf44ac094778fd448b98af158a3e6 Author: Eric Dumazet <[email protected]> Date: Thu Jan 11 22:31:18 2018 -0800 ipv6: ip6_make_skb() needs to clear cork.base.dst In my last patch, I missed fact that cork.base.dst was not initialized in ip6_make_skb() : If ip6_setup_cork() returns an error, we might attempt a dst_release() on some random pointer. Fixes: 862c03ee1deb ("ipv6: fix possible mem leaks in ipv6_make_skb()") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 68e76e034b6b1c1ce2eece1ab8ae4008e14be470 Author: Randy Dunlap <[email protected]> Date: Mon Jan 15 11:07:27 2018 -0800 tracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y I regularly get 50 MB - 60 MB files during kernel randconfig builds. These large files mostly contain (many repeats of; e.g., 124,594): In file included from ../include/linux/string.h:6:0, from ../include/linux/uuid.h:20, from ../include/linux/mod_devicetable.h:13, from ../scripts/mod/devicetable-offsets.c:3: ../include/linux/compiler.h:64:4: warning: '______f' is static but declared in inline function 'strcpy' which is not static [enabled by default] ______f = { \ ^ ../include/linux/compiler.h:56:23: note: in expansion of macro '__trace_if' ^ ../include/linux/string.h:425:2: note: in expansion of macro 'if' if (p_size == (size_t)-1 && q_size == (size_t)-1) ^ This only happens when CONFIG_FORTIFY_SOURCE=y and CONFIG_PROFILE_ALL_BRANCHES=y, so prevent PROFILE_ALL_BRANCHES if FORTIFY_SOURCE=y. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]> commit 37f47bc90c7481e7959703ad1defc4fc9f5d85e3 Author: Marcelo Ricardo Leitner <[email protected]> Date: Thu Jan 11 14:22:06 2018 -0200 sctp: avoid compiler warning on implicit fallthru These fall-through are expected. Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Neil Horman <[email protected]> Reviewed-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 6503a30440962f1e1ccb8868816b4e18201218d4 Author: Lorenzo Colitti <[email protected]> Date: Thu Jan 11 18:36:26 2018 +0900 net: ipv4: Make "ip route get" match iif lo rules again. Commit 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup") broke "ip route get" in the presence of rules that specify iif lo. Host-originated traffic always has iif lo, because ip_route_output_key_hash and ip6_route_output_flags set the flow iif to LOOPBACK_IFINDEX. Thus, putting "iif lo" in an ip rule is a convenient way to select only originated traffic and not forwarded traffic. inet_rtm_getroute used to match these rules correctly because even though it sets the flow iif to 0, it called ip_route_output_key which overwrites iif with LOOPBACK_IFINDEX. But now that it calls ip_route_output_key_hash_rcu, the ifindex will remain 0 and not match the iif lo in the rule. As a result, "ip route get" will return ENETUNREACH. Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup") Tested: https://android.googlesource.com/kernel/tests/+/master/net/test/multinetwork_test.py passes again Signed-off-by: Lorenzo Colitti <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit cbbdf8433a5f117b1a2119ea30fc651b61ef7570 Author: David Ahern <[email protected]> Date: Wed Jan 10 13:00:39 2018 -0800 netlink: extack needs to be reset each time through loop syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value. The problem is that netlink_rcv_skb loops over the skb repeatedly invoking the callback and without resetting the extack leaving potentially stale data. Initializing each time through avoids the WARN_ON. Fixes: 2d4bc93368f5a ("netlink: extended ACK reporting") Reported-by: [email protected] Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 59b36613e85fb16ebf9feaf914570879cd5c2a21 Author: Cong Wang <[email protected]> Date: Wed Jan 10 12:50:25 2018 -0800 tipc: fix a memory leak in tipc_nl_node_get_link() When tipc_node_find_by_name() fails, the nlmsg is not freed. While on it, switch to a goto label to properly free it. Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node") Reported-by: Dmitry Vyukov <[email protected]> Cc: Jon Maloy <[email protected]> Cc: Ying Xue <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 749439bfac6e1a2932c582e2699f91d329658196 Author: Mike Maloney <[email protected]> Date: Wed Jan 10 12:45:10 2018 -0500 ipv6: fix udpv6 sendmsg crash caused by too small MTU The logic in __ip6_append_data() assumes that the MTU is at least large enough for the headers. A device's MTU may be adjusted after being added while sendmsg() is processing data, resulting in __ip6_append_data() seeing any MTU. For an mtu smaller than the size of the fragmentation header, the math results in a negative 'maxfraglen', which causes problems when refragmenting any previous skb in the skb_write_queue, leaving it possibly malformed. Instead sendmsg returns EINVAL when the mtu is calculated to be less than IPV6_MIN_MTU. Found by syzkaller: kernel BUG at ./include/linux/skbuff.h:2064! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801d0b68580 task.stack: ffff8801ac6b8000 RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline] RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216 RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000 RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0 RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000 R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8 R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000 FS: 00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6_finish_skb include/net/ipv6.h:911 [inline] udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093 udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x352/0x5a0 net/socket.c:1750 SyS_sendto+0x40/0x50 net/socket.c:1718 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4512e9 RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9 RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005 RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69 R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000 Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570 RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570 Reported-by: syzbot <[email protected]> Signed-off-by: Mike Maloney <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 6200b430220f3b9207861b16f57916950f4ecd8e Author: Arnd Bergmann <[email protected]> Date: Wed Jan 10 17:30:22 2018 +0100 net: cs89x0: add MODULE_LICENSE This driver lacks a MODULE_LICENSE tag, leading to a Kbuild warning: WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/cirrus/cs89x0.o This adds license, author, and description according to the comment block at the start of the file. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 0171c41835591e9aa2e384b703ef9a6ae367c610 Author: Guillaume Nault <[email protected]> Date: Wed Jan 10 16:24:45 2018 +0100 ppp: unlock all_ppp_mutex before registering device ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices, needs to lock pn->all_ppp_mutex. Therefore we mustn't call register_netdevice() with pn->all_ppp_mutex already locked, or we'd deadlock in case register_netdevice() fails and calls .ndo_uninit(). Fortunately, we can unlock pn->all_ppp_mutex before calling register_netdevice(). This lock protects pn->units_idr, which isn't used in the device registration process. However, keeping pn->all_ppp_mutex locked during device registration did ensure that no device in transient state would be published in pn->units_idr. In practice, unlocking it before calling register_netdevice() doesn't change this property: ppp_unit_register() is called with 'ppp_mutex' locked and all searches done in pn->units_idr hold this lock too. Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion") Reported-and-tested-by: [email protected] Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 66940f35d5a81d5969bb5543171c70a434fc5110 Author: Michael S. Tsirkin <[email protected]> Date: Wed Jan 10 16:03:05 2018 +0200 ptr_ring: document usage around __ptr_ring_peek This explains why is the net usage of __ptr_ring_peek actually ok without locks. Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit d542296a4d0d9f41d0186edcac2baba1b674d02f Author: Stephen Hemminger <[email protected]> Date: Mon Jan 8 08:23:18 2018 -0800 9p: add missing module license for xen transport The 9P of Xen module is missing required license and module information. See https://bugzilla.kernel.org/show_bug.cgi?id=198109 Reported-by: Alan Bartlett <[email protected]> Fixes: 868eb122739a ("xen/9pfs: introduce Xen 9pfs transport driver") Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit a0e3a18f4baf8e3754ac1e56f0ade924d0c0c721 Author: Steven Rostedt (VMware) <[email protected]> Date: Mon Jan 15 10:47:09 2018 -0500 ring-buffer: Bring back context level recursive checks Commit 1a149d7d3f45 ("ring-buffer: Rewrite trace_recursive_(un)lock() to be simpler") replaced the context level recursion checks with a simple counter. This would prevent the ring buffer code from recursively calling itself more than the max number of contexts that exist (Normal, softirq, irq, nmi). But this change caused a lockup in a specific case, which was during suspend and resume using a global clock. Adding a stack dump to see where this occurred, the issue was in the trace global clock itself: trace_buffer_lock_reserve+0x1c/0x50 __trace_graph_entry+0x2d/0x90 trace_graph_entry+0xe8/0x200 prepare_ftrace_return+0x69/0xc0 ftrace_graph_caller+0x78/0xa8 queued_spin_lock_slowpath+0x5/0x1d0 trace_clock_global+0xb0/0xc0 ring_buffer_lock_reserve+0xf9/0x390 The function graph tracer traced queued_spin_lock_slowpath that was called by trace_clock_global. This pointed out that the trace_clock_global() is not reentrant, as it takes a spin lock. It depended on the ring buffer recursive lock from letting that happen. By removing the context detection and adding just a max number of allowable recursions, it allowed the trace_clock_global() to be entered again and try to retake the spinlock it already held, causing a deadlock. Fixes: 1a149d7d3f45 ("ring-buffer: Rewrite trace_recursive_(un)lock() to be simpler") Reported-by: David Weinehall <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]> commit 499ed50f603b4c9834197b2411ba3bd9aaa624d4 Author: Benoît Thébaudeau <[email protected]> Date: Sun Jan 14 19:43:05 2018 +0100 mmc: sdhci-esdhc-imx: Fix i.MX53 eSDHCv3 clock Commit 5143c953a786 ("mmc: sdhci-esdhc-imx: Allow all supported prescaler values") made it possible to set SYSCTL.SDCLKFS to 0 in SDR mode, thus bypassing the SD clock frequency prescaler, in order to be able to get higher SD clock frequencies in some contexts. However, that commit missed the fact that this value is illegal on the eSDHCv3 instance of the i.MX53. This seems to be the only exception on i.MX, this value being legal even for the eSDHCv2 instances of the i.MX53. Fix this issue by changing the minimum prescaler value if the i.MX53 eSDHCv3 is detected. According to the i.MX53 reference manual, if DLLCTRL[10] can be set, then the controller is eSDHCv3, else it is eSDHCv2. This commit fixes the following issue, which was preventing the i.MX53 Loco (IMX53QSB) board from booting Linux 4.15.0-rc5: [ 1.882668] mmcblk1: error -84 transferring data, sector 2048, nr 8, cmd response 0x900, card status 0xc00 [ 2.002255] mmcblk1: error -84 transferring data, sector 2050, nr 6, cmd response 0x900, card status 0xc00 [ 12.645056] mmc1: Timeout waiting for hardware interrupt. [ 12.650473] mmc1: sdhci: ============ SDHCI REGISTER DUMP =========== [ 12.656921] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00001201 [ 12.663366] mmc1: sdhci: Blk size: 0x00000004 | Blk cnt: 0x00000000 [ 12.669813] mmc1: sdhci: Argument: 0x00000000 | Trn mode: 0x00000013 [ 12.676258] mmc1: sdhci: Present: 0x01f8028f | Host ctl: 0x00000013 [ 12.682703] mmc1: sdhci: Power: 0x00000002 | Blk gap: 0x00000000 [ 12.689148] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x0000003f [ 12.695594] mmc1: sdhci: Timeout: 0x0000008e | Int stat: 0x00000000 [ 12.702039] mmc1: sdhci: Int enab: 0x107f004b | Sig enab: 0x107f004b [ 12.708485] mmc1: sdhci: AC12 err: 0x00000000 | Slot int: 0x00001201 [ 12.714930] mmc1: sdhci: Caps: 0x07eb0000 | Caps_1: 0x08100810 [ 12.721375] mmc1: sdhci: Cmd: 0x0000163a | Max curr: 0x00000000 [ 12.727821] mmc1: sdhci: Resp[0]: 0x00000920 | Resp[1]: 0x00000000 [ 12.734265] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000 [ 12.740709] mmc1: sdhci: Host ctl2: 0x00000000 [ 12.745157] mmc1: sdhci: ADMA Err: 0x00000001 | ADMA Ptr: 0xc8049200 [ 12.751601] mmc1: sdhci: ============================================ [ 12.758110] print_req_error: I/O error, dev mmcblk1, sector 2050 [ 12.764135] Buffer I/O error on dev mmcblk1p1, logical block 0, lost sync page write [ 12.775163] EXT4-fs (mmcblk1p1): mounted filesystem without journal. Opts: (null) [ 12.782746] VFS: Mounted root (ext4 filesystem) on device 179:9. [ 12.789151] mmcblk1: response CRC error sending SET_BLOCK_COUNT command, card status 0x900 Signed-off-by: Benoît Thébaudeau <[email protected]> Reported-by: Wladimir J. van der Laan <[email protected]> Tested-by: Wladimir J. van der Laan <[email protected]> Fixes: 5143c953a786 ("mmc: sdhci-esdhc-imx: Allow all supported prescaler values") Cc: <[email protected]> # v4.13+ Signed-off-by: Ulf Hansson <[email protected]> commit 59b179b48ce2a6076448a44531242ac2b3f6cef2 Author: Johannes Berg <[email protected]> Date: Mon Jan 15 09:58:27 2018 +0100 cfg80211: check dev_set_name() return value syzbot reported a warning from rfkill_alloc(), and after a while I think that the reason is that it was doing fault injection and the dev_set_name() failed, leaving the name NULL, and we didn't check the return value and got to rfkill_alloc() with a NULL name. Since we really don't want a NULL name, we ought to check the return value. Fixes: fb28ad35906a ("net: struct device - replace bus_id with dev_name(), dev_set_name()") Reported-by: [email protected] Signed-off-by: Johannes Berg <[email protected]> commit 51a1aaa631c90223888d8beac4d649dc11d2ca55 Author: Johannes Berg <[email protected]> Date: Mon Jan 15 09:32:36 2018 +0100 mac80211_hwsim: validate number of different channels When creating a new radio on the fly, hwsim allows this to be done with an arbitrary number of channels, but cfg80211 only supports a limited number of simultaneous channels, leading to a warning. Fix this by validating the number - this requires moving the define for the maximum out to a visible header file. Reported-by: [email protected] Fixes: b59ec8dd4394 ("mac80211_hwsim: fix number of channels in interface combinations") Signed-off-by: Johannes Berg <[email protected]> commit b71d856ab536f25eb97c011a351ecddf5518de41 Author: Benjamin Beichler <[email protected]> Date: Wed Jan 10 17:42:51 2018 +0100 mac80211_hwsim: add workqueue to wait for deferred radio deletion on mod unload When closing multiple wmediumd instances with many radios and try to unload the mac80211_hwsim module, it may happen that the work items live longer than the module. To wait especially for this deletion work items, add a work queue, otherwise flush_scheduled_work would be necessary. Signed-off-by: Benjamin Beichler <[email protected]> Signed-off-by: Johannes Berg <[email protected]> commit 7a94b8c2eee7083ddccd0515830f8c81a8e44b1a Author: Dominik Brodowski <[email protected]> Date: Mon Jan 15 08:12:15 2018 +0100 nl80211: take RCU read lock when calling ieee80211_bss_get_ie() As ieee80211_bss_get_ie() derefences an RCU to return ssid_ie, both the call to this function and any operation on this variable need protection by the RCU read lock. Fixes: 44905265bc15 ("nl80211: don't expose wdev->ssid for most interfaces") Signed-off-by: Dominik Brodowski <[email protected]> Signed-off-by: Johannes Berg <[email protected]> commit a48a52b7bea81c046fe1c1288f84d0eba214cba0 Author: Johannes Berg <[email protected]> Date: Mon Jan 15 09:12:05 2018 +0100 cfg80211: fully initialize old channel for event Paul reported that he got a report about undefined behaviour that seems to me to originate in using uninitialized memory when the channel structure here is used in the event code in nl80211 later. He never reported whether this fixed it, and I wasn't able to trigger this so far, but we should do the right thing and fully initialize the on-stack structure anyway. Reported-by: Paul Menzel <[email protected]> Signed-off-by: Johannes Berg <[email protected]> commit 28d437d550e1e39f805d99f9f8ac399c778827b7 Author: Tom Lendacky <[email protected]> Date: Sat Jan 13 17:27:30 2018 -0600 x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros The PAUSE instruction is currently used in the retpoline and RSB filling macros as a speculation trap. The use of PAUSE was originally suggested because it showed a very, very small difference in the amount of cycles/time used to execute the retpoline as compared to LFENCE. On AMD, the PAUSE instruction is not a serializing instruction, so the pause/jmp loop will use excess power as it is speculated over waiting for return to mispredict to the correct target. The RSB filling macro is applicable to AMD, and, if software is unable to verify that LFENCE is serializing on AMD (possible when running under a hypervisor), the generic retpoline support will be used and, so, is also applicable to AMD. Keep the current usage of PAUSE for Intel, but add an LFENCE instruction to the speculation trap for AMD. The same sequence has been adopted by GCC for the GCC generated retpolines. Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Acked-by: David Woodhouse <[email protected]> Acked-by: Arjan van de Ven <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Paul Turner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tim Chen <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Dan Williams <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected] commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 Author: David Woodhouse <[email protected]> Date: Fri Jan 12 17:49:25 2018 +0000 x86/retpoline: Fill RSB on context switch for affected CPUs On context switch from a shallow call stack to a deeper one, as the CPU does 'ret' up the deeper side it may encounter RSB entries (predictions for where the 'ret' goes to) which were populated in userspace. This is problematic if neither SMEP nor KPTI (the latter of which marks userspace pages as NX for the kernel) are active, as malicious code in userspace may then be executed speculatively. Overwrite the CPU's return prediction stack with calls which are predicted to return to an infinite loop, to "capture" speculation if this happens. This is required both for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI. On Skylake+ the problem is slightly different, and an *underflow* of the RSB may cause errant branch predictions to occur. So there it's not so much overwrite, as *filling* the RSB to attempt to prevent it getting empty. This is only a partial solution for Skylake+ since there are many other conditions which may result in the RSB becoming empty. The full solution on Skylake+ is to use IBRS, which will prevent the problem even when the RSB becomes empty. With IBRS, the RSB-stuffing will not be required on context switch. [ tglx: Added missing vendor check and slighty massaged comments and changelog ] Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Arjan van de Ven <[email protected]> Cc: [email protected] Cc: Rik van Riel <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Kees Cook <[email protected]> Cc: Tim Chen <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Paul Turner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] commit 0d39e…
- Loading branch information