origin · 0day-ci/linux@733ce59

Commit

origin

GIT 62d18ecfa64137349fac9c5817784fbd48b54f48

commit 009f8c90f571d87855914dbc20e6c0ea2a3b19ae
Author: Lukas Wunner <[email protected]>
Date:   Thu May 24 19:01:07 2018 +0200

    ALSA: hda - Fix runtime PM
    
    Before commit 3b5b899ca67d ("ALSA: hda: Make use of core codec functions
    to sync power state"), hda_set_power_state() returned the response to
    the Get Power State verb, a 32-bit unsigned integer whose expected value
    is 0x233 after transitioning a codec to D3, and 0x0 after transitioning
    it to D0.
    
    The response value is significant because hda_codec_runtime_suspend()
    does not clear the codec's bit in the codec_powered bitmask unless the
    AC_PWRST_CLK_STOP_OK bit (0x200) is set in the response value.  That in
    turn prevents the HDA controller from runtime suspending because
    azx_runtime_idle() checks that the codec_powered bitmask is zero.
    
    Since commit 3b5b899ca67d, hda_set_power_state() only returns 0x0 or
    0x1, thereby breaking runtime PM for any HDA controller.  That's because
    an inline function introduced by the commit returns a bool instead of a
    32-bit unsigned int.  The change was likely erroneous and resulted from
    copying and pasting snd_hda_check_power_state(), which is immediately
    preceding the newly introduced inline function.  Fix it.
    
    Link: https://bugs.freedesktop.org/show_bug.cgi?id=106597
    Fixes: 3b5b899ca67d ("ALSA: hda: Make use of core codec functions to sync power state")
    Cc: Alex Deucher <[email protected]>
    Cc: Abhijeet Kumar <[email protected]>
    Reported-and-tested-by: Gunnar Krüger <[email protected]>
    Signed-off-by: Lukas Wunner <[email protected]>
    Acked-by: Alex Deucher <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>

commit d883c6cf3b39f1f42506e82ad2779fb88004acf3
Author: Joonsoo Kim <[email protected]>
Date:   Wed May 23 10:18:21 2018 +0900

    Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE"
    
    This reverts the following commits that change CMA design in MM.
    
     3d2054ad8c2d ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")
    
     1d47a3ec09b5 ("mm/cma: remove ALLOC_CMA")
    
     bad8c6c0b114 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
    
    Ville reported a following error on i386.
    
      Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
      microcode: microcode updated early to revision 0x4, date = 2013-06-28
      Initializing CPU#0
      Initializing HighMem for node 0 (000377fe:00118000)
      Initializing Movable for node 0 (00000001:00118000)
      BUG: Bad page state in process swapper  pfn:377fe
      page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
      flags: 0x80000000()
      raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
      page dumped because: nonzero mapcount
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
      Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
      Call Trace:
       dump_stack+0x60/0x96
       bad_page+0x9a/0x100
       free_pages_check_bad+0x3f/0x60
       free_pcppages_bulk+0x29d/0x5b0
       free_unref_page_commit+0x84/0xb0
       free_unref_page+0x3e/0x70
       __free_pages+0x1d/0x20
       free_highmem_page+0x19/0x40
       add_highpages_with_active_regions+0xab/0xeb
       set_highmem_pages_init+0x66/0x73
       mem_init+0x1b/0x1d7
       start_kernel+0x17a/0x363
       i386_start_kernel+0x95/0x99
       startup_32_smp+0x164/0x168
    
    The reason for this error is that the span of MOVABLE_ZONE is extended
    to whole node span for future CMA initialization, and, normal memory is
    wrongly freed here.  I submitted the fix and it seems to work, but,
    another problem happened.
    
    It's so late time to fix the later problem so I decide to reverting the
    series.
    
    Reported-by: Ville Syrjälä <[email protected]>
    Acked-by: Laura Abbott <[email protected]>
    Acked-by: Michal Hocko <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Signed-off-by: Joonsoo Kim <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 4544e403eb25552aed7f0ee181a7a506b8800403
Author: Mika Westerberg <[email protected]>
Date:   Thu May 24 11:12:16 2018 +0300

    ahci: Add PCI ID for Cannon Lake PCH-LP AHCI
    
    This one should be using the default LPM policy for mobile chipsets so
    add the PCI ID to the driver list of supported revices.
    
    Signed-off-by: Mika Westerberg <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>
    Cc: [email protected]

commit 82034c23fcbc2389c73d97737f61fa2dd6526413
Author: Laura Abbott <[email protected]>
Date:   Wed May 23 11:43:46 2018 -0700

    arm64: Make sure permission updates happen for pmd/pud
    
    Commit 15122ee2c515 ("arm64: Enforce BBM for huge IO/VMAP mappings")
    disallowed block mappings for ioremap since that code does not honor
    break-before-make. The same APIs are also used for permission updating
    though and the extra checks prevent the permission updates from happening,
    even though this should be permitted. This results in read-only permissions
    not being fully applied. Visibly, this can occasionaly be seen as a failure
    on the built in rodata test when the test data ends up in a section or
    as an odd RW gap on the page table dump. Fix this by using
    pgattr_change_is_safe instead of p*d_present for determining if the
    change is permitted.
    
    Reviewed-by: Kees Cook <[email protected]>
    Tested-by: Peter Robinson <[email protected]>
    Reported-by: Peter Robinson <[email protected]>
    Fixes: 15122ee2c515 ("arm64: Enforce BBM for huge IO/VMAP mappings")
    Signed-off-by: Laura Abbott <[email protected]>
    Signed-off-by: Will Deacon <[email protected]>

commit d50147381aa0c9725d63a677c138c47f55d6d3bc
Author: Omar Sandoval <[email protected]>
Date:   Tue May 22 09:47:58 2018 -0700

    Btrfs: fix error handling in btrfs_truncate()
    
    Jun Wu at Facebook reported that an internal service was seeing a return
    value of 1 from ftruncate() on Btrfs in some cases. This is coming from
    the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
    
    btrfs_truncate() uses two variables for error handling, ret and err.
    When btrfs_truncate_inode_items() returns non-zero, we set err to the
    return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
    only set err if ret is an error (i.e., negative).
    
    To reproduce the issue: mount a filesystem with -o compress-force=zstd
    and the following program will encounter return value of 1 from
    ftruncate:
    
    int main(void) {
            char buf[256] = { 0 };
            int ret;
            int fd;
    
            fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
            if (fd == -1) {
                    perror("open");
                    return EXIT_FAILURE;
            }
    
            if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
                    perror("write");
                    close(fd);
                    return EXIT_FAILURE;
            }
    
            if (fsync(fd) == -1) {
                    perror("fsync");
                    close(fd);
                    return EXIT_FAILURE;
            }
    
            ret = ftruncate(fd, 128);
            if (ret) {
                    printf("ftruncate() returned %d\n", ret);
                    close(fd);
                    return EXIT_FAILURE;
            }
    
            close(fd);
            return EXIT_SUCCESS;
    }
    
    Fixes: ddfae63cc8e0 ("btrfs: move btrfs_truncate_block out of trans handle")
    CC: [email protected] # 4.15+
    Reported-by: Jun Wu <[email protected]>
    Signed-off-by: Omar Sandoval <[email protected]>
    Signed-off-by: David Sterba <[email protected]>

commit 55ba49cbcef37053d973f9a45bc58818c333fe13
Author: oulijun <[email protected]>
Date:   Tue May 22 20:47:15 2018 +0800

    RDMA/hns: Move the location for initializing tmp_len
    
    When posted work request, it need to compute the length of
    all sges of every wr and fill it into the msg_len field of
    send wqe. Thus, While posting multiple wr,
    tmp_len should be reinitialized to zero.
    
    Fixes: 8b9b8d143b46 ("RDMA/hns: Fix the endian problem for hns")
    Signed-off-by: Lijun Ou <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>

commit 05d6a4ddb654ef6f2fbbcf9dcb3b263184baa8e4
Author: oulijun <[email protected]>
Date:   Tue May 22 20:47:14 2018 +0800

    RDMA/hns: Bugfix for cq record db for kernel
    
    When use cq record db for kernel, it needs to set the hr_cq->db_en
    to 1 and configure the dma address of record cq db of qp context.
    
    Fixes: 86188a8810ed ("RDMA/hns: Support cq record doorbell for kernel space")
    Signed-off-by: Lijun Ou <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>

commit f4602cbb0a2478dda8238a4f382867da425daa8e
Author: Jason Gunthorpe <[email protected]>
Date:   Tue May 22 15:56:51 2018 -0600

    IB/uverbs: Fix uverbs_attr_get_obj
    
    The err pointer comes from uverbs_attr_get, not from the uobject member,
    which does not store an ERR_PTR.
    
    Fixes: be934cca9e98 ("IB/uverbs: Add device memory registration ioctl support")
    Signed-off-by: Jason Gunthorpe <[email protected]>
    Reviewed-by: Leon Romanovsky <[email protected]>

commit 30bf066cd9989fef34aeeef9080368867fe42be7
Author: Kalderon, Michal <[email protected]>
Date:   Tue May 15 15:13:33 2018 +0300

    RDMA/qedr: Fix doorbell bar mapping for dpi > 1
    
    Each user_context receives a separate dpi value and thus a different
    address on the doorbell bar. The qedr_mmap function needs to validate
    the address and map the doorbell bar accordingly.
    The current implementation always checked against dpi=0 doorbell range
    leading to a wrong mapping for doorbell bar. (It entered an else case
    that mapped the address differently). qedr_mmap should only be used
    for doorbells, so the else was actually wrong in the first place.
    This only has an affect on arm architecture and not an issue on a
    x86 based architecture.
    This lead to doorbells not occurring on arm based systems and left
    applications that use more than one dpi (or several applications
    run simultaneously ) to hang.
    
    Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs")
    Signed-off-by: Ariel Elior <[email protected]>
    Signed-off-by: Michal Kalderon <[email protected]>
    Reviewed-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>

commit 6a93cea15ed38e2dba4a0552483d28b7a87a03bd
Author: Thomas Hellstrom <[email protected]>
Date:   Wed May 23 16:14:54 2018 +0200

    drm/vmwgfx: Schedule an fb dirty update after resume
    
    We have had problems displaying fbdev after a resume and as a
    workaround we have had to call vmw_fb_refresh(). This has had
    a number of unwanted side-effects. The root of the problem was,
    however that the coalesced fbdev dirty region was not empty on
    the first dirty_mark() after a resume, so a flush was never
    scheduled.
    
    Fix this by force scheduling an fbdev flush after resume, and
    remove the workaround.
    
    Signed-off-by: Thomas Hellstrom <[email protected]>
    Reviewed-by: Brian Paul <[email protected]>
    Reviewed-by: Deepak Rawat <[email protected]>

commit f37230c0ad481091bc136788ff8b37dc86300c6d
Author: Thomas Hellstrom <[email protected]>
Date:   Wed May 23 16:13:20 2018 +0200

    drm/vmwgfx: Fix host logging / guestinfo reading error paths
    
    The error paths were leaking opened channels.
    Fix by using dedicated error paths.
    
    Cc: <[email protected]>
    Signed-off-by: Thomas Hellstrom <[email protected]>
    Reviewed-by: Brian Paul <[email protected]>
    Reviewed-by: Sinclair Yeh <[email protected]>

commit 938ae7259c908ad031da35d551da297640bb640c
Author: Thomas Hellstrom <[email protected]>
Date:   Wed May 23 16:11:24 2018 +0200

    drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros
    
    Depending on whether the kernel is compiled with frame-pointer or not,
    the temporary memory location used for the bp parameter in these macros
    is referenced relative to the stack pointer or the frame pointer.
    Hence we can never reference that parameter when we've modified either
    the stack pointer or the frame pointer, because then the compiler would
    generate an incorrect stack reference.
    
    Fix this by pushing the temporary memory parameter on a known location on
    the stack before modifying the stack- and frame pointers.
    
    Cc: <[email protected]>
    Signed-off-by: Thomas Hellstrom <[email protected]>
    Reviewed-by: Brian Paul <[email protected]>
    Reviewed-by: Sinclair Yeh <[email protected]>

commit 11799564fc7eedff50801950090773928f867996
Author: Brian Norris <[email protected]>
Date:   Tue May 22 17:23:10 2018 -0700

    mfd: cros_ec: Retry commands when EC is known to be busy
    
    Commit 001dde9400d5 ("mfd: cros ec: spi: Fix "in progress" error
    signaling") pointed out some bad code, but its analysis and conclusion
    was not 100% correct.
    
    It *is* correct that we should not propagate result==EC_RES_IN_PROGRESS
    for transport errors, because this has a special meaning -- that we
    should follow up with EC_CMD_GET_COMMS_STATUS until the EC is no longer
    busy. This is definitely the wrong thing for many commands, because
    among other problems, EC_CMD_GET_COMMS_STATUS doesn't actually retrieve
    any RX data from the EC, so commands that expected some data back will
    instead start processing junk.
    
    For such commands, the right answer is to either propagate the error
    (and return that error to the caller) or resend the original command
    (*not* EC_CMD_GET_COMMS_STATUS).
    
    Unfortunately, commit 001dde9400d5 forgets a crucial point: that for
    some long-running operations, the EC physically cannot respond to
    commands any more. For example, with EC_CMD_FLASH_ERASE, the EC may be
    re-flashing its own code regions, so it can't respond to SPI interrupts.
    Instead, the EC prepares us ahead of time for being busy for a "long"
    time, and fills its hardware buffer with EC_SPI_PAST_END. Thus, we
    expect to see several "transport" errors (or, messages filled with
    EC_SPI_PAST_END). So we should really translate that to a retryable
    error (-EAGAIN) and continue sending EC_CMD_GET_COMMS_STATUS until we
    get a ready status.
    
    IOW, it is actually important to treat some of these "junk" values as
    retryable errors.
    
    Together with commit 001dde9400d5, this resolves bugs like the
    following:
    
    1. EC_CMD_FLASH_ERASE now works again (with commit 001dde9400d5, we
       would abort the first time we saw EC_SPI_PAST_END)
    2. Before commit 001dde9400d5, transport errors (e.g.,
       EC_SPI_RX_BAD_DATA) seen in other commands (e.g.,
       EC_CMD_RTC_GET_VALUE) used to yield junk data in the RX buffer; they
       will now yield -EAGAIN return values, and tools like 'hwclock' will
       simply fail instead of retrieving and re-programming undefined time
       values
    
    Fixes: 001dde9400d5 ("mfd: cros ec: spi: Fix "in progress" error signaling")
    Signed-off-by: Brian Norris <[email protected]>
    Signed-off-by: Lee Jones <[email protected]>

commit 92d7223a74235054f2aa7227d207d9c57f84dca0
Author: Sinan Kaya <[email protected]>
Date:   Mon Apr 16 18:16:56 2018 -0400

    alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2
    
    memory-barriers.txt has been updated with the following requirement.
    
    "When using writel(), a prior wmb() is not needed to guarantee that the
    cache coherent memory writes have completed before writing to the MMIO
    region."
    
    Current writeX() and iowriteX() implementations on alpha are not
    satisfying this requirement as the barrier is after the register write.
    
    Move mb() in writeX() and iowriteX() functions to guarantee that HW
    observes memory changes before performing register operations.
    
    Signed-off-by: Sinan Kaya <[email protected]>
    Reported-by: Arnd Bergmann <[email protected]>
    Signed-off-by: Matt Turner <[email protected]>

commit f5e82fa26063e6fad10624ff600457d878fa6e41
Author: Christoph Hellwig <[email protected]>
Date:   Wed May 9 16:04:52 2018 +0200

    alpha: simplify get_arch_dma_ops
    
    Remove the dma_ops indirection.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Matt Turner <[email protected]>

commit 6db615431a21b6057f68ed87583a663ee69f7601
Author: Christoph Hellwig <[email protected]>
Date:   Wed May 9 16:04:51 2018 +0200

    alpha: use dma_direct_ops for jensen
    
    The generic dma_direct implementation does the same thing as the alpha
    pci-noop implementation, just with more bells and whistles.  And unlike
    the current code it at least has a theoretical chance to actually compile.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Matt Turner <[email protected]>

commit cc19846079a70abcfd91b5a0791a5f17d69458a5
Author: Peter Maydell <[email protected]>
Date:   Tue May 22 17:11:20 2018 +0100

    arm64: fault: Don't leak data in ESR context for user fault on kernel VA
    
    If userspace faults on a kernel address, handing them the raw ESR
    value on the sigframe as part of the delivered signal can leak data
    useful to attackers who are using information about the underlying hardware
    fault type (e.g. translation vs permission) as a mechanism to defeat KASLR.
    
    However there are also legitimate uses for the information provided
    in the ESR -- notably the GCC and LLVM sanitizers use this to report
    whether wild pointer accesses by the application are reads or writes
    (since a wild write is a more serious bug than a wild read), so we
    don't want to drop the ESR information entirely.
    
    For faulting addresses in the kernel, sanitize the ESR. We choose
    to present userspace with the illusion that there is nothing mapped
    in the kernel's part of the address space at all, by reporting all
    faults as level 0 translation faults taken to EL1.
    
    These fields are safe to pass through to userspace as they depend
    only on the instruction that userspace used to provoke the fault:
     EC IL (always)
     ISV CM WNR (for all data aborts)
    All the other fields in ESR except DFSC are architecturally RES0
    for an L0 translation fault taken to EL1, so can be zeroed out
    without confusing userspace.
    
    The illusion is not entirely perfect, as there is a tiny wrinkle
    where we will report an alignment fault that was not due to the memory
    type (for instance a LDREX to an unaligned address) as a translation
    fault, whereas if you do this on real unmapped memory the alignment
    fault takes precedence. This is not likely to trip anybody up in
    practice, as the only users we know of for the ESR information who
    care about the behaviour for kernel addresses only really want to
    know about the WnR bit.
    
    Signed-off-by: Peter Maydell <[email protected]>
    Signed-off-by: Will Deacon <[email protected]>

commit c62ec4610c40bcc44f2d3d5ed1c312737279e2f3
Author: Rafael J. Wysocki <[email protected]>
Date:   Tue May 22 13:02:17 2018 +0200

    PM / core: Fix direct_complete handling for devices with no callbacks
    
    Commit 08810a4119aa (PM / core: Add NEVER_SKIP and SMART_PREPARE
    driver flags) inadvertently prevented the power.direct_complete flag
    from being set for devices without PM callbacks and with disabled
    runtime PM which also prevents power.direct_complete from being set
    for their parents.  That led to problems including a resume crash on
    HP ZBook 14u.
    
    Restore the previous behavior by causing power.direct_complete to be
    set for those devices again, but do that in a more direct way to
    avoid overlooking that case in the future.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=199693
    Fixes: 08810a4119aa (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags)
    Reported-by: Thomas Martitz <[email protected]>
    Tested-by: Thomas Martitz <[email protected]>
    Cc: 4.15+ <[email protected]> # 4.15+
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Reviewed-by: Ulf Hansson <[email protected]>
    Reviewed-by: Johan Hovold <[email protected]>

commit a048a07d7f4535baa4cbad6bc024f175317ab938
Author: Nicholas Piggin <[email protected]>
Date:   Tue May 22 09:00:00 2018 +1000

    powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit
    
    On some CPUs we can prevent a vulnerability related to store-to-load
    forwarding by preventing store forwarding between privilege domains,
    by inserting a barrier in kernel entry and exit paths.
    
    This is known to be the case on at least Power7, Power8 and Power9
    powerpc CPUs.
    
    Barriers must be inserted generally before the first load after moving
    to a higher privilege, and after the last store before moving to a
    lower privilege, HV and PR privilege transitions must be protected.
    
    Barriers are added as patch sections, with all kernel/hypervisor entry
    points patched, and the exit points to lower privilge levels patched
    similarly to the RFI flush patching.
    
    Firmware advertisement is not implemented yet, so CPU flush types
    are hard coded.
    
    Thanks to Michal Suchánek for bug fixes and review.
    
    Signed-off-by: Nicholas Piggin <[email protected]>
    Signed-off-by: Mauricio Faria de Oliveira <[email protected]>
    Signed-off-by: Michael Neuling <[email protected]>
    Signed-off-by: Michal Suchánek <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit eedffa28c9b00ca2dcb4d541b5a530f4c917052d
Author: Jeff Layton <[email protected]>
Date:   Mon May 21 14:35:03 2018 -0400

    loop: clear wb_err in bd_inode when detaching backing file
    
    When a loop block device encounters a writeback error, that error will
    get propagated to the bd_inode's wb_err field. If we then detach the
    backing file from it, attach another and fsync it, we'll get back the
    writeback error that we had from the previous backing file.
    
    This is a bit of a grey area as POSIX doesn't cover loop devices, but it
    is somewhat counterintuitive.
    
    If we detach a backing file from the loopdev while there are still
    unreported errors, take it as a sign that we're no longer interested in
    the previous file, and clear out the wb_err in the loop blockdev.
    
    Reported-and-Tested-by: Theodore Y. Ts'o <[email protected]>
    Signed-off-by: Jeff Layton <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit baf10564fbb66ea222cae66fbff11c444590ffd9
Author: Al Viro <[email protected]>
Date:   Sun May 20 16:46:23 2018 -0400

    aio: fix io_destroy(2) vs. lookup_ioctx() race
    
    kill_ioctx() used to have an explicit RCU delay between removing the
    reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
    At some point that delay had been removed, on the theory that
    percpu_ref_kill() itself contained an RCU delay.  Unfortunately, that was
    the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
    by lookup_ioctx().  As the result, we could get ctx freed right under
    lookup_ioctx().  Tejun has fixed that in a6d7cff472e ("fs/aio: Add explicit
    RCU grace period when freeing kioctx"); however, that fix is not enough.
    
    Suppose io_destroy() from one thread races with e.g. io_setup() from another;
    CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
    has picked it (under rcu_read_lock()).  Then CPU1 proceeds to drop the
    refcount, getting it to 0 and triggering a call of free_ioctx_users(),
    which proceeds to drop the secondary refcount and once that reaches zero
    calls free_ioctx_reqs().  That does
            INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
            queue_rcu_work(system_wq, &ctx->free_rwork);
    and schedules freeing the whole thing after RCU delay.
    
    In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
    refcount from 0 to 1 and returned the reference to io_setup().
    
    Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
    freed until after percpu_ref_get().  Sure, we'd increment the counter before
    ctx can be freed.  Now we are out of rcu_read_lock() and there's nothing to
    stop freeing of the whole thing.  Unfortunately, CPU2 assumes that since it
    has grabbed the reference, ctx is *NOT* going away until it gets around to
    dropping that reference.
    
    The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
    It's not costlier than what we currently do in normal case, it's safe to
    call since freeing *is* delayed and it closes the race window - either
    lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
    won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
    fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
    the object in question at all.
    
    Cc: [email protected]
    Fixes: a6d7cff472e "fs/aio: Add explicit RCU grace period when freeing kioctx"
    Signed-off-by: Al Viro <[email protected]>

commit 5aa1437d2d9a068c0334bd7c9dafa8ec4f97f13b
Author: Al Viro <[email protected]>
Date:   Thu May 17 17:18:30 2018 -0400

    ext2: fix a block leak
    
    open file, unlink it, then use ioctl(2) to make it immutable or
    append only.  Now close it and watch the blocks *not* freed...
    
    Immutable/append-only checks belong in ->setattr().
    Note: the bug is old and backport to anything prior to 737f2e93b972
    ("ext2: convert to use the new truncate convention") will need
    these checks lifted into ext2_setattr().
    
    Cc: [email protected]
    Signed-off-by: Al Viro <[email protected]>

commit 3819bb0d79f50b05910db5bdc6d9ef512184e3b1
Author: Al Viro <[email protected]>
Date:   Fri May 11 17:03:19 2018 -0400

    nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed
    
    That can (and does, on some filesystems) happen - ->mkdir() (and thus
    vfs_mkdir()) can legitimately leave its argument negative and just
    unhash it, counting upon the lookup to pick the object we'd created
    next time we try to look at that name.
    
    Some vfs_mkdir() callers forget about that possibility...
    
    Acked-by: J. Bruce Fields <[email protected]>
    Signed-off-by: Al Viro <[email protected]>

commit 9c3e9025a3f7ed25c99a0add8af65431c8043800
Author: Al Viro <[email protected]>
Date:   Thu May 10 22:59:45 2018 -0400

    cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed
    
    That can (and does, on some filesystems) happen - ->mkdir() (and thus
    vfs_mkdir()) can legitimately leave its argument negative and just
    unhash it, counting upon the lookup to pick the object we'd created
    next time we try to look at that name.
    
    Some vfs_mkdir() callers forget about that possibility...
    
    Signed-off-by: Al Viro <[email protected]>

commit 7b745a4e4051e1bbce40e0b1c2cf636c70583aa4
Author: Al Viro <[email protected]>
Date:   Mon May 14 00:03:34 2018 -0400

    unfuck sysfs_mount()
    
    new_sb is left uninitialized in case of early failures in kernfs_mount_ns(),
    and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb
    is not a solution - IS_ERR(root) is true in some cases when new_sb is true.
    
    Make sure new_sb is initialized (and matches the reality) in all cases and
    fix the condition for dropping kobj reference - we want it done precisely
    in those situations where the reference has not been transferred into a new
    super_block instance.
    
    Signed-off-by: Al Viro <[email protected]>

commit 82382acec0c97b91830fff7130d0acce4ac4f3f3
Author: Al Viro <[email protected]>
Date:   Tue Apr 3 00:22:29 2018 -0400

    kernfs: deal with kernfs_fill_super() failures
    
    make sure that info->node is initialized early, so that kernfs_kill_sb()
    can list_del() it safely.
    
    Signed-off-by: Al Viro <[email protected]>

commit 08a8f3086880325433d66b2dc9cdfb3f095adddf
Author: Joe Perches <[email protected]>
Date:   Sun May 13 15:05:47 2018 -0700

    cramfs: Fix IS_ENABLED typo
    
    There's an extra C here...
    
    Fixes: 99c18ce580c6 ("cramfs: direct memory access support")
    Acked-by: Nicolas Pitre <[email protected]>
    Signed-off-by: Joe Perches <[email protected]>
    Signed-off-by: Al Viro <[email protected]>

commit f4e4d434fe3f5eceea470bf821683677dabe39c4
Author: Al Viro <[email protected]>
Date:   Mon Apr 30 19:02:02 2018 -0400

    befs_lookup(): use d_splice_alias()
    
    RTFS(Documentation/filesystems/nfs/Exporting) if you try to make
    something exportable.
    
    Fixes: ac632f5b6301 "befs: add NFS export support"
    Signed-off-by: Al Viro <[email protected]>

commit 87fbd639c02ec96d67738e40b6521fb070ed7168
Author: Al Viro <[email protected]>
Date:   Sun May 6 12:20:40 2018 -0400

    affs_lookup: switch to d_splice_alias()
    
    Making something exportable takes more than providing ->s_export_ops.
    In particular, ->lookup() *MUST* use d_splice_alias() instead of
    d_add().
    
    Reading Documentation/filesystems/nfs/Exporting would've been a good idea;
    as it is, exporting AFFS is badly (and exploitably) broken.
    
    Partially-Fixes: ed4433d72394 "fs/affs: make affs exportable"
    Acked-by: David Sterba <[email protected]>
    Signed-off-by: Al Viro <[email protected]>

commit 30da870ce4a4e007c901858a96e9e394a1daa74a
Author: Al Viro <[email protected]>
Date:   Sun May 6 12:15:20 2018 -0400

    affs_lookup(): close a race with affs_remove_link()
    
    we unlock the directory hash too early - if we are looking at secondary
    link and primary (in another directory) gets removed just as we unlock,
    we could have the old primary moved in place of the secondary, leaving
    us to look into freed entry (and leaving our dentry with ->d_fsdata
    pointing to a freed entry).
    
    Cc: [email protected] # 2.4.4+
    Acked-by: David Sterba <[email protected]>
    Signed-off-by: Al Viro <[email protected]>

commit f7068114d45ec55996b9040e98111afa56e010fe
Author: Jens Axboe <[email protected]>
Date:   Mon May 21 12:21:14 2018 -0600

    sr: pass down correctly sized SCSI sense buffer
    
    We're casting the CDROM layer request_sense to the SCSI sense
    buffer, but the former is 64 bytes and the latter is 96 bytes.
    As we generally allocate these on the stack, we end up blowing
    up the stack.
    
    Fix this by wrapping the scsi_execute() call with a properly
    sized sense buffer, and copying back the bits for the CDROM
    layer.
    
    Cc: [email protected]
    Reported-by: Piotr Gabriel Kosinski <[email protected]>
    Reported-by: Daniel Shapira <[email protected]>
    Tested-by: Kees Cook <[email protected]>
    Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
    Signed-off-by: Jens Axboe <[email protected]>

commit 255845fc43a3aaf806852a1d3bc89bff1411ebe3
Author: Jason A. Donenfeld <[email protected]>
Date:   Sat Apr 28 00:42:52 2018 +0200

    arm64: export tishift functions to modules
    
    Otherwise modules that use these arithmetic operations will fail to
    link. We accomplish this with the usual EXPORT_SYMBOL, which on most
    architectures goes in the .S file but the ARM64 maintainers prefer that
    insead it goes into arm64ksyms.
    
    While we're at it, we also fix this up to use SPDX, and I personally
    choose to relicense this as GPL2||BSD so that these symbols don't need
    to be export_symbol_gpl, so all modules can use the routines, since
    these are important general purpose compiler-generated function calls.
    
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Reported-by: PaX Team <[email protected]>
    Cc: [email protected]
    Signed-off-by: Will Deacon <[email protected]>

commit 32c3fa7cdf0c4a3eb8405fc3e13398de019e828b
Author: Will Deacon <[email protected]>
Date:   Mon May 21 17:44:57 2018 +0100

    arm64: lse: Add early clobbers to some input/output asm operands
    
    For LSE atomics that read and write a register operand, we need to
    ensure that these operands are annotated as "early clobber" if the
    register is written before all of the input operands have been consumed.
    Failure to do so can result in the compiler allocating the same register
    to both operands, leading to splats such as:
    
     Unable to handle kernel paging request at virtual address 11111122222221
     [...]
     x1 : 1111111122222222 x0 : 1111111122222221
     Process swapper/0 (pid: 1, stack limit = 0x000000008209f908)
     Call trace:
      test_atomic64+0x1360/0x155c
    
    where x0 has been allocated as both the value to be stored and also the
    atomic_t pointer.
    
    This patch adds the missing clobbers.
    
    Cc: <[email protected]>
    Cc: Dave Martin <[email protected]>
    Cc: Robin Murphy <[email protected]>
    Reported-by: Mark Salter <[email protected]>
    Signed-off-by: Will Deacon <[email protected]>

commit 136d769e0b3475d71350aa3648a116a6ee7a8f6c
Author: Sudip Mukherjee <[email protected]>
Date:   Sat May 19 22:29:36 2018 +0100

    libata: blacklist Micron 500IT SSD with MU01 firmware
    
    While whitelisting Micron M500DC drives, the tweaked blacklist entry
    enabled queued TRIM from M500IT variants also. But these do not support
    queued TRIM. And while using those SSDs with the latest kernel we have
    seen errors and even the partition table getting corrupted.
    
    Some part from the dmesg:
    [    6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133
    [    6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
    [    6.741026] ata1.00: supports DRM functions and may not be fully accessible
    [    6.759887] ata1.00: configured for UDMA/133
    [    6.762256] scsi 0:0:0:0: Direct-Access     ATA      Micron_M500IT_MT MU01 PQ: 0 ANSI: 5
    
    and then for the error:
    [  120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen
    [  120.860338] ata1.00: irq_stat 0x40000008
    [  120.860342] ata1.00: failed command: SEND FPDMA QUEUED
    [  120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out
             res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout)
    [  120.860353] ata1.00: status: { DRDY }
    [  120.860543] ata1: hard resetting link
    [  121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    [  121.166376] ata1.00: supports DRM functions and may not be fully accessible
    [  121.186238] ata1.00: supports DRM functions and may not be fully accessible
    [  121.204445] ata1.00: configured for UDMA/133
    [  121.204454] ata1.00: device reported invalid CHS sector 0
    [  121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
    [  121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current]
    [  121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4
    [  121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00
    [  121.204559] print_req_error: I/O error, dev sda, sector 272512
    
    After few reboots with these errors, and the SSD is corrupted.
    After blacklisting it, the errors are not seen and the SSD does not get
    corrupted any more.
    
    Fixes: 243918be6393 ("libata: Do not blacklist Micron M500DC")
    Cc: Martin K. Petersen <[email protected]>
    Cc: [email protected]
    Signed-off-by: Sudip Mukherjee <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>

commit 3de06d5a1f05c11c94cbb68af14dbfa7fb81d78b
Author: Corneliu Doban <[email protected]>
Date:   Fri May 18 15:03:57 2018 -0700

    mmc: sdhci-iproc: add SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus
    
    The SDHCI_QUIRK2_HOST_OFF_CARD_ON is needed for the driver to
    properly reset the host controller (reset all) on initialization
    after exiting deep sleep.
    
    Signed-off-by: Corneliu Doban <[email protected]>
    Signed-off-by: Scott Branden <[email protected]>
    Reviewed-by: Ray Jui <[email protected]>
    Reviewed-by: Srinath Mannam <[email protected]>
    Fixes: c833e92bbb60 ("mmc: sdhci-iproc: support standard byte register accesses")
    Cc: [email protected] # v4.10+
    Signed-off-by: Ulf Hansson <[email protected]>

commit 5f651b870485ee60f5abbbd85195a6852978894a
Author: Corneliu Doban <[email protected]>
Date:   Fri May 18 15:03:56 2018 -0700

    mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
    
    When the host controller accepts only 32bit writes, the value of the
    16bit TRANSFER_MODE register, that has the same 32bit address as the
    16bit COMMAND register, needs to be saved and it will be written
    in a 32bit write together with the command as this will trigger the
    host to send the command on the SD interface.
    When sending the tuning command, TRANSFER_MODE is written and then
    sdhci_set_transfer_mode reads it back to clear AUTO_CMD12 bit and
    write it again resulting in wrong value to be written because the
    initial write value was saved in a shadow and the read-back returned
    a wrong value, from the register.
    Fix sdhci_iproc_readw to return the saved value of TRANSFER_MODE
    when a saved value exist.
    Same fix for read of BLOCK_SIZE and BLOCK_COUNT registers, that are
    saved for a different reason, although a scenario that will cause the
    mentioned problem on this registers is not probable.
    
    Fixes: b580c52d58d9 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
    Signed-off-by: Corneliu Doban <[email protected]>
    Signed-off-by: Scott Branden <[email protected]>
    Cc: [email protected] # v4.1+
    Signed-off-by: Ulf Hansson <[email protected]>

commit 4c94238f37af87a2165c3fb491b4a8b50e90649c
Author: Srinath Mannam <[email protected]>
Date:   Fri May 18 15:03:55 2018 -0700

    mmc: sdhci-iproc: remove hard coded mmc cap 1.8v
    
    Remove hard coded mmc cap 1.8v from platform data as it is board specific.
    The 1.8v DDR mmc caps can be enabled using DTS property for those
    boards that support it.
    
    Fixes: b17b4ab8ce38 ("mmc: sdhci-iproc: define MMC caps in platform data")
    Signed-off-by: Srinath Mannam <[email protected]>
    Signed-off-by: Scott Branden <[email protected]>
    Reviewed-by: Ray Jui <[email protected]>
    Cc: [email protected] # v4.8+
    Signed-off-by: Ulf Hansson <[email protected]>

commit b25b750df99bcba29317d3f9d9f93c4ec58890e6
Author: Mathieu Malaterre <[email protected]>
Date:   Wed May 16 21:20:20 2018 +0200

    mmc: block: propagate correct returned value in mmc_rpmb_ioctl
    
    In commit 97548575bef3 ("mmc: block: Convert RPMB to a character device") a
    new function `mmc_rpmb_ioctl` was added. The final return is simply
    returning a value of `0` instead of propagating the correct return code.
    
    Discovered during a compilation with W=1, silence the following gcc warning
    
    drivers/mmc/core/block.c:2470:6: warning: variable ‘ret’ set but not used
    [-Wunused-but-set-variable]
    
    Signed-off-by: Mathieu Malaterre <[email protected]>
    Reviewed-by: Shawn Lin <[email protected]>
    Fixes: 97548575bef3 ("mmc: block: Convert RPMB to a character device")
    Cc: [email protected] # v4.15+
    Signed-off-by: Ulf Hansson <[email protected]>

commit 643ca198aacc671f32ef7c0c2783f0b539070a36
Author: Laurent Pinchart <[email protected]>
Date:   Fri Apr 27 22:40:21 2018 +0300

    drm: rcar-du: lvds: Fix crash in .atomic_check when disabling connector
    
    The connector .atomic_check() handler can be called with a NULL crtc
    pointer in the connector state when the connector gets disabled
    explicitly (through performing a legacy mode set or setting the
    connector's CRTC_ID property to 0). This causes a crash as the crtc
    pointer is dereferenced without any check.
    
    Fix it by returning from the .atomic_check() handler when then crtc
    pointer is NULL, as there is no check to be performed when the connector
    gets disabled.
    
    Fixes: c6a27fa41fab ("drm: rcar-du: Convert LVDS encoder code to bridge driver")
    Signed-off-by: Laurent Pinchart <[email protected]>
    Reviewed-by: Kieran Bingham <[email protected]>

commit b80d0b93b991e551a32157e0d9d38fc5bc9348a7
Author: William Tu <[email protected]>
Date:   Fri May 18 19:22:28 2018 -0700

    net: ip6_gre: fix tunnel metadata device sharing.
    
    Currently ip6gre and ip6erspan share single metadata mode device,
    using 'collect_md_tun'.  Thus, when doing:
      ip link add dev ip6gre11 type ip6gretap external
      ip link add dev ip6erspan12 type ip6erspan external
      RTNETLINK answers: File exists
    simply fails due to the 2nd tries to create the same collect_md_tun.
    
    The patch fixes it by adding a separate collect md tunnel device
    for the ip6erspan, 'collect_md_tun_erspan'.  As a result, a couple
    of places need to refactor/split up in order to distinguish ip6gre
    and ip6erspan.
    
    First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
    create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
    Then before link/unlink, make sure the link_md/unlink_md is called.
    Finally, a separate ndo_uninit is created for ip6erspan.  Tested it
    using the samples/bpf/test_tunnel_bpf.sh.
    
    Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
    Signed-off-by: William Tu <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit af86ca4e3088fe5eacf2f7e58c01fa68ca067672
Author: Alexei Starovoitov <[email protected]>
Date:   Tue May 15 09:27:05 2018 -0700

    bpf: Prevent memory disambiguation attack
    
    Detect code patterns where malicious 'speculative store bypass' can be used
    and sanitize such patterns.
    
     39: (bf) r3 = r10
     40: (07) r3 += -216
     41: (79) r8 = *(u64 *)(r7 +0)   // slow read
     42: (7a) *(u64 *)(r10 -72) = 0  // verifier inserts this instruction
     43: (7b) *(u64 *)(r8 +0) = r3   // this store becomes slow due to r8
     44: (79) r1 = *(u64 *)(r6 +0)   // cpu speculatively executes this load
     45: (71) r2 = *(u8 *)(r1 +0)    // speculatively arbitrary 'load byte'
                                     // is now sanitized
    
    Above code after x86 JIT becomes:
     e5: mov    %rbp,%rdx
     e8: add    $0xffffffffffffff28,%rdx
     ef: mov    0x0(%r13),%r14
     f3: movq   $0x0,-0x48(%rbp)
     fb: mov    %rdx,0x0(%r14)
     ff: mov    0x0(%rbx),%rdi
    103: movzbq 0x0(%rdi),%rsi
    
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>

commit 4855c92dbb7b3b85c23e88ab7ca04f99b9677b41
Author: Joe Jin <[email protected]>
Date:   Thu May 17 12:33:28 2018 -0700

    xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
    
    When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
    but Dom Heap is increased by the same size. Tracing raidconfig we found
    that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
    to apply memory. If the memory allocated by Dom0 is not in the DMA area,
    it will exchange memory with Xen to meet the requiment. Later drivers
    call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
    the check condition (dev_addr + size - 1 <= dma_mask) is always false,
    it prevents calling xen_destroy_contiguous_region() to return the memory
    to the Xen DMA heap.
    
    This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
    coherent alloc/dealloc check before swizzling the MFNs.".
    
    Signed-off-by: Joe Jin <[email protected]>
    Tested-by: John Sobecki <[email protected]>
    Reviewed-by: Rzeszutek Wilk <[email protected]>
    Cc: [email protected]
    Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

commit d775f26b295a0a303f7a73d7da46e04296484fe7
Author: Rahul Lakkireddy <[email protected]>
Date:   Fri May 18 19:13:37 2018 +0530

    cxgb4: fix offset in collecting TX rate limit info
    
    Correct the indirect register offsets in collecting TX rate limit info
    in UP CIM logs.
    
    Also, T5 doesn't support these indirect register offsets, so remove
    them from collection logic.
    
    Fixes: be6e36d916b1 ("cxgb4: collect TX rate limit info in UP CIM logs")
    Signed-off-by: Rahul Lakkireddy <[email protected]>
    Signed-off-by: Ganesh Goudar <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 44a63b137f7b6e4c7bd6c9cc21615941cb36509d
Author: Paolo Abeni <[email protected]>
Date:   Fri May 18 14:51:44 2018 +0200

    net: sched: red: avoid hashing NULL child
    
    Hangbin reported an Oops triggered by the syzkaller qdisc rules:
    
     kasan: GPF could be caused by NULL-ptr deref or user memory access
     general protection fault: 0000 [#1] SMP KASAN PTI
     Modules linked in: sch_red
     CPU: 0 PID: 28699 Comm: syz-executor5 Not tainted 4.17.0-rc4.kcov #1
     Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
     RIP: 0010:qdisc_hash_add+0x26/0xa0
     RSP: 0018:ffff8800589cf470 EFLAGS: 00010203
     RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff824ad971
     RDX: 0000000000000007 RSI: ffffc9000ce9f000 RDI: 000000000000003c
     RBP: 0000000000000001 R08: ffffed000b139ea2 R09: ffff8800589cf4f0
     R10: ffff8800589cf50f R11: ffffed000b139ea2 R12: ffff880054019fc0
     R13: ffff880054019fb4 R14: ffff88005c0af600 R15: ffff880054019fb0
     FS:  00007fa6edcb1700(0000) GS:ffff88005ce00000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 0000000020000740 CR3: 000000000fc16000 CR4: 00000000000006f0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      red_change+0x2d2/0xed0 [sch_red]
      qdisc_create+0x57e/0xef0
      tc_modify_qdisc+0x47f/0x14e0
      rtnetlink_rcv_msg+0x6a8/0x920
      netlink_rcv_skb+0x2a2/0x3c0
      netlink_unicast+0x511/0x740
      netlink_sendmsg+0x825/0xc30
      sock_sendmsg+0xc5/0x100
      ___sys_sendmsg+0x778/0x8e0
      __sys_sendmsg+0xf5/0x1b0
      do_syscall_64+0xbd/0x3b0
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
     RIP: 0033:0x450869
     RSP: 002b:00007fa6edcb0c48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
     RAX: ffffffffffffffda RBX: 00007fa6edcb16b4 RCX: 0000000000450869
     RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000013
     RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
     R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
     R13: 0000000000008778 R14: 0000000000702838 R15: 00007fa6edcb1700
     Code: e9 0b fe ff ff 0f 1f 44 00 00 55 53 48 89 fb 89 f5 e8 3f 07 f3 fe 48 8d 7b 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 51
     RIP: qdisc_hash_add+0x26/0xa0 RSP: ffff8800589cf470
    
    When a red qdisc is updated with a 0 limit, the child qdisc is left
    unmodified, no additional scheduler is created in red_change(),
    the 'child' local variable is rightfully NULL and must not add it
    to the hash table.
    
    This change addresses the above issue moving qdisc_hash_add() right
    after the child qdisc creation. It additionally removes unneeded checks
    for noop_qdisc.
    
    Reported-by: Hangbin Liu <[email protected]>
    Fixes: 49b499718fa1 ("net: sched: make default fifo qdiscs appear in the dump")
    Signed-off-by: Paolo Abeni <[email protected]>
    Acked-by: Jiri Kosina <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 9709020c86f6bf8439ca3effc58cfca49a5de192
Author: Eric Dumazet <[email protected]>
Date:   Fri May 18 04:47:55 2018 -0700

    sock_diag: fix use-after-free read in __sk_free
    
    We must not call sock_diag_has_destroy_listeners(sk) on a socket
    that has no reference on net structure.
    
    BUG: KASAN: use-after-free in sock_diag_has_destroy_listeners include/linux/sock_diag.h:75 [inline]
    BUG: KASAN: use-after-free in __sk_free+0x329/0x340 net/core/sock.c:1609
    Read of size 8 at addr ffff88018a02e3a0 by task swapper/1/0
    
    CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.17.0-rc5+ #54
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     <IRQ>
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0x1b9/0x294 lib/dump_stack.c:113
     print_address_description+0x6c/0x20b mm/kasan/report.c:256
     kasan_report_error mm/kasan/report.c:354 [inline]
     kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
     __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
     sock_diag_has_destroy_listeners include/linux/sock_diag.h:75 [inline]
     __sk_free+0x329/0x340 net/core/sock.c:1609
     sk_free+0x42/0x50 net/core/sock.c:1623
     sock_put include/net/sock.h:1664 [inline]
     reqsk_free include/net/request_sock.h:116 [inline]
     reqsk_put include/net/request_sock.h:124 [inline]
     inet_csk_reqsk_queue_drop_and_put net/ipv4/inet_connection_sock.c:672 [inline]
     reqsk_timer_handler+0xe27/0x10e0 net/ipv4/inet_connection_sock.c:739
     call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
     expire_timers kernel/time/timer.c:1363 [inline]
     __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
     run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
     __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
     invoke_softirq kernel/softirq.c:365 [inline]
     irq_exit+0x1d1/0x200 kernel/softirq.c:405
     exiting_irq arch/x86/include/asm/apic.h:525 [inline]
     smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
     apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
     </IRQ>
    RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54
    RSP: 0018:ffff8801d9ae7c38 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
    RAX: dffffc0000000000 RBX: 1ffff1003b35cf8a RCX: 0000000000000000
    RDX: 1ffffffff11a30d0 RSI: 0000000000000001 RDI: ffffffff88d18680
    RBP: ffff8801d9ae7c38 R08: ffffed003b5e46c3 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
    R13: ffff8801d9ae7cf0 R14: ffffffff897bef20 R15: 0000000000000000
     arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
     default_idle+0xc2/0x440 arch/x86/kernel/process.c:354
     arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:345
     default_idle_call+0x6d/0x90 kernel/sched/idle.c:93
     cpuidle_idle_call kernel/sched/idle.c:153 [inline]
     do_idle+0x395/0x560 kernel/sched/idle.c:262
     cpu_startup_entry+0x104/0x120 kernel/sched/idle.c:368
     start_secondary+0x426/0x5b0 arch/x86/kernel/smpboot.c:269
     secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242
    
    Allocated by task 4557:
     save_stack+0x43/0xd0 mm/kasan/kasan.c:448
     set_track mm/kasan/kasan.c:460 [inline]
     kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
     kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
     kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
     kmem_cache_zalloc include/linux/slab.h:691 [inline]
     net_alloc net/core/net_namespace.c:383 [inline]
     copy_net_ns+0x159/0x4c0 net/core/net_namespace.c:423
     create_new_namespaces+0x69d/0x8f0 kernel/nsproxy.c:107
     unshare_nsproxy_namespaces+0xc3/0x1f0 kernel/nsproxy.c:206
     ksys_unshare+0x708/0xf90 kernel/fork.c:2408
     __do_sys_unshare kernel/fork.c:2476 [inline]
     __se_sys_unshare kernel/fork.c:2474 [inline]
     __x64_sys_unshare+0x31/0x40 kernel/fork.c:2474
     do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
     entry_SYSCALL_64_after_hwframe+0x49/0xbe
    
    Freed by task 69:
     save_stack+0x43/0xd0 mm/kasan/kasan.c:448
     set_track mm/kasan/kasan.c:460 [inline]
     __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
     kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
     __cache_free mm/slab.c:3498 [inline]
     kmem_cache_free+0x86/0x2d0 mm/slab.c:3756
     net_free net/core/net_namespace.c:399 [inline]
     net_drop_ns.part.14+0x11a/0x130 net/core/net_namespace.c:406
     net_drop_ns net/core/net_namespace.c:405 [inline]
     cleanup_net+0x6a1/0xb20 net/core/net_namespace.c:541
     process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
     worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
     kthread+0x345/0x410 kernel/kthread.c:240
     ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
    
    The buggy address belongs to the object at ffff88018a02c140
     which belongs to the cache net_namespace of size 8832
    The buggy address is located 8800 bytes inside of
     8832-byte region [ffff88018a02c140, ffff88018a02e3c0)
    The buggy address belongs to the page:
    page:ffffea0006280b00 count:1 mapcount:0 mapping:ffff88018a02c140 index:0x0 compound_mapcount: 0
    flags: 0x2fffc0000008100(slab|head)
    raw: 02fffc0000008100 ffff88018a02c140 0000000000000000 0000000100000001
    raw: ffffea00062a1320 ffffea0006268020 ffff8801d9bdde40 0000000000000000
    page dumped because: kasan: bad access detected
    
    Fixes: b922622ec6ef ("sock_diag: don't broadcast kernel sockets")
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Craig Gallek <[email protected]>
    Reported-by: syzbot <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit b16a960ddbf0d4fd6aaabee42d7ec4c4c3ec836d
Author: Geert Uytterhoeven <[email protected]>
Date:   Fri May 18 12:52:51 2018 +0200

    sh_eth: Change platform check to CONFIG_ARCH_RENESAS
    
    Since commit 9b5ba0df4ea4f940 ("ARM: shmobile: Introduce ARCH_RENESAS")
    is CONFIG_ARCH_RENESAS a more appropriate platform check than the legacy
    CONFIG_ARCH_SHMOBILE, hence use the former.
    
    Renesas SuperH SH-Mobile SoCs are still covered by the CONFIG_CPU_SH4
    check.
    
    This will allow to drop ARCH_SHMOBILE on ARM and ARM64 in the near
    future.
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Acked-by: Arnd Bergmann <[email protected]>
    Acked-by: Sergei Shtylyov <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 5447d78623da2eded06d4cd9469d1a71eba43bc4
Author: Florian Fainelli <[email protected]>
Date:   Thu May 17 16:55:39 2018 -0700

    net: dsa: Do not register devlink for unused ports
    
    Even if commit 1d27732f411d ("net: dsa: setup and teardown ports") indicated
    that registering a devlink instance for unused ports is not a problem, and this
    is true, this can be confusing nonetheless, so let's not do it.
    
    Fixes: 1d27732f411d ("net: dsa: setup and teardown ports")
    Reported-by: Jiri Pirko <[email protected]>
    Signed-off-by: Florian Fainelli <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 6358d49ac23995fdfe157cc8747ab0f274d3954b
Author: Amritha Nambiar <[email protected]>
Date:   Thu May 17 14:50:44 2018 -0700

    net: Fix a bug in removing queues from XPS map
    
    While removing queues from the XPS map, the individual CPU ID
    alone was used to index the CPUs map, this should be changed to also
    factor in the traffic class mapping for the CPU-to-queue lookup.
    
    Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
    Signed-off-by: Amritha Nambiar <[email protected]>
    Acked-by: Alexander Duyck <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit a45b599ad808c3c982fdcdc12b0b8611c2f92824
Author: Alexander Potapenko <[email protected]>
Date:   Fri May 18 16:23:18 2018 +0200

    scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()
    
    This shall help avoid copying uninitialized memory to the userspace when
    calling ioctl(fd, SG_IO) with an empty command.
    
    Reported-by: [email protected]
    Cc: [email protected]
    Signed-off-by: Alexander Potapenko <[email protected]>
    Acked-by: Douglas Gilbert <[email protected]>
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>

commit 240da953fcc6a9008c92fae5b1f727ee5ed167ab
Author: Konrad Rzeszutek Wilk <[email protected]>
Date:   Wed May 16 23:18:09 2018 -0400

    x86/bugs: Rename SSBD_NO to SSB_NO
    
    The "336996 Speculative Execution Side Channel Mitigations" from
    May defines this as SSB_NO, hence lets sync-up.
    
    Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>

commit 3ae180972564846e6d794e3615e1ab0a1e6c4ef9
Author: Ben Hutchings <[email protected]>
Date:   Thu May 17 22:34:39 2018 +0100

    ALSA: timer: Fix pause event notification
    
    Commit f65e0d299807 ("ALSA: timer: Call notifier in the same spinlock")
    combined the start/continue and stop/pause functions, and in doing so
    changed the event code for the pause case to SNDRV_TIMER_EVENT_CONTINUE.
    Change it back to SNDRV_TIMER_EVENT_PAUSE.
    
    Fixes: f65e0d299807 ("ALSA: timer: Call notifier in the same spinlock")
    Signed-off-by: Ben Hutchings <[email protected]>
    Cc: [email protected]
    Signed-off-by: Takashi Iwai <[email protected]>

commit faf37c44a105f3608115785f17cbbf3500f8bc71
Author: Michael Neuling <[email protected]>
Date:   Fri May 18 11:37:42 2018 +1000

    powerpc/64s: Clear PCR on boot
    
    Clear the PCR (Processor Compatibility Register) on boot to ensure we
    are not running in a compatibility mode.
    
    We've seen this cause problems when a crash (and kdump) occurs while
    running compat mode guests. The kdump kernel then runs with the PCR
    set and causes problems. The symptom in the kdump kernel (also seen in
    petitboot after fast-reboot) is early userspace programs taking
    sigills on newer instructions (seen in libc).
    
    Signed-off-by: Michael Neuling <[email protected]>
    Cc: [email protected]
    Signed-off-by: Michael Ellerman <[email protected]>

commit 050fad7c4534c13c8eb1d9c2ba66012e014773cb
Author: Daniel Borkmann <[email protected]>
Date:   Thu May 17 01:44:11 2018 +0200

    bpf: fix truncated jump targets on heavy expansions
    
    Recently during testing, I ran into the following panic:
    
      [  207.892422] Internal error: Accessing user space memory outside uaccess.h routines: 96000004 [#1] SMP
      [  207.901637] Modules linked in: binfmt_misc [...]
      [  207.966530] CPU: 45 PID: 2256 Comm: test_verifier Tainted: G        W         4.17.0-rc3+ #7
      [  207.974956] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
      [  207.982428] pstate: 60400005 (nZCv daif +PAN -UAO)
      [  207.987214] pc : bpf_skb_load_helper_8_no_cache+0x34/0xc0
      [  207.992603] lr : 0xffff000000bdb754
      [  207.996080] sp : ffff000013703ca0
      [  207.999384] x29: ffff000013703ca0 x28: 0000000000000001
      [  208.004688] x27: 0000000000000001 x26: 0000000000000000
      [  208.009992] x25: ffff000013703ce0 x24: ffff800fb4afcb00
      [  208.015295] x23: ffff00007d2f5038 x22: ffff00007d2f5000
      [  208.020599] x21: fffffffffeff2a6f x20: 000000000000000a
      [  208.025903] x19: ffff000009578000 x18: 0000000000000a03
      [  208.031206] x17: 0000000000000000 x16: 0000000000000000
      [  208.036510] x15: 0000ffff9de83000 x14: 0000000000000000
      [  208.041813] x13: 0000000000000000 x12: 0000000000000000
      [  208.047116] x11: 0000000000000001 x10: ffff0000089e7f18
      [  208.052419] x9 : fffffffffeff2a6f x8 : 0000000000000000
      [  208.057723] x7 : 000000000000000a x6 : 00280c6160000000
      [  208.063026] x5 : 0000000000000018 x4 : 0000000000007db6
      [  208.068329] x3 : 000000000008647a x2 : 19868179b1484500
      [  208.073632] x1 : 0000000000000000 x0 : ffff000009578c08
      [  208.078938] Process test_verifier (pid: 2256, stack limit = 0x0000000049ca7974)
      [  208.086235] Call trace:
      [  208.088672]  bpf_skb_load_helper_8_no_cache+0x34/0xc0
      [  208.093713]  0xffff000000bdb754
      [  208.096845]  bpf_test_run+0x78/0xf8
      [  208.100324]  bpf_prog_test_run_skb+0x148/0x230
      [  208.104758]  sys_bpf+0x314/0x1198
      [  208.108064]  el0_svc_naked+0x30/0x34
      [  208.111632] Code: 91302260 f9400001 f9001fa1 d2800001 (29500680)
      [  208.117717] ---[ end trace 263cb8a59b5bf29f ]---
    
    The program itself which caused this had a long jump over the whole
    instruction sequence where all of the inner instructions required
    heavy expansions into multiple BPF instructions. Additionally, I also
    had BPF hardening enabled which requires once more rewrites of all
    constant values in order to blind them. Each time we rewrite insns,
    bpf_adj_branches() would need to potentially adjust branch targets
    which cross the patchlet boundary to accommodate for the additional
    delta. Eventually that lead to the case where the target offset could
    not fit into insn->off's upper 0x7fff limit anymore where then offset
    wraps around becoming negative (in s16 universe), or vice versa
    depending on the jump direction.
    
    Therefore it becomes necessary to detect and reject any such occasions
    in a generic way for native eBPF and cBPF to eBPF migrations. For
    the latter we can simply check bounds in the bpf_convert_filter()'s
    BPF_EMIT_JMP helper macro and bail out once we surpass limits. The
    bpf_patch_insn_single() for native eBPF (and cBPF to eBPF in case
    of subsequent hardening) is a bit more complex in that we need to
    detect such truncations before hitting the bpf_prog_realloc(). Thus
    the latter is split into an extra pass to probe problematic offsets
    on the original program in order to fail early. With that in place
    and carefully tested I no longer hit the panic and the rewrites are
    rejected properly. The above example panic I've seen on bpf-next,
    though the issue itself is generic in that a guard against this issue
    in bpf seems more appropriate in this case.
    
    Signed-off-by: Daniel Borkmann <[email protected]>
    Acked-by: Martin KaFai Lau <[email protected]>
    Signed-off-by: Alexei Starovoitov <[email protected]>

commit 9617456054a6160f5e11e892b713fade78aea2e9
Author: John Fastabend <[email protected]>
Date:   Thu May 17 14:06:40 2018 -0700

    bpf: parse and verdict prog attach may race with bpf map update
    
    In the sockmap design BPF programs (SK_SKB_STREAM_PARSER,
    SK_SKB_STREAM_VERDICT and SK_MSG_VERDICT) are attached to the sockmap
    map type and when a sock is added to the map the programs are used by
    the socket. However, sockmap updates from both userspace and BPF
    programs can happen concurrently with the attach and detach of these
    programs.
    
    To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
    primitive to ensure the program pointer is not refeched and
    possibly NULL'd before the refcnt increment. This happens inside
    a RCU critical section so although the pointer reference in the map
    object may be NULL (by a concurrent detach operation) the reference
    from READ_ONCE will not be free'd until after grace period. This
    ensures the object returned by READ_ONCE() is valid through the
    RCU criticl section and safe to use as long as we "know" it may
    be free'd shortly.
    
    Daniel spotted a case in the sock update API where instead of using
    the READ_ONCE() program reference we used the pointer from the
    original map, stab->bpf_{verdict|parse|txmsg}. The problem with this
    is the logic checks the object returned from the READ_ONCE() is not
    NULL and then tries to reference the object again but using the
    above map pointer, which may have already been NULL'd by a parallel
    detach operation. If this happened bpf_porg…

Loading branch information

akpm00 authored and hnaz committed May 25, 2018

1 parent 771c577 commit 733ce59

Documentation/ABI/testing/sysfs-devices-system-cpu

-Original file line number
+Diff line change
@@ Expand Up / @@ -478,6 +478,7 @@ What: /sys/devices/system/cpu/vulnerabilities @@
     		/sys/devices/system/cpu/vulnerabilities/meltdown
     		/sys/devices/system/cpu/vulnerabilities/spectre_v1
     		/sys/devices/system/cpu/vulnerabilities/spectre_v2
+    		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
     Date:		January 2018
     Contact:	Linux kernel mailing list <[email protected]>
     Description:	Information about CPU vulnerabilities
@@ Expand Down @@

Documentation/admin-guide/kernel-parameters.txt

-Original file line number
+Diff line change
@@ Expand Up / @@ -2680,6 +2680,9 @@ @@
     			allow data leaks with this option, which is equivalent
     			to spectre_v2=off.
+    	nospec_store_bypass_disable
+    			[HW] Disable all mitigations for the Speculative Store Bypass vulnerability
     	noxsave		[BUGS=X86] Disables x86 extended register state save
     			and restore using xsave. The kernel will fallback to
     			enabling legacy floating-point and sse state.
@@ Expand Down Expand Up / @@ -4025,6 +4028,48 @@ @@
     			Not specifying this option is equivalent to
     			spectre_v2=auto.
+    	spec_store_bypass_disable=
+    			[HW] Control Speculative Store Bypass (SSB) Disable mitigation
+    			(Speculative Store Bypass vulnerability)
+    			Certain CPUs are vulnerable to an exploit against a
+    			a common industry wide performance optimization known
+    			as "Speculative Store Bypass" in which recent stores
+    			to the same memory location may not be observed by
+    			later loads during speculative execution. The idea
+    			is that such stores are unlikely and that they can
+    			be detected prior to instruction retirement at the
+    			end of a particular speculation execution window.
+    			In vulnerable processors, the speculatively forwarded
+    			store can be used in a cache side channel attack, for
+    			example to read memory to which the attacker does not
+    			directly have access (e.g. inside sandboxed code).
+    			This parameter controls whether the Speculative Store
+    			Bypass optimization is used.
+    			on      - Unconditionally disable Speculative Store Bypass
+    			off     - Unconditionally enable Speculative Store Bypass
+    			auto    - Kernel detects whether the CPU model contains an
+    				  implementation of Speculative Store Bypass and
+    				  picks the most appropriate mitigation. If the
+    				  CPU is not vulnerable, "off" is selected. If the
+    				  CPU is vulnerable the default mitigation is
+    				  architecture and Kconfig dependent. See below.
+    			prctl   - Control Speculative Store Bypass per thread
+    				  via prctl. Speculative Store Bypass is enabled
+    				  for a process by default. The state of the control
+    				  is inherited on fork.
+    			seccomp - Same as "prctl" above, but all seccomp threads
+    				  will disable SSB unless they explicitly opt out.
+    			Not specifying this option is equivalent to
+    			spec_store_bypass_disable=auto.
+    			Default mitigations:
+    			X86:	If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
     	spia_io_base=	[HW,MTD]
     	spia_fio_base=
     	spia_pedr=
@@ Expand Down @@

Documentation/devicetree/bindings/net/micrel-ksz90x1.txt

-Original file line number
+Diff line change
@@ Expand Up / @@ -57,6 +57,13 @@ KSZ9031: @@
           - txd2-skew-ps : Skew control of TX data 2 pad
           - txd3-skew-ps : Skew control of TX data 3 pad
+        - micrel,force-master:
+            Boolean, force phy to master mode. Only set this option if the phy
+            reference clock provided at CLK125_NDO pin is used as MAC reference
+            clock because the clock jitter in slave mode is to high (errata#2).
+            Attention: The link partner must be configurable as slave otherwise
+            no link will be established.
     Examples:
     	mdio {
@@ Expand Down @@

Documentation/userspace-api/index.rst

-Original file line number
+Diff line change
@@ Expand Up / @@ -19,6 +19,7 @@ place where this information is gathered. @@
        no_new_privs
        seccomp_filter
        unshare
+       spec_ctrl
     .. only::  subproject and html
@@ Expand Down @@

Documentation/userspace-api/spec_ctrl.rst

-Original file line number
+Diff line change
@@ -0,0 +1,94 @@
+    ===================
+    Speculation Control
+    ===================
+    Quite some CPUs have speculation-related misfeatures which are in
+    fact vulnerabilities causing data leaks in various forms even across
+    privilege domains.
+    The kernel provides mitigation for such vulnerabilities in various
+    forms. Some of these mitigations are compile-time configurable and some
+    can be supplied on the kernel command line.
+    There is also a class of mitigations which are very expensive, but they can
+    be restricted to a certain set of processes or tasks in controlled
+    environments. The mechanism to control these mitigations is via
+    :manpage:`prctl(2)`.
+    There are two prctl options which are related to this:
+     * PR_GET_SPECULATION_CTRL
+     * PR_SET_SPECULATION_CTRL
+    PR_GET_SPECULATION_CTRL
+    -----------------------
+    PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
+    which is selected with arg2 of prctl(2). The return value uses bits 0-3 with
+    the following meaning:
+    ==== ===================== ===================================================
+    Bit  Define                Description
+    ==== ===================== ===================================================
+PR_SPEC_PRCTL         Mitigation can be controlled per task by
+                               PR_SET_SPECULATION_CTRL.
+PR_SPEC_ENABLE        The speculation feature is enabled, mitigation is
+                               disabled.
+PR_SPEC_DISABLE       The speculation feature is disabled, mitigation is
+                               enabled.
+PR_SPEC_FORCE_DISABLE Same as PR_SPEC_DISABLE, but cannot be undone. A
+                               subsequent prctl(..., PR_SPEC_ENABLE) will fail.
+    ==== ===================== ===================================================
+    If all bits are 0 the CPU is not affected by the speculation misfeature.
+    If PR_SPEC_PRCTL is set, then the per-task control of the mitigation is
+    available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
+    misfeature will fail.
+    PR_SET_SPECULATION_CTRL
+    -----------------------
+    PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
+    is selected by arg2 of :manpage:`prctl(2)` per task. arg3 is used to hand
+    in the control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE or
+    PR_SPEC_FORCE_DISABLE.
+    Common error codes
+    ------------------
+    ======= =================================================================
+    Value   Meaning
+    ======= =================================================================
+    EINVAL  The prctl is not implemented by the architecture or unused
+            prctl(2) arguments are not 0.
+    ENODEV  arg2 is selecting a not supported speculation misfeature.
+    ======= =================================================================
+    PR_SET_SPECULATION_CTRL error codes
+    -----------------------------------
+    ======= =================================================================
+    Value   Meaning
+    ======= =================================================================
+Success
+    ERANGE  arg3 is incorrect, i.e. it's neither PR_SPEC_ENABLE nor
+            PR_SPEC_DISABLE nor PR_SPEC_FORCE_DISABLE.
+    ENXIO   Control of the selected speculation misfeature is not possible.
+            See PR_GET_SPECULATION_CTRL.
+    EPERM   Speculation was disabled with PR_SPEC_FORCE_DISABLE and caller
+            tried to enable it again.
+    ======= =================================================================
+    Speculation misfeature controls
+    -------------------------------
+    - PR_SPEC_STORE_BYPASS: Speculative Store Bypass
+      Invocations:
+       * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, 0, 0, 0);
+       * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0);
+       * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0);
+       * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0);

MAINTAINERS

-Original file line number
+Diff line change
@@ Expand Up / @@ -5388,7 +5388,6 @@ S: Maintained @@
     F:	drivers/iommu/exynos-iommu.c
     EZchip NPS platform support
-    M:	Elad Kanfi <[email protected]>
     M:	Vineet Gupta <[email protected]>
     S:	Supported
     F:	arch/arc/plat-eznps
@@ Expand Down Expand Up / @@ -9021,7 +9020,6 @@ Q: http://patchwork.ozlabs.org/project/netdev/list/ @@
     F:	drivers/net/ethernet/mellanox/mlx5/core/en_*
     MELLANOX ETHERNET INNOVA DRIVER
-    M:	Ilan Tayari <[email protected]>
     R:	Boris Pismenny <[email protected]>
     L:	[email protected]
     S:	Supported
@@ Expand All / @@ -9031,7 +9029,6 @@ F: drivers/net/ethernet/mellanox/mlx5/core/fpga/* @@
     F:	include/linux/mlx5/mlx5_ifc_fpga.h
     MELLANOX ETHERNET INNOVA IPSEC DRIVER
-    M:	Ilan Tayari <[email protected]>
     R:	Boris Pismenny <[email protected]>
     L:	[email protected]
     S:	Supported
@@ Expand Down Expand Up / @@ -9087,7 +9084,6 @@ F: include/uapi/rdma/mlx4-abi.h @@
     MELLANOX MLX5 core VPI driver
     M:	Saeed Mahameed <[email protected]>
-    M:	Matan Barak <[email protected]>
     M:	Leon Romanovsky <[email protected]>
     L:	[email protected]
     L:	[email protected]
@@ Expand All / @@ -9098,7 +9094,6 @@ F: drivers/net/ethernet/mellanox/mlx5/core/ @@
     F:	include/linux/mlx5/
     MELLANOX MLX5 IB driver
-    M:	Matan Barak <[email protected]>
     M:	Leon Romanovsky <[email protected]>
     L:	[email protected]
     W:	http://www.mellanox.com
@@ Expand Down Expand Up / @@ -9832,7 +9827,6 @@ F: net/netfilter/xt_CONNSECMARK.c @@
     F:	net/netfilter/xt_SECMARK.c
     NETWORKING [TLS]
-    M:	Ilya Lesokhin <[email protected]>
     M:	Aviad Yehezkel <[email protected]>
     M:	Dave Watson <[email protected]>
     L:	[email protected]
@@ Expand Down @@

arch/alpha/Kconfig

-Original file line number
+Diff line change
@@ Expand Up / @@ -211,6 +211,7 @@ config ALPHA_EIGER @@
     config ALPHA_JENSEN
     	bool "Jensen"
     	depends on BROKEN
+    	select DMA_DIRECT_OPS
     	help
     	  DEC PC 150 AXP (aka Jensen): This is a very old Digital system - one
     	  of the first-generation Alpha systems. A number of these systems
@@ Expand Down @@

arch/alpha/include/asm/dma-mapping.h

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -2,11 +2,15 @@
  
    #ifndef _ALPHA_DMA_MAPPING_H

    #define _ALPHA_DMA_MAPPING_H

    extern const struct dma_map_ops *dma_ops;

    extern const struct dma_map_ops alpha_pci_ops;

    static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)

    {

    	return dma_ops;

    #ifdef CONFIG_ALPHA_JENSEN

    	return &dma_direct_ops;

    #else

    	return &alpha_pci_ops;

    #endif

    }

    #endif	/* _ALPHA_DMA_MAPPING_H */

arch/alpha/kernel/io.c

-Original file line number
+Diff line change
@@ Expand Up / @@ -37,20 +37,20 @@ unsigned int ioread32(void __iomem *addr) @@
     void iowrite8(u8 b, void __iomem *addr)
     {
-    	IO_CONCAT(__IO_PREFIX,iowrite8)(b, addr);
     	mb();
+    	IO_CONCAT(__IO_PREFIX,iowrite8)(b, addr);
     }
     void iowrite16(u16 b, void __iomem *addr)
     {
-    	IO_CONCAT(__IO_PREFIX,iowrite16)(b, addr);
     	mb();
+    	IO_CONCAT(__IO_PREFIX,iowrite16)(b, addr);
     }
     void iowrite32(u32 b, void __iomem *addr)
     {
-    	IO_CONCAT(__IO_PREFIX,iowrite32)(b, addr);
     	mb();
+    	IO_CONCAT(__IO_PREFIX,iowrite32)(b, addr);
     }
     EXPORT_SYMBOL(ioread8);
@@ Expand Down Expand Up / @@ -176,26 +176,26 @@ u64 readq(const volatile void __iomem *addr) @@
     void writeb(u8 b, volatile void __iomem *addr)
     {
-    	__raw_writeb(b, addr);
     	mb();
+    	__raw_writeb(b, addr);
     }
     void writew(u16 b, volatile void __iomem *addr)
     {
-    	__raw_writew(b, addr);
     	mb();
+    	__raw_writew(b, addr);
     }
     void writel(u32 b, volatile void __iomem *addr)
     {
-    	__raw_writel(b, addr);
     	mb();
+    	__raw_writel(b, addr);
     }
     void writeq(u64 b, volatile void __iomem *addr)
     {
-    	__raw_writeq(b, addr);
     	mb();
+    	__raw_writeq(b, addr);
     }
     EXPORT_SYMBOL(readb);
@@ Expand Down @@

arch/alpha/kernel/pci-noop.c

-Original file line number
+Diff line change
@@ Expand Up @@
     	else
     		return -ENODEV;
     }
-    static void *alpha_noop_alloc_coherent(struct device *dev, size_t size,
-    				       dma_addr_t *dma_handle, gfp_t gfp,
-    				       unsigned long attrs)
-    {
-    	void *ret;
-    	if (!dev || *dev->dma_mask >= 0xffffffffUL)
-    		gfp &= ~GFP_DMA;
-    	ret = (void *)__get_free_pages(gfp, get_order(size));
-    	if (ret) {
-    		memset(ret, 0, size);
-    		*dma_handle = virt_to_phys(ret);
-    	}
-    	return ret;
-    }
-    static int alpha_noop_supported(struct device *dev, u64 mask)
-    {
-    	return mask < 0x00ffffffUL ? 0 : 1;
-    }
-    const struct dma_map_ops alpha_noop_ops = {
-    	.alloc			= alpha_noop_alloc_coherent,
-    	.free			= dma_noop_free_coherent,
-    	.map_page		= dma_noop_map_page,
-    	.map_sg			= dma_noop_map_sg,
-    	.mapping_error		= dma_noop_mapping_error,
-    	.dma_supported		= alpha_noop_supported,
-    };
-    const struct dma_map_ops *dma_ops = &alpha_noop_ops;
-    EXPORT_SYMBOL(dma_ops);

arch/alpha/kernel/pci_iommu.c

-Original file line number
+Diff line change
@@ Expand Up / @@ -950,6 +950,4 @@ const struct dma_map_ops alpha_pci_ops = { @@
     	.mapping_error		= alpha_pci_mapping_error,
     	.dma_supported		= alpha_pci_supported,
     };
-    const struct dma_map_ops *dma_ops = &alpha_pci_ops;
-    EXPORT_SYMBOL(dma_ops);
+    EXPORT_SYMBOL(alpha_pci_ops);

arch/arm/mm/dma-mapping.c

-Original file line number
+Diff line change
@@ Expand Up @@
     void __init dma_contiguous_remap(void)
     {
     	int i;
-    	if (!dma_mmu_remap_num)
-    		return;
-    	/* call flush_cache_all() since CMA area would be large enough */
-    	flush_cache_all();
     	for (i = 0; i < dma_mmu_remap_num; i++) {
     		phys_addr_t start = dma_mmu_remap[i].base;
     		phys_addr_t end = start + dma_mmu_remap[i].size;
@@ Expand Down Expand Up / @@ -504,15 +498,7 @@ void __init dma_contiguous_remap(void) @@
     		flush_tlb_kernel_range(__phys_to_virt(start),
     				       __phys_to_virt(end));
-    		/*
-    		 * All the memory in CMA region will be on ZONE_MOVABLE.
-    		 * If that zone is considered as highmem, the memory in CMA
-    		 * region is also considered as highmem even if it's
-    		 * physical address belong to lowmem. In this case,
-    		 * re-mapping isn't required.
-    		 */
-    		if (!is_highmem_idx(ZONE_MOVABLE))
-    			iotable_init(&map, 1);
+    		iotable_init(&map, 1);
     	}
     }
@@ Expand Down @@

0 comments on commit `733ce59`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `733ce59`

Commit

There are no files selected for viewing

0 comments on commit 733ce59

0 comments on commit `733ce59`