Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android samsung 2.6.35 Touchscreen again... #5

Merged
2 commits merged into from
Feb 22, 2011
Merged

Android samsung 2.6.35 Touchscreen again... #5

2 commits merged into from
Feb 22, 2011

Conversation

DW1985
Copy link
Contributor

@DW1985 DW1985 commented Feb 21, 2011

Forgot to change two values and set the maximum number of points to 10. Fixed now. Sorry :)

TheEscapist13 pushed a commit to TheEscapist13/android_kernel_samsung_aries that referenced this pull request May 28, 2011
[ Upstream commit e226930 ]

This code has been broken forever, but in several different and
creative ways.

So far as I can work out, the R6040 MAC filter has 4 exact-match
entries, the first of which the driver uses for its assigned unicast
address, plus a 64-entry hash-based filter for multicast addresses
(maybe unicast as well?).

The original version of this code would write the first 4 multicast
addresses as exact-match entries from offset 1 (bug coolya#1: there is no
entry 4 so this could write to some PHY registers).  It would fill the
remainder of the exact-match entries with the broadcast address (bug coolya#2:
this would overwrite the last used entry).  If more than 4 multicast
addresses were configured, it would set up the hash table, write some
random crap to the MAC control register (bug coolya#3) and finally walk off
the end of the list when filling the exact-match entries (bug coolya#4).

All of this seems to be pointless, since it sets the promiscuous bit
when the interface is made promiscuous or if >4 multicast addresses
are enabled, and never clears it (bug coolya#5, masking bug coolya#2).

The recent(ish) changes to the multicast list fixed bug coolya#4, but
completely removed the limit on iteration over the exact-match entries
(bug coolya#6).

Bug coolya#4 was reported as
<https://bugzilla.kernel.org/show_bug.cgi?id=15355> and more recently
as <http://bugs.debian.org/600155>.  Florian Fainelli attempted to fix
these in commit 3bcf822, but that
actually dealt with bugs coolya#1-3, bug coolya#4 having been fixed in mainline at
that point.

That commit fixes the most important current bug coolya#6.

Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
TheEscapist13 pushed a commit to TheEscapist13/android_kernel_samsung_aries that referenced this pull request May 28, 2011
Commit: b0a0f66 upstream

> ===================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> /home/greearb/git/linux.wireless-testing/kernel/sched.c:618 invoked rcu_dereference_check() without protection!
>
> other info that might help us debug this:
>
> rcu_scheduler_active = 1, debug_locks = 1
> 1 lock held by ifup/23517:
>   #0:  (&rq->lock){-.-.-.}, at: [<c042f782>] task_fork_fair+0x3b/0x108
>
> stack backtrace:
> Pid: 23517, comm: ifup Not tainted 2.6.36-rc6-wl+ coolya#5
> Call Trace:
>   [<c075e219>] ? printk+0xf/0x16
>   [<c0455842>] lockdep_rcu_dereference+0x74/0x7d
>   [<c0426854>] task_group+0x6d/0x79
>   [<c042686e>] set_task_rq+0xe/0x57
>   [<c042f79e>] task_fork_fair+0x57/0x108
>   [<c042e965>] sched_fork+0x82/0xf9
>   [<c04334b3>] copy_process+0x569/0xe8e
>   [<c0433ef0>] do_fork+0x118/0x262
>   [<c076302f>] ? do_page_fault+0x16a/0x2cf
>   [<c044b80c>] ? up_read+0x16/0x2a
>   [<c04085ae>] sys_clone+0x1b/0x20
>   [<c04030a5>] ptregs_clone+0x15/0x30
>   [<c0402f1c>] ? sysenter_do_call+0x12/0x38

Here a newly created task is having its runqueue assigned.  The new task
is not yet on the tasklist, so cannot go away.  This is therefore a false
positive, suppress with an RCU read-side critical section.

Reported-by: Ben Greear <[email protected]
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Ben Greear <[email protected]
Signed-off-by: Mike Galbraith <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
zachariasmaladroit pushed a commit to zachariasmaladroit/android_kernel_samsung_aries that referenced this pull request Aug 3, 2011
mm: unify module_alloc code for vmalloc

    Four architectures (arm, mips, sparc, x86) use __vmalloc_area() for
    module_init().  Much of the code is duplicated and can be generalized in a
    globally accessible function, __vmalloc_node_range().

    __vmalloc_node() now calls into __vmalloc_node_range() with a range of
    [VMALLOC_START, VMALLOC_END) for functionally equivalent behavior.

    Each architecture may then use __vmalloc_node_range() directly to remove
    the duplication of code.

    Signed-off-by: David Rientjes <[email protected]>
    Cc: Christoph Lameter <[email protected]>
    Cc: Russell King <[email protected]>
    Cc: Ralf Baechle <[email protected]>
    Cc: "David S. Miller" <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: "H. Peter Anvin" <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

mm: clear PageError bit in msync & fsync

    Temporary IO failures, eg.  due to loss of both multipath paths, can
    permanently leave the PageError bit set on a page, resulting in msync or
    fsync returning -EIO over and over again, even if IO is now getting to the
    disk correctly.

    We already clear the AS_ENOSPC and AS_IO bits in mapping->flags in the
    filemap_fdatawait_range function.  Also clearing the PageError bit on the
    page allows subsequent msync or fsync calls on this file to return without
    an error, if the subsequent IO succeeds.

    Unfortunately data written out in the msync or fsync call that returned
    -EIO can still get lost, because the page dirty bit appears to not get
    restored on IO error.  However, the alternative could be potentially all
    of memory filling up with uncleanable dirty pages, hanging the system, so
    there is no nice choice here...

    Signed-off-by: Rik van Riel <[email protected]>
    Acked-by: Valerie Aurora <[email protected]>
    Acked-by: Jeff Layton <[email protected]>
    Cc: Theodore Ts'o <[email protected]>
    Acked-by: Jan Kara <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

mm: set correct numa_zonelist_order string when configured on the kernel command line

    When numa_zonelist_order parameter is set to "node" or "zone" on the
    command line it's still showing as "default" in sysctl.  That's because
    early_param parsing function changes only user_zonelist_order variable.
    Fix this by copying user-provided string to numa_zonelist_order if it was
    successfully parsed.

    Signed-off-by: Volodymyr G Lukiianyk <[email protected]>
    Acked-by: KAMEZAWA Hiroyuki <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

mm: clear pages_scanned only if draining a pcp adds pages to the buddy allocator

    Commit 0e093d9 ("writeback: do not sleep on the congestion queue if
    there are no congested BDIs or if significant congestion is not being
    encountered in the current zone") uncovered a livelock in the page
    allocator that resulted in tasks infinitely looping trying to find
    memory and kswapd running at 100% cpu.

    The issue occurs because drain_all_pages() is called immediately
    following direct reclaim when no memory is freed and try_to_free_pages()
    returns non-zero because all zones in the zonelist do not have their
    all_unreclaimable flag set.

    When draining the per-cpu pagesets back to the buddy allocator for each
    zone, the zone->pages_scanned counter is cleared to avoid erroneously
    setting zone->all_unreclaimable later.  The problem is that no pages may
    actually be drained and, thus, the unreclaimable logic never fails
    direct reclaim so the oom killer may be invoked.

    This apparently only manifested after wait_iff_congested() was
    introduced and the zone was full of anonymous memory that would not
    congest the backing store.  The page allocator would infinitely loop if
    there were no other tasks waiting to be scheduled and clear
    zone->pages_scanned because of drain_all_pages() as the result of this
    change before kswapd could scan enough pages to trigger the reclaim
    logic.  Additionally, with every loop of the page allocator and in the
    reclaim path, kswapd would be kicked and would end up running at 100%
    cpu.  In this scenario, current and kswapd are all running continuously
    with kswapd incrementing zone->pages_scanned and current clearing it.

    The problem is even more pronounced when current swaps some of its
    memory to swap cache and the reclaimable logic then considers all active
    anonymous memory in the all_unreclaimable logic, which requires a much
    higher zone->pages_scanned value for try_to_free_pages() to return zero
    that is never attainable in this scenario.

    Before wait_iff_congested(), the page allocator would incur an
    unconditional timeout and allow kswapd to elevate zone->pages_scanned to
    a level that the oom killer would be called the next time it loops.

    The fix is to only attempt to drain pcp pages if there is actually a
    quantity to be drained.  The unconditional clearing of
    zone->pages_scanned in free_pcppages_bulk() need not be changed since
    other callers already ensure that draining will occur.  This patch
    ensures that free_pcppages_bulk() will actually free memory before
    calling into it from drain_all_pages() so zone->pages_scanned is only
    cleared if appropriate.

    Signed-off-by: David Rientjes <[email protected]>
    Cc: Mel Gorman <[email protected]>
    Reviewed-by: Johannes Weiner <[email protected]>
    Cc: Minchan Kim <[email protected]>
    Cc: Wu Fengguang <[email protected]>
    Cc: KAMEZAWA Hiroyuki <[email protected]>
    Cc: KOSAKI Motohiro <[email protected]>
    Reviewed-by: Rik van Riel <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

mm: <asm-generic/pgtable.h> must include <linux/mm_types.h>

    Commit e2cda32 ("thp: add pmd mangling generic functions") replaced
    some macros in <asm-generic/pgtable.h> with inline functions.

    If the functions are to be defined (not all architectures need them)
    then struct vm_area_struct must be defined first.  So include
    <linux/mm_types.h>.

    Fixes a build failure seen in Debian:

        CC [M]  drivers/media/dvb/mantis/mantis_pci.o
      In file included from arch/arm/include/asm/pgtable.h:460,
                       from drivers/media/dvb/mantis/mantis_pci.c:25:
      include/asm-generic/pgtable.h: In function 'ptep_test_and_clear_young':
      include/asm-generic/pgtable.h:29: error: dereferencing pointer to incomplete type

    Signed-off-by: Ben Hutchings <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

mm: use alloc_bootmem_node_nopanic() on really needed path

    commit 8f389a9 upstream.

    Stefan found nobootmem does not work on his system that has only 8M of
    RAM.  This causes an early panic:

      BIOS-provided physical RAM map:
       BIOS-88: 0000000000000000 - 000000000009f000 (usable)
       BIOS-88: 0000000000100000 - 0000000000840000 (usable)
      bootconsole [earlyser0] enabled
      Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
      DMI not present or invalid.
      last_pfn = 0x840 max_arch_pfn = 0x100000
      init_memory_mapping: 0000000000000000-0000000000840000
      8MB LOWMEM available.
        mapped low ram: 0 - 00840000
        low ram: 0 - 00840000
      Zone PFN ranges:
        DMA      0x00000001 -> 0x00001000
        Normal   empty
      Movable zone start PFN for each node
      early_node_map[2] active PFN ranges
          0: 0x00000001 -> 0x0000009f
          0: 0x00000100 -> 0x00000840
      BUG: Int 6: CR2 (null)
           EDI c034663c  ESI (null)  EBP c0329f38  ESP c0329ef4
           EBX c0346380  EDX 00000006  ECX ffffffff  EAX fffffff4
           err (null)  EIP c0353191   CS c0320060  flg 00010082
      Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null)
             00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
             (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null)
      Pid: 0, comm: swapper Not tainted 2.6.36 coolya#5
      Call Trace:
       [<c02e3707>] ? 0xc02e3707
       [<c035e6e5>] 0xc035e6e5
       [<c0353191>] ? 0xc0353191
       [<c03534d6>] 0xc03534d6
       [<c034f1cd>] 0xc034f1cd
       [<c034a824>] 0xc034a824
       [<c03513cb>] ? 0xc03513cb
       [<c0349432>] 0xc0349432
       [<c0349066>] 0xc0349066

    It turns out that we should ignore the low limit of 16M.

    Use alloc_bootmem_node_nopanic() in this case.

    [[email protected]: less mess]
    Signed-off-by: Yinghai LU <[email protected]>
    Reported-by: Stefan Hellermann <[email protected]>
    Tested-by: Stefan Hellermann <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: "H. Peter Anvin" <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: vmscan: correctly check if reclaimer should schedule during shrink_slab

    commit f06590b upstream.

    It has been reported on some laptops that kswapd is consuming large
    amounts of CPU and not being scheduled when SLUB is enabled during large
    amounts of file copying.  It is expected that this is due to kswapd
    missing every cond_resched() point because;

    shrink_page_list() calls cond_resched() if inactive pages were isolated
            which in turn may not happen if all_unreclaimable is set in
            shrink_zones(). If for whatver reason, all_unreclaimable is
            set on all zones, we can miss calling cond_resched().

    balance_pgdat() only calls cond_resched if the zones are not
            balanced. For a high-order allocation that is balanced, it
            checks order-0 again. During that window, order-0 might have
            become unbalanced so it loops again for order-0 and returns
            that it was reclaiming for order-0 to kswapd(). It can then
            find that a caller has rewoken kswapd for a high-order and
            re-enters balance_pgdat() without ever calling cond_resched().

    shrink_slab only calls cond_resched() if we are reclaiming slab
    	pages. If there are a large number of direct reclaimers, the
    	shrinker_rwsem can be contended and prevent kswapd calling
    	cond_resched().

    This patch modifies the shrink_slab() case.  If the semaphore is
    contended, the caller will still check cond_resched().  After each
    successful call into a shrinker, the check for cond_resched() remains in
    case one shrinker is particularly slow.

    [[email protected]: preserve call to cond_resched after each call into shrinker]
    Signed-off-by: Mel Gorman <[email protected]>
    Signed-off-by: Minchan Kim <[email protected]>
    Cc: Rik van Riel <[email protected]>
    Cc: Johannes Weiner <[email protected]>
    Cc: Wu Fengguang <[email protected]>
    Cc: James Bottomley <[email protected]>
    Tested-by: Colin King <[email protected]>
    Cc: Raghavendra D Prabhu <[email protected]>
    Cc: Jan Kara <[email protected]>
    Cc: Chris Mason <[email protected]>
    Cc: Christoph Lameter <[email protected]>
    Cc: Pekka Enberg <[email protected]>
    Cc: Rik van Riel <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

memcg: more mem_cgroup_uncharge() batching

    It seems odd that truncate_inode_pages_range(), called not only when
    truncating but also when evicting inodes, has mem_cgroup_uncharge_start
    and _end() batching in its second loop to clear up a few leftovers, but
    not in its first loop that does almost all the work: add them there too.

    Signed-off-by: Hugh Dickins <[email protected]>
    Acked-by: KAMEZAWA Hiroyuki <[email protected]>
    Acked-by: Balbir Singh <[email protected]>
    Acked-by: Daisuke Nishimura <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

mm/migrate.c: fix compilation error

    GCC complained about update_mmu_cache() not being defined in migrate.c.
    Including <asm/tlbflush.h> seems to solve the problem.

    Signed-off-by: Michal Nazarewicz <[email protected]>
    Signed-off-by: Kyungmin Park <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Jun 9, 2012
commit a18a920 upstream.

This patch validates sdev pointer in scsi_dh_activate before proceeding further.

Without this check we might see the panic as below. I have seen this
panic multiple times..

Call trace:

 #0 [ffff88007d647b50] machine_kexec at ffffffff81020902
 #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0
 #2 [ffff88007d647c70] oops_end at ffffffff8139c650
 coolya#3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15
 coolya#4 [ffff88007d647d50] page_fault at ffffffff8139b8cf
    [exception RIP: scsi_dh_activate+0x82]
    RIP: ffffffffa0041922  RSP: ffff88007d647e00  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 00000000000093c5
    RDX: 00000000000093c5  RSI: ffffffffa02e6640  RDI: ffff88007cc88988
    RBP: 000000000000000f   R8: ffff88007d646000   R9: 0000000000000000
    R10: ffff880082293790  R11: 00000000ffffffff  R12: ffff88007cc88988
    R13: 0000000000000000  R14: 0000000000000286  R15: ffff880037b845e0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 coolya#5 [ffff88007d647e38] run_workqueue at ffffffff81060268
 coolya#6 [ffff88007d647e78] worker_thread at ffffffff81060386
 coolya#7 [ffff88007d647ee8] kthread at ffffffff81064436
 coolya#8 [ffff88007d647f48] kernel_thread at ffffffff81003fba

Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Jun 9, 2012
…(try coolya#5)

commit 1788ea6 upstream.

commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return
from an OPEN call. Prior to that patch, that caused the client to fall
back to doing a normal lookup. When that patch went in, the code began
returning that error to userspace. The d_revalidate codepath however
never had the corresponding change, so it was still possible to end up
with a NULL ctx->state pointer after that.

That patch caused a regression. When we attempt to open a directory that
does not have a cached dentry, that open now errors out with EISDIR. If
you attempt the same open with a cached dentry, it will succeed.

Fix this by reverting the change in nfs_atomic_lookup and allowing
attempts to open directories to fall back to a normal lookup

Also, add a NFSv4-specific f_ops->open routine that just returns
-ENOTDIR. This should never be called if things are working properly,
but if it ever is, then the dprintk may help in debugging.

To facilitate this, a new file_operations field is also added to the
nfs_rpc_ops struct.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
hiepgia added a commit to hiepgia/subZero that referenced this pull request Jun 13, 2012
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Jun 27, 2012
…condition

commit 26c1917 upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 coolya#3 [f06a9e64] bad_area_nosemaphore at c043401a
 coolya#4 [f06a9e6c] __do_page_fault at c0434493
 coolya#5 [f06a9eec] do_page_fault at c083eb45
 coolya#6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 coolya#7 [f06a9f38] _spin_lock at c083bc14
 coolya#8 [f06a9f44] sys_mincore at c0507b7d
 coolya#9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <[email protected]>
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Larry Woodman <[email protected]>
Cc: Petr Matousek <[email protected]>
Cc: Rik van Riel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Aug 12, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     coolya#3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     coolya#4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     coolya#5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     coolya#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     coolya#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     coolya#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     coolya#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    coolya#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    coolya#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    coolya#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Oct 15, 2012
commit 160c942 upstream.

Interface coolya#5 on ZTE MF683 is a QMI/wwan interface.

Signed-off-by: Bjørn Mork <[email protected]>
Cc: Shawn J. Goff <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Nov 18, 2012
commit 3e7abe2 upstream.

When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below.  It looks like a legitimate lock
ordering problem:

 - domain_context_mapping_one() takes iommu->lock and calls
   iommu_support_dev_iotlb(), which takes device_domain_lock (inside
   iommu->lock).

 - domain_remove_one_dev_info() starts by taking device_domain_lock
   then takes iommu->lock inside it (near the end of the function).

So this is the classic AB-BA deadlock.  It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.

BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:

  iommu_support_dev_iotlb()
                                          domain_remove_dev_info()

  lock device_domain_lock
    find info
  unlock device_domain_lock

                                          lock device_domain_lock
                                            find same info
                                          unlock device_domain_lock

                                          free_devinfo_mem(info)

  do stuff with info after it's free

However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.

Anyway here's the full lockdep output that prompted all of this:

     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.39.1+ #1
     -------------------------------------------------------
     bash/13954 is trying to acquire lock:
      (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230

     but task is already holding lock:
      (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #1 (device_domain_lock){-.-...}:
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
            [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
            [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
            [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
            [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
            [<ffffffff81002165>] do_one_initcall+0x45/0x190
            [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
            [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10

     -> #0 (&(&iommu->lock)->rlock){......}:
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

     other info that might help us debug this:

     6 locks held by bash/13954:
      #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
      #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
      #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
      coolya#3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
      coolya#4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
      coolya#5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     stack backtrace:
     Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
     Call Trace:
      [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
      [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
      [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      [<ffffffff812f8b42>] device_notifier+0x72/0x90
      [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
      [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
      [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
      [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
      [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
      [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
      [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
      [<ffffffff8117569e>] vfs_write+0xce/0x190
      [<ffffffff811759e4>] sys_write+0x54/0xa0
      [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DerTeufel pushed a commit to DerTeufel/samsung-kernel-aries that referenced this pull request Dec 12, 2012
commit 412d32e upstream.

A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 coolya#3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 coolya#4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 coolya#5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 coolya#6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 coolya#7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 coolya#8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 coolya#9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
…>cpu

A kernel panic was observed when passing the sc->request->cpu = -1 to
retrieve the per_cpu variable pointer:
 #0 [ffff880011203960] machine_kexec at ffffffff81022bc3
 #1 [ffff8800112039b0] crash_kexec at ffffffff81088630
 CyanogenMod#2 [ffff880011203a80] __die at ffffffff8139ea20
 CyanogenMod#3 [ffff880011203aa0] no_context at ffffffff8102f3a7
 CyanogenMod#4 [ffff880011203ae0] __bad_area_nosemaphore at ffffffff8102f665
 CyanogenMod#5 [ffff880011203ba0] retint_signal at ffffffff8139dd1f
 coolya#6 [ffff880011203cc8] bnx2i_indicate_kcqe at ffffffffa03dc4f2
 coolya#7 [ffff880011203da8] service_kcqes at ffffffffa03cb04f
 coolya#8 [ffff880011203e68] cnic_service_bnx2x_kcq at ffffffffa03cb14a
 coolya#9 [ffff880011203e88] cnic_service_bnx2x_bh at ffffffffa03cb1b3

The problem lies in the slow path sg_io (and perhaps sg_scsi_ioctl) call to
blk_get_request->get_request/wait->blk_alloc_request->blk_rq_init which
re-initializes the request->cpu to -1.  There is no assignment for cpu from
that to the request_fn call to low level drivers.

When this happens, the sc->request->cpu will be using the init value of
-1.  This will create a kernel panic when it hits bnx2i because the code
refers it to get the per_cpu variables ptr.

This change is to put in a guard against that and also for cases when
bio affinity/queue completion to the same cpu is not enabled.  In those
cases, the request->cpu will remain a -1 also.

This bug was created from commit:  b5cf6b6

For the case when the blk layer did not setup the request->cpu, bnx2i
will complete the sc with the current CPU of the thread.

Signed-off-by: Eddie Wai <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
The rtnl cannot be held durrng the fcoe_interface_put.
If it is the last reference on the fcoe_interface the
fcoe_ctlr_destroy will be called as a part of the
cleanup, ultimately calling cancel_work_sync(&fip->recv_work);

If we are processing a flogi response we will be in
the recv_work context and we will lock the rtnl to
add a new unicast MAC address. This is how the deadlock
can occur.

The fix is simply to move the rtnl_lock/unlock into
fcoe_interface_cleanup so that it can be unlocked before
fcoe_interface_put is called.

Here is the lockdep report:

Jul 21 11:26:35 bubba [  223.870702]
ul 21 11:26:35 bubba [  223.870704] =======================================================
Jul 21 11:26:35 bubba [  223.871255] [ INFO: possible circular locking dependency detected ]
Jul 21 11:26:35 bubba [  223.871530] 3.0.0-rc7+ #1
Jul 21 11:26:35 bubba [  223.871797] -------------------------------------------------------
Jul 21 11:26:35 bubba [  223.872072] lockdeptest.sh/3464 is trying to acquire lock:
Jul 21 11:26:35 bubba [  223.872345]  ((&fip->recv_work)
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff810531f1>] wait_on_work+0x0/0xbd
Jul 21 11:26:35 bubba [  223.873022]
Jul 21 11:26:35 bubba [  223.873023] but task is already holding lock:
Jul 21 11:26:35 bubba [  223.873555]  (rtnl_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
Jul 21 11:26:35 bubba [  223.874229]
Jul 21 11:26:35 bubba [  223.874230] which lock already depends on the new lock.
Jul 21 11:26:35 bubba [  223.874231]
Jul 21 11:26:35 bubba [  223.875032]
Jul 21 11:26:35 bubba [  223.875033] the existing dependency chain (in reverse order) is:
Jul 21 11:26:35 bubba [  223.875573]
Jul 21 11:26:35 bubba [  223.875573] -> #1
Jul 21 11:26:35 bubba (rtnl_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba :
Jul 21 11:26:35 bubba [  223.876301]
Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
Jul 21 11:26:35 bubba [  223.876645]
Jul 21 11:26:35 bubba [<ffffffff8151d975>] __mutex_lock_common+0x47/0x30d
Jul 21 11:26:35 bubba [  223.876991]
Jul 21 11:26:35 bubba [<ffffffff8151dd36>] mutex_lock_nested+0x3b/0x40
Jul 21 11:26:35 bubba [  223.877334]
Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
Jul 21 11:26:35 bubba [  223.877675]
Jul 21 11:26:35 bubba [<ffffffffa003d5a0>] fcoe_update_src_mac+0x2b/0x80 [fcoe]
Jul 21 11:26:35 bubba [  223.878022]
Jul 21 11:26:35 bubba [<ffffffffa003d698>] fcoe_flogi_resp+0x5e/0x79 [fcoe]
Jul 21 11:26:35 bubba [  223.878366]
Jul 21 11:26:35 bubba [<ffffffffa001566f>] fc_exch_recv+0x7f5/0x9da [libfc]
Jul 21 11:26:35 bubba [  223.878713]
Jul 21 11:26:35 bubba [<ffffffffa00327d8>] fcoe_ctlr_recv_work+0x71f/0x10dc [libfcoe]
Jul 21 11:26:35 bubba [  223.879258]
Jul 21 11:26:35 bubba [<ffffffff81053761>] process_one_work+0x1d7/0x347
Jul 21 11:26:35 bubba [  223.879601]
Jul 21 11:26:35 bubba [<ffffffff81054ade>] worker_thread+0xf8/0x17c
Jul 21 11:26:35 bubba [  223.879944]
Jul 21 11:26:35 bubba [<ffffffff81058184>] kthread+0x7d/0x85
Jul 21 11:26:35 bubba [  223.880287]
Jul 21 11:26:35 bubba [<ffffffff81526414>] kernel_thread_helper+0x4/0x10
Jul 21 11:26:35 bubba [  223.880634]
Jul 21 11:26:35 bubba [  223.880635] -> #0
Jul 21 11:26:35 bubba ((&fip->recv_work)
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba :
Jul 21 11:26:35 bubba [  223.881357]
Jul 21 11:26:35 bubba [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c
Jul 21 11:26:35 bubba [  223.881695]
Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
Jul 21 11:26:35 bubba [  223.882033]
Jul 21 11:26:35 bubba [<ffffffff81053241>] wait_on_work+0x50/0xbd
Jul 21 11:26:35 bubba [  223.882378]
Jul 21 11:26:35 bubba [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4
Jul 21 11:26:35 bubba [  223.882718]
Jul 21 11:26:35 bubba [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd
Jul 21 11:26:35 bubba [  223.883057]
Jul 21 11:26:35 bubba [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe]
Jul 21 11:26:35 bubba [  223.883399]
Jul 21 11:26:35 bubba [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe]
Jul 21 11:26:35 bubba [  223.883940]
Jul 21 11:26:35 bubba [<ffffffff811fbbe6>] kref_put+0x43/0x4d
Jul 21 11:26:35 bubba [  223.884280]
Jul 21 11:26:35 bubba [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe]
Jul 21 11:26:35 bubba [  223.884624]
Jul 21 11:26:35 bubba [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe]
Jul 21 11:26:35 bubba [  223.885163]
Jul 21 11:26:35 bubba [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe]
Jul 21 11:26:35 bubba [  223.885502]
Jul 21 11:26:35 bubba [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe]
Jul 21 11:26:35 bubba [  223.886045]
Jul 21 11:26:35 bubba [<ffffffff81056153>] param_attr_store+0x43/0x62
Jul 21 11:26:35 bubba [  223.886385]
Jul 21 11:26:35 bubba [<ffffffff8105602d>] module_attr_store+0x21/0x25
Jul 21 11:26:35 bubba [  223.886728]
Jul 21 11:26:35 bubba [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f
Jul 21 11:26:35 bubba [  223.887068]
Jul 21 11:26:35 bubba [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa
Jul 21 11:26:35 bubba [  223.887406]
Jul 21 11:26:35 bubba [<ffffffff810f4073>] sys_write+0x45/0x69
Jul 21 11:26:35 bubba [  223.887742]
Jul 21 11:26:35 bubba [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b
Jul 21 11:26:35 bubba [  223.888083]
Jul 21 11:26:35 bubba [  223.888084] other info that might help us debug this:
Jul 21 11:26:35 bubba [  223.888085]
Jul 21 11:26:35 bubba [  223.888879]  Possible unsafe locking scenario:
Jul 21 11:26:35 bubba [  223.888881]
Jul 21 11:26:35 bubba [  223.889411]        CPU0                    CPU1
Jul 21 11:26:35 bubba [  223.889683]        ----                    ----
Jul 21 11:26:35 bubba [  223.889955]   lock(
Jul 21 11:26:35 bubba rtnl_mutex
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.890349]                                lock(
Jul 21 11:26:35 bubba (&fip->recv_work)
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.890751]                                lock(
Jul 21 11:26:35 bubba rtnl_mutex
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.891154]   lock(
Jul 21 11:26:35 bubba (&fip->recv_work)
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.891549]
Jul 21 11:26:35 bubba [  223.891550]  *** DEADLOCK ***
Jul 21 11:26:35 bubba [  223.891551]
Jul 21 11:26:35 bubba [  223.892347] 6 locks held by lockdeptest.sh/3464:
Jul 21 11:26:35 bubba [  223.892621]  #0:
Jul 21 11:26:35 bubba (&buffer->mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff8114c171>] sysfs_write_file+0x37/0x13f
Jul 21 11:26:35 bubba [  223.893359]  #1:
Jul 21 11:26:35 bubba (s_active
Jul 21 11:26:35 bubba ){++++.+}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff8114c21c>] sysfs_write_file+0xe2/0x13f
Jul 21 11:26:35 bubba [  223.894094]  CyanogenMod#2:
Jul 21 11:26:35 bubba (param_lock
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff81056146>] param_attr_store+0x36/0x62
Jul 21 11:26:35 bubba [  223.894835]  CyanogenMod#3:
Jul 21 11:26:35 bubba (ft_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffffa0034017>] fcoe_transport_destroy+0x1e/0x110 [libfcoe]
Jul 21 11:26:35 bubba [  223.895574]  CyanogenMod#4:
Jul 21 11:26:35 bubba (fcoe_config_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffffa003f2c9>] fcoe_destroy+0x18/0x72 [fcoe]
Jul 21 11:26:35 bubba [  223.896314]  CyanogenMod#5:
Jul 21 11:26:35 bubba (rtnl_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
Jul 21 11:26:35 bubba [  223.897047]
Jul 21 11:26:35 bubba [  223.897048] stack backtrace:
Jul 21 11:26:35 bubba [  223.897578] Pid: 3464, comm: lockdeptest.sh Not tainted 3.0.0-rc7+ #1
Jul 21 11:26:35 bubba [  223.897853] Call Trace:
Jul 21 11:26:35 bubba [  223.898128]  [<ffffffff81068e16>] print_circular_bug+0x1f8/0x209
Jul 21 11:26:35 bubba [  223.898416]  [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c
Jul 21 11:26:35 bubba [  223.898699]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
Jul 21 11:26:35 bubba [  223.898982]  [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
Jul 21 11:26:35 bubba [  223.899263]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
Jul 21 11:26:35 bubba [  223.899547]  [<ffffffff8104a097>] ? mod_timer+0x8f/0x98
Jul 21 11:26:35 bubba [  223.899827]  [<ffffffff81053241>] wait_on_work+0x50/0xbd
Jul 21 11:26:35 bubba [  223.900108]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
Jul 21 11:26:35 bubba [  223.900390]  [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4
Jul 21 11:26:35 bubba [  223.900671]  [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd
Jul 21 11:26:35 bubba [  223.900953]  [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe]
Jul 21 11:26:35 bubba [  223.901237]  [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe]
Jul 21 11:26:35 bubba [  223.901522]  [<ffffffffa003e4fd>] ? fcoe_enable+0x6b/0x6b [fcoe]
Jul 21 11:26:35 bubba [  223.901803]  [<ffffffff811fbbe6>] kref_put+0x43/0x4d
Jul 21 11:26:35 bubba [  223.902083]  [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe]
Jul 21 11:26:35 bubba [  223.902367]  [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe]
Jul 21 11:26:35 bubba [  223.902653]  [<ffffffff8151dd36>] ? mutex_lock_nested+0x3b/0x40
Jul 21 11:26:35 bubba [  223.902939]  [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe]
Jul 21 11:26:35 bubba [  223.903223]  [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe]
Jul 21 11:26:35 bubba [  223.903508]  [<ffffffff81056153>] param_attr_store+0x43/0x62
Jul 21 11:26:35 bubba [  223.903792]  [<ffffffff8105602d>] module_attr_store+0x21/0x25
Jul 21 11:26:35 bubba [  223.904075]  [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f
Jul 21 11:26:35 bubba [  223.904357]  [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa
Jul 21 11:26:35 bubba [  223.904642]  [<ffffffff810f51d6>] ? fget_light+0x35/0x96
Jul 21 11:26:35 bubba [  223.904923]  [<ffffffff810f4073>] sys_write+0x45/0x69
Jul 21 11:26:35 bubba [  223.905204]  [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b
Jul 21 11:26:36 bubba [  223.964438] ixgbe 0000:05:00.0: eth3: detected SFP+: 5
Jul 21 11:26:37 bubba [  225.196702] ixgbe 0000:05:00.0: eth3: NIC Link is Up 10 Gbps, Flow Control: None

Signed-off-by: Robert Love <[email protected]>
Tested-by: Ross Brattain <[email protected]>
Reviewed-by: Yi Zou <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
I got:
	Generating server: Tehuti.onmicrosoft.com

	[email protected]
	#< CyanogenMod#5.1.1 smtp;550 5.1.1 RESOLVER.ADR.RecipNotFound; not found> #SMTP#

Signed-off-by: Ian Campbell <[email protected]>
Cc: Alexander Indenbaum <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below.  It looks like a legitimate lock
ordering problem:

 - domain_context_mapping_one() takes iommu->lock and calls
   iommu_support_dev_iotlb(), which takes device_domain_lock (inside
   iommu->lock).

 - domain_remove_one_dev_info() starts by taking device_domain_lock
   then takes iommu->lock inside it (near the end of the function).

So this is the classic AB-BA deadlock.  It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.

BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:

  iommu_support_dev_iotlb()
                                          domain_remove_dev_info()

  lock device_domain_lock
    find info
  unlock device_domain_lock

                                          lock device_domain_lock
                                            find same info
                                          unlock device_domain_lock

                                          free_devinfo_mem(info)

  do stuff with info after it's free

However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.

Anyway here's the full lockdep output that prompted all of this:

     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.39.1+ #1
     -------------------------------------------------------
     bash/13954 is trying to acquire lock:
      (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230

     but task is already holding lock:
      (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #1 (device_domain_lock){-.-...}:
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
            [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
            [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
            [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
            [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
            [<ffffffff81002165>] do_one_initcall+0x45/0x190
            [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
            [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10

     -> #0 (&(&iommu->lock)->rlock){......}:
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

     other info that might help us debug this:

     6 locks held by bash/13954:
      #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
      #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
      CyanogenMod#2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
      CyanogenMod#3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
      CyanogenMod#4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
      CyanogenMod#5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     stack backtrace:
     Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
     Call Trace:
      [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
      [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
      [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      [<ffffffff812f8b42>] device_notifier+0x72/0x90
      [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
      [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
      [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
      [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
      [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
      [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
      [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
      [<ffffffff8117569e>] vfs_write+0xce/0x190
      [<ffffffff811759e4>] sys_write+0x54/0xa0
      [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
BUG: sleeping function called from invalid context at /local/scratch/dariof/linux/kernel/mutex.c:271
in_atomic(): 1, irqs_disabled(): 0, pid: 3256, name: qemu-dm
1 lock held by qemu-dm/3256:
 #0:  (&(&priv->lock)->rlock){......}, at: [<ffffffff813223da>] gntdev_ioctl+0x2bd/0x4d5
Pid: 3256, comm: qemu-dm Tainted: G        W   3.1.0-rc8+ CyanogenMod#5
Call Trace:
 [<ffffffff81054594>] __might_sleep+0x131/0x135
 [<ffffffff816bd64f>] mutex_lock_nested+0x25/0x45
 [<ffffffff8131c7c8>] free_xenballooned_pages+0x20/0xb1
 [<ffffffff8132194d>] gntdev_put_map+0xa8/0xdb
 [<ffffffff816be546>] ? _raw_spin_lock+0x71/0x7a
 [<ffffffff813223da>] ? gntdev_ioctl+0x2bd/0x4d5
 [<ffffffff8132243c>] gntdev_ioctl+0x31f/0x4d5
 [<ffffffff81007d62>] ? check_events+0x12/0x20
 [<ffffffff811433bc>] do_vfs_ioctl+0x488/0x4d7
 [<ffffffff81007d4f>] ? xen_restore_fl_direct_reloc+0x4/0x4
 [<ffffffff8109168b>] ? lock_release+0x21c/0x229
 [<ffffffff81135cdd>] ? rcu_read_unlock+0x21/0x32
 [<ffffffff81143452>] sys_ioctl+0x47/0x6a
 [<ffffffff816bfd82>] system_call_fastpath+0x16/0x1b

gntdev_put_map tries to acquire a mutex when freeing pages back to the
xenballoon pool, so it cannot be called with a spinlock held. In
gntdev_release, the spinlock is not needed as we are freeing the
structure later; in the ioctl, only the list manipulation needs to be
under the lock.

Reported-and-Tested-By: Dario Faggioli <[email protected]>
Signed-off-by: Daniel De Graaf <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
This patch validates sdev pointer in scsi_dh_activate before proceeding further.

Without this check we might see the panic as below. I have seen this
panic multiple times..

Call trace:

 #0 [ffff88007d647b50] machine_kexec at ffffffff81020902
 #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0
 CyanogenMod#2 [ffff88007d647c70] oops_end at ffffffff8139c650
 CyanogenMod#3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15
 CyanogenMod#4 [ffff88007d647d50] page_fault at ffffffff8139b8cf
    [exception RIP: scsi_dh_activate+0x82]
    RIP: ffffffffa0041922  RSP: ffff88007d647e00  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 00000000000093c5
    RDX: 00000000000093c5  RSI: ffffffffa02e6640  RDI: ffff88007cc88988
    RBP: 000000000000000f   R8: ffff88007d646000   R9: 0000000000000000
    R10: ffff880082293790  R11: 00000000ffffffff  R12: ffff88007cc88988
    R13: 0000000000000000  R14: 0000000000000286  R15: ffff880037b845e0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 CyanogenMod#5 [ffff88007d647e38] run_workqueue at ffffffff81060268
 coolya#6 [ffff88007d647e78] worker_thread at ffffffff81060386
 coolya#7 [ffff88007d647ee8] kthread at ffffffff81064436
 coolya#8 [ffff88007d647f48] kernel_thread at ffffffff81003fba

Signed-off-by: Babu Moger <[email protected]>
Cc: [email protected]
Signed-off-by: James Bottomley <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
…(try CyanogenMod#5)

commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return
from an OPEN call. Prior to that patch, that caused the client to fall
back to doing a normal lookup. When that patch went in, the code began
returning that error to userspace. The d_revalidate codepath however
never had the corresponding change, so it was still possible to end up
with a NULL ctx->state pointer after that.

That patch caused a regression. When we attempt to open a directory that
does not have a cached dentry, that open now errors out with EISDIR. If
you attempt the same open with a cached dentry, it will succeed.

Fix this by reverting the change in nfs_atomic_lookup and allowing
attempts to open directories to fall back to a normal lookup

Also, add a NFSv4-specific f_ops->open routine that just returns
-ENOTDIR. This should never be called if things are working properly,
but if it ever is, then the dprintk may help in debugging.

To facilitate this, a new file_operations field is also added to the
nfs_rpc_ops struct.

Cc: [email protected]
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
…inux-nfs

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Revert pnfs ugliness from the generic NFS read code path
  SUNRPC: destroy freshly allocated transport in case of sockaddr init error
  NFS: Fix a regression in the referral code
  nfs: move nfs_file_operations declaration to bottom of file.c (try CyanogenMod#2)
  nfs: when attempting to open a directory, fall back on normal lookup (try CyanogenMod#5)
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.

 gdb> bt
 #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 CyanogenMod#2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
 ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
 CyanogenMod#3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
 os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
 CyanogenMod#4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
 xffff88001e26be40) at mm/filemap.c:2600
 CyanogenMod#5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
 zed out>, pos=<value optimized out>) at mm/filemap.c:2632
 coolya#6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
 t fs/ext4/file.c:136
 coolya#7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
 ppos=0xffff88001e26bf48) at fs/read_write.c:406
 coolya#8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
 000, pos=0xffff88001e26bf48) at fs/read_write.c:435
 coolya#9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
 4000) at fs/read_write.c:487
 coolya#10 <signal handler called>
 coolya#11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
 coolya#12 0x0000000000000000 in ?? ()
 gdb> print offset
 $22 = 0xffffffffffffffff
 gdb> print idx
 $23 = 0xffffffff
 gdb> print inode->i_blkbits
 $24 = 0xc
 gdb> up
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 2512                    if (ext4_da_should_update_i_disksize(page, end)) {
 gdb> print start
 $25 = 0x0
 gdb> print end
 $26 = 0xffffffffffffffff
 gdb> print pos
 $27 = 0x108000
 gdb> print new_i_size
 $28 = 0x108000
 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
 $29 = 0xd9000
 gdb> down
 2467            for (i = 0; i < idx; i++)
 gdb> print i
 $30 = 0xd44acbee

This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.

Signed-off-by: Andrea Arcangeli <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: [email protected]
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 CyanogenMod#2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 CyanogenMod#3 [d72d3bec] bad_area at c0227fb6
 CyanogenMod#4 [d72d3c00] do_page_fault at c05c72ec
 CyanogenMod#5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 coolya#6 [d72d3cb4] isolate_migratepages at c030b15a
 coolya#7 [d72d3d14] zone_watermark_ok at c02d26cb
 coolya#8 [d72d3d2c] compact_zone at c030b8de
 coolya#9 [d72d3d68] compact_zone_order at c030bba1
coolya#10 [d72d3db4] try_to_compact_pages at c030bc84
coolya#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
coolya#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <[email protected]>
Tested-by: Herbert van den Bergh <[email protected]>
Signed-off-by: Mel Gorman <[email protected]>
Acked-by: Michal Nazarewicz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries May 6, 2013
… CPUs

commit a956bd6 upstream.

Loading the microcode driver on an unsupported CPU and subsequently
unloading the driver causes

 WARNING: at fs/sysfs/group.c:138 mc_device_remove+0x5f/0x70 [microcode]()
 Hardware name: 01972NG
 sysfs group ffffffffa00013d0 not found for kobject 'cpu0'
 Modules linked in: snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel btusb snd_hda_codec bluetooth thinkpad_acpi rfkill microcode(-) [last unloaded: cfg80211]
 Pid: 4560, comm: modprobe Not tainted 3.4.0-rc2-00002-g258f742 CyanogenMod#5
 Call Trace:
  [<ffffffff8103113b>] ? warn_slowpath_common+0x7b/0xc0
  [<ffffffff81031235>] ? warn_slowpath_fmt+0x45/0x50
  [<ffffffff81120e74>] ? sysfs_remove_group+0x34/0x120
  [<ffffffffa00000ef>] ? mc_device_remove+0x5f/0x70 [microcode]
  [<ffffffff81331eb9>] ? subsys_interface_unregister+0x69/0xa0
  [<ffffffff81563526>] ? mutex_lock+0x16/0x40
  [<ffffffffa0000c3e>] ? microcode_exit+0x50/0x92 [microcode]
  [<ffffffff8107051d>] ? sys_delete_module+0x16d/0x260
  [<ffffffff810a0065>] ? wait_iff_congested+0x45/0x110
  [<ffffffff815656af>] ? page_fault+0x1f/0x30
  [<ffffffff81565ba2>] ? system_call_fastpath+0x16/0x1b

on recent kernels.

This is due to commit 8a25a2f ("cpu: convert 'cpu' and
'machinecheck' sysdev_class to a regular subsystem") which renders
commit 6c53cbf ("x86, microcode: Correct sysdev_add error path")
useless.

See http://marc.info/?l=linux-kernel&m=133416246406478

Avoid above warning by restoring the old driver behaviour before
6c53cbf ("x86, microcode: Correct sysdev_add error path").

Cc: Tigran Aivazian <[email protected]>
Signed-off-by: Andreas Herrmann <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
[bwh: Backported to 3.2: deleted line uses sys_dev, not dev]
Signed-off-by: Ben Hutchings <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
xfs_sync_worker checks the MS_ACTIVE flag in s_flags to avoid doing
work during mount and unmount.  This flag can be cleared by unmount
after the xfs_sync_worker checks it but before the work is completed.
The has caused crashes in the completion handler for the dummy
transaction commited by xfs_sync_worker:

PID: 27544  TASK: ffff88013544e040  CPU: 3   COMMAND: "kworker/3:0"
 #0 [ffff88016fdff930] machine_kexec at ffffffff810244e9
 #1 [ffff88016fdff9a0] crash_kexec at ffffffff8108d053
 #2 [ffff88016fdffa70] oops_end at ffffffff813ad1b8
 CyanogenMod#3 [ffff88016fdffaa0] no_context at ffffffff8102bd48
 CyanogenMod#4 [ffff88016fdffaf0] __bad_area_nosemaphore at ffffffff8102c04d
 CyanogenMod#5 [ffff88016fdffb40] bad_area_nosemaphore at ffffffff8102c12e
 coolya#6 [ffff88016fdffb50] do_page_fault at ffffffff813afaee
 coolya#7 [ffff88016fdffc60] page_fault at ffffffff813ac635
    [exception RIP: xlog_get_lowest_lsn+0x30]
    RIP: ffffffffa04a9910  RSP: ffff88016fdffd10  RFLAGS: 00010246
    RAX: ffffc90014e48000  RBX: ffff88014d879980  RCX: ffff88014d879980
    RDX: ffff8802214ee4c0  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88016fdffd10   R8: ffff88014d879a80   R9: 0000000000000000
    R10: 0000000000000001  R11: 0000000000000000  R12: ffff8802214ee400
    R13: ffff88014d879980  R14: 0000000000000000  R15: ffff88022fd96605
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 coolya#8 [ffff88016fdffd18] xlog_state_do_callback at ffffffffa04aa186 [xfs]
 coolya#9 [ffff88016fdffd98] xlog_state_done_syncing at ffffffffa04aa568 [xfs]

Protect xfs_sync_worker by using the s_umount semaphore at the read
level to provide exclusion with unmount while work is progressing.

Reviewed-by: Mark Tinguely <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
The logic that allows to have a short TFD queue was completely wrong.
We do maintain 256 Transmit Frame Descriptors, but they point to
recycled buffers. We used to attach and de-attach different TFDs for
the same buffer and it worked since they pointed to the same buffer.

Also zero the number of BDs after unmapping a TFD. This seems not
necessary since we don't reclaim the same TFD twice, but I like
housekeeping.

This patch solves this warning:

[ 6427.079855] WARNING: at lib/dma-debug.c:866 check_unmap+0x727/0x7a0()
[ 6427.079859] Hardware name: Latitude E6410
[ 6427.079865] iwlwifi 0000:02:00.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x00000000296d393c] [size=8 bytes]
[ 6427.079870] Modules linked in: ...
[ 6427.079950] Pid: 6613, comm: ifconfig Tainted: G           O 3.3.3 CyanogenMod#5
[ 6427.079954] Call Trace:
[ 6427.079963]  [<c10337a2>] warn_slowpath_common+0x72/0xa0
[ 6427.079982]  [<c1033873>] warn_slowpath_fmt+0x33/0x40
[ 6427.079988]  [<c12dcb77>] check_unmap+0x727/0x7a0
[ 6427.079995]  [<c12dcdaa>] debug_dma_unmap_page+0x5a/0x80
[ 6427.080024]  [<fe2312ac>] iwlagn_unmap_tfd+0x12c/0x180 [iwlwifi]
[ 6427.080048]  [<fe231349>] iwlagn_txq_free_tfd+0x49/0xb0 [iwlwifi]
[ 6427.080071]  [<fe228e37>] iwl_tx_queue_unmap+0x67/0x90 [iwlwifi]
[ 6427.080095]  [<fe22d221>] iwl_trans_pcie_stop_device+0x341/0x7b0 [iwlwifi]
[ 6427.080113]  [<fe204b0e>] iwl_down+0x17e/0x260 [iwlwifi]
[ 6427.080132]  [<fe20efec>] iwlagn_mac_stop+0x6c/0xf0 [iwlwifi]
[ 6427.080168]  [<fd8480ce>] ieee80211_stop_device+0x5e/0x190 [mac80211]
[ 6427.080198]  [<fd833208>] ieee80211_do_stop+0x288/0x620 [mac80211]
[ 6427.080243]  [<fd8335b7>] ieee80211_stop+0x17/0x20 [mac80211]
[ 6427.080250]  [<c148dac1>] __dev_close_many+0x81/0xd0
[ 6427.080270]  [<c148db3d>] __dev_close+0x2d/0x50
[ 6427.080276]  [<c148d152>] __dev_change_flags+0x82/0x150
[ 6427.080282]  [<c148e3e3>] dev_change_flags+0x23/0x60
[ 6427.080289]  [<c14f6320>] devinet_ioctl+0x6a0/0x770
[ 6427.080296]  [<c14f8705>] inet_ioctl+0x95/0xb0
[ 6427.080304]  [<c147a0f0>] sock_ioctl+0x70/0x270

Cc: [email protected]
Reported-by: Antonio Quartulli <[email protected]>
Tested-by: Antonio Quartulli <[email protected]>
Signed-off-by: Emmanuel Grumbach <[email protected]>
Reviewed-by: Wey-Yi W Guy <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
…condition

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 CyanogenMod#3 [f06a9e64] bad_area_nosemaphore at c043401a
 CyanogenMod#4 [f06a9e6c] __do_page_fault at c0434493
 CyanogenMod#5 [f06a9eec] do_page_fault at c083eb45
 coolya#6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 coolya#7 [f06a9f38] _spin_lock at c083bc14
 coolya#8 [f06a9f44] sys_mincore at c0507b7d
 coolya#9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <[email protected]>
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Larry Woodman <[email protected]>
Cc: Petr Matousek <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 CyanogenMod#3 CyanogenMod#4 CyanogenMod#5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU coolya#6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  coolya#6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  coolya#7 coolya#8 coolya#9 coolya#10 coolya#11 Ok.
[    0.807108] Booting Node   3, Processors  coolya#12 #13 #14 #15 #16 #17 Ok.
[    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <[email protected]>
Cc: Andreas Herrmann <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 CyanogenMod#3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 CyanogenMod#4 [eed63e50] do_writepages at c04f3e32
 CyanogenMod#5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 coolya#6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 coolya#7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 coolya#8 [eed63ecc] do_sync_write at c052d202
 coolya#9 [eed63f74] vfs_write at c052d4ee
coolya#10 [eed63f94] sys_write at c052df4c
coolya#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Cc: <[email protected]>
Reported-by: Jian Li <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Steve French <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     CyanogenMod#3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     CyanogenMod#4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     CyanogenMod#5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     coolya#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     coolya#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     coolya#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     coolya#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    coolya#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    coolya#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    coolya#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected]
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   CyanogenMod#3 [ffff880472a51cd0] die at ffffffff8100f26b
   CyanogenMod#4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   CyanogenMod#5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   coolya#6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   coolya#7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   coolya#8 [ffff880472a51f40] sys_times at ffffffff81088524
   coolya#9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Cc: [email protected]
Signed-off-by: Stanislaw Gruszka <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
This moves ARM over to the asm-generic/unaligned.h header. This has the
benefit of better code generated especially for ARMv7 on gcc 4.7+
compilers.

As Arnd Bergmann, points out: The asm-generic version uses the "struct"
version for native-endian unaligned access and the "byteshift" version
for the opposite endianess. The current ARM version however uses the
"byteshift" implementation for both.

Thanks to Nicolas Pitre for the excellent analysis:

Test case:

int foo (int *x) { return get_unaligned(x); }
long long bar (long long *x) { return get_unaligned(x); }

With the current ARM version:

foo:
	ldrb	r3, [r0, #2]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
	ldrb	r1, [r0, #1]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
	ldrb	r2, [r0, #0]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
	mov	r3, r3, asl #16	@ tmp154, MEM[(const u8 *)x_1(D) + 2B],
	ldrb	r0, [r0, CyanogenMod#3]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
	orr	r3, r3, r1, asl coolya#8	@, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
	orr	r3, r3, r2	@ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
	orr	r0, r3, r0, asl #24	@,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
	bx	lr	@

bar:
	stmfd	sp!, {r4, r5, r6, r7}	@,
	mov	r2, #0	@ tmp184,
	ldrb	r5, [r0, coolya#6]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
	ldrb	r4, [r0, CyanogenMod#5]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
	ldrb	ip, [r0, #2]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
	ldrb	r1, [r0, CyanogenMod#4]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
	mov	r5, r5, asl #16	@ tmp175, MEM[(const u8 *)x_1(D) + 6B],
	ldrb	r7, [r0, #1]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
	orr	r5, r5, r4, asl coolya#8	@, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
	ldrb	r6, [r0, coolya#7]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
	orr	r5, r5, r1	@ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
	ldrb	r4, [r0, #0]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
	mov	ip, ip, asl #16	@ tmp188, MEM[(const u8 *)x_1(D) + 2B],
	ldrb	r1, [r0, CyanogenMod#3]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
	orr	ip, ip, r7, asl coolya#8	@, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
	orr	r3, r5, r6, asl #24	@,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
	orr	ip, ip, r4	@ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
	orr	ip, ip, r1, asl #24	@, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
	mov	r1, r3	@,
	orr	r0, r2, ip	@ tmp171, tmp184, tmp194
	ldmfd	sp!, {r4, r5, r6, r7}
	bx	lr

In both cases the code is slightly suboptimal.  One may wonder why
wasting r2 with the constant 0 in the second case for example.  And all
the mov's could be folded in subsequent orr's, etc.

Now with the asm-generic version:

foo:
	ldr	r0, [r0, #0]	@ unaligned	@,* x
	bx	lr	@

bar:
	mov	r3, r0	@ x, x
	ldr	r0, [r0, #0]	@ unaligned	@,* x
	ldr	r1, [r3, CyanogenMod#4]	@ unaligned	@,
	bx	lr	@

This is way better of course, but only because this was compiled for
ARMv7. In this case the compiler knows that the hardware can do
unaligned word access.  This isn't that obvious for foo(), but if we
remove the get_unaligned() from bar as follows:

long long bar (long long *x) {return *x; }

then the resulting code is:

bar:
	ldmia	r0, {r0, r1}	@ x,,
	bx	lr	@

So this proves that the presumed aligned vs unaligned cases does have
influence on the instructions the compiler may use and that the above
unaligned code results are not just an accident.

Still... this isn't fully conclusive without at least looking at the
resulting assembly fron a pre ARMv6 compilation.  Let's see with an
ARMv5 target:

foo:
	ldrb	r3, [r0, #0]	@ zero_extendqisi2	@ tmp139,* x
	ldrb	r1, [r0, #1]	@ zero_extendqisi2	@ tmp140,
	ldrb	r2, [r0, #2]	@ zero_extendqisi2	@ tmp143,
	ldrb	r0, [r0, CyanogenMod#3]	@ zero_extendqisi2	@ tmp146,
	orr	r3, r3, r1, asl coolya#8	@, tmp142, tmp139, tmp140,
	orr	r3, r3, r2, asl #16	@, tmp145, tmp142, tmp143,
	orr	r0, r3, r0, asl #24	@,, tmp145, tmp146,
	bx	lr	@

bar:
	stmfd	sp!, {r4, r5, r6, r7}	@,
	ldrb	r2, [r0, #0]	@ zero_extendqisi2	@ tmp139,* x
	ldrb	r7, [r0, #1]	@ zero_extendqisi2	@ tmp140,
	ldrb	r3, [r0, CyanogenMod#4]	@ zero_extendqisi2	@ tmp149,
	ldrb	r6, [r0, CyanogenMod#5]	@ zero_extendqisi2	@ tmp150,
	ldrb	r5, [r0, #2]	@ zero_extendqisi2	@ tmp143,
	ldrb	r4, [r0, coolya#6]	@ zero_extendqisi2	@ tmp153,
	ldrb	r1, [r0, coolya#7]	@ zero_extendqisi2	@ tmp156,
	ldrb	ip, [r0, CyanogenMod#3]	@ zero_extendqisi2	@ tmp146,
	orr	r2, r2, r7, asl coolya#8	@, tmp142, tmp139, tmp140,
	orr	r3, r3, r6, asl coolya#8	@, tmp152, tmp149, tmp150,
	orr	r2, r2, r5, asl #16	@, tmp145, tmp142, tmp143,
	orr	r3, r3, r4, asl #16	@, tmp155, tmp152, tmp153,
	orr	r0, r2, ip, asl #24	@,, tmp145, tmp146,
	orr	r1, r3, r1, asl #24	@,, tmp155, tmp156,
	ldmfd	sp!, {r4, r5, r6, r7}
	bx	lr

Compared to the initial results, this is really nicely optimized and I
couldn't do much better if I were to hand code it myself.

Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Nicolas Pitre <[email protected]>
Tested-by: Thomas Petazzoni <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Signed-off-by: Russell King <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Fixes following lockdep splat :

[ 1614.734896] =============================================
[ 1614.734898] [ INFO: possible recursive locking detected ]
[ 1614.734901] 3.6.0-rc3+ #782 Not tainted
[ 1614.734903] ---------------------------------------------
[ 1614.734905] swapper/11/0 is trying to acquire lock:
[ 1614.734907]  (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.734920]
[ 1614.734920] but task is already holding lock:
[ 1614.734922]  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734932]
[ 1614.734932] other info that might help us debug this:
[ 1614.734935]  Possible unsafe locking scenario:
[ 1614.734935]
[ 1614.734937]        CPU0
[ 1614.734938]        ----
[ 1614.734940]   lock(slock-AF_INET);
[ 1614.734943]   lock(slock-AF_INET);
[ 1614.734946]
[ 1614.734946]  *** DEADLOCK ***
[ 1614.734946]
[ 1614.734949]  May be due to missing lock nesting notation
[ 1614.734949]
[ 1614.734952] 7 locks held by swapper/11/0:
[ 1614.734954]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00
[ 1614.734964]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0
[ 1614.734972]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230
[ 1614.734982]  CyanogenMod#3:  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734989]  CyanogenMod#4:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680
[ 1614.734997]  CyanogenMod#5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890
[ 1614.735004]  coolya#6:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00
[ 1614.735012]
[ 1614.735012] stack backtrace:
[ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782
[ 1614.735018] Call Trace:
[ 1614.735020]  <IRQ>  [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10
[ 1614.735033]  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
[ 1614.735037]  [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130
[ 1614.735042]  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
[ 1614.735047]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735051]  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
[ 1614.735060]  [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50
[ 1614.735065]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735069]  [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735075]  [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
[ 1614.735079]  [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70
[ 1614.735083]  [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70
[ 1614.735087]  [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00
[ 1614.735093]  [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290
[ 1614.735098]  [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00
[ 1614.735102]  [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70
[ 1614.735106]  [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0
[ 1614.735111]  [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280
[ 1614.735117]  [<ffffffff8160a7e2>] arp_xmit+0x22/0x60
[ 1614.735121]  [<ffffffff8160a863>] arp_send+0x43/0x50
[ 1614.735125]  [<ffffffff8160b82f>] arp_solicit+0x18f/0x450
[ 1614.735132]  [<ffffffff8159d9da>] neigh_probe+0x4a/0x70
[ 1614.735137]  [<ffffffff815a191a>] __neigh_event_send+0xea/0x300
[ 1614.735141]  [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260
[ 1614.735146]  [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890
[ 1614.735150]  [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890
[ 1614.735154]  [<ffffffff815dae79>] ip_output+0x59/0xf0
[ 1614.735158]  [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0
[ 1614.735162]  [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680
[ 1614.735165]  [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0
[ 1614.735172]  [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60
[ 1614.735177]  [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620
[ 1614.735181]  [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960
[ 1614.735185]  [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0
[ 1614.735189]  [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0
[ 1614.735194]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735199]  [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230
[ 1614.735203]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735208]  [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0
[ 1614.735213]  [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0
[ 1614.735217]  [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0
[ 1614.735221]  [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0
[ 1614.735225]  [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90
[ 1614.735229]  [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730
[ 1614.735233]  [<ffffffff815d425d>] ip_rcv+0x21d/0x300
[ 1614.735237]  [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00
[ 1614.735241]  [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00
[ 1614.735245]  [<ffffffff81593368>] process_backlog+0xb8/0x180
[ 1614.735249]  [<ffffffff81593cf9>] net_rx_action+0x159/0x330
[ 1614.735257]  [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0
[ 1614.735264]  [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30
[ 1614.735270]  [<ffffffff8175419c>] call_softirq+0x1c/0x30
[ 1614.735278]  [<ffffffff8100425d>] do_softirq+0x8d/0xc0
[ 1614.735282]  [<ffffffff8104983e>] irq_exit+0xae/0xe0
[ 1614.735287]  [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99
[ 1614.735291]  [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80
[ 1614.735293]  <EOI>  [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10
[ 1614.735306]  [<ffffffff81336f85>] ? intel_idle+0xf5/0x150
[ 1614.735310]  [<ffffffff81336f7e>] ? intel_idle+0xee/0x150
[ 1614.735317]  [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20
[ 1614.735321]  [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630
[ 1614.735327]  [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0
[ 1614.735333]  [<ffffffff8173762e>] start_secondary+0x220/0x222

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Use spin_lock_irq() to quiet warning:

         [    8.232324] BUG: spinlock trylock failure on UP on CPU#0, reboot/85
         [    8.234138]  lock: c161c760, .magic: dead4ead, .owner: reboot/85, .owner_cpu: 0
         [    8.236132] Pid: 85, comm: reboot Not tainted 3.4.0-rc7-00656-g82163ed CyanogenMod#5
         [    8.237965] Call Trace:
         [    8.238648]  [<c13dfd20>] ? printk+0x18/0x1a
         [    8.239827]  [<c122a5e0>] spin_dump+0x80/0xd0
         [    8.241016]  [<c122a652>] spin_bug+0x22/0x30
         [    8.242181]  [<c122a93b>] do_raw_spin_trylock+0x5b/0x70
         [    8.243611]  [<c13e8bae>] _raw_spin_trylock+0xe/0x60
         [    8.244975]  [<c1392230>] ? keypad_send_key.constprop.9+0xe0/0xe0
 ==>     [    8.246638]  [<c13922ea>] panel_scan_timer+0xba/0x570
         [    8.248019]  [<c1392230>] ? keypad_send_key.constprop.9+0xe0/0xe0
         [    8.249689]  [<c102f6f5>] run_timer_softirq+0x1e5/0x370
         [    8.251191]  [<c102f645>] ? run_timer_softirq+0x135/0x370
         [    8.252718]  [<c1392230>] ? keypad_send_key.constprop.9+0xe0/0xe0
         [    8.254462]  [<c102a592>] __do_softirq+0xc2/0x1c0
         [    8.255758]  [<c102a4d0>] ? local_bh_enable_ip+0x130/0x130
         [    8.257228]  <IRQ>  [<c102a855>] ? irq_exit+0x65/0x70
         [    8.258647]  [<c1013ff9>] ? smp_apic_timer_interrupt+0x49/0x80
         [    8.260226]  [<c13e96c1>] ? apic_timer_interrupt+0x31/0x38
         [    8.261737]  [<c12700e0>] ? drm_vm_open_locked+0x70/0xb0
         [    8.263166]  [<c122489a>] ? delay_tsc+0x1a/0x30
         [    8.264452]  [<c12248c9>] ? __delay+0x9/0x10
         [    8.265621]  [<c12248ec>] ? __const_udelay+0x1c/0x20
 ==>     [    8.266967]  [<c139136c>] ? lcd_clear_fast_p8+0x9c/0xe0
         [    8.268386]  [<c1391a66>] ? lcd_write+0x226/0x810
         [    8.269653]  [<c1367900>] ? md_set_readonly+0xc0/0xc0
         [    8.271013]  [<c122a9ed>] ? do_raw_spin_unlock+0x9d/0xe0
         [    8.272470]  [<c1392a98>] ? panel_lcd_print+0x38/0x40
         [    8.273837]  [<c1392ace>] ? panel_notify_sys+0x2e/0x60
         [    8.275224]  [<c1046634>] ? notifier_call_chain+0x84/0xb0
         [    8.276754]  [<c10469ce>] ? __blocking_notifier_call_chain+0x3e/0x60
         [    8.278576]  [<c1046a0a>] ? blocking_notifier_call_chain+0x1a/0x20
         [    8.280267]  [<c1036a14>] ? kernel_restart_prepare+0x14/0x40
         [    8.281901]  [<c1036a8e>] ? kernel_restart+0xe/0x50
         [    8.283216]  [<c1036ce9>] ? sys_reboot+0x149/0x1e0
         [    8.284532]  [<c10b3fb3>] ? handle_pte_fault+0x93/0xd70
         [    8.285956]  [<c1019e35>] ? do_page_fault+0x215/0x5e0
         [    8.287330]  [<c101a113>] ? do_page_fault+0x4f3/0x5e0
         [    8.288704]  [<c1045ac6>] ? up_read+0x16/0x30
         [    8.289890]  [<c101a113>] ? do_page_fault+0x4f3/0x5e0
         [    8.291252]  [<c10d4486>] ? iterate_supers+0x86/0xd0
         [    8.292615]  [<c122a9ed>] ? do_raw_spin_unlock+0x9d/0xe0
         [    8.294049]  [<c13e8dcd>] ? _raw_spin_unlock+0x1d/0x20
         [    8.295449]  [<c10d44ab>] ? iterate_supers+0xab/0xd0
         [    8.296795]  [<c10fb620>] ? __sync_filesystem+0xa0/0xa0
         [    8.298199]  [<c13e9b03>] ? sysenter_do_call+0x12/0x37
         [    8.306899] Restarting system.
         [    8.307747] machine restart

Signed-off-by: Fengguang Wu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 CyanogenMod#3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 CyanogenMod#4 [c5377df4] bad_area_nosemaphore at c06f629b
 CyanogenMod#5 [c5377e00] do_page_fault at c070b0cb
 coolya#6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 coolya#7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 coolya#8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 coolya#9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
coolya#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
coolya#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
coolya#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 CyanogenMod#3 [cde0fe94] __down_common at c0706098
 CyanogenMod#4 [cde0fec8] __down at c0706122
 CyanogenMod#5 [cde0fed0] down at c025936f
 coolya#6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 coolya#7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 coolya#8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 coolya#9 [cde0ff1c] generic_shutdown_super at c0333d7a
coolya#10 [cde0ff38] kill_block_super at c0333e0f
coolya#11 [cde0ff48] deactivate_locked_super at c0334218
coolya#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 CyanogenMod#3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 CyanogenMod#4 [c5377df4] bad_area_nosemaphore at c06f629b
 CyanogenMod#5 [c5377e00] do_page_fault at c070b0cb
 coolya#6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 coolya#7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 coolya#8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 coolya#9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
coolya#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
coolya#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
coolya#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 CyanogenMod#3 [cde0fe94] __down_common at c0706098
 CyanogenMod#4 [cde0fec8] __down at c0706122
 CyanogenMod#5 [cde0fed0] down at c025936f
 coolya#6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 coolya#7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 coolya#8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 coolya#9 [cde0ff1c] generic_shutdown_super at c0333d7a
coolya#10 [cde0ff38] kill_block_super at c0333e0f
coolya#11 [cde0ff48] deactivate_locked_super at c0334218
coolya#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
One of the modes of Huawei E367 has this QMI/wwan interface:

 I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=07 Driver=(none)
 E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=2ms
 E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms

Huawei use subclass and protocol to identify vendor specific
functions, so adding a new vendor rule for this combination.

The Pantech devices UML290 (106c:3718) and P4200 (106c:3721) use
the same subclass to identify the QMI/wwan function.  Replace the
existing device specific UML290 entries with generic vendor matching,
adding support for the Pantech P4200.

The ZTE MF683 has 6 vendor specific interfaces, all using
ff/ff/ff for cls/sub/prot.  Adding a match on interface CyanogenMod#5 which
is a QMI/wwan interface.

Cc: Fangxiaozhi (Franko) <[email protected]>
Cc: Thomas Schäfer <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Shawn J. Goff <[email protected]>
Signed-off-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Interface CyanogenMod#5 on ZTE MF683 is a QMI/wwan interface.

Signed-off-by: Bjørn Mork <[email protected]>
Cc: stable <[email protected]>
Cc: Shawn J. Goff <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
One of the modes of Huawei E367 has this QMI/wwan interface:

 I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=07 Driver=(none)
 E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=2ms
 E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms

Huawei use subclass and protocol to identify vendor specific
functions, so adding a new vendor rule for this combination.

The Pantech devices UML290 (106c:3718) and P4200 (106c:3721) use
the same subclass to identify the QMI/wwan function.  Replace the
existing device specific UML290 entries with generic vendor matching,
adding support for the Pantech P4200.

The ZTE MF683 has 6 vendor specific interfaces, all using
ff/ff/ff for cls/sub/prot.  Adding a match on interface CyanogenMod#5 which
is a QMI/wwan interface.

Cc: Fangxiaozhi (Franko) <[email protected]>
Cc: Thomas Schäfer <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Shawn J. Goff <[email protected]>
Signed-off-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
netlink_register_notifier requires notify functions to not sleep.
nfc_stop_poll locks device mutex and must not be called from notifier.
Create workqueue that will handle this for all devices.

BUG: sleeping function called from invalid context at kernel/mutex.c:269
in_atomic(): 0, irqs_disabled(): 0, pid: 4497, name: neard
1 lock held by neard/4497:
Pid: 4497, comm: neard Not tainted 3.5.0-999-nfc+ CyanogenMod#5
Call Trace:
[<ffffffff810952c5>] __might_sleep+0x145/0x200
[<ffffffff81743dde>] mutex_lock_nested+0x2e/0x50
[<ffffffff816ffd19>] nfc_stop_poll+0x39/0xb0
[<ffffffff81700a17>] nfc_genl_rcv_nl_event+0x77/0xc0
[<ffffffff8174aa8c>] notifier_call_chain+0x5c/0x120
[<ffffffff8174abd6>] __atomic_notifier_call_chain+0x86/0x140
[<ffffffff8174ab50>] ? notifier_call_chain+0x120/0x120
[<ffffffff815e1347>] ? skb_dequeue+0x67/0x90
[<ffffffff8174aca6>] atomic_notifier_call_chain+0x16/0x20
[<ffffffff8162119a>] netlink_release+0x24a/0x280
[<ffffffff815d7aa8>] sock_release+0x28/0xa0
[<ffffffff815d7be7>] sock_close+0x17/0x30
[<ffffffff811b2a7c>] __fput+0xcc/0x250
[<ffffffff811b2c0e>] ____fput+0xe/0x10
[<ffffffff81085009>] task_work_run+0x69/0x90
[<ffffffff8101b951>] do_notify_resume+0x81/0xd0
[<ffffffff8174ef22>] int_signal+0x12/0x17

Signed-off-by: Szymon Janc <[email protected]>
Signed-off-by: Samuel Ortiz <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
…aves

P_Key change and guid change events are not of interest to all slaves,
but only to those slaves which "see" the table slots whose contents
have change.

For example, if the guid at port 1, index 5 has changed in the PPF, we
wish to propagate the gid-change event only to the function which has
that guid index mapped to its port/guid table (in this case it is
slave CyanogenMod#5). Other functions should not get the event, since the event
does not affect them.

Similarly with P_Keys -- P_Key change events are forwarded only to
slaves which have that P_Key index mapped to their virtual P_Key table.

Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Roland Dreier <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
If a qdisc is installed on a bonding device, its possible to get
following lockdep splat under stress :

 =============================================
 [ INFO: possible recursive locking detected ]
 3.6.0+ #211 Not tainted
 ---------------------------------------------
 ping/4876 is trying to acquire lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 but task is already holding lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by ping/4876:
  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e5030>] raw_sendmsg+0x600/0xc30
  #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815ba4bd>] ip_finish_output+0x12d/0x870
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830
  CyanogenMod#3:  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  CyanogenMod#4:  (&bond->lock){++.?..}, at: [<ffffffffa02128c1>] bond_start_xmit+0x31/0x4b0 [bonding]
  CyanogenMod#5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830

 stack backtrace:
 Pid: 4876, comm: ping Not tainted 3.6.0+ #211
 Call Trace:
  [<ffffffff810a0145>] __lock_acquire+0x715/0x1b80
  [<ffffffff810a256b>] ? mark_held_locks+0x9b/0x100
  [<ffffffff810a1bf2>] lock_acquire+0x92/0x1d0
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff81726b7c>] _raw_spin_lock+0x3c/0x50
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff8106264d>] ? rcu_read_lock_bh_held+0x5d/0x90
  [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffffa0212a6a>] bond_start_xmit+0x1da/0x4b0 [bonding]
  [<ffffffff815796d0>] dev_hard_start_xmit+0x240/0x6b0
  [<ffffffff81597c6e>] sch_direct_xmit+0xfe/0x2a0
  [<ffffffff8157a249>] dev_queue_xmit+0x199/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffff815ba96f>] ip_finish_output+0x5df/0x870
  [<ffffffff815ba4bd>] ? ip_finish_output+0x12d/0x870
  [<ffffffff815bb964>] ip_output+0x54/0xf0
  [<ffffffff815bad48>] ip_local_out+0x28/0x90
  [<ffffffff815bc444>] ip_send_skb+0x14/0x50
  [<ffffffff815bc4b2>] ip_push_pending_frames+0x32/0x40
  [<ffffffff815e536a>] raw_sendmsg+0x93a/0xc30
  [<ffffffff8128d570>] ? selinux_file_send_sigiotask+0x1f0/0x1f0
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6855>] inet_sendmsg+0x125/0x240
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8155cddb>] sock_sendmsg+0xab/0xe0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff8155d18c>] __sys_sendmsg+0x37c/0x390
  [<ffffffff81195b2a>] ? fsnotify+0x2ca/0x7e0
  [<ffffffff811958e8>] ? fsnotify+0x88/0x7e0
  [<ffffffff81361f36>] ? put_ldisc+0x56/0xd0
  [<ffffffff8116f98a>] ? fget_light+0x3da/0x510
  [<ffffffff8155f6c4>] sys_sendmsg+0x44/0x80
  [<ffffffff8172fc22>] system_call_fastpath+0x16/0x1b

Avoid this problem using a distinct lock_class_key for bonding
devices.

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 CyanogenMod#3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 CyanogenMod#4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 CyanogenMod#5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 coolya#6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 coolya#7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 coolya#8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 coolya#9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: [email protected]
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
All boards, except Amstrad E3, mark USB config with __initdata.

As a result, when you compile USB into modules, they will try to refer
already released platform data and the behaviour is undefined. For example
on Nokia 770, I get the following kernel panic when modprobing ohci-hcd:

[    3.462158] Unable to handle kernel paging request at virtual address e7fddef0
[    3.477050] pgd = c3434000
[    3.487365] [e7fddef0] *pgd=00000000
[    3.498535] Internal error: Oops: 80000005 [#1] ARM
[    3.510955] Modules linked in: ohci_hcd(+)
[    3.522705] CPU: 0    Not tainted  (3.7.0-770_tiny+ CyanogenMod#5)
[    3.535552] PC is at 0xe7fddef0
[    3.546508] LR is at ohci_omap_init+0x5c/0x144 [ohci_hcd]
[    3.560272] pc : [<e7fddef0>]    lr : [<bf003140>]    psr: a0000013
[    3.560272] sp : c344bdb0  ip : c344bce0  fp : c344bdcc
[    3.589782] r10: 00000001  r9 : 00000000  r8 : 00000000
[    3.604553] r7 : 00000026  r6 : 000000de  r5 : c0227300  r4 : c342d620
[    3.621032] r3 : e7fddef0  r2 : c048b880  r1 : 00000000  r0 : 0000000a
[    3.637786] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[    3.655822] Control: 0005317f  Table: 13434000  DAC: 00000015
[    3.672790] Process modprobe (pid: 425, stack limit = 0xc344a1b8)
[    3.690643] Stack: (0xc344bdb0 to 0xc344c000)
[    3.707031] bda0:                                     bf0030e4 c342d620 00000000 c049e62c
[    3.727905] bdc0: c344be04 c344bdd0 c0150ff0 bf0030f4 bf001b88 00000000 c048a4ac c345b020
[    3.748870] bde0: c342d620 00000000 c048a468 bf003968 00000001 bf006000 c344be34 c344be08
[    3.769836] be00: bf001bf0 c0150e48 00000000 c344be18 c00b9bfc c048a478 c048a4ac bf0037f8
[    3.790985] be20: c012ca04 c000e024 c344be44 c344be38 c012d968 bf001a84 c344be64 c344be48
[    3.812164] be40: c012c8ac c012d95c 00000000 c048a478 c048a4ac bf0037f8 c344be84 c344be68
[    3.833740] be60: c012ca74 c012c80c 20000013 00000000 c344be88 bf0037f8 c344beac c344be88
[    3.855468] be80: c012b038 c012ca14 c38093cc c383ee10 bf0037f8 c35be5a0 c049d5e8 00000000
[    3.877166] bea0: c344bebc c344beb0 c012c40c c012aff4 c344beec c344bec0 c012bfc0 c012c3fc
[    3.898834] bec0: bf00378c 00000000 c344beec bf0037f8 00067f39 00000000 00005c44 c000e024
[    3.920837] bee0: c344bf14 c344bef0 c012cd54 c012befc c04ce080 00067f39 00000000 00005c44
[    3.943023] bf00: c000e024 bf006000 c344bf24 c344bf18 c012db14 c012ccc0 c344bf3c c344bf28
[    3.965423] bf20: bf00604c c012dad8 c344a000 bf003834 c344bf7c c344bf40 c00087ac bf006010
[    3.987976] bf40: 0000000f bf003834 00067f39 00000000 00005c44 bf003834 00067f39 00000000
[    4.010711] bf60: 00005c44 c000e024 c344a000 00000000 c344bfa4 c344bf80 c004c35c c0008720
[    4.033569] bf80: c344bfac c344bf90 01422192 01427ea0 00000000 00000080 00000000 c344bfa8
[    4.056518] bfa0: c000dec0 c004c2f0 01422192 01427ea0 01427ea0 00005c44 00067f39 00000000
[    4.079406] bfc0: 01422192 01427ea0 00000000 00000080 b6e11008 014221aa be941fcc b6e1e008
[    4.102569] bfe0: b6ef6300 be941758 0000e93c b6ef6310 60000010 01427ea0 00000000 00000000
[    4.125946] Backtrace:
[    4.143463] [<bf0030e4>] (ohci_omap_init+0x0/0x144 [ohci_hcd]) from [<c0150ff0>] (usb_add_hcd+0x1b8/0x61c)
[    4.183898]  r6:c049e62c r5:00000000 r4:c342d620 r3:bf0030e4
[    4.205596] [<c0150e38>] (usb_add_hcd+0x0/0x61c) from [<bf001bf0>] (ohci_hcd_omap_drv_probe+0x17c/0x224 [ohci_hcd])
[    4.248138] [<bf001a74>] (ohci_hcd_omap_drv_probe+0x0/0x224 [ohci_hcd]) from [<c012d968>] (platform_drv_probe+0x1c/0x20)
[    4.292144]  r8:c000e024 r7:c012ca04 r6:bf0037f8 r5:c048a4ac r4:c048a478
[    4.316192] [<c012d94c>] (platform_drv_probe+0x0/0x20) from [<c012c8ac>] (driver_probe_device+0xb0/0x208)
[    4.360168] [<c012c7fc>] (driver_probe_device+0x0/0x208) from [<c012ca74>] (__driver_attach+0x70/0x94)
[    4.405548]  r6:bf0037f8 r5:c048a4ac r4:c048a478 r3:00000000
[    4.429809] [<c012ca04>] (__driver_attach+0x0/0x94) from [<c012b038>] (bus_for_each_dev+0x54/0x90)
[    4.475708]  r6:bf0037f8 r5:c344be88 r4:00000000 r3:20000013
[    4.500366] [<c012afe4>] (bus_for_each_dev+0x0/0x90) from [<c012c40c>] (driver_attach+0x20/0x28)
[    4.528442]  r7:00000000 r6:c049d5e8 r5:c35be5a0 r4:bf0037f8
[    4.553466] [<c012c3ec>] (driver_attach+0x0/0x28) from [<c012bfc0>] (bus_add_driver+0xd4/0x228)
[    4.581878] [<c012beec>] (bus_add_driver+0x0/0x228) from [<c012cd54>] (driver_register+0xa4/0x134)
[    4.629730]  r8:c000e024 r7:00005c44 r6:00000000 r5:00067f39 r4:bf0037f8
[    4.656738] [<c012ccb0>] (driver_register+0x0/0x134) from [<c012db14>] (platform_driver_register+0x4c/0x60)
[    4.706542] [<c012dac8>] (platform_driver_register+0x0/0x60) from [<bf00604c>] (ohci_hcd_mod_init+0x4c/0x8c [ohci_hcd])
[    4.757843] [<bf006000>] (ohci_hcd_mod_init+0x0/0x8c [ohci_hcd]) from [<c00087ac>] (do_one_initcall+0x9c/0x174)
[    4.808990]  r4:bf003834 r3:c344a000
[    4.832641] [<c0008710>] (do_one_initcall+0x0/0x174) from [<c004c35c>] (sys_init_module+0x7c/0x194)
[    4.881530] [<c004c2e0>] (sys_init_module+0x0/0x194) from [<c000dec0>] (ret_fast_syscall+0x0/0x2c)
[    4.930664]  r7:00000080 r6:00000000 r5:01427ea0 r4:01422192
[    4.956481] Code: bad PC value
[    4.978729] ---[ end trace 58280240f08342c4 ]---
[    5.002258] Kernel panic - not syncing: Fatal exception

Fix this by taking a copy of the data. Also mark Amstrad E3's data with
__initdata to save some memory with multi-board kernels.

Signed-off-by: Aaro Koskinen <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
Pursuant to this review https://lkml.org/lkml/2012/11/12/500
by Stefan Richter, update the TODO file.
- Clarify purpose of TODO file
- Remove firewire item CyanogenMod#4. As discussed in this conversation
  https://lkml.org/lkml/2012/11/13/564 knowing the AR buffer size
  is not a hard requirement. The required rx buffer size can be
  determined experimentally.
- Remove firewire item CyanogenMod#5. This was a private note for further
  experimentation.
- Change firewire item #1. Change suggested header from uapi header
  to kernel-only header.

Signed-off-by: Peter Hurley <[email protected]>
Acked-by: Stefan Richter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
storm31 referenced this pull request in storm31/android_kernel_samsung_aries Jun 30, 2013
The following lines of code produce a kernel oops.

fd = socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0);
fchmod(fd, 0666);

[  139.922364] BUG: unable to handle kernel NULL pointer dereference at   (null)
[  139.924982] IP: [<  (null)>]   (null)
[  139.924982] *pde = 00000000
[  139.924982] Oops: 0000 [CyanogenMod#5] SMP
[  139.924982] Modules linked in: fuse dm_crypt dm_mod i2c_piix4 serio_raw evdev binfmt_misc button
[  139.924982] Pid: 3070, comm: acpid Tainted: G      D      3.8.0-rc2-kds+ #465 Bochs Bochs
[  139.924982] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
[  139.924982] EIP is at 0x0
[  139.924982] EAX: cf5ef000 EBX: cf5ef000 ECX: c143d600 EDX: c15225f2
[  139.924982] ESI: cf4d2a1c EDI: cf4d2a1c EBP: cc02df10 ESP: cc02dee4
[  139.924982]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  139.924982] CR0: 80050033 CR2: 00000000 CR3: 0c059000 CR4: 000006d0
[  139.924982] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  139.924982] DR6: ffff0ff0 DR7: 00000400
[  139.924982] Process acpid (pid: 3070, ti=cc02c000 task=d7705340 task.ti=cc02c000)
[  139.924982] Stack:
[  139.924982]  c1203c88 00000000 cc02def4 cf4d2a1c ae21eefa 471b60d5 1083c1ba c26a5940
[  139.924982]  e891fb5e 00000041 00000004 cc02df1c c1203964 00000000 cc02df4c c10e20c3
[  139.924982]  00000002 00000000 00000000 22222222 c1ff2222 cf5ef000 00000000 d76efb08
[  139.924982] Call Trace:
[  139.924982]  [<c1203c88>] ? evm_update_evmxattr+0x5b/0x62
[  139.924982]  [<c1203964>] evm_inode_post_setattr+0x22/0x26
[  139.924982]  [<c10e20c3>] notify_change+0x25f/0x281
[  139.924982]  [<c10cbf56>] chmod_common+0x59/0x76
[  139.924982]  [<c10e27a1>] ? put_unused_fd+0x33/0x33
[  139.924982]  [<c10cca09>] sys_fchmod+0x39/0x5c
[  139.924982]  [<c13f4f30>] syscall_call+0x7/0xb
[  139.924982] Code:  Bad EIP value.

This happens because sockets do not define the removexattr operation.
Before removing the xattr, verify the removexattr function pointer is
not NULL.

Signed-off-by: Dmitry Kasatkin <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Cc: [email protected]
Signed-off-by: James Morris <[email protected]>
JackpotClavin referenced this pull request in JackpotClavin/android_kernel_samsung_venturi Jul 28, 2013
commit f7a1dd6 upstream.

The reason for this patch is crash in kmemdup
caused by returning from get_callid with uniialized
matchoff and matchlen.

Removing Zero check of matchlen since it's done by ct_sip_get_header()

BUG: unable to handle kernel paging request at ffff880457b5763f
IP: [<ffffffff810df7fc>] kmemdup+0x2e/0x35
PGD 27f6067 PUD 0
Oops: 0000 [CyanogenMod#1] PREEMPT SMP
Modules linked in: xt_state xt_helper nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle xt_connmark xt_conntrack ip6_tables nf_conntrack_ftp ip_vs_ftp nf_nat xt_tcpudp iptable_mangle xt_mark ip_tables x_tables ip_vs_rr ip_vs_lblcr ip_vs_pe_sip ip_vs nf_conntrack_sip nf_conntrack bonding igb i2c_algo_bit i2c_core
CPU 5
Pid: 0, comm: swapper/5 Not tainted 3.9.0-rc5+ CyanogenMod#5                  /S1200KP
RIP: 0010:[<ffffffff810df7fc>]  [<ffffffff810df7fc>] kmemdup+0x2e/0x35
RSP: 0018:ffff8803fea03648  EFLAGS: 00010282
RAX: ffff8803d61063e0 RBX: 0000000000000003 RCX: 0000000000000003
RDX: 0000000000000003 RSI: ffff880457b5763f RDI: ffff8803d61063e0
RBP: ffff8803fea03658 R08: 0000000000000008 R09: 0000000000000011
R10: 0000000000000011 R11: 00ffffffff81a8a3 R12: ffff880457b5763f
R13: ffff8803d67f786a R14: ffff8803fea03730 R15: ffffffffa0098e90
FS:  0000000000000000(0000) GS:ffff8803fea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff880457b5763f CR3: 0000000001a0c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/5 (pid: 0, threadinfo ffff8803ee18c000, task ffff8803ee18a480)
Stack:
 ffff8803d822a080 000000000000001c ffff8803fea036c8 ffffffffa000937a
 ffffffff81f0d8a0 000000038135fdd5 ffff880300000014 ffff880300110000
 ffffffff150118ac ffff8803d7e8a000 ffff88031e0118ac 0000000000000000
Call Trace:
 <IRQ>

 [<ffffffffa000937a>] ip_vs_sip_fill_param+0x13a/0x187 [ip_vs_pe_sip]
 [<ffffffffa007b209>] ip_vs_sched_persist+0x2c6/0x9c3 [ip_vs]
 [<ffffffff8107dc53>] ? __lock_acquire+0x677/0x1697
 [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d
 [<ffffffff8100972e>] ? native_sched_clock+0x3c/0x7d
 [<ffffffff810649bc>] ? sched_clock_cpu+0x43/0xcf
 [<ffffffffa007bb1e>] ip_vs_schedule+0x181/0x4ba [ip_vs]
...

Signed-off-by: Hans Schillstrom <[email protected]>
Acked-by: Julian Anastasov <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Cc: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
humberos referenced this pull request in humberos/android_kernel_samsung_aries Aug 15, 2013
commit ea3768b upstream.

We used to keep the port's char device structs and the /sys entries
around till the last reference to the port was dropped.  This is
actually unnecessary, and resulted in buggy behaviour:

1. Open port in guest
2. Hot-unplug port
3. Hot-plug a port with the same 'name' property as the unplugged one

This resulted in hot-plug being unsuccessful, as a port with the same
name already exists (even though it was unplugged).

This behaviour resulted in a warning message like this one:

-------------------8<---------------------------------------
WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
Hardware name: KVM
sysfs: cannot create duplicate filename
'/devices/pci0000:00/0000:00:04.0/virtio0/virtio-ports/vport0p1'

Call Trace:
 [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106b6f6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff811f2319>] ? sysfs_add_one+0xc9/0x130
 [<ffffffff811f23e8>] ? create_dir+0x68/0xb0
 [<ffffffff811f2469>] ? sysfs_create_dir+0x39/0x50
 [<ffffffff81273129>] ? kobject_add_internal+0xb9/0x260
 [<ffffffff812733d8>] ? kobject_add_varg+0x38/0x60
 [<ffffffff812734b4>] ? kobject_add+0x44/0x70
 [<ffffffff81349de4>] ? get_device_parent+0xf4/0x1d0
 [<ffffffff8134b389>] ? device_add+0xc9/0x650

-------------------8<---------------------------------------

Instead of relying on guest applications to release all references to
the ports, we should go ahead and unregister the port from all the core
layers.  Any open/read calls on the port will then just return errors,
and an unplug/plug operation on the host will succeed as expected.

This also caused buggy behaviour in case of the device removal (not just
a port): when the device was removed (which means all ports on that
device are removed automatically as well), the ports with active
users would clean up only when the last references were dropped -- and
it would be too late then to be referencing char device pointers,
resulting in oopses:

-------------------8<---------------------------------------
PID: 6162   TASK: ffff8801147ad500  CPU: 0   COMMAND: "cat"
 #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b
 #1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322
 CyanogenMod#2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50
 CyanogenMod#3 [ffff88011b9d5bf0] die at ffffffff8100f26b
 CyanogenMod#4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2
 CyanogenMod#5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5
    [exception RIP: strlen+2]
    RIP: ffffffff81272ae2  RSP: ffff88011b9d5d00  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff880118901c18  RCX: 0000000000000000
    RDX: ffff88011799982c  RSI: 00000000000000d0  RDI: 3a303030302f3030
    RBP: ffff88011b9d5d38   R8: 0000000000000006   R9: ffffffffa0134500
    R10: 0000000000001000  R11: 0000000000001000  R12: ffff880117a1cc10
    R13: 00000000000000d0  R14: 0000000000000017  R15: ffffffff81aff700
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 coolya#6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d
 coolya#7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551
 coolya#8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb
 coolya#9 [ffff88011b9d5de0] device_del at ffffffff813440c7

-------------------8<---------------------------------------

So clean up when we have all the context, and all that's left to do when
the references to the port have dropped is to free up the port struct
itself.

Reported-by: chayang <[email protected]>
Reported-by: YOGANANTH SUBRAMANIAN <[email protected]>
Reported-by: FuXiangChun <[email protected]>
Reported-by: Qunfang Zhang <[email protected]>
Reported-by: Sibiao Luo <[email protected]>
Signed-off-by: Amit Shah <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant