Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remoteproc to work with the PRU GNU Binutils port #96

Closed
wants to merge 137 commits into from

Conversation

dinuxbg
Copy link

@dinuxbg dinuxbg commented Jul 24, 2016

PRU IRAM addresses need to be masked before being handled to
remoteproc. This is due to PRU Binutils' lack of separate address
spaces for IRAM and DRAM.

Signed-off-by: Dimitar Dimitrov [email protected]


I tested this using bone-debian-8.5-iot-armhf-2016-06-19-4gb.img.xz (kernel 4.4.15-ti-r37).

RobertCNelson and others added 30 commits July 19, 2016 09:41
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
… is defined

'owner' member of 'struct mutex' is defined as below
in 'include/linux/mutex.h':

struct mutex {
...
if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
        struct task_struct      *owner;
endif
...

But function au_pin_hdir_set_owner() called owner as below:

 void au_pin_hdir_set_owner(struct au_pin *p, struct task_struct *task)
 {
if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_SMP)
        p->hdir->hi_inode->i_mutex.owner = task;
endif
 }

So if Kernel doesn't define 'DEBUG_MUTEXES' and 'MUTEX_SPIN_ON_OWNER',
but defines SMP, compiler will report the below error:

fs/aufs/i_op.c: In function 'au_pin_hdir_set_owner':
fs/aufs/i_op.c:593:28: error: 'struct mutex' has no member named 'owner'
  p->hdir->hi_inode->i_mutex.owner = task;
                            ^

Signed-off-by: Yanjiang Jin <[email protected]>
Signed-off-by: Bruce Ashfield <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Beyond the warning:

 drivers/tty/serial/8250/8250.c:1613:6: warning: unused variable ‘pass_counter’ [-Wunused-variable]

the solution of just looping infinitely was ugly - up it to 1 million to
give it a chance to continue in some really ugly situation.

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Adds support for using a OMAP dual-mode timer with PWM capability
as a Linux PWM device. The driver controls the timer by using the
dmtimer API.

Add a platform_data structure for each pwm-omap-dmtimer nodes containing
the dmtimers functions in order to get driver not rely on platform
specific functions.

Cc: Grant Erickson <[email protected]>
Cc: NeilBrown <[email protected]>
Cc: Joachim Eastwood <[email protected]>
Suggested-by: Tony Lindgren <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Acked-by: Tony Lindgren <[email protected]>
[[email protected]: coding style bikeshed, fix timer leak]
Signed-off-by: Thierry Reding <[email protected]>
"omap" is NULL so we can't dereference it.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
In order to set the currently platform dependent dmtimer
functions pointers as platform data for the pwm-omap-dmtimer
platform driver, add it to plat-omap auxdata_lookup table.

Suggested-by: Tony Lindgren <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Fix the calculation of load_value and match_value. Currently they
are slightly too low, which produces a noticeably wrong PWM rate with
sufficiently short periods (i.e. when 1/period approaches clk_rate/2).

Example:
 clk_rate=32768Hz, period=122070ns, duty_cycle=61035ns (8192Hz/50% PWM)
 Correct values: load = 0xfffffffc, match = 0xfffffffd
 Current values: load = 0xfffffffa, match = 0xfffffffc
 effective PWM: period=183105ns, duty_cycle=91553ns (5461Hz/50% PWM)

Fixes: 6604c65 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: David Rivshin <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Tested-by: Adam Ford <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Add sanity checking to ensure that we do not program load or match values
that are out of range if a user requests period or duty_cycle values which
are not achievable. The match value cannot be less than the load value (but
can be equal), and neither can be 0xffffffff. This means that there must be
at least one fclk cycle between load and match, and another between match
and overflow.

Fixes: 6604c65 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: David Rivshin <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
[[email protected]: minor coding style cleanups]
Signed-off-by: Thierry Reding <[email protected]>
When converting period and duty_cycle from nanoseconds to fclk cycles,
the error introduced by the integer division can be appreciable, especially
in the case of slow fclk or short period. Use DIV_ROUND_CLOSEST_ULL() so
that the error is kept to +/- 0.5 clock cycles.

Fixes: 6604c65 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: David Rivshin <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
After going through the math and constraints checking to compute load
and match values, it is helpful to know what the resultant period and
duty cycle are.

Signed-off-by: David Rivshin <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
… there is also a hidden SSID in the scan list

Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Use kobj_to_dev() instead of open-coding it.

Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson and others added 15 commits July 19, 2016 09:41
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
I have encountered the same issue(s) on A6A boards.

I couldn't find a patch,  so I wrote this patch to update the device tree
in the davinci_mdio driver in the 3.15.1 tree, it seems to correct it. I
would welcome any input on a different approach.

https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/SRlnumt0LoMJ

v4.1-rcX: added hack around CONFIG_OF_OVERLAY
v4.2-rc3+: added if (of_machine_is_compatible("ti,am335x-bone")) so we do
not break dual ethernet am335x devices

Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Commit 520bd7a ("mmc: core: Optimize boot time by detecting cards
simultaneously") causes regressions for some platforms.

These platforms relies on fixed mmcblk device indexes, instead of
deploying the defacto standard with UUID/PARTUUID. In other words their
rootfs needs to be available at hardcoded paths, like /dev/mmcblk0p2.

Such guarantees have never been made by the kernel, but clearly the above
commit changes the behaviour. More precisely, because of that the order
changes of how cards becomes detected, so do their corresponding mmcblk
device indexes.

As the above commit significantly improves boot time for some platforms
(magnitude of seconds), let's avoid reverting this change but instead
restore the behaviour of how mmcblk device indexes becomes picked.

By using the same index for the mmcblk device as for the corresponding mmc
host device, the probe order of mmc host devices decides the index we get
for the mmcblk device.

For those platforms that suffers from a regression, one could expect that
this updated behaviour should be sufficient to meet their expectations of
"fixed" mmcblk device indexes.

Another side effect from this change, is that the same index is used for
the mmc host device, the mmcblk device and the mmc block queue. That
should clarify their relationship.

Reported-by: Peter Hurley <[email protected]>
Reported-by: Laszlo Fiat <[email protected]>
Cc: Linus Torvalds <[email protected]>
Fixes: 520bd7a ("mmc: core: Optimize boot time by detecting cards
simultaneously")
Cc: <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
PRU IRAM addresses need to be masked before being handled to
remoteproc. This is due to PRU Binutils' lack of separate address
spaces for IRAM and DRAM.

Signed-off-by: Dimitar Dimitrov <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Sep 8, 2016
commit 088bf2f upstream.

seq_read() is a nasty piece of work, not to mention buggy.

It has (I think) an old bug which allows unprivileged userspace to read
beyond the end of m->buf.

I was getting these:

    BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880
    Read of size 2713 by task trinity-c2/1329
    CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    Call Trace:
      kasan_object_err+0x1c/0x80
      kasan_report_error+0x2cb/0x7e0
      kasan_report+0x4e/0x80
      check_memory_region+0x13e/0x1a0
      kasan_check_read+0x11/0x20
      seq_read+0xcd2/0x1480
      proc_reg_read+0x10b/0x260
      do_loop_readv_writev.part.5+0x140/0x2c0
      do_readv_writev+0x589/0x860
      vfs_readv+0x7b/0xd0
      do_readv+0xd8/0x2c0
      SyS_readv+0xb/0x10
      do_syscall_64+0x1b3/0x4b0
      entry_SYSCALL64_slow_path+0x25/0x25
    Object at ffff880116889100, in cache kmalloc-4096 size: 4096
    Allocated:
    PID = 1329
      save_stack_trace+0x26/0x80
      save_stack+0x46/0xd0
      kasan_kmalloc+0xad/0xe0
      __kmalloc+0x1aa/0x4a0
      seq_buf_alloc+0x35/0x40
      seq_read+0x7d8/0x1480
      proc_reg_read+0x10b/0x260
      do_loop_readv_writev.part.5+0x140/0x2c0
      do_readv_writev+0x589/0x860
      vfs_readv+0x7b/0xd0
      do_readv+0xd8/0x2c0
      SyS_readv+0xb/0x10
      do_syscall_64+0x1b3/0x4b0
      return_from_SYSCALL_64+0x0/0x6a
    Freed:
    PID = 0
    (stack is not available)
    Memory state around the buggy address:
     ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    >ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
		       ^
     ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================
    Disabling lock debugging due to kernel taint

This seems to be the same thing that Dave Jones was seeing here:

  https://lkml.org/lkml/2016/8/12/334

There are multiple issues here:

  1) If we enter the function with a non-empty buffer, there is an attempt
     to flush it. But it was not clearing m->from after doing so, which
     means that if we try to do this flush twice in a row without any call
     to traverse() in between, we are going to be reading from the wrong
     place -- the splat above, fixed by this patch.

  2) If there's a short write to userspace because of page faults, the
     buffer may already contain multiple lines (i.e. pos has advanced by
     more than 1), but we don't save the progress that was made so the
     next call will output what we've already returned previously. Since
     that is a much less serious issue (and I have a headache after
     staring at seq_read() for the past 8 hours), I'll leave that for now.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vegard Nossum <[email protected]>
Reported-by: Dave Jones <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Sep 8, 2016
commit 088bf2f upstream.

seq_read() is a nasty piece of work, not to mention buggy.

It has (I think) an old bug which allows unprivileged userspace to read
beyond the end of m->buf.

I was getting these:

    BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880
    Read of size 2713 by task trinity-c2/1329
    CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    Call Trace:
      kasan_object_err+0x1c/0x80
      kasan_report_error+0x2cb/0x7e0
      kasan_report+0x4e/0x80
      check_memory_region+0x13e/0x1a0
      kasan_check_read+0x11/0x20
      seq_read+0xcd2/0x1480
      proc_reg_read+0x10b/0x260
      do_loop_readv_writev.part.5+0x140/0x2c0
      do_readv_writev+0x589/0x860
      vfs_readv+0x7b/0xd0
      do_readv+0xd8/0x2c0
      SyS_readv+0xb/0x10
      do_syscall_64+0x1b3/0x4b0
      entry_SYSCALL64_slow_path+0x25/0x25
    Object at ffff880116889100, in cache kmalloc-4096 size: 4096
    Allocated:
    PID = 1329
      save_stack_trace+0x26/0x80
      save_stack+0x46/0xd0
      kasan_kmalloc+0xad/0xe0
      __kmalloc+0x1aa/0x4a0
      seq_buf_alloc+0x35/0x40
      seq_read+0x7d8/0x1480
      proc_reg_read+0x10b/0x260
      do_loop_readv_writev.part.5+0x140/0x2c0
      do_readv_writev+0x589/0x860
      vfs_readv+0x7b/0xd0
      do_readv+0xd8/0x2c0
      SyS_readv+0xb/0x10
      do_syscall_64+0x1b3/0x4b0
      return_from_SYSCALL_64+0x0/0x6a
    Freed:
    PID = 0
    (stack is not available)
    Memory state around the buggy address:
     ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    >ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
		       ^
     ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================
    Disabling lock debugging due to kernel taint

This seems to be the same thing that Dave Jones was seeing here:

  https://lkml.org/lkml/2016/8/12/334

There are multiple issues here:

  1) If we enter the function with a non-empty buffer, there is an attempt
     to flush it. But it was not clearing m->from after doing so, which
     means that if we try to do this flush twice in a row without any call
     to traverse() in between, we are going to be reading from the wrong
     place -- the splat above, fixed by this patch.

  2) If there's a short write to userspace because of page faults, the
     buffer may already contain multiple lines (i.e. pos has advanced by
     more than 1), but we don't save the progress that was made so the
     next call will output what we've already returned previously. Since
     that is a much less serious issue (and I have a headache after
     staring at seq_read() for the past 8 hours), I'll leave that for now.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vegard Nossum <[email protected]>
Reported-by: Dave Jones <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@dinuxbg
Copy link
Author

dinuxbg commented Sep 10, 2016

Patch appears to have been merged in 4.4.20-ti-rt-r43. Thanks.

@dinuxbg dinuxbg closed this Sep 10, 2016
RobertCNelson pushed a commit that referenced this pull request Jun 12, 2018
[ Upstream commit f8ba22a ]

Pipe clock comes out of the phy and is available as long as
the phy is turned on. Clock controller fails to gate this
clock after the phy is turned off and generates a warning.

/ # [   33.048561] gcc_usb3_phy_pipe_clk status stuck at 'on'
[   33.048585] ------------[ cut here ]------------
[   33.052621] WARNING: CPU: 1 PID: 18 at ../drivers/clk/qcom/clk-branch.c:97 clk_branch_wait+0xf0/0x108
[   33.057384] Modules linked in:
[   33.066497] CPU: 1 PID: 18 Comm: kworker/1:0 Tainted: G        W       4.12.0-rc7-00024-gfe926e34c36d-dirty #96
[   33.069451] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
...
[   33.278565] [<ffff00000849b27c>] clk_branch_wait+0xf0/0x108
[   33.286375] [<ffff00000849b2f4>] clk_branch2_disable+0x28/0x34
[   33.291761] [<ffff0000084868dc>] clk_core_disable+0x5c/0x88
[   33.297660] [<ffff000008487d68>] clk_core_disable_lock+0x20/0x34
[   33.303129] [<ffff000008487d98>] clk_disable+0x1c/0x24
[   33.309384] [<ffff0000083ccd78>] qcom_qmp_phy_poweroff+0x20/0x48
[   33.314328] [<ffff0000083c53f4>] phy_power_off+0x80/0xdc
[   33.320492] [<ffff00000875c950>] dwc3_core_exit+0x94/0xa0
[   33.325784] [<ffff00000875c9ac>] dwc3_suspend_common+0x50/0x60
[   33.331080] [<ffff00000875ca04>] dwc3_runtime_suspend+0x48/0x6c
[   33.336810] [<ffff0000085b82f4>] pm_generic_runtime_suspend+0x28/0x38
[   33.342627] [<ffff0000085bace0>] __rpm_callback+0x150/0x254
[   33.349222] [<ffff0000085bae08>] rpm_callback+0x24/0x78
[   33.354604] [<ffff0000085b9fd8>] rpm_suspend+0xe0/0x4e4
[   33.359813] [<ffff0000085bb784>] pm_runtime_work+0xdc/0xf0
[   33.365028] [<ffff0000080d7b30>] process_one_work+0x12c/0x28c
[   33.370576] [<ffff0000080d7ce8>] worker_thread+0x58/0x3b8
[   33.376393] [<ffff0000080dd4a8>] kthread+0x100/0x12c
[   33.381776] [<ffff0000080836c0>] ret_from_fork+0x10/0x50

Fix this by disabling it as the first thing in phy_exit().

Fixes: e78f3d1 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets")
Signed-off-by: Vivek Gautam <[email protected]>
Signed-off-by: Manu Gautam <[email protected]>
Signed-off-by: Kishon Vijay Abraham I <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
crow-misia pushed a commit to crow-misia/linux that referenced this pull request Mar 24, 2019
commit f16eb8a upstream.

If SSDT overlay is loaded via ConfigFS and then unloaded the device,
we would like to have OF modalias for, already gone. Thus, acpi_get_name()
returns no allocated buffer for such case and kernel crashes afterwards:

 ACPI: Host-directed Dynamic ACPI Table Unload
 ads7950 spi-PRP0001:00: Dropping the link to regulator.0
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 #PF error: [normal kernel read fault]
 PGD 80000000070d6067 P4D 80000000070d6067 PUD 70d0067 PMD 0
 Oops: 0000 [beagleboard#1] SMP PTI
 CPU: 0 PID: 40 Comm: kworker/u4:2 Not tainted 5.0.0+ beagleboard#96
 Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
 Workqueue: kacpi_hotplug acpi_device_del_work_fn
 RIP: 0010:create_of_modalias.isra.1+0x4c/0x150
 Code: 00 00 48 89 44 24 18 31 c0 48 8d 54 24 08 48 c7 44 24 10 00 00 00 00 48 c7 44 24 08 ff ff ff ff e8 7a b0 03 00 48 8b 4c 24 10 <0f> b6 01 84 c0 74 27 48 c7 c7 00 09 f4 a5 0f b6 f0 8d 50 20 f6 04
 RSP: 0000:ffffa51040297c10 EFLAGS: 00010246
 RAX: 0000000000001001 RBX: 0000000000000785 RCX: 0000000000000000
 RDX: 0000000000001001 RSI: 0000000000000286 RDI: ffffa2163dc042e0
 RBP: ffffa216062b1196 R08: 0000000000001001 R09: ffffa21639873000
 R10: ffffffffa606761d R11: 0000000000000001 R12: ffffa21639873218
 R13: ffffa2163deb5060 R14: ffffa216063d1010 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffffa2163e000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000007114000 CR4: 00000000001006f0
 Call Trace:
  __acpi_device_uevent_modalias+0xb0/0x100
  spi_uevent+0xd/0x40

 ...

In order to fix above let create_of_modalias() check the status returned
by acpi_get_name() and bail out in case of failure.

Fixes: 8765c5b ("ACPI / scan: Rework modalias creation when "compatible" is present")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201381
Reported-by: Ferry Toth <[email protected]>
Tested-by: Ferry Toth<[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Cc: 4.1+ <[email protected]> # 4.1+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Mar 27, 2019
commit f16eb8a upstream.

If SSDT overlay is loaded via ConfigFS and then unloaded the device,
we would like to have OF modalias for, already gone. Thus, acpi_get_name()
returns no allocated buffer for such case and kernel crashes afterwards:

 ACPI: Host-directed Dynamic ACPI Table Unload
 ads7950 spi-PRP0001:00: Dropping the link to regulator.0
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 #PF error: [normal kernel read fault]
 PGD 80000000070d6067 P4D 80000000070d6067 PUD 70d0067 PMD 0
 Oops: 0000 [#1] SMP PTI
 CPU: 0 PID: 40 Comm: kworker/u4:2 Not tainted 5.0.0+ #96
 Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
 Workqueue: kacpi_hotplug acpi_device_del_work_fn
 RIP: 0010:create_of_modalias.isra.1+0x4c/0x150
 Code: 00 00 48 89 44 24 18 31 c0 48 8d 54 24 08 48 c7 44 24 10 00 00 00 00 48 c7 44 24 08 ff ff ff ff e8 7a b0 03 00 48 8b 4c 24 10 <0f> b6 01 84 c0 74 27 48 c7 c7 00 09 f4 a5 0f b6 f0 8d 50 20 f6 04
 RSP: 0000:ffffa51040297c10 EFLAGS: 00010246
 RAX: 0000000000001001 RBX: 0000000000000785 RCX: 0000000000000000
 RDX: 0000000000001001 RSI: 0000000000000286 RDI: ffffa2163dc042e0
 RBP: ffffa216062b1196 R08: 0000000000001001 R09: ffffa21639873000
 R10: ffffffffa606761d R11: 0000000000000001 R12: ffffa21639873218
 R13: ffffa2163deb5060 R14: ffffa216063d1010 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffffa2163e000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000007114000 CR4: 00000000001006f0
 Call Trace:
  __acpi_device_uevent_modalias+0xb0/0x100
  spi_uevent+0xd/0x40

 ...

In order to fix above let create_of_modalias() check the status returned
by acpi_get_name() and bail out in case of failure.

Fixes: 8765c5b ("ACPI / scan: Rework modalias creation when "compatible" is present")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201381
Reported-by: Ferry Toth <[email protected]>
Tested-by: Ferry Toth<[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Cc: 4.1+ <[email protected]> # 4.1+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
Current code doesn't limit the number of nested devices.
Nested devices would be handled recursively and this needs huge stack
memory. So, unlimited nested devices could make stack overflow.

This patch adds upper_level and lower_level, they are common variables
and represent maximum lower/upper depth.
When upper/lower device is attached or dettached,
{lower/upper}_level are updated. and if maximum depth is bigger than 8,
attach routine fails and returns -EMLINK.

In addition, this patch converts recursive routine of
netdev_walk_all_{lower/upper} to iterator routine.

Test commands:
    ip link add dummy0 type dummy
    ip link add link dummy0 name vlan1 type vlan id 1
    ip link set vlan1 up

    for i in {2..55}
    do
	    let A=$i-1

	    ip link add vlan$i link vlan$A type vlan id $i
    done
    ip link del dummy0

Splat looks like:
[  155.513226][  T908] BUG: KASAN: use-after-free in __unwind_start+0x71/0x850
[  155.514162][  T908] Write of size 88 at addr ffff8880608a6cc0 by task ip/908
[  155.515048][  T908]
[  155.515333][  T908] CPU: 0 PID: 908 Comm: ip Not tainted 5.4.0-rc3+ #96
[  155.516147][  T908] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  155.517233][  T908] Call Trace:
[  155.517627][  T908]
[  155.517918][  T908] Allocated by task 0:
[  155.518412][  T908] (stack is not available)
[  155.518955][  T908]
[  155.519228][  T908] Freed by task 0:
[  155.519885][  T908] (stack is not available)
[  155.520452][  T908]
[  155.520729][  T908] The buggy address belongs to the object at ffff8880608a6ac0
[  155.520729][  T908]  which belongs to the cache names_cache of size 4096
[  155.522387][  T908] The buggy address is located 512 bytes inside of
[  155.522387][  T908]  4096-byte region [ffff8880608a6ac0, ffff8880608a7ac0)
[  155.523920][  T908] The buggy address belongs to the page:
[  155.524552][  T908] page:ffffea0001822800 refcount:1 mapcount:0 mapping:ffff88806c657cc0 index:0x0 compound_mapcount:0
[  155.525836][  T908] flags: 0x100000000010200(slab|head)
[  155.526445][  T908] raw: 0100000000010200 ffffea0001813808 ffffea0001a26c08 ffff88806c657cc0
[  155.527424][  T908] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[  155.528429][  T908] page dumped because: kasan: bad access detected
[  155.529158][  T908]
[  155.529410][  T908] Memory state around the buggy address:
[  155.530060][  T908]  ffff8880608a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  155.530971][  T908]  ffff8880608a6c00: fb fb fb fb fb f1 f1 f1 f1 00 f2 f2 f2 f3 f3 f3
[  155.531889][  T908] >ffff8880608a6c80: f3 fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  155.532806][  T908]                                            ^
[  155.533509][  T908]  ffff8880608a6d00: fb fb fb fb fb fb fb fb fb f1 f1 f1 f1 00 00 00
[  155.534436][  T908]  ffff8880608a6d80: f2 f3 f3 f3 f3 fb fb fb 00 00 00 00 00 00 00 00
[ ... ]

Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
The IFF_BONDING means bonding master or bonding slave device.
->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets
IFF_BONDING flag.

bond0<--bond1

Both bond0 and bond1 are bonding device and these should keep having
IFF_BONDING flag until they are removed.
But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine
do not check whether the slave device is the bonding type or not.
This patch adds the interface type check routine before removing
IFF_BONDING flag.

Test commands:
    ip link add bond0 type bond
    ip link add bond1 type bond
    ip link set bond1 master bond0
    ip link set bond1 nomaster
    ip link del bond1 type bond
    ip link add bond1 type bond

Splat looks like:
[  226.665555] proc_dir_entry 'bonding/bond1' already registered
[  226.666440] WARNING: CPU: 0 PID: 737 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0
[  226.667571] Modules linked in: bonding af_packet sch_fq_codel ip_tables x_tables unix
[  226.668662] CPU: 0 PID: 737 Comm: ip Not tainted 5.4.0-rc3+ #96
[  226.669508] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  226.670652] RIP: 0010:proc_register+0x2a9/0x3e0
[  226.671612] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 0b 14 9f 48 8b b0 e
0 00 00 00 e8 07 e7 88 ff <0f> 0b 48 c7 c7 40 2d a5 9f e8 59 d6 23 01 48 8b 4c 24 10 48 b8 00
[  226.675007] RSP: 0018:ffff888050e17078 EFLAGS: 00010282
[  226.675761] RAX: dffffc0000000008 RBX: ffff88805fdd0f10 RCX: ffffffff9dd344e2
[  226.676757] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f6b8c
[  226.677751] RBP: ffff8880507160f3 R08: ffffed100d940019 R09: ffffed100d940019
[  226.678761] R10: 0000000000000001 R11: ffffed100d940018 R12: ffff888050716008
[  226.679757] R13: ffff8880507160f2 R14: dffffc0000000000 R15: ffffed100a0e2c1e
[  226.680758] FS:  00007fdc217cc0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
[  226.681886] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  226.682719] CR2: 00007f49313424d0 CR3: 0000000050e46001 CR4: 00000000000606f0
[  226.683727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  226.684725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  226.685681] Call Trace:
[  226.687089]  proc_create_seq_private+0xb3/0xf0
[  226.687778]  bond_create_proc_entry+0x1b3/0x3f0 [bonding]
[  226.691458]  bond_netdev_event+0x433/0x970 [bonding]
[  226.692139]  ? __module_text_address+0x13/0x140
[  226.692779]  notifier_call_chain+0x90/0x160
[  226.693401]  register_netdevice+0x9b3/0xd80
[  226.694010]  ? alloc_netdev_mqs+0x854/0xc10
[  226.694629]  ? netdev_change_features+0xa0/0xa0
[  226.695278]  ? rtnl_create_link+0x2ed/0xad0
[  226.695849]  bond_newlink+0x2a/0x60 [bonding]
[  226.696422]  __rtnl_newlink+0xb9f/0x11b0
[  226.696968]  ? rtnl_link_unregister+0x220/0x220
[ ... ]

Fixes: 0b680e7 ("[PATCH] bonding: Add priv_flag to avoid event mishandling")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
All bonding device has same lockdep key and subclass is initialized with
nest_level.
But actual nest_level value can be changed when a lower device is attached.
And at this moment, the subclass should be updated but it seems to be
unsafe.
So this patch makes bonding use dynamic lockdep key instead of the
subclass.

Test commands:
    ip link add bond0 type bond

    for i in {1..5}
    do
	    let A=$i-1
	    ip link add bond$i type bond
	    ip link set bond$i master bond$A
    done
    ip link set bond5 master bond0

Splat looks like:
[  307.992912] WARNING: possible recursive locking detected
[  307.993656] 5.4.0-rc3+ #96 Tainted: G        W
[  307.994367] --------------------------------------------
[  307.995092] ip/761 is trying to acquire lock:
[  307.995710] ffff8880513aac60 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[  307.997045]
	       but task is already holding lock:
[  307.997923] ffff88805fcbac60 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[  307.999215]
	       other info that might help us debug this:
[  308.000251]  Possible unsafe locking scenario:

[  308.001137]        CPU0
[  308.001533]        ----
[  308.001915]   lock(&(&bond->stats_lock)->rlock#2/2);
[  308.002609]   lock(&(&bond->stats_lock)->rlock#2/2);
[  308.003302]
		*** DEADLOCK ***

[  308.004310]  May be due to missing lock nesting notation

[  308.005319] 3 locks held by ip/761:
[  308.005830]  #0: ffffffff9fcc42b0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[  308.006894]  #1: ffff88805fcbac60 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[  308.008243]  #2: ffffffff9f9219c0 (rcu_read_lock){....}, at: bond_get_stats+0x9f/0x500 [bonding]
[  308.009422]
	       stack backtrace:
[  308.010124] CPU: 0 PID: 761 Comm: ip Tainted: G        W         5.4.0-rc3+ #96
[  308.011097] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  308.012179] Call Trace:
[  308.012601]  dump_stack+0x7c/0xbb
[  308.013089]  __lock_acquire+0x269d/0x3de0
[  308.013669]  ? register_lock_class+0x14d0/0x14d0
[  308.014318]  lock_acquire+0x164/0x3b0
[  308.014858]  ? bond_get_stats+0xb8/0x500 [bonding]
[  308.015520]  _raw_spin_lock_nested+0x2e/0x60
[  308.016129]  ? bond_get_stats+0xb8/0x500 [bonding]
[  308.017215]  bond_get_stats+0xb8/0x500 [bonding]
[  308.018454]  ? bond_arp_rcv+0xf10/0xf10 [bonding]
[  308.019710]  ? rcu_read_lock_held+0x90/0xa0
[  308.020605]  ? rcu_read_lock_sched_held+0xc0/0xc0
[  308.021286]  ? bond_get_stats+0x9f/0x500 [bonding]
[  308.021953]  dev_get_stats+0x1ec/0x270
[  308.022508]  bond_get_stats+0x1d1/0x500 [bonding]

Fixes: d3fff6c ("net: add netdev_lockdep_set_classes() helper")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
team interface could be nested and it's lock variable could be nested too.
But this lock uses static lockdep key and there is no nested locking
handling code such as mutex_lock_nested() and so on.
so the Lockdep would warn about the circular locking scenario that
couldn't happen.
In order to fix, this patch makes the team module to use dynamic lock key
instead of static key.

Test commands:
    ip link add team0 type team
    ip link add team1 type team
    ip link set team0 master team1
    ip link set team0 nomaster
    ip link set team1 master team0
    ip link set team1 nomaster

Splat that looks like:
[   40.364352] WARNING: possible recursive locking detected
[   40.364964] 5.4.0-rc3+ #96 Not tainted
[   40.365405] --------------------------------------------
[   40.365973] ip/750 is trying to acquire lock:
[   40.366542] ffff888060b34c40 (&team->lock){+.+.}, at: team_set_mac_address+0x151/0x290 [team]
[   40.367689]
	       but task is already holding lock:
[   40.368729] ffff888051201c40 (&team->lock){+.+.}, at: team_del_slave+0x29/0x60 [team]
[   40.370280]
	       other info that might help us debug this:
[   40.371159]  Possible unsafe locking scenario:

[   40.371942]        CPU0
[   40.372338]        ----
[   40.372673]   lock(&team->lock);
[   40.373115]   lock(&team->lock);
[   40.373549]
	       *** DEADLOCK ***

[   40.374432]  May be due to missing lock nesting notation

[   40.375338] 2 locks held by ip/750:
[   40.375851]  #0: ffffffffabcc42b0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[   40.376927]  #1: ffff888051201c40 (&team->lock){+.+.}, at: team_del_slave+0x29/0x60 [team]
[   40.377989]
	       stack backtrace:
[   40.378650] CPU: 0 PID: 750 Comm: ip Not tainted 5.4.0-rc3+ #96
[   40.379368] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   40.380574] Call Trace:
[   40.381208]  dump_stack+0x7c/0xbb
[   40.381959]  __lock_acquire+0x269d/0x3de0
[   40.382817]  ? register_lock_class+0x14d0/0x14d0
[   40.383784]  ? check_chain_key+0x236/0x5d0
[   40.384518]  lock_acquire+0x164/0x3b0
[   40.385074]  ? team_set_mac_address+0x151/0x290 [team]
[   40.385805]  __mutex_lock+0x14d/0x14c0
[   40.386371]  ? team_set_mac_address+0x151/0x290 [team]
[   40.387038]  ? team_set_mac_address+0x151/0x290 [team]
[   40.387632]  ? mutex_lock_io_nested+0x1380/0x1380
[   40.388245]  ? team_del_slave+0x60/0x60 [team]
[   40.388752]  ? rcu_read_lock_sched_held+0x90/0xc0
[   40.389304]  ? rcu_read_lock_bh_held+0xa0/0xa0
[   40.389819]  ? lock_acquire+0x164/0x3b0
[   40.390285]  ? lockdep_rtnl_is_held+0x16/0x20
[   40.390797]  ? team_port_get_rtnl+0x90/0xe0 [team]
[   40.391353]  ? __module_text_address+0x13/0x140
[   40.391886]  ? team_set_mac_address+0x151/0x290 [team]
[   40.392547]  team_set_mac_address+0x151/0x290 [team]
[   40.393111]  dev_set_mac_address+0x1f0/0x3f0
[ ... ]

Fixes: 3d249d4 ("net: introduce ethernet teaming device")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
virt_wifi_newlink() calls netdev_upper_dev_link() and it internally
holds reference count of lower interface.

Current code does not release a reference count of the lower interface
when the lower interface is being deleted.
So, reference count leaks occur.

Test commands:
    ip link add dummy0 type dummy
    ip link add vw1 link dummy0 type virt_wifi
    ip link del dummy0

Splat looks like:
[  133.787526][  T788] WARNING: CPU: 1 PID: 788 at net/core/dev.c:8274 rollback_registered_many+0x835/0xc80
[  133.788355][  T788] Modules linked in: virt_wifi cfg80211 dummy team af_packet sch_fq_codel ip_tables x_tables unix
[  133.789377][  T788] CPU: 1 PID: 788 Comm: ip Not tainted 5.4.0-rc3+ #96
[  133.790069][  T788] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  133.791167][  T788] RIP: 0010:rollback_registered_many+0x835/0xc80
[  133.791906][  T788] Code: 00 4d 85 ff 0f 84 b5 fd ff ff ba c0 0c 00 00 48 89 de 4c 89 ff e8 9b 58 04 00 48 89 df e8 30
[  133.794317][  T788] RSP: 0018:ffff88805ba3f338 EFLAGS: 00010202
[  133.795080][  T788] RAX: ffff88805e57e801 RBX: ffff88805ba34000 RCX: ffffffffa9294723
[  133.796045][  T788] RDX: 1ffff1100b746816 RSI: 0000000000000008 RDI: ffffffffabcc4240
[  133.797006][  T788] RBP: ffff88805ba3f4c0 R08: fffffbfff5798849 R09: fffffbfff5798849
[  133.797993][  T788] R10: 0000000000000001 R11: fffffbfff5798848 R12: dffffc0000000000
[  133.802514][  T788] R13: ffff88805ba3f440 R14: ffff88805ba3f400 R15: ffff88805ed622c0
[  133.803237][  T788] FS:  00007f2e9608c0c0(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
[  133.804002][  T788] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.804664][  T788] CR2: 00007f2e95610603 CR3: 000000005f68c004 CR4: 00000000000606e0
[  133.805363][  T788] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  133.806073][  T788] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  133.806787][  T788] Call Trace:
[  133.807069][  T788]  ? generic_xdp_install+0x310/0x310
[  133.807612][  T788]  ? lock_acquire+0x164/0x3b0
[  133.808077][  T788]  ? is_bpf_text_address+0x5/0xf0
[  133.808640][  T788]  ? deref_stack_reg+0x9c/0xd0
[  133.809138][  T788]  ? __nla_validate_parse+0x98/0x1ab0
[  133.809944][  T788]  unregister_netdevice_many.part.122+0x13/0x1b0
[  133.810599][  T788]  rtnl_delete_link+0xbc/0x100
[  133.811073][  T788]  ? rtnl_af_register+0xc0/0xc0
[  133.811672][  T788]  rtnl_dellink+0x30e/0x8a0
[  133.812205][  T788]  ? is_bpf_text_address+0x5/0xf0
[ ... ]

[  144.110530][  T788] unregister_netdevice: waiting for dummy0 to become free. Usage count = 1

This patch adds notifier routine to delete upper interface before deleting
lower interface.

Fixes: c7cdba3 ("mac80211-next: rtnetlink wifi simulation device")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.