Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quirks for Sabrent Rocket Q on darp7, galp5, lemp9, and lemp10 #35

Closed
wants to merge 5,543 commits into from

Conversation

jacobgkau
Copy link
Member

This gets the Sabrent Rocket Q 8TB drive working in the products where it was not previously working (which would quadruple the maximum storage we're able to offer in these products.) I think it would be a good idea to confirm if the 4TB drive also needs the same fixes, so it can be added to the same section.

NVME_QUIRK_NO_DEEPEST_PS is needed to prevent random failures while using the drive (as it enters/exits PS4). NVME_QUIRK_SIMPLE_SUSPEND is needed to prevent failure when resuming from suspend.

If we merge this here, then I'll also try to submit it upstream.

svens-s390 and others added 30 commits January 21, 2021 13:21
BugLink: https://bugs.launchpad.net/bugs/1912027

commit b5e438e upstream.

Not resetting the SMT siblings might leave them in unpredictable
state. One of the observed problems was that the CPU timer wasn't
reset and therefore large system time values where accounted during
CPU bringup.

Cc: <[email protected]> # 4.0
Fixes: 10ad34b ("s390: add SMT support")
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 613775d upstream.

diag308 subcode 0 performes a clear reset which inlcudes the reset of
all registers in the system. While this is the preferred behavior when
loading a normal kernel via kexec it prevents the crash kernel to store
the register values in the dump. To prevent this use subcode 1 when
loading a crash kernel instead.

Fixes: ee337f5 ("s390/kexec_file: Add crash support to image loader")
Cc: <[email protected]> # 4.17
Signed-off-by: Philipp Rudo <[email protected]>
Reported-by: Xiaoying Yan <[email protected]>
Tested-by: Lianbo Jiang <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit e259b3f upstream.

During removal of the critical section cleanup the calculation
of mt_cycles during idle was removed. This causes invalid
accounting on systems with SMT enabled.

Fixes: 0b0ed65 ("s390: remove critical section cleanup from entry.S")
Cc: <[email protected]> # 5.8
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 454efcf upstream.

When a machine check interrupt is triggered during idle, the code
is using the async timer/clock for idle time calculation. It should use
the machine check enter timer/clock which is passed to the macro.

Fixes: 0b0ed65 ("s390: remove critical section cleanup from entry.S")
Cc: <[email protected]> # 5.8
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 658a337 upstream.

For an LCU update a read unit address configuration IO is required.
This is started using sleep_on(), which has early exit paths in case the
device is not usable for IO. For example when it is in offline processing.

In those cases the LCU update should fail and not be retried.
Therefore lcu_update_work checks if EOPNOTSUPP is returned or not.

Commit 4199534 ("s390/dasd: fix endless loop after read unit address configuration")
accidentally removed the EOPNOTSUPP return code from
read_unit_address_configuration(), which in turn might lead to an endless
loop of the LCU update in offline processing.

Fix by returning EOPNOTSUPP again if the device is not able to perform the
request.

Fixes: 4199534 ("s390/dasd: fix endless loop after read unit address configuration")
Cc: [email protected] #5.3
Signed-off-by: Stefan Haberland <[email protected]>
Reviewed-by: Jan Hoeppner <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit a29ea01 upstream.

Prevent _lcu_update from adding a device to a pavgroup if the LCU still
requires an update. The data is not reliable any longer and in parallel
devices might have been moved on the lists already.
This might lead to list corruptions or invalid PAV grouping.
Only add devices to a pavgroup if the LCU is up to date. Additional steps
are taken by the scheduled lcu update.

Fixes: 8e09f21 ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: [email protected]
Signed-off-by: Stefan Haberland <[email protected]>
Reviewed-by: Jan Hoeppner <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 0ede91f upstream.

dasd_alias_add_device() moves devices to the active_devices list in case
of a scheduled LCU update regardless if they have previously been in a
pavgroup or not.

Example: device A and B are in the same pavgroup.

Device A has already been in a pavgroup and the private->pavgroup pointer
is set and points to a valid pavgroup. While going through dasd_add_device
it is moved from the pavgroup to the active_devices list.

In parallel device B might be removed from the same pavgroup in
remove_device_from_lcu() which in turn checks if the group is empty
and deletes it accordingly because device A has already been removed from
there.

When now device A enters remove_device_from_lcu() it is tried to remove it
from the pavgroup again because the pavgroup pointer is still set and again
the empty group will be cleaned up which leads to a list corruption.

Fix by setting private->pavgroup to NULL in dasd_add_device.

If the device has been the last device on the pavgroup an empty pavgroup
remains but this will be cleaned up by the scheduled lcu_update which
iterates over all existing pavgroups.

Fixes: 8e09f21 ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: [email protected]
Signed-off-by: Stefan Haberland <[email protected]>
Reviewed-by: Jan Hoeppner <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 53a7f65 upstream.

In dasd_alias_disconnect_device_from_lcu the device is removed from any
list on the LCU. Afterwards the LCU is removed from the lcu list if it
does not contain devices any longer.

The lcu->lock protects the lcu from parallel updates. But to cancel all
workers and wait for completion the lcu->lock has to be unlocked.

If two devices are removed in parallel and both are removed from the LCU
the first device that takes the lcu->lock again will delete the LCU because
it is already empty but the second device also tries to free the LCU which
leads to a list corruption of the lcu list.

Fix by removing the device right before the lcu is checked without
unlocking the lcu->lock in between.

Fixes: 8e09f21 ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: [email protected]
Signed-off-by: Stefan Haberland <[email protected]>
Reviewed-by: Jan Hoeppner <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 0f966cb upstream.

Add a per-transaction flag to indicate that the buffer
must be cleared when the transaction is complete to
prevent copies of sensitive data from being preserved
in memory.

Signed-off-by: Todd Kjos <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 0d024a8 upstream.

The cx2072x codec driver defines multiple DAIs with the same stream
name "Playback" and "Capture".  Although the current code works more
or less as is as the secondary streams are never used, it still leads
the error message like:
 debugfs: File 'Playback' in directory 'dapm' already present!
 debugfs: File 'Capture' in directory 'dapm' already present!

Fix it by renaming the secondary streams to unique names.

Fixes: a497a43 ("ASoC: Add support for Conexant CX2072X CODEC")
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
…IOS)

BugLink: https://bugs.launchpad.net/bugs/1912027

commit 718c406 upstream.

Users reported that some Lenovo AMD platforms do not have ACP microphone,
but the BIOS advertises it via ACPI.

This patch create a simple DMI table, where those machines with the broken
BIOS can be added. The DMI description for Lenovo IdeaPad 5 and
IdeaPad Flex 5 devices are added there.

Also describe the dmic_acpi_check kernel module parameter in a more
understandable way.

Cc: <[email protected]>
Cc: Vijendar Mukunda <[email protected]>
Cc: Mark Brown <[email protected]>
Signed-off-by: Jaroslav Kysela <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 55d8e6a upstream.

The Raven and Renoir ACP can be distinguished by the PCI revision.
Let's do the check very early, otherwise the wrong probe code
can be run.

Link: https://lore.kernel.org/alsa-devel/[email protected]/
Cc: <[email protected]>
Cc: Vijendar Mukunda <[email protected]>
Cc: Mark Brown <[email protected]>
Signed-off-by: Jaroslav Kysela <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 56c9045 upstream.

I have had reports from two different people that attempts to read the
analog input channels of the MF624 board fail with an `ETIMEDOUT` error.

After triggering the conversion, the code calls `comedi_timeout()` with
`mf6x4_ai_eoc()` as the callback function to check if the conversion is
complete.  The callback returns 0 if complete or `-EBUSY` if not yet
complete.  `comedi_timeout()` returns `-ETIMEDOUT` if it has not
completed within a timeout period which is propagated as an error to the
user application.

The existing code considers the conversion to be complete when the EOLC
bit is high.  However, according to the user manuals for the MF624 and
MF634 boards, this test is incorrect because EOLC is an active low
signal that goes high when the conversion is triggered, and goes low
when the conversion is complete.  Fix the problem by inverting the test
of the EOLC bit state.

Fixes: 04b5650 ("comedi: Humusoft MF634 and MF624 DAQ cards driver")
Cc: <[email protected]> # v4.4+
Cc: Rostislav Lisovy <[email protected]>
Signed-off-by: Ian Abbott <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit fc54886 upstream.

Patch series "z3fold: stability / rt fixes".

Address z3fold stability issues under stress load, primarily in the
reclaim and free aspects.  Besides, it fixes the locking problems that
were only seen in real-time kernel configuration.

This patch (of 3):

There used to be two places in the code where slots could be freed, namely
when freeing the last allocated handle from the slots and when releasing
the z3fold header these slots aree linked to.  The logic to decide on
whether to free certain slots was complicated and error prone in both
functions and it led to failures in RT case.

To fix that, make free_handle() the single point of freeing slots.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vitaly Wool <[email protected]>
Tested-by: Mike Galbraith <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit dcf5aed upstream.

Use temporary slots in reclaim function to avoid possible race when
freeing those.

While at it, make sure we check CLAIMED flag under page lock in the
reclaim function to make sure we are not racing with z3fold_alloc().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vitaly Wool <[email protected]>
Cc: <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 306e3e9 upstream.

The event CYCLE_ACTIVITY.STALLS_MEM_ANY (0x14a3) should be available on
all 8 GP counters on ICL, but it's only scheduled on the first four
counters due to the current ICL constraint table.

Add a line for the CYCLE_ACTIVITY.STALLS_MEM_ANY event in the ICL
constraint table.
Correct the comments for the CYCLE_ACTIVITY.CYCLES_MEM_ANY event.

Fixes: 6017608 ("perf/x86/intel: Add Icelake support")
Reported-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 46b72e1 upstream.

According to the event list from icelake_core_v1.09.json, the encoding
of the RTM_RETIRED.ABORTED event on Ice Lake should be,
    "EventCode": "0xc9",
    "UMask": "0x04",
    "EventName": "RTM_RETIRED.ABORTED",

Correct the wrong encoding.

Fixes: 6017608 ("perf/x86/intel: Add Icelake support")
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
…ace.

BugLink: https://bugs.launchpad.net/bugs/1912027

commit aa8e21c upstream.

Perf event attritube supports exclude_kernel flag to avoid
sampling/profiling in supervisor state (kernel). Based on this event
attr flag, Monitor Mode Control Register bit is set to freeze on
supervisor state. But sometimes (due to hardware limitation), Sampled
Instruction Address Register (SIAR) locks on to kernel address even
when freeze on supervisor is set. Patch here adds a check to drop
those samples.

Cc: [email protected]
Signed-off-by: Athira Rajeev <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit e40ad84 upstream.

When turbo has been disabled by the BIOS, but HWP_CAP.GUARANTEED is
changed later, user space may want to take advantage of this increased
guaranteed performance.

HWP_CAP.GUARANTEED is not a static value.  It can be adjusted by an
out-of-band agent or during an Intel Speed Select performance level
change.  The HWP_CAP.MAX is still the maximum achievable performance
with turbo disabled by the BIOS, so HWP_CAP.GUARANTEED can still
change as long as it remains less than or equal to HWP_CAP.MAX.

When HWP_CAP.GUARANTEED is changed, the sysfs base_frequency
attribute shows the most recent guaranteed frequency value. This
attribute can be used by user space software to update the scaling
min/max limits of the CPU.

Currently, the ->setpolicy() callback already uses the latest
HWP_CAP values when setting HWP_REQ, but the ->verify() callback will
restrict the user settings to the to old guaranteed performance value
which prevents user space from making use of the extra CPU capacity
theoretically available to it after increasing HWP_CAP.GUARANTEED.

To address this, read HWP_CAP in intel_pstate_verify_cpu_policy()
to obtain the maximum P-state that can be used and use that to
confine the policy max limit instead of using the cached and
possibly stale pstate.max_freq value for this purpose.

For consistency, update intel_pstate_update_perf_limits() to use the
maximum available P-state returned by intel_pstate_get_hwp_max() to
compute the maximum frequency instead of using the return value of
intel_pstate_get_max_freq() which, again, may be stale.

This issue is a side-effect of fixing the scaling frequency limits in
commit eacc9c5 ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max()
for turbo disabled") which corrected the setting of the reduced scaling
frequency values, but caused stale HWP_CAP.GUARANTEED to be used in
the case at hand.

Fixes: eacc9c5 ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled")
Reported-by: Srinivas Pandruvada <[email protected]>
Tested-by: Srinivas Pandruvada <[email protected]>
Cc: 5.8+ <[email protected]> # 5.8+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 17858b1 upstream.

ecdh_set_secret() casts a void* pointer to a const u64* in order to
feed it into ecc_is_key_valid(). This is not generally permitted by
the C standard, and leads to actual misalignment faults on ARMv6
cores. In some cases, these are fixed up in software, but this still
leads to performance hits that are entirely avoidable.

So let's copy the key into the ctx buffer first, which we will do
anyway in the common case, and which guarantees correct alignment.

Cc: <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit f3456b9 upstream.

ARM Cortex-A57 and Cortex-A72 cores running in 32-bit mode are affected
by silicon errata #1742098 and #1655431, respectively, where the second
instruction of a AES instruction pair may execute twice if an interrupt
is taken right after the first instruction consumes an input register of
which a single 32-bit lane has been updated the last time it was modified.

This is not such a rare occurrence as it may seem: in counter mode, only
the least significant 32-bit word is incremented in the absence of a
carry, which makes our counter mode implementation susceptible to these
errata.

So let's shuffle the counter assignments around a bit so that the most
recent updates when the AES instruction pair executes are 128-bit wide.

[0] ARM-EPM-049219 v23 Cortex-A57 MPCore Software Developers Errata Notice
[1] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice

Cc: <[email protected]> # v5.4+
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit a7b5458 upstream.

Don't add platform resources that won't be used. This avoids a
recently-added warning from the driver core, that can show up on a
multi-platform kernel when !MACH_IS_MAC.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at drivers/base/platform.c:224 platform_get_irq_optional+0x8e/0xce
0 is an invalid IRQ number
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-multi #1
Stack from 004b3f04:
        004b3f04 00462c2f 00462c2f 004b3f2 0002e128 004754db 004b6ad4 004b3f4c
        0002e19c 004754f7 000000e0 00285ba0 00000009 00000000 004b3f44 ffffffff
        004754db 004b3f64 004b3f74 00285ba0 004754f7 000000e0 00000009 004754db
        004fdf0c 005269e2 004fdf0c 00000000 004b3f88 00285cae 004b6964 00000000
        004fdf0c 004b3fac 0051cc68 004b6964 00000000 004b6964 00000200 00000000
        0051cc3e 0023c18a 004b3fc0 0051cd8a 004fdf0c 00000002 0052b43c 004b3fc8
Call Trace: [<0002e128>] __warn+0xa6/0xd6
 [<0002e19c>] warn_slowpath_fmt+0x44/0x76
 [<00285ba0>] platform_get_irq_optional+0x8e/0xce
 [<00285ba0>] platform_get_irq_optional+0x8e/0xce
 [<00285cae>] platform_get_irq+0x12/0x4c
 [<0051cc68>] pmz_init_port+0x2a/0xa6
 [<0051cc3e>] pmz_init_port+0x0/0xa6
 [<0023c18a>] strlen+0x0/0x22
 [<0051cd8a>] pmz_probe+0x34/0x88
 [<0051cde6>] pmz_console_init+0x8/0x28
 [<00511776>] console_init+0x1e/0x28
 [<0005a3bc>] printk+0x0/0x16
 [<0050a8a6>] start_kernel+0x368/0x4ce
 [<005094f8>] _sinittext+0x4f8/0xc48
random: get_random_bytes called from print_oops_end_marker+0x56/0x80 with crng_init=0
---[ end trace 392d8e82eed68d6c ]---

Commit a85a6c8 ("driver core: platform: Clarify that IRQ 0 is invalid"),
which introduced the WARNING, suggests that testing for irq == 0 is
undesirable. Instead of that comparison, just test for resource existence.

Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Joshua Thompson <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: [email protected] # v5.8+
Reported-by: Laurent Vivier <[email protected]>
Signed-off-by: Finn Thain <[email protected]>
Link: https://lore.kernel.org/r/0c0fe1e4f11ccec202d4df09ea7d9d98155d101a.1606001297.git.fthain@telegraphics.com.au
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 83ff51c upstream.

Instead of raw access, use readl() to access MMIO registers of
memory controller to avoid possible compiler re-ordering.

Fixes: d4dc89d ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Cc: <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 706657b upstream.

In order to setup its PCI component, the driver needs any node private
instance in order to get a reference to the PCI device and hand that
into edac_pci_create_generic_ctl(). For convenience, it uses the 0th
memory controller descriptor under the assumption that if any, the 0th
will be always present.

However, this assumption goes wrong when the 0th node doesn't have
memory and the driver doesn't initialize an instance for it:

  EDAC amd64: F17h detected (node 0).
  ...
  EDAC amd64: Node 0: No DIMMs detected.

But looking up node instances is not really needed - all one needs is
the pointer to the proper device which gets discovered during instance
init.

So stash that pointer into a variable and use it when setting up the
EDAC PCI component.

Clear that variable when the driver needs to unwind due to some
instances failing init to avoid any registration imbalance.

Cc: <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 406100f upstream.

One of our machines keeled over trying to rebuild the scheduler domains.
Mainline produces the same splat:

  BUG: unable to handle page fault for address: 0000607f820054db
  CPU: 2 PID: 149 Comm: kworker/1:1 Not tainted 5.10.0-rc1-master+ #6
  Workqueue: events cpuset_hotplug_workfn
  RIP: build_sched_domains
  Call Trace:
   partition_sched_domains_locked
   rebuild_sched_domains_locked
   cpuset_hotplug_workfn

It happens with cgroup2 and exclusive cpusets only.  This reproducer
triggers it on an 8-cpu vm and works most effectively with no
preexisting child cgroups:

  cd $UNIFIED_ROOT
  mkdir cg1
  echo 4-7 > cg1/cpuset.cpus
  echo root > cg1/cpuset.cpus.partition

  # with smt/control reading 'on',
  echo off > /sys/devices/system/cpu/smt/control

RIP maps to

  sd->shared = *per_cpu_ptr(sdd->sds, sd_id);

from sd_init().  sd_id is calculated earlier in the same function:

  cpumask_and(sched_domain_span(sd), cpu_map, tl->mask(cpu));
  sd_id = cpumask_first(sched_domain_span(sd));

tl->mask(cpu), which reads cpu_sibling_map on x86, returns an empty mask
and so cpumask_first() returns >= nr_cpu_ids, which leads to the bogus
value from per_cpu_ptr() above.

The problem is a race between cpuset_hotplug_workfn() and a later
offline of CPU N.  cpuset_hotplug_workfn() updates the effective masks
when N is still online, the offline clears N from cpu_sibling_map, and
then the worker uses the stale effective masks that still have N to
generate the scheduling domains, leading the worker to read
N's empty cpu_sibling_map in sd_init().

rebuild_sched_domains_locked() prevented the race during the cgroup2
cpuset series up until the Fixes commit changed its check.  Make the
check more robust so that it can detect an offline CPU in any exclusive
cpuset's effective mask, not just the top one.

Fixes: 0ccea8f ("cpuset: Make generate_sched_domains() work with partition")
Signed-off-by: Daniel Jordan <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 975323a upstream.

The parallel-port restore operations is called when a driver claims the
port and is supposed to restore the provided state (e.g. saved when
releasing the port).

Fixes: b69578d ("USB: usbserial: mos7720: add support for parallel port on moschip 7715")
Cc: stable <[email protected]>     # 2.6.35
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 5098e77 upstream.

The driver must not call tty_wakeup() while holding its private lock as
line disciplines are allowed to call back into write() from
write_wakeup(), leading to a deadlock.

Also remove the unneeded work struct that was used to defer wakeup in
order to work around a possible race in ancient times (see comment about
n_tty write_chan() in commit 14b54e3 ("USB: serial: remove
changelogs and old todo entries")).

Fixes: 1da177e ("Linux-2.6.12-rc2")
Cc: [email protected]
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 696c541 upstream.

Commit c528fcb ("USB: serial: keyspan_pda: fix receive sanity
checks") broke write-unthrottle handling by dropping well-formed
unthrottle-interrupt packets which are precisely two bytes long. This
could lead to blocked writers not being woken up when buffer space again
becomes available.

Instead, stop unconditionally printing the third byte which is
(presumably) only valid on modem-line changes.

Fixes: c528fcb ("USB: serial: keyspan_pda: fix receive sanity checks")
Cc: stable <[email protected]>     # 4.11
Acked-by: Sebastian Andrzej Siewior <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit 7353cad upstream.

The write() callback can be called in interrupt context (e.g. when used
as a console) so interrupts must be disabled while holding the port lock
to prevent a possible deadlock.

Fixes: e81ee63 ("usb-serial: possible irq lock inversion (PPP vs. usb/serial)")
Fixes: 507ca9b ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.")
Cc: stable <[email protected]>     # 2.6.19
Acked-by: Sebastian Andrzej Siewior <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/1912027

commit c01d2c5 upstream.

Make sure to clear the write-busy flag also in case no new data was
submitted due to lack of device buffer space so that writing is
resumed once space again becomes available.

Fixes: 507ca9b ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.")
Cc: stable <[email protected]>     # 2.6.13
Acked-by: Sebastian Andrzej Siewior <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Kelsey Skunberg <[email protected]>
@techaround
Copy link

Interesting. I wouldn't necessarily have expected this to improve stability over the boot option (other than suspend), but glad to >>hear it's helping.
There is randomness -- also I woke up to the system in read only mode - so it's not fixed (note that suspend is completely disabled)


I will be getting a 2TB nvme on monday and I would be happy to run your test then.

The 8TB is in Slot 2 right now. I plan to switch that when I get the 2TB since it can use the PCIe 4.0.

(Sabrent 2TB Rocket Q4 NVMe PCIe 4.0 )

@techaround
Copy link

I have not run your tests yet.

I got the new drive and installed it. I think it is ok, but acts weird with the 8TB is installed.
I had to do this (nvme_core.default_ps_max_latency_us=2000) to even see the 8TB drive.

If I don't, after the disk unencrypts, there is a 30 second delay before it boots and gparted only sees the boot drive.

Do you want me to run your test with nvme_core.default_ps_max_latency_us=2000?

Will running "nvme_core.default_ps_max_latency_us=2000" in the TB2 drive affect the performance? If so, I'll probably end up living without the 8TB drive until this get sorted out.

Though happy to test

@jacobgkau
Copy link
Member Author

jacobgkau commented Feb 16, 2021

I had to do this (nvme_core.default_ps_max_latency_us=2000) to even see the 8TB drive.

Are you booting from a new OS installation on the new drive, or the same OS installation on the old drive? If it's a new installation, please add the sabrent-rocket-q kernel version using the same commands as before in my earlier message. Does that get rid of the weirdness without using nvme_core.default_ps_max_latency_us=2000?

Do you want me to run your test with nvme_core.default_ps_max_latency_us=2000?

Yes, while the custom kernel is preferred, I'm also able to reproduce I/O errors with the kernel option, so knowing whether or not you're able to would tell me if we're seeing the same behavior or if my drive is potentially bad (it sounds like we're seeing the same behavior, since you were still seeing some instability with the custom kernel and with the option before.)

Will running "nvme_core.default_ps_max_latency_us=2000" in the TB2 drive affect the performance?

Not performance, just power efficiency. An NVMe drive has various APST states, and each state takes a certain amount of time to enter/exit. What this option does is says: "If it would take more than 2000 microseconds to wake up from this level of power saving, then don't use it." With the Sabrent 4TB and 8TB drives in particular, this disables PS4 (the deepest power-saving state) and leaves only PS1 through PS3 available. (If anything, it's improving performance by not having to spend time entering/exiting PS4, but the ~2ms difference won't be noticeable.)

You can view the list of states enabled with your kernel/boot options using sudo nvme get-feature /dev/nvme# --feature-id=12 -H, and you can see the current power state with sudo nvme get-feature /dev/nvme# --feature-id=2 -H.

If so, I'll probably end up living without the 8TB drive until this get sorted out.

Just to be clear, there is no guarantee this will be sorted out. This PR was an attempt to get the drives working (by disabling standard features that the drive was not handling properly), and while it does improve behavior, it doesn't seem to be enough. This seems like an issue with the drive that may not be fixable without a firmware update for the drive from Sabrent.

@techaround
Copy link

I had to do this (nvme_core.default_ps_max_latency_us=2000) to even see the 8TB drive.

Are you booting from a new OS installation on the new drive, or the same OS installation on the old drive? If it's a new installation, please add the sabrent-rocket-q kernel version using the same commands as before in my earlier message. Does that get rid of the weirdness without using nvme_core.default_ps_max_latency_us=2000?

I am booting off the 2TB drive.

Do you want me to run your test with nvme_core.default_ps_max_latency_us=2000?

Yes, while the custom kernel is preferred, I'm also able to reproduce I/O errors with the kernel option, so knowing whether or not you're able to would tell me if we're seeing the same behavior or if my drive is potentially bad (it sounds like we're seeing the same behavior, since you were still seeing some instability with the custom kernel and with the option before.)

I will run several tests with different configurations and get back to you.

Will running "nvme_core.default_ps_max_latency_us=2000" in the TB2 drive affect the performance?

Not performance, just power efficiency. An NVMe drive has various APST states, and each state takes a certain amount of time to enter/exit. What this option does is says: "If it would take more than 2000 microseconds to wake up from this level of power saving, then don't use it." With the Sabrent 4TB and 8TB drives in particular, this disables PS4 (the deepest power-saving state) and leaves only PS1 through PS3 available. (If anything, it's improving performance by not having to spend time entering/exiting PS4, but the ~2ms difference won't be noticeable.)

You can view the list of states enabled with your kernel/boot options using sudo nvme get-feature /dev/nvme# --feature-id=12 -H, and you can see the current power state with sudo nvme get-feature /dev/nvme# --feature-id=2 -H.

If so, I'll probably end up living without the 8TB drive until this get sorted out.
Just to be clear, there is no guarantee this will be sorted out. This PR was an attempt to get the drives working (by disabling standard features that the drive was not handling properly), and while it does improve behavior, it doesn't seem to be enough. This seems like an issue with the drive that may not be fixable without a firmware update for the drive from Sabrent.

Too bad I like this computer so much. I can sell the drive I guess -- seem very unfortunate.

@techaround
Copy link

testing
2tb in slot 2 (boot)
8tb in slot 2

Using the default 20.04 kernel with nvme_core.default_ps_max_latency_us=2000

I was able to boot and see both drives.
I ran your test as indicated -- very quickly the 8tb disk went read only.

Installed the custom kernel and rebooted.
could not see the 8TB at all.
tried booting the custom kernel with nvme_core.default_ps_max_latency_us=2000
still cannot see the 8TB drive

reinstalled the default kernel and rebooted
still cannot see the 8TB drive.

@dphurst
Copy link

dphurst commented Feb 17, 2021

Would Sabrent engineers talk to System76 and work with y'all to get the drives in a robust state? I'd sure like to have 4 or 8 TB in my galp5.
Thanks,
Dow Hurst

@jacobgkau
Copy link
Member Author

reinstalled the default kernel and rebooted
still cannot see the 8TB drive.

It doesn't sound like the modified kernel was necessarily responsible for the lack of detection there, if it still wasn't detected after going back to the normal kernel. I've had cases where I need to reseat the SSD after a failure before it will detect again, so you might want to install the modified kernel again, power off, reseat it, and power on again.

@techaround
Copy link

I was running out of time to return the drive so I went ahead and did that -- good luck though

@techaround
Copy link

Has there been any progress find any working 4 TB or greater drives? What about non-Sabrent drives? This is very frustrating.

@dphurst
Copy link

dphurst commented Aug 26, 2021 via email

jackpot51 pushed a commit that referenced this pull request Sep 24, 2021
commit e4571b8 upstream.

[BUG]
It's easy to trigger NULL pointer dereference, just by removing a
non-existing device id:

 # mkfs.btrfs -f -m single -d single /dev/test/scratch1 \
				     /dev/test/scratch2
 # mount /dev/test/scratch1 /mnt/btrfs
 # btrfs device remove 3 /mnt/btrfs

Then we have the following kernel NULL pointer dereference:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 9 PID: 649 Comm: btrfs Not tainted 5.14.0-rc3-custom+ #35
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
 RIP: 0010:btrfs_rm_device+0x4de/0x6b0 [btrfs]
  btrfs_ioctl+0x18bb/0x3190 [btrfs]
  ? lock_is_held_type+0xa5/0x120
  ? find_held_lock.constprop.0+0x2b/0x80
  ? do_user_addr_fault+0x201/0x6a0
  ? lock_release+0xd2/0x2d0
  ? __x64_sys_ioctl+0x83/0xb0
  __x64_sys_ioctl+0x83/0xb0
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae

[CAUSE]
Commit a27a94c ("btrfs: Make btrfs_find_device_by_devspec return
btrfs_device directly") moves the "missing" device path check into
btrfs_rm_device().

But btrfs_rm_device() itself can have case where it only receives
@devid, with NULL as @device_path.

In that case, calling strcmp() on NULL will trigger the NULL pointer
dereference.

Before that commit, we handle the "missing" case inside
btrfs_find_device_by_devspec(), which will not check @device_path at all
if @devid is provided, thus no way to trigger the bug.

[FIX]
Before calling strcmp(), also make sure @device_path is not NULL.

Fixes: a27a94c ("btrfs: Make btrfs_find_device_by_devspec return btrfs_device directly")
CC: [email protected] # 5.4+
Reported-by: butt3rflyh4ck <[email protected]>
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@jackpot51
Copy link
Member

This is pretty out of date now

@jackpot51
Copy link
Member

We can rebase this, and target it against the master_impish kernel, after arm64 builds are finished

@jackpot51 jackpot51 closed this Sep 28, 2021
@jackpot51 jackpot51 deleted the sabrent-rocket-q branch September 28, 2021 21:28
jackpot51 pushed a commit that referenced this pull request Feb 7, 2022
…ncurrent video sessions

[ Upstream commit 91f2b7d ]

In existing implementation, core_clk_setrate() is getting called
concurrently in concurrent video sessions. Before the previous call to
core_clk_setrate returns, new call to core_clk_setrate is invoked from
another video session running concurrently. This results in latest
calculated frequency being set (higher/lower) instead of actual frequency
required for that video session. It also results in stability crashes
mention below. These resources are specific to video core, hence keeping
under core lock would ensure that they are estimated for all running video
sessions and called once for the video core.

Crash logs:

[    1.900089] WARNING: CPU: 4 PID: 1 at drivers/opp/debugfs.c:33 opp_debug_remove_one+0x2c/0x48
[    1.908493] Modules linked in:
[    1.911524] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.10.67 #35 f8edb8c30cf2dd6838495dd9ef9be47af7f5f60c
[    1.921036] Hardware name: Qualcomm Technologies, Inc. sc7280 IDP SKU2 platform (DT)
[    1.928673] pstate: 60800009 (nZCv daif -PAN +UAO -TCO BTYPE=--)
[    1.934608] pc : opp_debug_remove_one+0x2c/0x48
[    1.939080] lr : opp_debug_remove_one+0x2c/0x48
[    1.943560] sp : ffffffc011d7b7f0
[    1.946836] pmr_save: 000000e0
[    1.949854] x29: ffffffc011d7b7f0 x28: ffffffc010733bbc
[    1.955104] x27: ffffffc010733ba8 x26: ffffff8083cedd00
[    1.960355] x25: 0000000000000001 x24: 0000000000000000
[    1.965603] x23: ffffff8083cc2878 x22: ffffff8083ceb900
[    1.970852] x21: ffffff8083ceb910 x20: ffffff8083cc2800
[    1.976101] x19: ffffff8083ceb900 x18: 00000000ffff0a10
[    1.981352] x17: ffffff80837a5620 x16: 00000000000000ec
[    1.986601] x15: ffffffc010519ad4 x14: 0000000000000003
[    1.991849] x13: 0000000000000004 x12: 0000000000000001
[    1.997100] x11: c0000000ffffdfff x10: 00000000ffffffff
[    2.002348] x9 : d2627c580300dc00 x8 : d2627c580300dc00
[    2.007596] x7 : 0720072007200720 x6 : ffffff80802ecf00
[    2.012845] x5 : 0000000000190004 x4 : 0000000000000000
[    2.018094] x3 : ffffffc011d7b478 x2 : ffffffc011d7b480
[    2.023343] x1 : 00000000ffffdfff x0 : 0000000000000017
[    2.028594] Call trace:
[    2.031022]  opp_debug_remove_one+0x2c/0x48
[    2.035160]  dev_pm_opp_put+0x94/0xb0
[    2.038780]  _opp_remove_all+0x7c/0xc8
[    2.042486]  _opp_remove_all_static+0x54/0x7c
[    2.046796]  dev_pm_opp_remove_table+0x74/0x98
[    2.051183]  devm_pm_opp_of_table_release+0x18/0x24
[    2.056001]  devm_action_release+0x1c/0x28
[    2.060053]  release_nodes+0x23c/0x2b8
[    2.063760]  devres_release_group+0xcc/0xd0
[    2.067900]  component_bind+0xac/0x168
[    2.071608]  component_bind_all+0x98/0x124
[    2.075664]  msm_drm_bind+0x1e8/0x678
[    2.079287]  try_to_bring_up_master+0x60/0x134
[    2.083674]  component_master_add_with_match+0xd8/0x120
[    2.088834]  msm_pdev_probe+0x20c/0x2a0
[    2.092629]  platform_drv_probe+0x9c/0xbc
[    2.096598]  really_probe+0x11c/0x46c
[    2.100217]  driver_probe_device+0x8c/0xf0
[    2.104270]  device_driver_attach+0x54/0x78
[    2.108407]  __driver_attach+0x48/0x148
[    2.112201]  bus_for_each_dev+0x88/0xd4
[    2.115998]  driver_attach+0x2c/0x38
[    2.119534]  bus_add_driver+0x10c/0x200
[    2.123330]  driver_register+0x6c/0x104
[    2.127122]  __platform_driver_register+0x4c/0x58
[    2.131767]  msm_drm_register+0x6c/0x70
[    2.135560]  do_one_initcall+0x64/0x23c
[    2.139357]  do_initcall_level+0xac/0x15c
[    2.143321]  do_initcalls+0x5c/0x9c
[    2.146778]  do_basic_setup+0x2c/0x38
[    2.150401]  kernel_init_freeable+0xf8/0x15c
[    2.154622]  kernel_init+0x1c/0x11c
[    2.158079]  ret_from_fork+0x10/0x30
[    2.161615] ---[ end trace a2cc45a0f784b212 ]---

[    2.166272] Removing OPP: 300000000

Signed-off-by: Mansur Alisha Shaik <[email protected]>
Signed-off-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jackpot51 pushed a commit that referenced this pull request Feb 28, 2022
…ncurrent video sessions

[ Upstream commit 91f2b7d ]

In existing implementation, core_clk_setrate() is getting called
concurrently in concurrent video sessions. Before the previous call to
core_clk_setrate returns, new call to core_clk_setrate is invoked from
another video session running concurrently. This results in latest
calculated frequency being set (higher/lower) instead of actual frequency
required for that video session. It also results in stability crashes
mention below. These resources are specific to video core, hence keeping
under core lock would ensure that they are estimated for all running video
sessions and called once for the video core.

Crash logs:

[    1.900089] WARNING: CPU: 4 PID: 1 at drivers/opp/debugfs.c:33 opp_debug_remove_one+0x2c/0x48
[    1.908493] Modules linked in:
[    1.911524] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.10.67 #35 f8edb8c30cf2dd6838495dd9ef9be47af7f5f60c
[    1.921036] Hardware name: Qualcomm Technologies, Inc. sc7280 IDP SKU2 platform (DT)
[    1.928673] pstate: 60800009 (nZCv daif -PAN +UAO -TCO BTYPE=--)
[    1.934608] pc : opp_debug_remove_one+0x2c/0x48
[    1.939080] lr : opp_debug_remove_one+0x2c/0x48
[    1.943560] sp : ffffffc011d7b7f0
[    1.946836] pmr_save: 000000e0
[    1.949854] x29: ffffffc011d7b7f0 x28: ffffffc010733bbc
[    1.955104] x27: ffffffc010733ba8 x26: ffffff8083cedd00
[    1.960355] x25: 0000000000000001 x24: 0000000000000000
[    1.965603] x23: ffffff8083cc2878 x22: ffffff8083ceb900
[    1.970852] x21: ffffff8083ceb910 x20: ffffff8083cc2800
[    1.976101] x19: ffffff8083ceb900 x18: 00000000ffff0a10
[    1.981352] x17: ffffff80837a5620 x16: 00000000000000ec
[    1.986601] x15: ffffffc010519ad4 x14: 0000000000000003
[    1.991849] x13: 0000000000000004 x12: 0000000000000001
[    1.997100] x11: c0000000ffffdfff x10: 00000000ffffffff
[    2.002348] x9 : d2627c580300dc00 x8 : d2627c580300dc00
[    2.007596] x7 : 0720072007200720 x6 : ffffff80802ecf00
[    2.012845] x5 : 0000000000190004 x4 : 0000000000000000
[    2.018094] x3 : ffffffc011d7b478 x2 : ffffffc011d7b480
[    2.023343] x1 : 00000000ffffdfff x0 : 0000000000000017
[    2.028594] Call trace:
[    2.031022]  opp_debug_remove_one+0x2c/0x48
[    2.035160]  dev_pm_opp_put+0x94/0xb0
[    2.038780]  _opp_remove_all+0x7c/0xc8
[    2.042486]  _opp_remove_all_static+0x54/0x7c
[    2.046796]  dev_pm_opp_remove_table+0x74/0x98
[    2.051183]  devm_pm_opp_of_table_release+0x18/0x24
[    2.056001]  devm_action_release+0x1c/0x28
[    2.060053]  release_nodes+0x23c/0x2b8
[    2.063760]  devres_release_group+0xcc/0xd0
[    2.067900]  component_bind+0xac/0x168
[    2.071608]  component_bind_all+0x98/0x124
[    2.075664]  msm_drm_bind+0x1e8/0x678
[    2.079287]  try_to_bring_up_master+0x60/0x134
[    2.083674]  component_master_add_with_match+0xd8/0x120
[    2.088834]  msm_pdev_probe+0x20c/0x2a0
[    2.092629]  platform_drv_probe+0x9c/0xbc
[    2.096598]  really_probe+0x11c/0x46c
[    2.100217]  driver_probe_device+0x8c/0xf0
[    2.104270]  device_driver_attach+0x54/0x78
[    2.108407]  __driver_attach+0x48/0x148
[    2.112201]  bus_for_each_dev+0x88/0xd4
[    2.115998]  driver_attach+0x2c/0x38
[    2.119534]  bus_add_driver+0x10c/0x200
[    2.123330]  driver_register+0x6c/0x104
[    2.127122]  __platform_driver_register+0x4c/0x58
[    2.131767]  msm_drm_register+0x6c/0x70
[    2.135560]  do_one_initcall+0x64/0x23c
[    2.139357]  do_initcall_level+0xac/0x15c
[    2.143321]  do_initcalls+0x5c/0x9c
[    2.146778]  do_basic_setup+0x2c/0x38
[    2.150401]  kernel_init_freeable+0xf8/0x15c
[    2.154622]  kernel_init+0x1c/0x11c
[    2.158079]  ret_from_fork+0x10/0x30
[    2.161615] ---[ end trace a2cc45a0f784b212 ]---

[    2.166272] Removing OPP: 300000000

Signed-off-by: Mansur Alisha Shaik <[email protected]>
Signed-off-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jackpot51 pushed a commit that referenced this pull request Apr 11, 2022
[ Upstream commit f2703de ]

After enabling CONFIG_SCHED_CORE (landed during 5.14 cycle),
2-core 2-thread-per-core interAptiv (CPS-driven) started emitting
the following:

[    0.025698] CPU1 revision is: 0001a120 (MIPS interAptiv (multi))
[    0.048183] ------------[ cut here ]------------
[    0.048187] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6025 sched_core_cpu_starting+0x198/0x240
[    0.048220] Modules linked in:
[    0.048233] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc3+ #35 b7b319f24073fd9a3c2aa7ad15fb7993eec0b26f
[    0.048247] Stack : 817f0000 00000004 327804c8 810eb050 00000000 00000004 00000000 c314fdd1
[    0.048278]         830cbd64 819c0000 81800000 817f0000 83070bf4 00000001 830cbd08 00000000
[    0.048307]         00000000 00000000 815fcbc4 00000000 00000000 00000000 00000000 00000000
[    0.048334]         00000000 00000000 00000000 00000000 817f0000 00000000 00000000 817f6f34
[    0.048361]         817f0000 818a3c00 817f0000 00000004 00000000 00000000 4dc33260 0018c933
[    0.048389]         ...
[    0.048396] Call Trace:
[    0.048399] [<8105a7bc>] show_stack+0x3c/0x140
[    0.048424] [<8131c2a0>] dump_stack_lvl+0x60/0x80
[    0.048440] [<8108b5c0>] __warn+0xc0/0xf4
[    0.048454] [<8108b658>] warn_slowpath_fmt+0x64/0x10c
[    0.048467] [<810bd418>] sched_core_cpu_starting+0x198/0x240
[    0.048483] [<810c6514>] sched_cpu_starting+0x14/0x80
[    0.048497] [<8108c0f8>] cpuhp_invoke_callback_range+0x78/0x140
[    0.048510] [<8108d914>] notify_cpu_starting+0x94/0x140
[    0.048523] [<8106593c>] start_secondary+0xbc/0x280
[    0.048539]
[    0.048543] ---[ end trace 0000000000000000 ]---
[    0.048636] Synchronize counters for CPU 1: done.

...for each but CPU 0/boot.
Basic debug printks right before the mentioned line say:

[    0.048170] CPU: 1, smt_mask:

So smt_mask, which is sibling mask obviously, is empty when entering
the function.
This is critical, as sched_core_cpu_starting() calculates
core-scheduling parameters only once per CPU start, and it's crucial
to have all the parameters filled in at that moment (at least it
uses cpu_smt_mask() which in fact is `&cpu_sibling_map[cpu]` on
MIPS).

A bit of debugging led me to that set_cpu_sibling_map() performing
the actual map calculation, was being invocated after
notify_cpu_start(), and exactly the latter function starts CPU HP
callback round (sched_core_cpu_starting() is basically a CPU HP
callback).
While the flow is same on ARM64 (maps after the notifier, although
before calling set_cpu_online()), x86 started calculating sibling
maps earlier than starting the CPU HP callbacks in Linux 4.14 (see
[0] for the reference). Neither me nor my brief tests couldn't find
any potential caveats in calculating the maps right after performing
delay calibration, but the WARN splat is now gone.
The very same debug prints now yield exactly what I expected from
them:

[    0.048433] CPU: 1, smt_mask: 0-1

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=76ce7cfe35ef

Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mmstick pushed a commit that referenced this pull request Jul 26, 2023
[ Upstream commit 2aaa8a1 ]

With some IPv6 Ext Hdr (RPL, SRv6, etc.), we can send a packet that
has the link-local address as src and dst IP and will be forwarded to
an external IP in the IPv6 Ext Hdr.

For example, the script below generates a packet whose src IP is the
link-local address and dst is updated to 11::.

  # for f in $(find /proc/sys/net/ -name *seg6_enabled*); do echo 1 > $f; done
  # python3
  >>> from socket import *
  >>> from scapy.all import *
  >>>
  >>> SRC_ADDR = DST_ADDR = "fe80::5054:ff:fe12:3456"
  >>>
  >>> pkt = IPv6(src=SRC_ADDR, dst=DST_ADDR)
  >>> pkt /= IPv6ExtHdrSegmentRouting(type=4, addresses=["11::", "22::"], segleft=1)
  >>>
  >>> sk = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW)
  >>> sk.sendto(bytes(pkt), (DST_ADDR, 0))

For such a packet, we call ip6_route_input() to look up a route for the
next destination in these three functions depending on the header type.

  * ipv6_rthdr_rcv()
  * ipv6_rpl_srh_rcv()
  * ipv6_srh_rcv()

If no route is found, ip6_null_entry is set to skb, and the following
dst_input(skb) calls ip6_pkt_drop().

Finally, in icmp6_dev(), we dereference skb_rt6_info(skb)->rt6i_idev->dev
as the input device is the loopback interface.  Then, we have to check if
skb_rt6_info(skb)->rt6i_idev is NULL or not to avoid NULL pointer deref
for ip6_null_entry.

BUG: kernel NULL pointer dereference, address: 0000000000000000
 PF: supervisor read access in kernel mode
 PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 157 Comm: python3 Not tainted 6.4.0-11996-gb121d614371c #35
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:icmp6_send (net/ipv6/icmp.c:436 net/ipv6/icmp.c:503)
Code: fe ff ff 48 c7 40 30 c0 86 5d 83 e8 c6 44 1c 00 e9 c8 fc ff ff 49 8b 46 58 48 83 e0 fe 0f 84 4a fb ff ff 48 8b 80 d0 00 00 00 <48> 8b 00 44 8b 88 e0 00 00 00 e9 34 fb ff ff 4d 85 ed 0f 85 69 01
RSP: 0018:ffffc90000003c70 EFLAGS: 00000286
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000000000e0
RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff888006d72a18
RBP: ffffc90000003d80 R08: 0000000000000000 R09: 0000000000000001
R10: ffffc90000003d98 R11: 0000000000000040 R12: ffff888006d72a10
R13: 0000000000000000 R14: ffff8880057fb800 R15: ffffffff835d86c0
FS:  00007f9dc72ee740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000057b2000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
 <IRQ>
 ip6_pkt_drop (net/ipv6/route.c:4513)
 ipv6_rthdr_rcv (net/ipv6/exthdrs.c:640 net/ipv6/exthdrs.c:686)
 ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5))
 ip6_input_finish (./include/linux/rcupdate.h:781 net/ipv6/ip6_input.c:483)
 __netif_receive_skb_one_core (net/core/dev.c:5455)
 process_backlog (./include/linux/rcupdate.h:781 net/core/dev.c:5895)
 __napi_poll (net/core/dev.c:6460)
 net_rx_action (net/core/dev.c:6529 net/core/dev.c:6660)
 __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554)
 do_softirq (kernel/softirq.c:454 kernel/softirq.c:441)
 </IRQ>
 <TASK>
 __local_bh_enable_ip (kernel/softirq.c:381)
 __dev_queue_xmit (net/core/dev.c:4231)
 ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:135)
 rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914)
 sock_sendmsg (net/socket.c:725 net/socket.c:748)
 __sys_sendto (net/socket.c:2134)
 __x64_sys_sendto (net/socket.c:2146 net/socket.c:2142 net/socket.c:2142)
 do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
RIP: 0033:0x7f9dc751baea
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
RSP: 002b:00007ffe98712c38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007ffe98712cf8 RCX: 00007f9dc751baea
RDX: 0000000000000060 RSI: 00007f9dc6460b90 RDI: 0000000000000003
RBP: 00007f9dc56e8be0 R08: 00007ffe98712d70 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f9dc6af5d1b
 </TASK>
Modules linked in:
CR2: 0000000000000000
 ---[ end trace 0000000000000000 ]---
RIP: 0010:icmp6_send (net/ipv6/icmp.c:436 net/ipv6/icmp.c:503)
Code: fe ff ff 48 c7 40 30 c0 86 5d 83 e8 c6 44 1c 00 e9 c8 fc ff ff 49 8b 46 58 48 83 e0 fe 0f 84 4a fb ff ff 48 8b 80 d0 00 00 00 <48> 8b 00 44 8b 88 e0 00 00 00 e9 34 fb ff ff 4d 85 ed 0f 85 69 01
RSP: 0018:ffffc90000003c70 EFLAGS: 00000286
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000000000e0
RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff888006d72a18
RBP: ffffc90000003d80 R08: 0000000000000000 R09: 0000000000000001
R10: ffffc90000003d98 R11: 0000000000000040 R12: ffff888006d72a10
R13: 0000000000000000 R14: ffff8880057fb800 R15: ffffffff835d86c0
FS:  00007f9dc72ee740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000057b2000 CR4: 00000000007506f0
PKRU: 55555554
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled

Fixes: 4832c30 ("net: ipv6: put host and anycast routes on device with address")
Reported-by: Wang Yufen <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mmstick pushed a commit that referenced this pull request Jan 2, 2024
[ Upstream commit 1834d62 ]

We should pass a pointer to global_hook to the get_proto_defrag_hook()
instead of its value, since the passed value won't be updated even if
the request module was loaded successfully.

Log:

[   54.915713] nf_defrag_ipv4 has bad registration
[   54.915779] WARNING: CPU: 3 PID: 6323 at net/netfilter/nf_bpf_link.c:62 get_proto_defrag_hook+0x137/0x160
[   54.915835] CPU: 3 PID: 6323 Comm: fentry Kdump: loaded Tainted: G            E      6.7.0-rc2+ #35
[   54.915839] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[   54.915841] RIP: 0010:get_proto_defrag_hook+0x137/0x160
[   54.915844] Code: 4f 8c e8 2c cf 68 ff 80 3d db 83 9a 01 00 0f 85 74 ff ff ff 48 89 ee 48 c7 c7 8f 12 4f 8c c6 05 c4 83 9a 01 01 e8 09 ee 5f ff <0f> 0b e9 57 ff ff ff 49 8b 3c 24 4c 63 e5 e8 36 28 6c ff 4c 89 e0
[   54.915849] RSP: 0018:ffffb676003fbdb0 EFLAGS: 00010286
[   54.915852] RAX: 0000000000000023 RBX: ffff9596503d5600 RCX: ffff95996fce08c8
[   54.915854] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff95996fce08c0
[   54.915855] RBP: ffffffff8c4f12de R08: 0000000000000000 R09: 00000000fffeffff
[   54.915859] R10: ffffb676003fbc70 R11: ffffffff8d363ae8 R12: 0000000000000000
[   54.915861] R13: ffffffff8e1f75c0 R14: ffffb676003c9000 R15: 00007ffd15e78ef0
[   54.915864] FS:  00007fb6e9cab740(0000) GS:ffff95996fcc0000(0000) knlGS:0000000000000000
[   54.915867] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.915868] CR2: 00007ffd15e75c40 CR3: 0000000101e62006 CR4: 0000000000360ef0
[   54.915870] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   54.915871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   54.915873] Call Trace:
[   54.915891]  <TASK>
[   54.915894]  ? __warn+0x84/0x140
[   54.915905]  ? get_proto_defrag_hook+0x137/0x160
[   54.915908]  ? __report_bug+0xea/0x100
[   54.915925]  ? report_bug+0x2b/0x80
[   54.915928]  ? handle_bug+0x3c/0x70
[   54.915939]  ? exc_invalid_op+0x18/0x70
[   54.915942]  ? asm_exc_invalid_op+0x1a/0x20
[   54.915948]  ? get_proto_defrag_hook+0x137/0x160
[   54.915950]  bpf_nf_link_attach+0x1eb/0x240
[   54.915953]  link_create+0x173/0x290
[   54.915969]  __sys_bpf+0x588/0x8f0
[   54.915974]  __x64_sys_bpf+0x20/0x30
[   54.915977]  do_syscall_64+0x45/0xf0
[   54.915989]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[   54.915998] RIP: 0033:0x7fb6e9daa51d
[   54.916001] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 89 0c 00 f7 d8 64 89 01 48
[   54.916003] RSP: 002b:00007ffd15e78ed8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
[   54.916006] RAX: ffffffffffffffda RBX: 00007ffd15e78fc0 RCX: 00007fb6e9daa51d
[   54.916007] RDX: 0000000000000040 RSI: 00007ffd15e78ef0 RDI: 000000000000001c
[   54.916009] RBP: 000000000000002d R08: 00007fb6e9e73a60 R09: 0000000000000001
[   54.916010] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000006
[   54.916012] R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
[   54.916014]  </TASK>
[   54.916015] ---[ end trace 0000000000000000 ]---

Fixes: 91721c2 ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")
Signed-off-by: D. Wythe <[email protected]>
Acked-by: Daniel Xu <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mmstick pushed a commit that referenced this pull request Jan 2, 2024
[ Upstream commit fe57575 ]

The `cgrp_local_storage` test triggers a kernel panic like:

  # ./test_progs -t cgrp_local_storage
  Can't find bpf_testmod.ko kernel module: -2
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  [  550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00
  [  550.931781] Oops[#1]:
  [  550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
  [  550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
  [  550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0
  [  550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558
  [  550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118
  [  550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0
  [  550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020
  [  550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700
  [  550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8
  [  550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050
  [  550.933520]    ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200
  [  550.933911]   ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
  [  550.934105]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  [  550.934596]  PRMD: 00000004 (PPLV0 +PIE -PWE)
  [  550.934712]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
  [  550.934836]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
  [  550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
  [  550.935097]  BADV: 0000000000000080
  [  550.935181]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
  [  550.935291] Modules linked in:
  [  550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55)
  [  550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001
  [  550.935844]         9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034
  [  550.935990]         0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09
  [  550.936175]         0000000000000001 0000000000000000 9000000108353ec0 0000000000000118
  [  550.936314]         9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000
  [  550.936479]         0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288
  [  550.936635]         00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003
  [  550.936779]         9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c
  [  550.936939]         0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0
  [  550.937083]         ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558
  [  550.937224]         ...
  [  550.937299] Call Trace:
  [  550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
  [  550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154
  [  550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200
  [  550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94
  [  550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158
  [  550.938477]
  [  550.938607] Code: 580009ae  50016000  262402e4 <28c20085> 14092084  03a00084  16000024  03240084  00150006
  [  550.938851]
  [  550.939021] ---[ end trace 0000000000000000 ]---

Further investigation shows that this panic is triggered by memory
load operations:

  ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0,
                             BPF_LOCAL_STORAGE_GET_F_CREATE);

The expression `task->cgroups->dfl_cgrp` involves two memory load.
Since the field offset fits in imm12 or imm14, we use ldd or ldptrd
instructions. But both instructions have the side effect that it will
signed-extended the imm operand. Finally, we got the wrong addresses
and panics is inevitable.

Use a generic ldxd instruction to avoid this kind of issues.

With this change, we have:

  # ./test_progs -t cgrp_local_storage
  Can't find bpf_testmod.ko kernel module: -2
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
  #48/1    cgrp_local_storage/tp_btf:OK
  test_attach_cgroup:PASS:skel_open 0 nsec
  test_attach_cgroup:PASS:prog_attach 0 nsec
  test_attach_cgroup:PASS:prog_attach 0 nsec
  libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
  test_attach_cgroup:FAIL:prog_attach unexpected error: -524
  #48/2    cgrp_local_storage/attach_cgroup:FAIL
  test_recursion:PASS:skel_open_and_load 0 nsec
  libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'on_lookup': failed to auto-attach: -524
  test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
  #48/3    cgrp_local_storage/recursion:FAIL
  #48/4    cgrp_local_storage/negative:OK
  #48/5    cgrp_local_storage/cgroup_iter_sleepable:OK
  test_yes_rcu_lock:PASS:skel_open 0 nsec
  test_yes_rcu_lock:PASS:skel_load 0 nsec
  libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
  test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
  #48/6    cgrp_local_storage/yes_rcu_lock:FAIL
  #48/7    cgrp_local_storage/no_rcu_lock:OK
  #48      cgrp_local_storage:FAIL

  All error logs:
  test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
  test_attach_cgroup:PASS:skel_open 0 nsec
  test_attach_cgroup:PASS:prog_attach 0 nsec
  test_attach_cgroup:PASS:prog_attach 0 nsec
  libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
  test_attach_cgroup:FAIL:prog_attach unexpected error: -524
  #48/2    cgrp_local_storage/attach_cgroup:FAIL
  test_recursion:PASS:skel_open_and_load 0 nsec
  libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'on_lookup': failed to auto-attach: -524
  test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
  #48/3    cgrp_local_storage/recursion:FAIL
  test_yes_rcu_lock:PASS:skel_open 0 nsec
  test_yes_rcu_lock:PASS:skel_load 0 nsec
  libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
  test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
  #48/6    cgrp_local_storage/yes_rcu_lock:FAIL
  #48      cgrp_local_storage:FAIL
  Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED

No panics any more (The test still failed because lack of BPF trampoline
which I am actively working on).

Fixes: 5dc6155 ("LoongArch: Add BPF JIT support")
Signed-off-by: Hengqi Chen <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mmstick pushed a commit that referenced this pull request Jan 2, 2024
[ Upstream commit 5d47ec2 ]

The `cls_redirect` test triggers a kernel panic like:

  # ./test_progs -t cls_redirect
  Can't find bpf_testmod.ko kernel module: -2
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  [   30.938489] CPU 3 Unable to handle kernel paging request at virtual address fffffffffd814de0, era == ffff800002009fb8, ra == ffff800002009f9c
  [   30.939331] Oops[#1]:
  [   30.939513] CPU: 3 PID: 1260 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
  [   30.939732] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
  [   30.939901] pc ffff800002009fb8 ra ffff800002009f9c tp 9000000104da4000 sp 9000000104da7ab0
  [   30.940038] a0 fffffffffd814de0 a1 9000000104da7a68 a2 0000000000000000 a3 9000000104da7c10
  [   30.940183] a4 9000000104da7c14 a5 0000000000000002 a6 0000000000000021 a7 00005555904d7f90
  [   30.940321] t0 0000000000000110 t1 0000000000000000 t2 fffffffffd814de0 t3 0004c4b400000000
  [   30.940456] t4 ffffffffffffffff t5 00000000c3f63600 t6 0000000000000000 t7 0000000000000000
  [   30.940590] t8 000000000006d803 u0 0000000000000020 s9 9000000104da7b10 s0 900000010504c200
  [   30.940727] s1 fffffffffd814de0 s2 900000010504c200 s3 9000000104da7c10 s4 9000000104da7ad0
  [   30.940866] s5 0000000000000000 s6 90000000030e65bc s7 9000000104da7b44 s8 90000000044f6fc0
  [   30.941015]    ra: ffff800002009f9c bpf_prog_846803e5ae81417f_cls_redirect+0xa0/0x590
  [   30.941535]   ERA: ffff800002009fb8 bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
  [   30.941696]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  [   30.942224]  PRMD: 00000004 (PPLV0 +PIE -PWE)
  [   30.942330]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
  [   30.942453]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
  [   30.942612] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
  [   30.942764]  BADV: fffffffffd814de0
  [   30.942854]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
  [   30.942974] Modules linked in:
  [   30.943078] Process test_progs (pid: 1260, threadinfo=00000000ce303226, task=000000007d10bb76)
  [   30.943306] Stack : 900000010a064000 90000000044f6fc0 9000000104da7b48 0000000000000000
  [   30.943495]         0000000000000000 9000000104da7c14 9000000104da7c10 900000010504c200
  [   30.943626]         0000000000000001 ffff80001b88c000 9000000104da7b70 90000000030e6668
  [   30.943785]         0000000000000000 9000000104da7b58 ffff80001b88c048 9000000003d05000
  [   30.943936]         900000000303ac88 0000000000000000 0000000000000000 9000000104da7b70
  [   30.944091]         0000000000000000 0000000000000001 0000000731eeab00 0000000000000000
  [   30.944245]         ffff80001b88c000 0000000000000000 0000000000000000 54b99959429f83b8
  [   30.944402]         ffff80001b88c000 90000000044f6fc0 9000000101d70000 ffff80001b88c000
  [   30.944538]         000000000000005a 900000010504c200 900000010a064000 900000010a067000
  [   30.944697]         9000000104da7d88 0000000000000000 9000000003d05000 90000000030e794c
  [   30.944852]         ...
  [   30.944924] Call Trace:
  [   30.945120] [<ffff800002009fb8>] bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
  [   30.945650] [<90000000030e6668>] bpf_test_run+0x1ec/0x2f8
  [   30.945958] [<90000000030e794c>] bpf_prog_test_run_skb+0x31c/0x684
  [   30.946065] [<90000000026d4f68>] __sys_bpf+0x678/0x2724
  [   30.946159] [<90000000026d7288>] sys_bpf+0x20/0x2c
  [   30.946253] [<90000000032dd224>] do_syscall+0x7c/0x94
  [   30.946343] [<9000000002541c5c>] handle_syscall+0xbc/0x158
  [   30.946492]
  [   30.946549] Code: 0015030e  5c0009c0  5001d000 <28c00304> 02c00484  29c00304  00150009  2a42d2e4  0280200d
  [   30.946793]
  [   30.946971] ---[ end trace 0000000000000000 ]---
  [   32.093225] Kernel panic - not syncing: Fatal exception in interrupt
  [   32.093526] Kernel relocated by 0x2320000
  [   32.093630]  .text @ 0x9000000002520000
  [   32.093725]  .data @ 0x9000000003400000
  [   32.093792]  .bss  @ 0x9000000004413200
  [   34.971998] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

This is because we signed-extend function return values. When subprog
mode is enabled, we have:

  cls_redirect()
    -> get_global_metrics() returns pcpu ptr 0xfffffefffc00b480

The pointer returned is later signed-extended to 0xfffffffffc00b480 at
`BPF_JMP | BPF_EXIT`. During BPF prog run, this triggers unhandled page
fault and a kernel panic.

Drop the unnecessary signed-extension on return values like other
architectures do.

With this change, we have:

  # ./test_progs -t cls_redirect
  Can't find bpf_testmod.ko kernel module: -2
  WARNING! Selftests relying on bpf_testmod.ko will be skipped.
  #51/1    cls_redirect/cls_redirect_inlined:OK
  #51/2    cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
  #51/3    cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
  #51/4    cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
  #51/5    cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
  #51/6    cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
  #51/7    cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
  #51/8    cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
  #51/9    cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
  #51/10   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
  #51/11   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
  #51/12   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
  #51/13   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
  #51/14   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
  #51/15   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
  #51/16   cls_redirect/cls_redirect_subprogs:OK
  #51/17   cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
  #51/18   cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
  #51/19   cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
  #51/20   cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
  #51/21   cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
  #51/22   cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
  #51/23   cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
  #51/24   cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
  #51/25   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
  #51/26   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
  #51/27   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
  #51/28   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
  #51/29   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
  #51/30   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
  #51/31   cls_redirect/cls_redirect_dynptr:OK
  #51/32   cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
  #51/33   cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
  #51/34   cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
  #51/35   cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
  #51/36   cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
  #51/37   cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
  #51/38   cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
  #51/39   cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
  #51/40   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
  #51/41   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
  #51/42   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
  #51/43   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
  #51/44   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
  #51/45   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
  #51      cls_redirect:OK
  Summary: 1/45 PASSED, 0 SKIPPED, 0 FAILED

Fixes: 5dc6155 ("LoongArch: Add BPF JIT support")
Signed-off-by: Hengqi Chen <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mmstick pushed a commit that referenced this pull request Jan 8, 2024
commit 04a342c upstream.

When we are slave role and receives l2cap conn req when encryption has
started, we should check the enc key size to avoid KNOB attack or BLUFFS
attack.
From SIG recommendation, implementations are advised to reject
service-level connections on an encrypted baseband link with key
strengths below 7 octets.
A simple and clear way to achieve this is to place the enc key size
check in hci_cc_read_enc_key_size()

The btmon log below shows the case that lacks enc key size check.

> HCI Event: Connect Request (0x04) plen 10
        Address: BB:22:33:44:55:99 (OUI BB-22-33)
        Class: 0x480104
          Major class: Computer (desktop, notebook, PDA, organizers)
          Minor class: Desktop workstation
          Capturing (Scanner, Microphone)
          Telephony (Cordless telephony, Modem, Headset)
        Link type: ACL (0x01)
< HCI Command: Accept Connection Request (0x01|0x0009) plen 7
        Address: BB:22:33:44:55:99 (OUI BB-22-33)
        Role: Peripheral (0x01)
> HCI Event: Command Status (0x0f) plen 4
      Accept Connection Request (0x01|0x0009) ncmd 2
        Status: Success (0x00)
> HCI Event: Connect Complete (0x03) plen 11
        Status: Success (0x00)
        Handle: 1
        Address: BB:22:33:44:55:99 (OUI BB-22-33)
        Link type: ACL (0x01)
        Encryption: Disabled (0x00)
...

> HCI Event: Encryption Change (0x08) plen 4
        Status: Success (0x00)
        Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
        Encryption: Enabled with E0 (0x01)
< HCI Command: Read Encryption Key Size (0x05|0x0008) plen 2
        Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
> HCI Event: Command Complete (0x0e) plen 7
      Read Encryption Key Size (0x05|0x0008) ncmd 2
        Status: Success (0x00)
        Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
        Key size: 6
// We should check the enc key size
...

> ACL Data RX: Handle 1 flags 0x02 dlen 12
      L2CAP: Connection Request (0x02) ident 3 len 4
        PSM: 25 (0x0019)
        Source CID: 64
< ACL Data TX: Handle 1 flags 0x00 dlen 16
      L2CAP: Connection Response (0x03) ident 3 len 8
        Destination CID: 64
        Source CID: 64
        Result: Connection pending (0x0001)
        Status: Authorization pending (0x0002)
> HCI Event: Number of Completed Packets (0x13) plen 5
        Num handles: 1
        Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
        Count: 1
        #35: len 16 (25 Kb/s)
        Latency: 5 msec (2-7 msec ~4 msec)
< ACL Data TX: Handle 1 flags 0x00 dlen 16
      L2CAP: Connection Response (0x03) ident 3 len 8
        Destination CID: 64
        Source CID: 64
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)

Cc: [email protected]
Signed-off-by: Alex Lu <[email protected]>
Signed-off-by: Max Chou <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.