Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HACL* Raw RSA Integration #11

Closed
wants to merge 10,000 commits into from

Conversation

karthikbhargavan
Copy link

This PR adds an alternative implementation for the core RSA encryption and decryption functions in crypto/rsa.c. The code is taken from HACL*, but required changes to the source F* code, since it was embedded within RSA-PSS and had to be factored out. The RSA functions are quite (perhaps too) strict on their input and output lengths, so this code needs extensive testing to be sure that it handles all the kinds of inputs the kernel may throw at it.

tititiou36 and others added 30 commits September 19, 2023 12:30
[ Upstream commit 9c37785 ]

Some error paths don't call acpi_put_table() before returning.
Branch to the correct place instead of doing some direct return.

Fixes: 4d27328 ("tpm_crb: Add support for CRB devices based on Pluton")
Signed-off-by: Christophe JAILLET <[email protected]>
Acked-by: Matthew Garrett <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 6df373b ]

In gfs2_logd(), switch from an open-coded wait loop to
wait_event_interruptible_timeout().

Signed-off-by: Andreas Gruenbacher <[email protected]>
Stable-dep-of: b74cd55 ("gfs2: low-memory forced flush fixes")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit b74cd55 ]

First, function gfs2_ail_flush_reqd checks the SDF_FORCE_AIL_FLUSH flag
to determine if an AIL flush should be forced in low-memory situations.
However, it also immediately clears the flag, and when called repeatedly
as in function gfs2_logd, the flag will be lost.  Fix that by pulling
the SDF_FORCE_AIL_FLUSH flag check out of gfs2_ail_flush_reqd.

Second, function gfs2_writepages sets the SDF_FORCE_AIL_FLUSH flag
whether or not enough pages were written.  If enough pages could be
written, flushing the AIL is unnecessary, though.

Third, gfs2_writepages doesn't wake up logd after setting the
SDF_FORCE_AIL_FLUSH flag, so it can take a long time for logd to react.
It would be preferable to wake up logd, but that hurts the performance
of some workloads and we don't quite understand why so far, so don't
wake up logd so far.

Fixes: b066a4e ("gfs2: forcibly flush ail to relieve memory pressure")
Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a493208 ]

Breaking out early when a match is found leads to an incorrect num_chans
value when more than one ipcc mailbox channel is used by the same device.

Fixes: e9d50e4 ("mailbox: qcom-ipcc: Dynamic alloc for channel arrangement")
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Jassi Brar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a3b7039 ]

Buffer 'new_argv' is accessed without bound check after accessing with
bound check via 'new_argc' index.

Fixes: e298f3b ("kconfig: add built-in function support")
Co-developed-by: Ivanov Mikhail <[email protected]>
Signed-off-by: Konstantin Meskhidze <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7f33105 ]

Commit 97d5f2e ("tools api fs: More thread safety for global
filesystem variables") introduces pthread_once, so the libpthread
should be added at link time, or we'll meet the following compile
error when 'make -C tools/mm':

  gcc -Wall -Wextra -I../lib/ -o page-types page-types.c ../lib/api/libapi.a
  ~/linux/tools/lib/api/fs/fs.c:146: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:147: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:148: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:149: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:150: undefined reference to `pthread_once'
  /usr/bin/ld: ../lib/api/libapi.a(libapi-in.o):~/linux/tools/lib/api/fs/fs.c:151:
  more undefined references to `pthread_once' follow
  collect2: error: ld returned 1 exit status
  make: *** [Makefile:22: page-types] Error 1

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 97d5f2e ("tools api fs: More thread safety for global filesystem variables")
Signed-off-by: Xie XiuQi <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 2e00b8b ]

If the device drops into ultra-low-power mode before being placed
into normal-power mode as part of ATI being triggered, the device
does not assert any interrupts until the ATI routine is restarted
two seconds later.

Solve this problem by adopting the vendor's recommendation, which
calls for the device to be placed into normal-power mode prior to
being configured and ATI being triggered.

The original implementation followed this sequence, but the order
was inadvertently changed as part of the resolution of a separate
erratum.

Fixes: 1e4189d ("Input: iqs7222 - protect volatile registers")
Signed-off-by: Jeff LaBundy <[email protected]>
Link: https://lore.kernel.org/r/ZKrpHc2Ji9qR25r2@nixie71
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7962ef1 ]

In 3cb4d5e ("perf trace: Free syscall tp fields in
evsel->priv") it only was freeing if strcmp(evsel->tp_format->system,
"syscalls") returned zero, while the corresponding initialization of
evsel->priv was being performed if it was _not_ zero, i.e. if the tp
system wasn't 'syscalls'.

Just stop looking for that and free it if evsel->priv was set, which
should be equivalent.

Also use the pre-existing evsel_trace__delete() function.

This resolves these leaks, detected with:

  $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

  =================================================================
  ==481565==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212
      #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      gregkh#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      gregkh#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205
      #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      gregkh#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      gregkh#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
  [root@quaco ~]#

With this we plug all leaks with "perf trace sleep 1".

Fixes: 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv")
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 0323e8f ]

Allocate driver data as first resource in the probe function. This way it
can be used during allocation of the other resources (instead of assigning
these to local variables first and update driver data only when it's
allocated). Also as driver data is allocated using a devm function this
should happen first to have the order of freeing resources in the error
path and the remove function in reverse.

Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Stable-dep-of: c116223 ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit c116223 ]

Several resources were not freed in the error path and the remove
function. Add the forgotten items.

Fixes: 34cbcd7 ("pwm: atmel-tcb: Add sama5d2 support")
Fixes: 061f857 ("pwm: atmel-tcb: Switch to new binding")
Signed-off-by: Uwe Kleine-König <[email protected]>
Reviewed-by: Claudiu Beznea <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 4c09e20 ]

As pointed out by Uwe Kleine-König[1], the changes introduced in
commit c1ff7da ("video: backlight: lp855x: Get PWM for PWM mode
during probe") caused the PWM state set up by the bootloader to be
re-set when the driver is probed. This differs from the behavior from
before that patch, where the PWM state would be initialized on the
first brightness change.

Fix this by moving the PWM state initialization into the PWM control
function. Add a new variable, needs_pwm_init, to the device info struct
to allow us to check whether we need the initialization, or whether it
has already been done.

[1] https://lore.kernel.org/lkml/[email protected]/

Fixes: c1ff7da ("video: backlight: lp855x: Get PWM for PWM mode during probe")
Signed-off-by: Artur Weber <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…al power state

[ Upstream commit fe1328b ]

So, let's drop output GPIO direction check and only check GPIO value to set
the initial power state.

Fixes: 706dc68 ("backlight: gpio: Explicitly set the direction of the GPIO")
Signed-off-by: Liu Ying <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Acked-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a7a3252 ]

Split cases in event_pmu for greater accuracy.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Stable-dep-of: b30d4f0 ("perf parse-events: Additional error reporting")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 77cdd78 ]

Migration to improve error reporting as YYABORT cases should carry
event parsing errors.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Stable-dep-of: b30d4f0 ("perf parse-events: Additional error reporting")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit b52cb99 ]

Add PE_ABORT that will YYNOMEM or YYABORT accordingly.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Stable-dep-of: b30d4f0 ("perf parse-events: Additional error reporting")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit b30d4f0 ]

When no events or PMUs match report an error for event_pmu:

Before:
```
$ perf stat -e 'asdfasdf' -a sleep 1
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
```

After:
```
$ perf stat -e 'asdfasdf' -a sleep 1
event syntax error: 'asdfasdf'
                     \___ Bad event name

Unabled to find PMU or event on a PMU of 'asdfasdf'
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
```

Fixes the inadvertent removal when hybrid parsing was modified.

Fixes: 70c90e4 ("perf parse-events: Avoid scanning PMUs before parsing")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 389fbbe ]

Immediately mark NMIs as unmasked in response to #VMGEXIT(NMI complete)
instead of setting awaiting_iret_completion and waiting until the *next*
VM-Exit to unmask NMIs.  The whole point of "NMI complete" is that the
guest is responsible for telling the hypervisor when it's safe to inject
an NMI, i.e. there's no need to wait.  And because there's no IRET to
single-step, the next VM-Exit could be a long time coming, i.e. KVM could
incorrectly hold an NMI pending for far longer than what is required and
expected.

Opportunistically fix a stale reference to HF_IRET_MASK.

Fixes: 916b54a ("KVM: x86: Move HF_NMI_MASK and HF_IRET_MASK into "struct vcpu_svm"")
Fixes: 4444dfe ("KVM: SVM: Add NMI support for an SEV-ES guest")
Cc: Tom Lendacky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 687fe7d ]

Remove option having i2c client contain raw gpio number instead of proper
IRQ number. There are no users of this facility in mainline and it will
allow cleaning up the driver code with regard to wakeup handling, etc.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
Stable-dep-of: cc141c3 ("Input: tca6416-keypad - fix interrupt enable disbalance")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit cc141c3 ]

The driver has been switched to use IRQF_NO_AUTOEN, but in the error
unwinding and remove paths calls to enable_irq() were left in place, which
will lead to an incorrect enable counter value.

Fixes: bcd9730 ("Input: move to use request_irq by IRQF_NO_AUTOEN flag")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 979e9c9 ]

In 616b14b ("perf build: Conditionally define NDEBUG") we
started using NDEBUG=1 when DEBUG=1 isn't present, so code that is
enclosed with assert() is not called.

In dd317df ("perf build: Make binutil libraries opt in") we
stopped linking against binutils-devel, for licensing reasons.

Recently people asked me why annotation of BPF programs wasn't working,
i.e. this:

  $ perf annotate bpf_prog_5280546344e3f45c_kfree_skb

was returning:

  case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF:
     scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation");

This was on a fedora rpm, so its new enough that I had to try to test by
rebuilding using BUILD_NONDISTRO=1, only to get it segfaulting on me.

This combination made this libopcode function not to be called:

        assert(bfd_check_format(bfdf, bfd_object));

Changing it to:

	if (!bfd_check_format(bfdf, bfd_object))
		abort();

Made it work, looking at this "check" function made me realize it
changes the 'bfdf' internal state, i.e. we better call it.

So stop using assert() on it, just call it and abort if it fails.

Probably it is better to propagate the error, etc, but it seems it is
unlikely to fail from the usage done so far and we really need to stop
using libopcodes, so do the quick fix above and move on.

With it we have BPF annotation back working when built with
BUILD_NONDISTRO=1:

  ⬢[acme@toolbox perf-tools-next]$ perf annotate --stdio2 bpf_prog_5280546344e3f45c_kfree_skb   | head
  No kallsyms or vmlinux with build-id 939bc71a1a51cdc434e60af93c7e734f7d5c0e7e was found
  Samples: 12  of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 3000000, [percent: local period]
  bpf_prog_5280546344e3f45c_kfree_skb() bpf_prog_5280546344e3f45c_kfree_skb
  Percent      int kfree_skb(struct trace_event_raw_kfree_skb *args) {
                 nop
   33.33         xchg   %ax,%ax
                 push   %rbp
                 mov    %rsp,%rbp
                 sub    $0x180,%rsp
                 push   %rbx
                 push   %r13
  ⬢[acme@toolbox perf-tools-next]$

Fixes: 6987561 ("perf annotate: Enable annotation of BPF programs")
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mohamed Mahmoud <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Dave Tucker <[email protected]>
Cc: Derek Barbosa <[email protected]>
Cc: Song Liu <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…vm()

[ Upstream commit 5df8ecf ]

Drop the explicit check on the extended CPUID level in cpu_has_svm(), the
kernel's cached CPUID info will leave the entire SVM leaf unset if said
leaf is not supported by hardware.  Prior to using cached information,
the check was needed to avoid false positives due to Intel's rather crazy
CPUID behavior of returning the values of the maximum supported leaf if
the specified leaf is unsupported.

Fixes: 682a810 ("x86/kvm/svm: Simplify cpu_has_svm()")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 8c49c6e ]

Commit 3fd7a16 ("perf script: Add 'cgroup' field for output")
added support for printing cgroup path in perf script output.

It was okay if you didn't want any stacks:

    $ sudo perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup
    jpegtran:23f4bf 3321915 [013] 404718.587488:  /idle.slice/polish.service
    jpegtran:23f4bf 3321915 [031] 404718.592073:  /idle.slice/polish.service

With stacks it gets messier as cgroup is printed after the stack:

    $ perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup,ip,sym
    jpegtran:23f4bf 3321915 [013] 404718.587488:
                    5c554 compress_output
                    570d9 jpeg_finish_compress
                    3476e jpegtran_main
                    330ee jpegtran::main
                    326e2 core::ops::function::FnOnce::call_once (inlined)
                    326e2 std::sys_common::backtrace::__rust_begin_short_backtrace
    /idle.slice/polish.service
    jpegtran:23f4bf 3321915 [031] 404718.592073:
                    8474d jsimd_encode_mcu_AC_first_prepare_sse2.PADDING
                55af68e62fff [unknown]
    /idle.slice/polish.service

Let's instead print cgroup on the same line as comm:

    $ perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup,ip,sym
    jpegtran:23f4bf 3321915 [013] 404718.587488:  /idle.slice/polish.service
                    5c554 compress_output
                    570d9 jpeg_finish_compress
                    3476e jpegtran_main
                    330ee jpegtran::main
                    326e2 core::ops::function::FnOnce::call_once (inlined)
                    326e2 std::sys_common::backtrace::__rust_begin_short_backtrace

    jpegtran:23f4bf 3321915 [031] 404718.592073:  /idle.slice/polish.service
                    8474d jsimd_encode_mcu_AC_first_prepare_sse2.PADDING
                55af68e62fff [unknown]

Fixes: 3fd7a16 ("perf script: Add 'cgroup' field for output")
Signed-off-by: Ivan Babrou <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit dc7f01f ]

For logical OR operator, the actual sample_flags are in the 'groups'
list so it needs to check entries in the list instead.  Otherwise it
would show the following error message.

  $ sudo perf record -a -e cycles:p --filter 'period > 100 || weight > 0' sleep 1
  Error: cycles:p event does not have sample flags 0
  failed to set filter "BPF" on event cycles:p with 2 (No such file or directory)

Actually it should warn on 'weight' is used without WEIGHT flag.

  Error: cycles:p event does not have PERF_SAMPLE_WEIGHT
   Hint: please add -W option to perf record
  failed to set filter "BPF" on event cycles:p with 2 (No such file or directory)

Fixes: 4310551 ("perf bpf filter: Show warning for missing sample flags")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…find_symbol_fb()

[ Upstream commit 42c6dd9 ]

As thread__find_symbol_fb() will end up calling thread__find_map() and
it in turn will call these on uninitialized memory:

        maps__zput(al->maps);
        map__zput(al->map);
        thread__zput(al->thread);

Fixes: 0dd5041 ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 82b0a10 ]

Add perf_dlfilter_fns.al_cleanup() to do addr_location__exit() on data
passed via perf_dlfilter_fns.resolve_address().

Add dlfilter-test-api-v2 to the "dlfilter C API" test to test it.

Update documentation, clarifying that data returned by APIs should not
be dereferenced after filter_event() and filter_event_early() return.

Fixes: 0dd5041 ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…latform

[ Upstream commit 3286f88 ]

Update the description for some of the JSON/events for power10 platform.

Fixes: 32daa5d ("perf vendor events: Initial JSON/events list for power10 platform")
Signed-off-by: Kajol Jain <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit e104df9 ]

Drop some of the JSON/events for power10 platform due to counter
data mismatch.

Fixes: 32daa5d ("perf vendor events: Initial JSON/events list for power10 platform")
Signed-off-by: Kajol Jain <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…tform

[ Upstream commit 4836b9a ]

Drop STORES_PER_INST metric event for the power10 platform, as the
metric expression of STORES_PER_INST metric event using dropped event
PM_ST_FIN.

Fixes: 3ca3af7 ("perf vendor events power10: Add metric events JSON file for power10 platform")
Signed-off-by: Kajol Jain <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
… platform

[ Upstream commit 7d473f4 ]

Move some of the power10 JSON/events to appropriate files.

Fixes: 32daa5d ("perf vendor events: Initial JSON/events list for power10 platform")
Signed-off-by: Kajol Jain <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit edd65d2 ]

Update metric event name for some of the JSON/metric events for
power10 platform.

Fixes: 3ca3af7 ("perf vendor events power10: Add metric events JSON file for power10 platform")
Signed-off-by: Kajol Jain <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 10, 2024
commit 9af2efe upstream.

The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.

So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage.  So it shouldn't access it
unconditionally.

I got a segfault, when I wanted to see cgroup profiles.

  $ sudo perf record -a --all-cgroups --synth=cgroup true

  $ sudo perf report -s cgroup

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  48		return RC_CHK_ACCESS(map)->dso;
  (gdb) bt
  #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  gregkh#1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
  gregkh#2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
  gregkh#3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
      at util/hist.c:644
  gregkh#4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
  gregkh#5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
  gregkh#6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
  gregkh#7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
      at util/hist.c:1260
  gregkh#8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
      machine=0x5555560388e8) at builtin-report.c:334
  gregkh#9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
  gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
  gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
      file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
  gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
  gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
  gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
  gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
  gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
      at util/session.c:780
  gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
      file_path=0x555556038ff0 "perf.data") at util/session.c:1406

As you can see the entry->ms.map was NULL even if he->ms.map has a
value.  This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same.  I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).

Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 10, 2024
commit 23dfdb5 upstream.

The following kernel trace can be triggered with fstest generic/629 when
executed against a filesystem with fast-commit feature enabled:

INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ gregkh#11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x66/0x90
 register_lock_class+0x759/0x7d0
 __lock_acquire+0x85/0x2630
 ? __find_get_block+0xb4/0x380
 lock_acquire+0xd1/0x2d0
 ? __ext4_journal_get_write_access+0xd5/0x160
 _raw_spin_lock+0x33/0x40
 ? __ext4_journal_get_write_access+0xd5/0x160
 __ext4_journal_get_write_access+0xd5/0x160
 ext4_reserve_inode_write+0x61/0xb0
 __ext4_mark_inode_dirty+0x79/0x270
 ? ext4_ext_replay_set_iblocks+0x2f8/0x450
 ext4_ext_replay_set_iblocks+0x330/0x450
 ext4_fc_replay+0x14c8/0x1540
 ? jread+0x88/0x2e0
 ? rcu_is_watching+0x11/0x40
 do_one_pass+0x447/0xd00
 jbd2_journal_recover+0x139/0x1b0
 jbd2_journal_load+0x96/0x390
 ext4_load_and_init_journal+0x253/0xd40
 ext4_fill_super+0x2cc6/0x3180
...

In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in
function ext4_check_bdev_write_error().  Unfortunately, at this point this
spinlock has not been initialized yet.  Moving it's initialization to an
earlier point in __ext4_fill_super() fixes this splat.

Signed-off-by: Luis Henriques (SUSE) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 10, 2024
commit ac01c8c upstream.

AddressSanitizer found a use-after-free bug in the symbol code which
manifested as 'perf top' segfaulting.

  ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
  READ of size 1 at 0x60b00c48844b thread T193
      #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
      gregkh#1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
      gregkh#2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
      gregkh#3 0x5650d804568f in __hists__add_entry util/hist.c:754
      gregkh#4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
      gregkh#5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
      gregkh#6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
      gregkh#7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
      gregkh#8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
      gregkh#9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
      gregkh#10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
      gregkh#11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
      gregkh#12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
      gregkh#13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

When updating hist maps it's also necessary to update the hist symbol
reference because the old one gets freed in map__put().

While this bug was probably introduced with 5c24b67 ("perf
tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
the symbol objects were leaked until c087e94 ("perf machine:
Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
the bug was masked.

Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
Reported-by: Yunzhao Li <[email protected]>
Signed-off-by: Matt Fleming (Cloudflare) <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: Namhyung Kim <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: [email protected] # v5.13+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 10, 2024
commit 9af2efe upstream.

The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.

So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage.  So it shouldn't access it
unconditionally.

I got a segfault, when I wanted to see cgroup profiles.

  $ sudo perf record -a --all-cgroups --synth=cgroup true

  $ sudo perf report -s cgroup

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  48		return RC_CHK_ACCESS(map)->dso;
  (gdb) bt
  #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  gregkh#1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
  gregkh#2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
  gregkh#3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
      at util/hist.c:644
  gregkh#4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
  gregkh#5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
  gregkh#6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
  gregkh#7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
      at util/hist.c:1260
  gregkh#8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
      machine=0x5555560388e8) at builtin-report.c:334
  gregkh#9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
  gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
  gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
      file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
  gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
  gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
  gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
  gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
  gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
      at util/session.c:780
  gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
      file_path=0x555556038ff0 "perf.data") at util/session.c:1406

As you can see the entry->ms.map was NULL even if he->ms.map has a
value.  This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same.  I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).

Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 17, 2024
commit ac01c8c upstream.

AddressSanitizer found a use-after-free bug in the symbol code which
manifested as 'perf top' segfaulting.

  ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
  READ of size 1 at 0x60b00c48844b thread T193
      #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
      gregkh#1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
      gregkh#2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
      gregkh#3 0x5650d804568f in __hists__add_entry util/hist.c:754
      gregkh#4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
      gregkh#5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
      gregkh#6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
      gregkh#7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
      gregkh#8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
      gregkh#9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
      gregkh#10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
      gregkh#11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
      gregkh#12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
      gregkh#13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

When updating hist maps it's also necessary to update the hist symbol
reference because the old one gets freed in map__put().

While this bug was probably introduced with 5c24b67 ("perf
tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
the symbol objects were leaked until c087e94 ("perf machine:
Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
the bug was masked.

Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
Reported-by: Yunzhao Li <[email protected]>
Signed-off-by: Matt Fleming (Cloudflare) <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: Namhyung Kim <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: [email protected] # v5.13+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 17, 2024
commit 9af2efe upstream.

The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.

So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage.  So it shouldn't access it
unconditionally.

I got a segfault, when I wanted to see cgroup profiles.

  $ sudo perf record -a --all-cgroups --synth=cgroup true

  $ sudo perf report -s cgroup

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  48		return RC_CHK_ACCESS(map)->dso;
  (gdb) bt
  #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  gregkh#1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
  gregkh#2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
  gregkh#3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
      at util/hist.c:644
  gregkh#4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
  gregkh#5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
  gregkh#6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
  gregkh#7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
      at util/hist.c:1260
  gregkh#8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
      machine=0x5555560388e8) at builtin-report.c:334
  gregkh#9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
  gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
  gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
      file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
  gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
  gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
  gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
  gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
  gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
      at util/session.c:780
  gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
      file_path=0x555556038ff0 "perf.data") at util/session.c:1406

As you can see the entry->ms.map was NULL even if he->ms.map has a
value.  This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same.  I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).

Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 17, 2024
commit ac01c8c upstream.

AddressSanitizer found a use-after-free bug in the symbol code which
manifested as 'perf top' segfaulting.

  ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
  READ of size 1 at 0x60b00c48844b thread T193
      #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
      gregkh#1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
      gregkh#2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
      gregkh#3 0x5650d804568f in __hists__add_entry util/hist.c:754
      gregkh#4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
      gregkh#5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
      gregkh#6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
      gregkh#7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
      gregkh#8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
      gregkh#9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
      gregkh#10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
      gregkh#11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
      gregkh#12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
      gregkh#13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

When updating hist maps it's also necessary to update the hist symbol
reference because the old one gets freed in map__put().

While this bug was probably introduced with 5c24b67 ("perf
tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
the symbol objects were leaked until c087e94 ("perf machine:
Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
the bug was masked.

Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
Reported-by: Yunzhao Li <[email protected]>
Signed-off-by: Matt Fleming (Cloudflare) <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: Namhyung Kim <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: [email protected] # v5.13+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 17, 2024
commit 9af2efe upstream.

The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.

So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage.  So it shouldn't access it
unconditionally.

I got a segfault, when I wanted to see cgroup profiles.

  $ sudo perf record -a --all-cgroups --synth=cgroup true

  $ sudo perf report -s cgroup

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  48		return RC_CHK_ACCESS(map)->dso;
  (gdb) bt
  #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  gregkh#1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
  gregkh#2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
  gregkh#3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
      at util/hist.c:644
  gregkh#4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
  gregkh#5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
  gregkh#6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
  gregkh#7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
      at util/hist.c:1260
  gregkh#8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
      machine=0x5555560388e8) at builtin-report.c:334
  gregkh#9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
  gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
  gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
      file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
  gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
  gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
  gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
  gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
  gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
      at util/session.c:780
  gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
      file_path=0x555556038ff0 "perf.data") at util/session.c:1406

As you can see the entry->ms.map was NULL even if he->ms.map has a
value.  This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same.  I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).

Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 18, 2024
We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
abajk pushed a commit to abajk/linux-stable that referenced this pull request Oct 19, 2024
…g the sock

[ Upstream commit 3cf7203 ]

There is a race condition in vxlan that when deleting a vxlan device
during receiving packets, there is a possibility that the sock is
released after getting vxlan_sock vs from sk_user_data. Then in
later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got
NULL pointer dereference. e.g.

   #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
   gregkh#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
   gregkh#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
   gregkh#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
   gregkh#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
   gregkh#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
   gregkh#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
      [exception RIP: vxlan_ecn_decapsulate+0x3b]
      RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
      RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
      RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
      RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
      R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
      R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   gregkh#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
   gregkh#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
   gregkh#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
  gregkh#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
  gregkh#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
  gregkh#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
  gregkh#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
  gregkh#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
  gregkh#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
  gregkh#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
  gregkh#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
  gregkh#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3

Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh

Fix this by waiting for all sk_user_data reader to finish before
releasing the sock.

Reported-by: Jianlin Shi <[email protected]>
Suggested-by: Jakub Sitnicki <[email protected]>
Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs")
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 22, 2024
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 22, 2024
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 22, 2024
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 22, 2024
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 22, 2024
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Oct 31, 2024
Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
creates this splat when an MPTCP socket is created:

  =============================
  WARNING: suspicious RCU usage
  6.12.0-rc2+ gregkh#11 Not tainted
  -----------------------------
  net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  no locks held by mptcp_connect/176.

  stack backtrace:
  CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ gregkh#11
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
   <TASK>
   dump_stack_lvl (lib/dump_stack.c:123)
   lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
   mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
   mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
   ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
   inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
   ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
   __sock_create (net/socket.c:1576)
   __sys_socket (net/socket.c:1671)
   ? __pfx___sys_socket (net/socket.c:1712)
   ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
   __x64_sys_socket (net/socket.c:1728)
   do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
   entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

That's because when the socket is initialised, rcu_read_lock() is not
used despite the explicit comment written above the declaration of
mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
warning.

Fixes: 1730b2b ("mptcp: add sched in mptcp_sock")
Cc: [email protected]
Closes: multipath-tcp/mptcp_net-next#523
Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
AndreySV pushed a commit to AndreySV/linux-stable that referenced this pull request Nov 8, 2024
Disable strict aliasing, as has been done in the kernel proper for decades
(literally since before git history) to fix issues where gcc will optimize
away loads in code that looks 100% correct, but is _technically_ undefined
behavior, and thus can be thrown away by the compiler.

E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long)
pointer to a u64 (unsigned long long) pointer when setting PMCR.N via
u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores
the result and uses the original PMCR.

The issue is most easily observed by making set_pmcr_n() noinline and
wrapping the call with printf(), e.g. sans comments, for this code:

  printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n);
  set_pmcr_n(&pmcr, pmcr_n);
  printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n);

gcc-13 generates:

 0000000000401c90 <set_pmcr_n>:
  401c90:       f9400002        ldr     x2, [x0]
  401c94:       b3751022        bfi     x2, x1, gregkh#11, gregkh#5
  401c98:       f9000002        str     x2, [x0]
  401c9c:       d65f03c0        ret

 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>:
  402724:       aa1403e3        mov     x3, x20
  402728:       aa1503e2        mov     x2, x21
  40272c:       aa1603e0        mov     x0, x22
  402730:       aa1503e1        mov     x1, x21
  402734:       940060ff        bl      41ab30 <_IO_printf>
  402738:       aa1403e1        mov     x1, x20
  40273c:       910183e0        add     x0, sp, #0x60
  402740:       97fffd54        bl      401c90 <set_pmcr_n>
  402744:       aa1403e3        mov     x3, x20
  402748:       aa1503e2        mov     x2, x21
  40274c:       aa1503e1        mov     x1, x21
  402750:       aa1603e0        mov     x0, x22
  402754:       940060f7        bl      41ab30 <_IO_printf>

with the value stored in [sp + 0x60] ignored by both printf() above and
in the test proper, resulting in a false failure due to vcpu_set_reg()
simply storing the original value, not the intended value.

  $ ./vpmu_counter_access
  Random seed: 0x6b8b4567
  orig = 3040, next = 3040, want = 0
  orig = 3040, next = 3040, want = 0
  ==== Test Assertion Failure ====
    aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr)
    pid=71578 tid=71578 errno=9 - Bad file descriptor
       1        0x400673: run_access_test at vpmu_counter_access.c:522
       2         (inlined by) main at vpmu_counter_access.c:643
       3        0x4132d7: __libc_start_call_main at libc-start.o:0
       4        0x413653: __libc_start_main at ??:0
       5        0x40106f: _start at ??:0
    Failed to update PMCR.N to 0 (received: 6)

Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if
set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n()
is inlined in its sole caller.

Cc: [email protected]
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912
Signed-off-by: Sean Christopherson <[email protected]>
gregkh pushed a commit that referenced this pull request Nov 8, 2024
[ Upstream commit 3deb12c ]

Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
creates this splat when an MPTCP socket is created:

  =============================
  WARNING: suspicious RCU usage
  6.12.0-rc2+ #11 Not tainted
  -----------------------------
  net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  no locks held by mptcp_connect/176.

  stack backtrace:
  CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
   <TASK>
   dump_stack_lvl (lib/dump_stack.c:123)
   lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
   mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
   mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
   ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
   inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
   ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
   __sock_create (net/socket.c:1576)
   __sys_socket (net/socket.c:1671)
   ? __pfx___sys_socket (net/socket.c:1712)
   ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
   __x64_sys_socket (net/socket.c:1728)
   do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
   entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

That's because when the socket is initialised, rcu_read_lock() is not
used despite the explicit comment written above the declaration of
mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
warning.

Fixes: 1730b2b ("mptcp: add sched in mptcp_sock")
Cc: [email protected]
Closes: multipath-tcp/mptcp_net-next#523
Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
gregkh pushed a commit that referenced this pull request Nov 8, 2024
[ Upstream commit 3deb12c ]

Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
creates this splat when an MPTCP socket is created:

  =============================
  WARNING: suspicious RCU usage
  6.12.0-rc2+ #11 Not tainted
  -----------------------------
  net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  no locks held by mptcp_connect/176.

  stack backtrace:
  CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
   <TASK>
   dump_stack_lvl (lib/dump_stack.c:123)
   lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
   mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
   mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
   ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
   inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
   ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
   __sock_create (net/socket.c:1576)
   __sys_socket (net/socket.c:1671)
   ? __pfx___sys_socket (net/socket.c:1712)
   ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
   __x64_sys_socket (net/socket.c:1728)
   do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
   entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

That's because when the socket is initialised, rcu_read_lock() is not
used despite the explicit comment written above the declaration of
mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
warning.

Fixes: 1730b2b ("mptcp: add sched in mptcp_sock")
Cc: [email protected]
Closes: multipath-tcp/mptcp_net-next#523
Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
gregkh pushed a commit that referenced this pull request Nov 8, 2024
commit e972b08 upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a ("blk-wbt: improve waking of tasks")
Cc: [email protected]
Signed-off-by: Omar Sandoval <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
qaz6750 added a commit to qaz6750/linux-longterm that referenced this pull request Nov 16, 2024
commit 9b5aad3a7498c261116a0251fe57f14ba9c4c6cf
Author: Greg Kroah-Hartman <[email protected]>
Date:   Fri Nov 8 16:28:28 2024 +0100

    Linux 6.6.60

    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Hardik Garg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cc082e50375a29596153fc3f1f8fc85ad1b0b5b9
Author: Konstantin Komarov <[email protected]>
Date:   Thu Sep 5 15:03:48 2024 +0300

    fs/ntfs3: Sequential field availability check in mi_enum_attr()

    commit 090f612756a9720ec18b0b130e28be49839d7cb5 upstream.

    The code is slightly reformatted to consistently check field availability
    without duplication.

    Fixes: 556bdf27c2dd ("ntfs3: Add bounds checking to mi_enum_attr()")
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 10c20d79d59cadfe572480d98cec271a89ffb024
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon May 27 20:15:21 2024 +0530

    drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing

    commit 15c2990e0f0108b9c3752d7072a97d45d4283aea upstream.

    This commit adds null checks for the 'stream' and 'plane' variables in
    the dcn30_apply_idle_power_optimizations function. These variables were
    previously assumed to be null at line 922, but they were used later in
    the code without checking if they were null. This could potentially lead
    to a null pointer dereference, which would cause a crash.

    The null checks ensure that 'stream' and 'plane' are not null before
    they are used, preventing potential crashes.

    Fixes the below static smatch checker:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922)
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922)

    Cc: Tom Chung <[email protected]>
    Cc: Nicholas Kazlauskas <[email protected]>
    Cc: Bhawanpreet Lakha <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Hersen Wu <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    [Xiangyu: Modified file path to backport this commit]
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e979a6a626abf1358a5bb79219eea82ac160d3d3
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:15 2023 +0300

    ASoC: SOF: ipc4-control: Add support for ALSA enum control

    commit 07a866a41982c896dc46476f57d209a200602946 upstream.

    Enum controls use generic param_id and a generic struct where the data
    is passed to the firmware.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3facc0417d3d7b3ba5822e74155bcb1267ce62c1
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:14 2023 +0300

    ASoC: SOF: ipc4-control: Add support for ALSA switch control

    commit 4a2fd607b7ca6128ee3532161505da7624197f55 upstream.

    Volume controls with a max value of 1 are switches.
    Switch controls use generic param_id and a generic struct where the data
    is passed to the firmware.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f01d8fc623711046e1efee00827bff6d5882cdfd
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:13 2023 +0300

    ASoC: SOF: ipc4-topology: Add definition for generic switch/enum control

    commit 060a07cd9bc69eba2da33ed96b1fa69ead60bab1 upstream.

    Currently IPC4 has no notion of a switch or enum type of control which is
    a generic concept in ALSA.

    The generic support for these control types will be as follows:
    - large config is used to send the channel-value par array
    - param_id of a SWITCH type is 200
    - param_id of an ENUM type is 201

    Each module need to support a switch or/and enum must handle these
    universal param_ids.
    The message payload is described by struct sof_ipc4_control_msg_payload.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d54afaef6570c277070c3cafe1ed73dcdc129e0a
Author: Chuck Lever <[email protected]>
Date:   Tue Sep 19 11:35:15 2023 -0400

    SUNRPC: Remove BUG_ON call sites

    commit 789ce196a31dd13276076762204bee87df893e53 upstream.

    There is no need to take down the whole system for these assertions.

    I'd rather not attempt a heroic save here, as some bug has occurred
    that has left the transport data structures in an unknown state.
    Just warn and then leak the left-over resources.

    Acked-by: Christian Brauner <[email protected]>
    Reviewed-by: NeilBrown <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Dominique Martinet <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 27a58a19bd20a7afe369da2ce6d4ebea70768acd
Author: Michael Walle <[email protected]>
Date:   Fri Jun 21 14:09:29 2024 +0200

    mtd: spi-nor: winbond: fix w25q128 regression

    commit d35df77707bf5ae1221b5ba1c8a88cf4fcdd4901 upstream.

    Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    removed the flags for non-SFDP devices. It was assumed that it wasn't in
    use anymore. This wasn't true. Add the no_sfdp_flags as well as the size
    again.

    We add the additional flags for dual and quad read because they have
    been reported to work properly by Hartmut using both older and newer
    versions of this flash, the similar flashes with 64Mbit and 256Mbit
    already have these flags and because it will (luckily) trigger our
    legacy SFDP parsing, so newer versions with SFDP support will still get
    the parameters from the SFDP tables.

    Reported-by: Hartmut Birr <[email protected]>
    Closes: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/
    Fixes: 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    Reviewed-by: Linus Walleij <[email protected]>
    Signed-off-by: Michael Walle <[email protected]>
    Acked-by: Tudor Ambarus <[email protected]>
    Reviewed-by: Esben Haabendal <[email protected]>
    Reviewed-by: Pratyush Yadav <[email protected]>
    Signed-off-by: Pratyush Yadav <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    [Backported to v6.6 - vastly different due to upstream changes]
    Reviewed-by: Tudor Ambarus <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3d544942c0010feedc048b048ee0c35d2d921100
Author: David Hildenbrand <[email protected]>
Date:   Fri Oct 11 12:24:45 2024 +0200

    mm: don't install PMD mappings when THPs are disabled by the hw/process/vma

    commit 2b0f922323ccfa76219bcaacd35cd50aeaa13592 upstream.

    We (or rather, readahead logic :) ) might be allocating a THP in the
    pagecache and then try mapping it into a process that explicitly disabled
    THP: we might end up installing PMD mappings.

    This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
    THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
    starting the VM.

    For example, starting a VM backed on a file system with large folios
    supported makes the VM crash when the VM tries accessing such a mapping
    using KVM.

    Is it also a problem when the HW disabled THP using
    TRANSPARENT_HUGEPAGE_UNSUPPORTED?  At least on x86 this would be the case
    without X86_FEATURE_PSE.

    In the future, we might be able to do better on s390x and only disallow
    PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
    really wants.  For now, fix it by essentially performing the same check as
    would be done in __thp_vma_allowable_orders() or in shmem code, where this
    works as expected, and disallow PMD mappings, making us fallback to PTE
    mappings.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Leo Fu <[email protected]>
    Tested-by: Thomas Huth <[email protected]>
    Cc: Thomas Huth <[email protected]>
    Cc: Matthew Wilcox (Oracle) <[email protected]>
    Cc: Ryan Roberts <[email protected]>
    Cc: Christian Borntraeger <[email protected]>
    Cc: Janosch Frank <[email protected]>
    Cc: Claudio Imbrenda <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Kefeng Wang <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 02ec4b3bba49e8d3abb25a3feba6875cae12da92
Author: Kefeng Wang <[email protected]>
Date:   Fri Oct 11 12:24:44 2024 +0200

    mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()

    commit 963756aac1f011d904ddd9548ae82286d3a91f96 upstream.

    Patch series "mm: don't install PMD mappings when THPs are disabled by the
    hw/process/vma".

    During testing, it was found that we can get PMD mappings in processes
    where THP (and more precisely, PMD mappings) are supposed to be disabled.
    While it works as expected for anon+shmem, the pagecache is the
    problematic bit.

    For s390 KVM this currently means that a VM backed by a file located on
    filesystem with large folio support can crash when KVM tries accessing the
    problematic page, because the readahead logic might decide to use a
    PMD-sized THP and faulting it into the page tables will install a PMD
    mapping, something that s390 KVM cannot tolerate.

    This might also be a problem with HW that does not support PMD mappings,
    but I did not try reproducing it.

    Fix it by respecting the ways to disable THPs when deciding whether we can
    install a PMD mapping.  khugepaged should already be taking care of not
    collapsing if THPs are effectively disabled for the hw/process/vma.

    This patch (of 2):

    Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
    shmem_allowable_huge_orders() and __thp_vma_allowable_orders().

    [[email protected]: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
    Signed-off-by: Kefeng Wang <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Leo Fu <[email protected]>
    Tested-by: Thomas Huth <[email protected]>
    Reviewed-by: Ryan Roberts <[email protected]>
    Cc: Boqiao Fu <[email protected]>
    Cc: Christian Borntraeger <[email protected]>
    Cc: Claudio Imbrenda <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Janosch Frank <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fc621e7a043de346c33bd7ae7e2e0c651d6152ef
Author: Johannes Berg <[email protected]>
Date:   Wed Oct 23 09:17:44 2024 +0200

    wifi: iwlwifi: mvm: fix 6 GHz scan construction

    commit 7245012f0f496162dd95d888ed2ceb5a35170f1a upstream.

    If more than 255 colocated APs exist for the set of all
    APs found during 2.4/5 GHz scanning, then the 6 GHz scan
    construction will loop forever since the loop variable
    has type u8, which can never reach the number found when
    that's bigger than 255, and is stored in a u32 variable.
    Also move it into the loops to have a smaller scope.

    Using a u32 there is fine, we limit the number of APs in
    the scan list and each has a limit on the number of RNR
    entries due to the frame size. With a limit of 1000 scan
    results, a frame size upper bound of 4096 (really it's
    more like ~2300) and a TBTT entry size of at least 11,
    we get an upper bound for the number of ~372k, well in
    the bounds of a u32.

    Cc: [email protected]
    Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375
    Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f2f1fa446676c21edb777e6d2bc4fa8f956fab68
Author: Ryusuke Konishi <[email protected]>
Date:   Fri Oct 18 04:33:10 2024 +0900

    nilfs2: fix kernel bug due to missing clearing of checked flag

    commit 41e192ad2779cae0102879612dfe46726e4396aa upstream.

    Syzbot reported that in directory operations after nilfs2 detects
    filesystem corruption and degrades to read-only,
    __block_write_begin_int(), which is called to prepare block writes, may
    fail the BUG_ON check for accesses exceeding the folio/page size,
    triggering a kernel bug.

    This was found to be because the "checked" flag of a page/folio was not
    cleared when it was discarded by nilfs2's own routine, which causes the
    sanity check of directory entries to be skipped when the directory
    page/folio is reloaded.  So, fix that.

    This was necessary when the use of nilfs2's own page discard routine was
    applied to more than just metadata files.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a53c2d847627b790fb3bd8b00e02c247941b17e0
Author: Zong-Zhe Yang <[email protected]>
Date:   Mon Jun 17 19:52:17 2024 +0800

    wifi: mac80211: fix NULL dereference at band check in starting tx ba session

    commit 021d53a3d87eeb9dbba524ac515651242a2a7e3b upstream.

    In MLD connection, link_data/link_conf are dynamically allocated. They
    don't point to vif->bss_conf. So, there will be no chanreq assigned to
    vif->bss_conf and then the chan will be NULL. Tweak the code to check
    ht_supported/vht_supported/has_he/has_eht on sta deflink.

    Crash log (with rtw89 version under MLO development):
    [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [ 9890.526102] #PF: supervisor read access in kernel mode
    [ 9890.526105] #PF: error_code(0x0000) - not-present page
    [ 9890.526109] PGD 0 P4D 0
    [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI
    [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G           OE      6.9.0 #1
    [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018
    [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core]
    [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
    [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3
    All code
    ========
       0:	f7 e8                	imul   %eax
       2:	d5                   	(bad)
       3:	93                   	xchg   %eax,%ebx
       4:	3e ea                	ds (bad)
       6:	48 83 c4 28          	add    $0x28,%rsp
       a:	89 d8                	mov    %ebx,%eax
       c:	5b                   	pop    %rbx
       d:	41 5c                	pop    %r12
       f:	41 5d                	pop    %r13
      11:	41 5e                	pop    %r14
      13:	41 5f                	pop    %r15
      15:	5d                   	pop    %rbp
      16:	c3                   	retq
      17:	cc                   	int3
      18:	cc                   	int3
      19:	cc                   	int3
      1a:	cc                   	int3
      1b:	49 8b 84 24 e0 f1 ff 	mov    -0xe20(%r12),%rax
      22:	ff
      23:	48 8b 80 90 1b 00 00 	mov    0x1b90(%rax),%rax
      2a:*	83 38 03             	cmpl   $0x3,(%rax)		<-- trapping instruction
      2d:	0f 84 37 fe ff ff    	je     0xfffffffffffffe6a
      33:	bb ea ff ff ff       	mov    $0xffffffea,%ebx
      38:	eb cc                	jmp    0x6
      3a:	49                   	rex.WB
      3b:	8b                   	.byte 0x8b
      3c:	84 24 10             	test   %ah,(%rax,%rdx,1)
      3f:	f3                   	repz

    Code starting with the faulting instruction
    ===========================================
       0:	83 38 03             	cmpl   $0x3,(%rax)
       3:	0f 84 37 fe ff ff    	je     0xfffffffffffffe40
       9:	bb ea ff ff ff       	mov    $0xffffffea,%ebx
       e:	eb cc                	jmp    0xffffffffffffffdc
      10:	49                   	rex.WB
      11:	8b                   	.byte 0x8b
      12:	84 24 10             	test   %ah,(%rax,%rdx,1)
      15:	f3                   	repz
    [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246
    [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8
    [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685
    [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873
    [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70
    [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000
    [ 9890.526313] FS:  0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000
    [ 9890.526316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0
    [ 9890.526321] Call Trace:
    [ 9890.526324]  <TASK>
    [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479)
    [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713)
    [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3))
    [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211

    Signed-off-by: Zong-Zhe Yang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6a91a5816b289018e0b42a25444c0b4f8c637dca
Author: Pavel Begunkov <[email protected]>
Date:   Wed Apr 10 02:26:54 2024 +0100

    io_uring: always lock __io_cqring_overflow_flush

    commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream.

    Conditional locking is never great, in case of
    __io_cqring_overflow_flush(), which is a slow path, it's not justified.
    Don't handle IOPOLL separately, always grab uring_lock for overflow
    flushing.

    Signed-off-by: Pavel Begunkov <[email protected]>
    Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e3fb0e6afcc399660770428a35162b4880e2e14e
Author: Haibo Chen <[email protected]>
Date:   Thu Sep 5 17:43:38 2024 +0800

    arm64: dts: imx8ulp: correct the flexspi compatible string

    commit 409dc5196d5b6eb67468a06bf4d2d07d7225a67b upstream.

    The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has
    32 LUTs, so correct the compatible string here, otherwise will
    meet below error:

    [    1.119072] ------------[ cut here ]------------
    [    1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64
    [    1.133239] Modules linked in:
    [    1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69
    [    1.146821] Hardware name: NXP i.MX8ULP EVK (DT)
    [    1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64
    [    1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64
    [    1.168060] sp : ffff80008002b2a0
    [    1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000
    [    1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000
    [    1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0
    [    1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff
    [    1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002
    [    1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00
    [    1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80
    [    1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001
    [    1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6
    [    1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92
    [    1.246267] Call trace:
    [    1.248824]  nxp_fspi_exec_op+0xb04/0xb64
    [    1.253031]  spi_mem_exec_op+0x3a0/0x430
    [    1.257139]  spi_nor_read_id+0x80/0xcc
    [    1.261065]  spi_nor_scan+0x1ec/0xf10
    [    1.264901]  spi_nor_probe+0x108/0x2fc
    [    1.268828]  spi_mem_probe+0x6c/0xbc
    [    1.272574]  spi_probe+0x84/0xe4
    [    1.275958]  really_probe+0xbc/0x29c
    [    1.279713]  __driver_probe_device+0x78/0x12c
    [    1.284277]  driver_probe_device+0xd8/0x15c
    [    1.288660]  __device_attach_driver+0xb8/0x134
    [    1.293316]  bus_for_each_drv+0x88/0xe8
    [    1.297337]  __device_attach+0xa0/0x190
    [    1.301353]  device_initial_probe+0x14/0x20
    [    1.305734]  bus_probe_device+0xac/0xb0
    [    1.309752]  device_add+0x5d0/0x790
    [    1.313408]  __spi_add_device+0x134/0x204
    [    1.317606]  of_register_spi_device+0x3b4/0x590
    [    1.322348]  spi_register_controller+0x47c/0x754
    [    1.327181]  devm_spi_register_controller+0x4c/0xa4
    [    1.332289]  nxp_fspi_probe+0x1cc/0x2b0
    [    1.336307]  platform_probe+0x68/0xc4
    [    1.340145]  really_probe+0xbc/0x29c
    [    1.343893]  __driver_probe_device+0x78/0x12c
    [    1.348457]  driver_probe_device+0xd8/0x15c
    [    1.352838]  __driver_attach+0x90/0x19c
    [    1.356857]  bus_for_each_dev+0x7c/0xdc
    [    1.360877]  driver_attach+0x24/0x30
    [    1.364624]  bus_add_driver+0xe4/0x208
    [    1.368552]  driver_register+0x5c/0x124
    [    1.372573]  __platform_driver_register+0x28/0x34
    [    1.377497]  nxp_fspi_driver_init+0x1c/0x28
    [    1.381888]  do_one_initcall+0x80/0x1c8
    [    1.385908]  kernel_init_freeable+0x1c4/0x28c
    [    1.390472]  kernel_init+0x20/0x1d8
    [    1.394138]  ret_from_fork+0x10/0x20
    [    1.397885] ---[ end trace 0000000000000000 ]---
    [    1.407908] ------------[ cut here ]------------

    Fixes: ef89fd56bdfc ("arm64: dts: imx8ulp: add flexspi node")
    Cc: [email protected]
    Signed-off-by: Haibo Chen <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1a49b96c51063d38be296a0c1537928a06f02d6e
Author: Gregory Price <[email protected]>
Date:   Fri Oct 25 10:17:24 2024 -0400

    vmscan,migrate: fix page count imbalance on node stats when demoting pages

    [ Upstream commit 35e41024c4c2b02ef8207f61b9004f6956cf037b ]

    When numa balancing is enabled with demotion, vmscan will call
    migrate_pages when shrinking LRUs.  migrate_pages will decrement the
    the node's isolated page count, leading to an imbalanced count when
    invoked from (MG)LRU code.

    The result is dmesg output like such:

    $ cat /proc/sys/vm/stat_refresh

    [77383.088417] vmstat_refresh: nr_isolated_anon -103212
    [77383.088417] vmstat_refresh: nr_isolated_file -899642

    This negative value may impact compaction and reclaim throttling.

    The following path produces the decrement:

    shrink_folio_list
      demote_folio_list
        migrate_pages
          migrate_pages_batch
            migrate_folio_move
              migrate_folio_done
                mod_node_page_state(-ve) <- decrement

    This path happens for SUCCESSFUL migrations, not failures.  Typically
    callers to migrate_pages are required to handle putback/accounting for
    failures, but this is already handled in the shrink code.

    When accounting for migrations, instead do not decrement the count when
    the migration reason is MR_DEMOTION.  As of v6.11, this demotion logic
    is the only source of MR_DEMOTION.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim")
    Signed-off-by: Gregory Price <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Reviewed-by: Davidlohr Bueso <[email protected]>
    Reviewed-by: Shakeel Butt <[email protected]>
    Reviewed-by: "Huang, Ying" <[email protected]>
    Reviewed-by: Oscar Salvador <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Wei Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 003d2996964c03dfd34860500428f4cdf1f5879e
Author: Jens Axboe <[email protected]>
Date:   Thu Oct 31 08:05:44 2024 -0600

    io_uring/rw: fix missing NOWAIT check for O_DIRECT start write

    [ Upstream commit 1d60d74e852647255bd8e76f5a22dc42531e4389 ]

    When io_uring starts a write, it'll call kiocb_start_write() to bump the
    super block rwsem, preventing any freezes from happening while that
    write is in-flight. The freeze side will grab that rwsem for writing,
    excluding any new writers from happening and waiting for existing writes
    to finish. But io_uring unconditionally uses kiocb_start_write(), which
    will block if someone is currently attempting to freeze the mount point.
    This causes a deadlock where freeze is waiting for previous writes to
    complete, but the previous writes cannot complete, as the task that is
    supposed to complete them is blocked waiting on starting a new write.
    This results in the following stuck trace showing that dependency with
    the write blocked starting a new write:

    task:fio             state:D stack:0     pid:886   tgid:886   ppid:876
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_rwsem_wait+0x1e8/0x3f8
     __percpu_down_read+0xe8/0x500
     io_write+0xbb8/0xff8
     io_issue_sqe+0x10c/0x1020
     io_submit_sqes+0x614/0x2110
     __arm64_sys_io_uring_enter+0x524/0x1038
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170
    INFO: task fsfreeze:7364 blocked for more than 15 seconds.
          Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963

    with the attempting freezer stuck trying to grab the rwsem:

    task:fsfreeze        state:D stack:0     pid:7364  tgid:7364  ppid:995
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_down_write+0x2b0/0x680
     freeze_super+0x248/0x8a8
     do_vfs_ioctl+0x149c/0x1b18
     __arm64_sys_ioctl+0xd0/0x1a0
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170

    Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
    blocking grab of the super block rwsem if it isn't set. For normal issue
    where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
    have io_uring core issue a blocking attempt of the write. That will in
    turn also get completions run, ensuring forward progress.

    Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
    something that can be triggered by a regular user.

    Cc: [email protected] # 5.10+
    Reported-by: Peter Mann <[email protected]>
    Link: https://lore.kernel.org/io-uring/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 70bbe8d0a949413df1bb6532fd6b19fbf0f88feb
Author: Andrey Konovalov <[email protected]>
Date:   Tue Oct 22 18:07:06 2024 +0200

    kasan: remove vmalloc_percpu test

    [ Upstream commit 330d8df81f3673d6fb74550bbc9bb159d81b35f7 ]

    Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the
    vmalloc_percpu KASAN test with the assumption that __alloc_percpu always
    uses vmalloc internally, which is tagged by KASAN.

    However, __alloc_percpu might allocate memory from the first per-CPU
    chunk, which is not allocated via vmalloc().  As a result, the test might
    fail.

    Remove the test until proper KASAN annotation for the per-CPU allocated
    are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests")
    Signed-off-by: Andrey Konovalov <[email protected]>
    Reported-by: Samuel Holland <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Reported-by: Sabyrzhan Tasbolatov <[email protected]>
    Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/
    Cc: Alexander Potapenko <[email protected]>
    Cc: Andrey Ryabinin <[email protected]>
    Cc: Dmitry Vyukov <[email protected]>
    Cc: Marco Elver <[email protected]>
    Cc: Sabyrzhan Tasbolatov <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c60af16e1d6cc2237d58336546d6adfc067b6b8f
Author: Vitaliy Shevtsov <[email protected]>
Date:   Mon Sep 16 22:41:37 2024 +0500

    nvmet-auth: assign dh_key to NULL after kfree_sensitive

    [ Upstream commit d2f551b1f72b4c508ab9298419f6feadc3b5d791 ]

    ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup()
    for the same controller. So it's better to nullify it after release on
    error path in order to avoid double free later in nvmet_destroy_auth().

    Found by Linux Verification Center (linuxtesting.org) with Svace.

    Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support")
    Cc: [email protected]
    Signed-off-by: Vitaliy Shevtsov <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Hannes Reinecke <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4a39320977f9c665faa37efaa8093b8e82dd8c41
Author: Christoffer Sandberg <[email protected]>
Date:   Tue Oct 29 16:16:53 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1

    [ Upstream commit e49370d769e71456db3fbd982e95bab8c69f73e8 ]

    Quirk is needed to enable headset microphone on missing pin 0x19.

    Signed-off-by: Christoffer Sandberg <[email protected]>
    Signed-off-by: Werner Sembach <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit b42adef85aca72b51eab1a812a79913ff5aeb584
Author: Christoffer Sandberg <[email protected]>
Date:   Tue Oct 29 16:16:52 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3

    [ Upstream commit 0b04fbe886b4274c8e5855011233aaa69fec6e75 ]

    Quirk is needed to enable headset microphone on missing pin 0x19.

    Signed-off-by: Christoffer Sandberg <[email protected]>
    Signed-off-by: Werner Sembach <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 77ddc732416b017180893cbb2356e9f0a414c575
Author: Christoph Hellwig <[email protected]>
Date:   Wed Oct 23 15:37:22 2024 +0200

    xfs: fix finding a last resort AG in xfs_filestream_pick_ag

    [ Upstream commit dc60992ce76fbc2f71c2674f435ff6bde2108028 ]

    When the main loop in xfs_filestream_pick_ag fails to find a suitable
    AG it tries to just pick the online AG.  But the loop for that uses
    args->pag as loop iterator while the later code expects pag to be
    set.  Fix this by reusing the max_pag case for this last resort, and
    also add a check for impossible case of no AG just to make sure that
    the uninitialized pag doesn't even escape in theory.

    Reported-by: [email protected]
    Signed-off-by: Christoph Hellwig <[email protected]>
    Tested-by: [email protected]
    Fixes: f8f1ed1ab3baba ("xfs: return a referenced perag from filestreams allocator")
    Cc: <[email protected]> # v6.3
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Carlos Maiolino <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 8e886e44397ba89f6e8da8471386112b4f5b67b7
Author: Matt Johnston <[email protected]>
Date:   Tue Oct 22 18:25:14 2024 +0800

    mctp i2c: handle NULL header address

    [ Upstream commit 01e215975fd80af81b5b79f009d49ddd35976c13 ]

    daddr can be NULL if there is no neighbour table entry present,
    in that case the tx packet should be dropped.

    saddr will usually be set by MCTP core, but check for NULL in case a
    packet is transmitted by a different protocol.

    Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
    Cc: [email protected]
    Reported-by: Dung Cao <[email protected]>
    Signed-off-by: Matt Johnston <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 88f97a4b5843ce21c1286e082c02a5fb4d8eb473
Author: Edward Adam Davis <[email protected]>
Date:   Wed Oct 16 19:43:47 2024 +0800

    ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow

    [ Upstream commit bc0a2f3a73fcdac651fca64df39306d1e5ebe3b0 ]

    Syzbot reported a kernel BUG in ocfs2_truncate_inline.  There are two
    reasons for this: first, the parameter value passed is greater than
    ocfs2_max_inline_data_with_xattr, second, the start and end parameters of
    ocfs2_truncate_inline are "unsigned int".

    So, we need to add a sanity check for byte_start and byte_len right before
    ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater
    than ocfs2_max_inline_data_with_xattr return -EINVAL.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1afc32b95233 ("ocfs2: Write support for inline data")
    Signed-off-by: Edward Adam Davis <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c117a980185ee3812612e7e453e356a6a4f05305
Author: Sabyrzhan Tasbolatov <[email protected]>
Date:   Wed Oct 16 20:24:07 2024 +0500

    x86/traps: move kmsan check after instrumentation_begin

    [ Upstream commit 1db272864ff250b5e607283eaec819e1186c8e26 ]

    During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following:

      AR      built-in.a
      AR      vmlinux.a
      LD      vmlinux.o
    vmlinux.o: warning: objtool: handle_bug+0x4: call to
        kmsan_unpoison_entry_regs() leaves .noinstr.text section
      OBJCOPY modules.builtin.modinfo
      GEN     modules.builtin
      MODPOST Module.symvers
      CC      .vmlinux.export.o

    Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes
    the warning.

    There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but
    it has the return condition and if we include it after
    instrumentation_begin() it results the warning "return with
    instrumentation enabled", hence, I'm concerned that regs will not be KMSAN
    unpoisoned if `ud_type == BUG_NONE` is true.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ba54d194f8da ("x86/traps: avoid KMSAN bugs originating from handle_bug()")
    Signed-off-by: Sabyrzhan Tasbolatov <[email protected]>
    Reviewed-by: Alexander Potapenko <[email protected]>
    Cc: Borislav Petkov (AMD) <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 86ee1845cbbf52eff6d41ce438d5f7e9ab6f4602
Author: Gatlin Newhouse <[email protected]>
Date:   Wed Jul 24 00:01:55 2024 +0000

    x86/traps: Enable UBSAN traps on x86

    [ Upstream commit 7424fc6b86c8980a87169e005f5cd4438d18efe6 ]

    Currently ARM64 extracts which specific sanitizer has caused a trap via
    encoded data in the trap instruction. Clang on x86 currently encodes the
    same data in the UD1 instruction but x86 handle_bug() and
    is_valid_bugaddr() currently only look at UD2.

    Bring x86 to parity with ARM64, similar to commit 25b84002afb9 ("arm64:
    Support Clang UBSAN trap codes for better reporting"). See the llvm
    links for information about the code generation.

    Enable the reporting of UBSAN sanitizer details on x86 compiled with clang
    when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate
    which is encoded by the compiler after the UD1.

    [ tglx: Simplified it by moving the printk() into handle_bug() ]

    Signed-off-by: Gatlin Newhouse <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Acked-by: Peter Zijlstra (Intel) <[email protected]>
    Cc: Kees Cook <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f
    Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27
    Stable-dep-of: 1db272864ff2 ("x86/traps: move kmsan check after instrumentation_begin")
    Signed-off-by: Sasha Levin <[email protected]>

commit b958948ae1cb3e39c48e9f805436fd652103c71e
Author: Matt Fleming <[email protected]>
Date:   Fri Oct 11 13:07:37 2024 +0100

    mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves

    [ Upstream commit 281dd25c1a018261a04d1b8bf41a0674000bfe38 ]

    Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to
    fail even though free pages are available in the highatomic reserves.
    GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock()
    since it's only run from reclaim.

    Given that such allocations will pass the watermarks in
    __zone_watermark_unusable_free(), it makes sense to fallback to highatomic
    reserves the same way that ALLOC_OOM can.

    This fixes order-0 page allocation failures observed on Cloudflare's fleet
    when handling network packets:

      kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC),
      nodemask=(null),cpuset=/,mems_allowed=0-7
      CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G           O 6.6.43-CUSTOM #1
      Hardware name: MACHINE
      Call Trace:
       <IRQ>
       dump_stack_lvl+0x3c/0x50
       warn_alloc+0x13a/0x1c0
       __alloc_pages_slowpath.constprop.0+0xc9d/0xd10
       __alloc_pages+0x327/0x340
       __napi_alloc_skb+0x16d/0x1f0
       bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en]
       bnxt_rx_pkt+0x201/0x15e0 [bnxt_en]
       __bnxt_poll_work+0x156/0x2b0 [bnxt_en]
       bnxt_poll+0xd9/0x1c0 [bnxt_en]
       __napi_poll+0x2b/0x1b0
       bpf_trampoline_6442524138+0x7d/0x1000
       __napi_poll+0x5/0x1b0
       net_rx_action+0x342/0x740
       handle_softirqs+0xcf/0x2b0
       irq_exit_rcu+0x6c/0x90
       sysvec_apic_timer_interrupt+0x72/0x90
       </IRQ>

    [[email protected]: update comment]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/
    Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs")
    Signed-off-by: Matt Fleming <[email protected]>
    Suggested-by: Vlastimil Babka <[email protected]>
    Reviewed-by: Vlastimil Babka <[email protected]>
    Cc: Mel Gorman <[email protected]>
    Cc: Michal Hocko <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4882a352b5df897c30f9d64fba340a219a6604d0
Author: Alexander Usyskin <[email protected]>
Date:   Tue Oct 15 15:31:57 2024 +0300

    mei: use kvmalloc for read buffer

    [ Upstream commit 4adf613e01bf99e1739f6ff3e162ad5b7d578d1a ]

    Read buffer is allocated according to max message size, reported by
    the firmware and may reach 64K in systems with pxp client.
    Contiguous 64k allocation may fail under memory pressure.
    Read buffer is used as in-driver message storage and not required
    to be contiguous.
    Use kvmalloc to allow kernel to allocate non-contiguous memory.

    Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.")
    Cc: stable <[email protected]>
    Reported-by: Rohit Agarwal <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Brian Geffon <[email protected]>
    Signed-off-by: Alexander Usyskin <[email protected]>
    Acked-by: Tomas Winkler <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit cb8b81ad3e893a6d18dcdd3754cc2ea2a42c0136
Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Mon Oct 21 12:25:26 2024 +0200

    mptcp: init: protect sched with rcu_read_lock

    [ Upstream commit 3deb12c788c385e17142ce6ec50f769852fcec65 ]

    Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
    creates this splat when an MPTCP socket is created:

      =============================
      WARNING: suspicious RCU usage
      6.12.0-rc2+ #11 Not tainted
      -----------------------------
      net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!

      other info that might help us debug this:

      rcu_scheduler_active = 2, debug_locks = 1
      no locks held by mptcp_connect/176.

      stack backtrace:
      CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Call Trace:
       <TASK>
       dump_stack_lvl (lib/dump_stack.c:123)
       lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
       mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
       mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
       ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
       inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
       ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
       __sock_create (net/socket.c:1576)
       __sys_socket (net/socket.c:1671)
       ? __pfx___sys_socket (net/socket.c:1712)
       ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
       __x64_sys_socket (net/socket.c:1728)
       do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
       entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

    That's because when the socket is initialised, rcu_read_lock() is not
    used despite the explicit comment written above the declaration of
    mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
    warning.

    Fixes: 1730b2b2c5a5 ("mptcp: add sched in mptcp_sock")
    Cc: [email protected]
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4f7ffa83fa79dd52efbaef366c850aaaae06a469
Author: Hugh Dickins <[email protected]>
Date:   Sun Oct 27 15:23:23 2024 -0700

    iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP

    [ Upstream commit c749d9b7ebbc5716af7a95f7768634b30d9446ec ]

    generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem,
    on huge=always tmpfs, issues a warning and then hangs (interruptibly):

    WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9
    CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2
    ...
    copy_page_from_iter_atomic+0xa6/0x5ec
    generic_perform_write+0xf6/0x1b4
    shmem_file_write_iter+0x54/0x67

    Fix copy_page_from_iter_atomic() by limiting it in that case
    (include/linux/skbuff.h skb_frag_must_loop() does similar).

    But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too
    surprising, has outlived its usefulness, and should just be removed?

    Fixes: 908a1ad89466 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()")
    Signed-off-by: Hugh Dickins <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Christoph Hellwig <[email protected]>
    Cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit ade91f6e9848b370add44d89c976e070ccb492ef
Author: Shawn Wang <[email protected]>
Date:   Fri Oct 25 10:22:08 2024 +0800

    sched/numa: Fix the potential null pointer dereference in task_numa_work()

    [ Upstream commit 9c70b2a33cd2aa6a5a59c5523ef053bd42265209 ]

    When running stress-ng-vm-segv test, we found a null pointer dereference
    error in task_numa_work(). Here is the backtrace:

      [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
      ......
      [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se
      ......
      [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
      [323676.067115] pc : vma_migratable+0x1c/0xd0
      [323676.067122] lr : task_numa_work+0x1ec/0x4e0
      [323676.067127] sp : ffff8000ada73d20
      [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010
      [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000
      [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000
      [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8
      [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035
      [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8
      [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4
      [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001
      [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000
      [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000
      [323676.067152] Call trace:
      [323676.067153]  vma_migratable+0x1c/0xd0
      [323676.067155]  task_numa_work+0x1ec/0x4e0
      [323676.067157]  task_work_run+0x78/0xd8
      [323676.067161]  do_notify_resume+0x1ec/0x290
      [323676.067163]  el0_svc+0x150/0x160
      [323676.067167]  el0t_64_sync_handler+0xf8/0x128
      [323676.067170]  el0t_64_sync+0x17c/0x180
      [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000)
      [323676.067177] SMP: stopping secondary CPUs
      [323676.070184] Starting crashdump kernel...

    stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error
    handling function of the system, which tries to cause a SIGSEGV error on
    return from unmapping the whole address space of the child process.

    Normally this program will not cause kernel crashes. But before the
    munmap system call returns to user mode, a potential task_numa_work()
    for numa balancing could be added and executed. In this scenario, since the
    child process has no vma after munmap, the vma_next() in task_numa_work()
    will return a null pointer even if the vma iterator restarts from 0.

    Recheck the vma pointer before dereferencing it in task_numa_work().

    Fixes: 214dbc428137 ("sched: convert to vma iterator")
    Signed-off-by: Shawn Wang <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Cc: [email protected] # v6.2+
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

commit 8c9a1ec39c698cbc38f4efa9113185f885137f8b
Author: Dan Williams <[email protected]>
Date:   Tue Oct 22 18:43:40 2024 -0700

    cxl/acpi: Ensure ports ready at cxl_acpi_probe() return

    [ Upstream commit 48f62d38a07d464a499fa834638afcfd2b68f852 ]

    In order to ensure root CXL ports are enabled upon cxl_acpi_probe()
    when the 'cxl_port' driver is built as a module, arrange for the
    module to be pre-loaded or built-in.

    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means. However, a stable
    backport should do no harm.

    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Dan Williams <[email protected]>
    Tested-by: Gregory Price <[email protected]>
    Reviewed-by: Jonathan Cameron <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit a9ed67f39f888bb6e5729112ad45f15d9c5a3ef8
Author: Dan Williams <[email protected]>
Date:   Tue Oct 22 18:43:32 2024 -0700

    cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()

    [ Upstream commit 3d6ebf16438de5d712030fefbb4182b46373d677 ]

    It turns out since its original introduction, pre-2.6.12,
    bus_rescan_devices() has skipped devices that might be in the process of
    attaching or detaching from their driver. For CXL this behavior is
    unwanted and expects that cxl_bus_rescan() is a probe barrier.

    That behavior is simple enough to achieve with bus_for_each_dev() paired
    with call to device_attach(), and it is unclear why bus_rescan_devices()
    took the position of lockless consumption of dev->driver which is racy.

    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means.  However, a stable
    backport should do no harm.

    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Dan Williams <[email protected]>
    Tested-by: Gregory Price <[email protected]>
    Reviewed-by: Jonathan Cameron <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit d210bc87cc4fdde62f757002530a08c3d109d94a
Author: Chunyan Zhang <[email protected]>
Date:   Tue Oct 8 17:41:39 2024 +0800

    riscv: Remove duplicated GET_RM

    [ Upstream commit 164f66de6bb6ef454893f193c898dc8f1da6d18b ]

    The macro GET_RM defined twice in this file, one can be removed.

    Reviewed-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Chunyan Zhang <[email protected]>
    Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 6d84e1b2e5ac04511e68bcf5577fc8369e73f4ed
Author: Chunyan Zhang <[email protected]>
Date:   Tue Oct 8 17:41:38 2024 +0800

    riscv: Remove unused GENERATING_ASM_OFFSETS

    [ Upstream commit 46d4e5ac6f2f801f97bcd0ec82365969197dc9b1 ]

    The macro is not used in the current version of kernel, it looks like
    can be removed to avoid a build warning:

    ../arch/riscv/kernel/asm-offsets.c: At top level:
    ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros]
        7 | #define GENERATING_ASM_OFFSETS

    Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()")
    Cc: [email protected]
    Reviewed-by: Alexandre Ghiti <[email protected]>
    Tested-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Chunyan Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit a63ba17207c50da91b19150b6cde09d199b34c2c
Author: WangYuli <[email protected]>
Date:   Thu Oct 17 11:20:10 2024 +0800

    riscv: Use '%u' to format the output of 'cpu'

    [ Upstream commit e0872ab72630dada3ae055bfa410bf463ff1d1e0 ]

    'cpu' is an unsigned integer, so its conversion specifier should
    be %u, not %d.

    Suggested-by: Wentao Guan <[email protected]>
    Suggested-by: Maciej W. Rozycki <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: WangYuli <[email protected]>
    Reviewed-by: Charlie Jenkins <[email protected]>
    Tested-by: Charlie Jenkins <[email protected]>
    Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 909e71f28e9615410f52fca1b54acfd3d61c61c2
Author: Heinrich Schuchardt <[email protected]>
Date:   Sun Sep 29 16:02:33 2024 +0200

    riscv: efi: Set NX compat flag in PE/COFF header

    [ Upstream commit d41373a4b910961df5a5e3527d7bde6ad45ca438 ]

    The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the
    EFI binary does not rely on pages that are both executable and
    writable.

    The flag is used by some distro versions of GRUB to decide if the EFI
    binary may be executed.

    As the Linux kernel neither has RWX sections nor needs RWX pages for
    relocation we should set the flag.

    Cc: Ard Biesheuvel <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Heinrich Schuchardt <[email protected]>
    Reviewed-by: Emil Renner Berthing <[email protected]>
    Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub")
    Acked-by: Ard Biesheuvel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 58e78589ade880330e359587bb50b1474f43aa12
Author: Kailang Yang <[email protected]>
Date:   Fri Oct 18 13:53:24 2024 +0800

    ALSA: hda/realtek: Limit internal Mic boost on Dell platform

    [ Upstream commit 78e7be018784934081afec77f96d49a2483f9188 ]

    Dell want to limit internal Mic boost on all Dell platform.

    Signed-off-by: Kailang Yang <[email protected]>
    Cc: <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit ceec8ad09135c27890cdee5a9bb0bf5f58c23720
Author: Dmitry Torokhov <[email protected]>
Date:   Fri Oct 18 17:17:48 2024 -0700

    Input: edt-ft5x06 - fix regmap leak when probe fails

    [ Upstream commit bffdf9d7e51a7be8eeaac2ccf9e54a5fde01ff65 ]

    The driver neglects to free the instance of I2C regmap constructed at
    the beginning of the edt_ft5x06_ts_probe() method when probe fails.
    Additionally edt_ft5x06_ts_remove() is freeing the regmap too early,
    before the rest of the device resources that are managed by devm are
    released.

    Fix this by installing a custom devm action that will ensure that the
    regmap is released at the right time during normal teardown as well as
    in case of probe failure.

    Note that devm_regmap_init_i2c() could not be used because the driver
    may replace the original regmap with a regmap specific for M06 devices
    in the middle of the probe, and using devm_regmap_init_i2c() would
    result in releasing the M06 regmap too early.

    Reported-by: Li Zetao <[email protected]>
    Fixes: 9dfd9708ffba ("Input: edt-ft5x06 - convert to use regmap API")
    Cc: [email protected]
    Reviewed-by: Oliver Graute <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c19a0c171d37f86ab7267c638d475321fd9f0b77
Author: Alexandre Ghiti <[email protected]>
Date:   Wed Oct 16 10:36:24 2024 +0200

    riscv: vdso: Prevent the compiler from inserting calls to memset()

    [ Upstream commit bf40167d54d55d4b54d0103713d86a8638fb9290 ]

    The compiler is smart enough to insert a call to memset() in
    riscv_vdso_get_cpus(), which generates a dynamic relocation.

    So prevent this by using -fno-builtin option.

    Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
    Cc: [email protected]
    Signed-off-by: Alexandre Ghiti <[email protected]>
    Reviewed-by: Guo Ren <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit e79c1f1c9100b4adc91c6512985db2cc961aafaa
Author: Frank Li <[email protected]>
Date:   Wed Oct 23 16:30:32 2024 -0400

    spi: spi-fsl-dspi: Fix crash when not using GPIO chip select

    [ Upstream commit 25f00a13dccf8e45441265768de46c8bf58e08f6 ]

    Add check for the return value of spi_get_csgpiod() to avoid passing a NULL
    pointer to gpiod_direction_output(), preventing a crash when GPIO chip
    select is not used.

    Fix below crash:
    [    4.251960] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    [    4.260762] Mem abort info:
    [    4.263556]   ESR = 0x0000000096000004
    [    4.267308]   EC = 0x25: DABT (current EL), IL = 32 bits
    [    4.272624]   SET = 0, FnV = 0
    [    4.275681]   EA = 0, S1PTW = 0
    [    4.278822]   FSC = 0x04: level 0 translation fault
    [    4.283704] Data abort info:
    [    4.286583]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    [    4.292074]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [    4.297130]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [    4.302445] [0000000000000000] user address but active_mm is swapper
    [    4.308805] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    [    4.315072] Modules linked in:
    [    4.318124] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc4-next-20241023-00008-ga20ec42c5fc1 #359
    [    4.328130] Hardware name: LS1046A QDS Board (DT)
    [    4.332832] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    4.339794] pc : gpiod_direction_output+0x34/0x5c
    [    4.344505] lr : gpiod_direction_output+0x18/0x5c
    [    4.349208] sp : ffff80008003b8f0
    [    4.352517] x29: ffff80008003b8f0 x28: 0000000000000000 x27: ffffc96bcc7e9068
    [    4.359659] x26: ffffc96bcc6e00b0 x25: ffffc96bcc598398 x24: ffff447400132810
    [    4.366800] x23: 0000000000000000 x22: 0000000011e1a300 x21: 0000000000020002
    [    4.373940] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff
    [    4.381081] x17: ffff44740016e600 x16: 0000000500000003 x15: 0000000000000007
    [    4.388221] x14: 0000000000989680 x13: 0000000000020000 x12: 000000000000001e
    [    4.395362] x11: 0044b82fa09b5a53 x10: 0000000000000019 x9 : 0000000000000008
    [    4.402502] x8 : 0000000000000002 x7 : 0000000000000007 …
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Nov 17, 2024
commit cf96b8e upstream.

ASan reports a memory leak caused by evlist not being deleted on exit in
perf-report, perf-script and perf-data.
The problem is caused by evlist->session not being deleted, which is
allocated in perf_session__read_header, called in perf_session__new if
perf_data is in read mode.
In case of write mode, the session->evlist is filled by the caller.
This patch solves the problem by calling evlist__delete in
perf_session__delete if perf_data is in read mode.

Changes in v2:
 - call evlist__delete from within perf_session__delete

v1: https://lore.kernel.org/lkml/[email protected]/

ASan report follows:

$ ./perf script report flamegraph
=================================================================
==227640==ERROR: LeakSanitizer: detected memory leaks

<SNIP unrelated>

Indirect leak of 2704 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    gregkh#2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26
    gregkh#3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20
    gregkh#4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    gregkh#5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    gregkh#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    gregkh#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    gregkh#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    gregkh#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    gregkh#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    gregkh#11 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 568 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    gregkh#2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24
    gregkh#3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9
    gregkh#4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11
    gregkh#5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    gregkh#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    gregkh#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    gregkh#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    gregkh#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    gregkh#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    gregkh#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    gregkh#12 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 264 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    gregkh#2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23
    gregkh#3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21
    gregkh#4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
    gregkh#5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    gregkh#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    gregkh#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    gregkh#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    gregkh#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    gregkh#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    gregkh#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    gregkh#12 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    gregkh#2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14
    gregkh#3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
    gregkh#4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    gregkh#5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    gregkh#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    gregkh#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    gregkh#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    gregkh#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    gregkh#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    gregkh#11 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207)
    gregkh#1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16
    gregkh#2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3
    gregkh#3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9
    gregkh#4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9
    gregkh#5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2
    gregkh#6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    gregkh#7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    gregkh#8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    gregkh#9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    gregkh#10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    gregkh#11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    gregkh#12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    gregkh#13 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s).

Signed-off-by: Riccardo Mancini <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: [email protected] # 5.10.228
Signed-off-by: Shuai Xue <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
piso77 pushed a commit to piso77/linux that referenced this pull request Nov 21, 2024
Daniel Machon says:

====================
net: sparx5: add support for lan969x switch device

== Description:

This series is the second of a multi-part series, that prepares and adds
support for the new lan969x switch driver.

The upstreaming efforts is split into multiple series (might change a
bit as we go along):

        1) Prepare the Sparx5 driver for lan969x (merged)

    --> 2) add support lan969x (same basic features as Sparx5
           provides excl. FDMA and VCAP).

        3) Add support for lan969x VCAP, FDMA and RGMII

== Lan969x in short:

The lan969x Ethernet switch family [1] provides a rich set of
switching features and port configurations (up to 30 ports) from 10Mbps
to 10Gbps, with support for RGMII, SGMII, QSGMII, USGMII, and USXGMII,
ideal for industrial & process automation infrastructure applications,
transport, grid automation, power substation automation, and ring &
intra-ring topologies. The LAN969x family is hardware and software
compatible and scalable supporting 46Gbps to 102Gbps switch bandwidths.

== Preparing Sparx5 for lan969x:

The main preparation work for lan969x has already been merged [1].

After this series is applied, lan969x will have the same functionality
as Sparx5, except for VCAP and FDMA support. QoS features that requires
the VCAP (e.g. PSFP, port mirroring) will obviously not work until VCAP
support is added later.

== Patch breakdown:

Patch gregkh#1-gregkh#4  do some preparation work for lan969x

Patch gregkh#5     adds new registers required by lan969x

Patch gregkh#6     adds initial match data for all lan969x targets

Patch gregkh#7     defines the lan969x register differences

Patch gregkh#8     adds lan969x constants to match data

Patch gregkh#9     adds some lan969x ops in bulk

Patch gregkh#10    adds PTP function to ops

Patch gregkh#11    adds lan969x_calendar.c for calculating the calendar

Patch gregkh#12    makes additional use of the is_sparx5() macro to branch out
             in certain places.

Patch gregkh#13    documents lan969x in the dt-bindings

Patch gregkh#14    adds lan969x compatible string to sparx5 driver

Patch gregkh#15    introduces new concept of per-target features

[1] https://lore.kernel.org/netdev/20241004-b4-sparx5-lan969x-switch-driver-v2-0-d3290f581663@microchip.com/

v1: https://lore.kernel.org/20241021-sparx5-lan969x-switch-driver-2-v1-0-c8c49ef21e0f@microchip.com
====================

Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-0-a0b5fae88a0f@microchip.com
Signed-off-by: Jakub Kicinski <[email protected]>
piso77 pushed a commit to piso77/linux that referenced this pull request Nov 21, 2024
This commit provides a watchdog timer that sets a limit of how long a
single sub-test could run:
- if sub-test runs for 10 seconds, the name of the test is printed
  (currently the name of the test is printed only after it finishes);
- if sub-test runs for 120 seconds, the running thread is terminated
  with SIGSEGV (to trigger crash_handler() and get a stack trace).

Specifically:
- the timer is armed on each call to run_one_test();
- re-armed at each call to test__start_subtest();
- is stopped when exiting run_one_test().

Default timeout could be overridden using '-w' or '--watchdog-timeout'
options. Value 0 can be used to turn the timer off.
Here is an example execution:

    $ ./ssh-exec.sh ./test_progs -w 5 -t \
      send_signal/send_signal_perf_thread_remote,send_signal/send_signal_nmi_thread_remote
    WATCHDOG: test case send_signal/send_signal_nmi_thread_remote executes for 5 seconds, terminating with SIGSEGV
    Caught signal gregkh#11!
    Stack trace:
    ./test_progs(crash_handler+0x1f)[0x9049ef]
    /lib64/libc.so.6(+0x40d00)[0x7f1f1184fd00]
    /lib64/libc.so.6(read+0x4a)[0x7f1f1191cc4a]
    ./test_progs[0x720dd3]
    ./test_progs[0x71ef7a]
    ./test_progs(test_send_signal+0x1db)[0x71edeb]
    ./test_progs[0x9066c5]
    ./test_progs(main+0x5ed)[0x9054ad]
    /lib64/libc.so.6(+0x2a088)[0x7f1f11839088]
    /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f1f1183914b]
    ./test_progs(_start+0x25)[0x527385]
    #292     send_signal:FAIL
    test_send_signal_common:PASS:reading pipe 0 nsec
    test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
    test_send_signal_common:PASS:incorrect result 0 nsec
    test_send_signal_common:PASS:pipe_write 0 nsec
    test_send_signal_common:PASS:setpriority 0 nsec

Timer is implemented using timer_{create,start} librt API.
Internally librt uses pthreads for SIGEV_THREAD timers,
so this change adds a background timer thread to the test process.
Because of this a few checks in tests 'bpf_iter' and 'iters'
need an update to account for an extra thread.

For parallelized scenario the watchdog is also created for each worker
fork. If one of the workers gets stuck, it would be terminated by a
watchdog. In theory, this might lead to a scenario when all worker
threads are exhausted, however this should not be a problem for
server_main(), as it would exit with some of the tests not run.

Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
gregkh pushed a commit that referenced this pull request Nov 22, 2024
commit 5b188cc upstream.

Disable strict aliasing, as has been done in the kernel proper for decades
(literally since before git history) to fix issues where gcc will optimize
away loads in code that looks 100% correct, but is _technically_ undefined
behavior, and thus can be thrown away by the compiler.

E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long)
pointer to a u64 (unsigned long long) pointer when setting PMCR.N via
u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores
the result and uses the original PMCR.

The issue is most easily observed by making set_pmcr_n() noinline and
wrapping the call with printf(), e.g. sans comments, for this code:

  printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n);
  set_pmcr_n(&pmcr, pmcr_n);
  printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n);

gcc-13 generates:

 0000000000401c90 <set_pmcr_n>:
  401c90:       f9400002        ldr     x2, [x0]
  401c94:       b3751022        bfi     x2, x1, #11, #5
  401c98:       f9000002        str     x2, [x0]
  401c9c:       d65f03c0        ret

 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>:
  402724:       aa1403e3        mov     x3, x20
  402728:       aa1503e2        mov     x2, x21
  40272c:       aa1603e0        mov     x0, x22
  402730:       aa1503e1        mov     x1, x21
  402734:       940060ff        bl      41ab30 <_IO_printf>
  402738:       aa1403e1        mov     x1, x20
  40273c:       910183e0        add     x0, sp, #0x60
  402740:       97fffd54        bl      401c90 <set_pmcr_n>
  402744:       aa1403e3        mov     x3, x20
  402748:       aa1503e2        mov     x2, x21
  40274c:       aa1503e1        mov     x1, x21
  402750:       aa1603e0        mov     x0, x22
  402754:       940060f7        bl      41ab30 <_IO_printf>

with the value stored in [sp + 0x60] ignored by both printf() above and
in the test proper, resulting in a false failure due to vcpu_set_reg()
simply storing the original value, not the intended value.

  $ ./vpmu_counter_access
  Random seed: 0x6b8b4567
  orig = 3040, next = 3040, want = 0
  orig = 3040, next = 3040, want = 0
  ==== Test Assertion Failure ====
    aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr)
    pid=71578 tid=71578 errno=9 - Bad file descriptor
       1        0x400673: run_access_test at vpmu_counter_access.c:522
       2         (inlined by) main at vpmu_counter_access.c:643
       3        0x4132d7: __libc_start_call_main at libc-start.o:0
       4        0x413653: __libc_start_main at ??:0
       5        0x40106f: _start at ??:0
    Failed to update PMCR.N to 0 (received: 6)

Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if
set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n()
is inlined in its sole caller.

Cc: [email protected]
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
qaz6750 added a commit to qaz6750/linux-longterm that referenced this pull request Nov 24, 2024
commit 9b5aad3a7498c261116a0251fe57f14ba9c4c6cf
Author: Greg Kroah-Hartman <[email protected]>
Date:   Fri Nov 8 16:28:28 2024 +0100

    Linux 6.6.60

    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Hardik Garg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cc082e50375a29596153fc3f1f8fc85ad1b0b5b9
Author: Konstantin Komarov <[email protected]>
Date:   Thu Sep 5 15:03:48 2024 +0300

    fs/ntfs3: Sequential field availability check in mi_enum_attr()

    commit 090f612756a9720ec18b0b130e28be49839d7cb5 upstream.

    The code is slightly reformatted to consistently check field availability
    without duplication.

    Fixes: 556bdf27c2dd ("ntfs3: Add bounds checking to mi_enum_attr()")
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 10c20d79d59cadfe572480d98cec271a89ffb024
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon May 27 20:15:21 2024 +0530

    drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing

    commit 15c2990e0f0108b9c3752d7072a97d45d4283aea upstream.

    This commit adds null checks for the 'stream' and 'plane' variables in
    the dcn30_apply_idle_power_optimizations function. These variables were
    previously assumed to be null at line 922, but they were used later in
    the code without checking if they were null. This could potentially lead
    to a null pointer dereference, which would cause a crash.

    The null checks ensure that 'stream' and 'plane' are not null before
    they are used, preventing potential crashes.

    Fixes the below static smatch checker:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922)
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922)

    Cc: Tom Chung <[email protected]>
    Cc: Nicholas Kazlauskas <[email protected]>
    Cc: Bhawanpreet Lakha <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Hersen Wu <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    [Xiangyu: Modified file path to backport this commit]
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e979a6a626abf1358a5bb79219eea82ac160d3d3
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:15 2023 +0300

    ASoC: SOF: ipc4-control: Add support for ALSA enum control

    commit 07a866a41982c896dc46476f57d209a200602946 upstream.

    Enum controls use generic param_id and a generic struct where the data
    is passed to the firmware.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3facc0417d3d7b3ba5822e74155bcb1267ce62c1
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:14 2023 +0300

    ASoC: SOF: ipc4-control: Add support for ALSA switch control

    commit 4a2fd607b7ca6128ee3532161505da7624197f55 upstream.

    Volume controls with a max value of 1 are switches.
    Switch controls use generic param_id and a generic struct where the data
    is passed to the firmware.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f01d8fc623711046e1efee00827bff6d5882cdfd
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:13 2023 +0300

    ASoC: SOF: ipc4-topology: Add definition for generic switch/enum control

    commit 060a07cd9bc69eba2da33ed96b1fa69ead60bab1 upstream.

    Currently IPC4 has no notion of a switch or enum type of control which is
    a generic concept in ALSA.

    The generic support for these control types will be as follows:
    - large config is used to send the channel-value par array
    - param_id of a SWITCH type is 200
    - param_id of an ENUM type is 201

    Each module need to support a switch or/and enum must handle these
    universal param_ids.
    The message payload is described by struct sof_ipc4_control_msg_payload.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d54afaef6570c277070c3cafe1ed73dcdc129e0a
Author: Chuck Lever <[email protected]>
Date:   Tue Sep 19 11:35:15 2023 -0400

    SUNRPC: Remove BUG_ON call sites

    commit 789ce196a31dd13276076762204bee87df893e53 upstream.

    There is no need to take down the whole system for these assertions.

    I'd rather not attempt a heroic save here, as some bug has occurred
    that has left the transport data structures in an unknown state.
    Just warn and then leak the left-over resources.

    Acked-by: Christian Brauner <[email protected]>
    Reviewed-by: NeilBrown <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Dominique Martinet <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 27a58a19bd20a7afe369da2ce6d4ebea70768acd
Author: Michael Walle <[email protected]>
Date:   Fri Jun 21 14:09:29 2024 +0200

    mtd: spi-nor: winbond: fix w25q128 regression

    commit d35df77707bf5ae1221b5ba1c8a88cf4fcdd4901 upstream.

    Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    removed the flags for non-SFDP devices. It was assumed that it wasn't in
    use anymore. This wasn't true. Add the no_sfdp_flags as well as the size
    again.

    We add the additional flags for dual and quad read because they have
    been reported to work properly by Hartmut using both older and newer
    versions of this flash, the similar flashes with 64Mbit and 256Mbit
    already have these flags and because it will (luckily) trigger our
    legacy SFDP parsing, so newer versions with SFDP support will still get
    the parameters from the SFDP tables.

    Reported-by: Hartmut Birr <[email protected]>
    Closes: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/
    Fixes: 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    Reviewed-by: Linus Walleij <[email protected]>
    Signed-off-by: Michael Walle <[email protected]>
    Acked-by: Tudor Ambarus <[email protected]>
    Reviewed-by: Esben Haabendal <[email protected]>
    Reviewed-by: Pratyush Yadav <[email protected]>
    Signed-off-by: Pratyush Yadav <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    [Backported to v6.6 - vastly different due to upstream changes]
    Reviewed-by: Tudor Ambarus <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3d544942c0010feedc048b048ee0c35d2d921100
Author: David Hildenbrand <[email protected]>
Date:   Fri Oct 11 12:24:45 2024 +0200

    mm: don't install PMD mappings when THPs are disabled by the hw/process/vma

    commit 2b0f922323ccfa76219bcaacd35cd50aeaa13592 upstream.

    We (or rather, readahead logic :) ) might be allocating a THP in the
    pagecache and then try mapping it into a process that explicitly disabled
    THP: we might end up installing PMD mappings.

    This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
    THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
    starting the VM.

    For example, starting a VM backed on a file system with large folios
    supported makes the VM crash when the VM tries accessing such a mapping
    using KVM.

    Is it also a problem when the HW disabled THP using
    TRANSPARENT_HUGEPAGE_UNSUPPORTED?  At least on x86 this would be the case
    without X86_FEATURE_PSE.

    In the future, we might be able to do better on s390x and only disallow
    PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
    really wants.  For now, fix it by essentially performing the same check as
    would be done in __thp_vma_allowable_orders() or in shmem code, where this
    works as expected, and disallow PMD mappings, making us fallback to PTE
    mappings.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Leo Fu <[email protected]>
    Tested-by: Thomas Huth <[email protected]>
    Cc: Thomas Huth <[email protected]>
    Cc: Matthew Wilcox (Oracle) <[email protected]>
    Cc: Ryan Roberts <[email protected]>
    Cc: Christian Borntraeger <[email protected]>
    Cc: Janosch Frank <[email protected]>
    Cc: Claudio Imbrenda <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Kefeng Wang <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 02ec4b3bba49e8d3abb25a3feba6875cae12da92
Author: Kefeng Wang <[email protected]>
Date:   Fri Oct 11 12:24:44 2024 +0200

    mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()

    commit 963756aac1f011d904ddd9548ae82286d3a91f96 upstream.

    Patch series "mm: don't install PMD mappings when THPs are disabled by the
    hw/process/vma".

    During testing, it was found that we can get PMD mappings in processes
    where THP (and more precisely, PMD mappings) are supposed to be disabled.
    While it works as expected for anon+shmem, the pagecache is the
    problematic bit.

    For s390 KVM this currently means that a VM backed by a file located on
    filesystem with large folio support can crash when KVM tries accessing the
    problematic page, because the readahead logic might decide to use a
    PMD-sized THP and faulting it into the page tables will install a PMD
    mapping, something that s390 KVM cannot tolerate.

    This might also be a problem with HW that does not support PMD mappings,
    but I did not try reproducing it.

    Fix it by respecting the ways to disable THPs when deciding whether we can
    install a PMD mapping.  khugepaged should already be taking care of not
    collapsing if THPs are effectively disabled for the hw/process/vma.

    This patch (of 2):

    Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
    shmem_allowable_huge_orders() and __thp_vma_allowable_orders().

    [[email protected]: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
    Signed-off-by: Kefeng Wang <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Leo Fu <[email protected]>
    Tested-by: Thomas Huth <[email protected]>
    Reviewed-by: Ryan Roberts <[email protected]>
    Cc: Boqiao Fu <[email protected]>
    Cc: Christian Borntraeger <[email protected]>
    Cc: Claudio Imbrenda <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Janosch Frank <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fc621e7a043de346c33bd7ae7e2e0c651d6152ef
Author: Johannes Berg <[email protected]>
Date:   Wed Oct 23 09:17:44 2024 +0200

    wifi: iwlwifi: mvm: fix 6 GHz scan construction

    commit 7245012f0f496162dd95d888ed2ceb5a35170f1a upstream.

    If more than 255 colocated APs exist for the set of all
    APs found during 2.4/5 GHz scanning, then the 6 GHz scan
    construction will loop forever since the loop variable
    has type u8, which can never reach the number found when
    that's bigger than 255, and is stored in a u32 variable.
    Also move it into the loops to have a smaller scope.

    Using a u32 there is fine, we limit the number of APs in
    the scan list and each has a limit on the number of RNR
    entries due to the frame size. With a limit of 1000 scan
    results, a frame size upper bound of 4096 (really it's
    more like ~2300) and a TBTT entry size of at least 11,
    we get an upper bound for the number of ~372k, well in
    the bounds of a u32.

    Cc: [email protected]
    Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375
    Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f2f1fa446676c21edb777e6d2bc4fa8f956fab68
Author: Ryusuke Konishi <[email protected]>
Date:   Fri Oct 18 04:33:10 2024 +0900

    nilfs2: fix kernel bug due to missing clearing of checked flag

    commit 41e192ad2779cae0102879612dfe46726e4396aa upstream.

    Syzbot reported that in directory operations after nilfs2 detects
    filesystem corruption and degrades to read-only,
    __block_write_begin_int(), which is called to prepare block writes, may
    fail the BUG_ON check for accesses exceeding the folio/page size,
    triggering a kernel bug.

    This was found to be because the "checked" flag of a page/folio was not
    cleared when it was discarded by nilfs2's own routine, which causes the
    sanity check of directory entries to be skipped when the directory
    page/folio is reloaded.  So, fix that.

    This was necessary when the use of nilfs2's own page discard routine was
    applied to more than just metadata files.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a53c2d847627b790fb3bd8b00e02c247941b17e0
Author: Zong-Zhe Yang <[email protected]>
Date:   Mon Jun 17 19:52:17 2024 +0800

    wifi: mac80211: fix NULL dereference at band check in starting tx ba session

    commit 021d53a3d87eeb9dbba524ac515651242a2a7e3b upstream.

    In MLD connection, link_data/link_conf are dynamically allocated. They
    don't point to vif->bss_conf. So, there will be no chanreq assigned to
    vif->bss_conf and then the chan will be NULL. Tweak the code to check
    ht_supported/vht_supported/has_he/has_eht on sta deflink.

    Crash log (with rtw89 version under MLO development):
    [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [ 9890.526102] #PF: supervisor read access in kernel mode
    [ 9890.526105] #PF: error_code(0x0000) - not-present page
    [ 9890.526109] PGD 0 P4D 0
    [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI
    [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G           OE      6.9.0 #1
    [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018
    [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core]
    [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
    [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3
    All code
    ========
       0:	f7 e8                	imul   %eax
       2:	d5                   	(bad)
       3:	93                   	xchg   %eax,%ebx
       4:	3e ea                	ds (bad)
       6:	48 83 c4 28          	add    $0x28,%rsp
       a:	89 d8                	mov    %ebx,%eax
       c:	5b                   	pop    %rbx
       d:	41 5c                	pop    %r12
       f:	41 5d                	pop    %r13
      11:	41 5e                	pop    %r14
      13:	41 5f                	pop    %r15
      15:	5d                   	pop    %rbp
      16:	c3                   	retq
      17:	cc                   	int3
      18:	cc                   	int3
      19:	cc                   	int3
      1a:	cc                   	int3
      1b:	49 8b 84 24 e0 f1 ff 	mov    -0xe20(%r12),%rax
      22:	ff
      23:	48 8b 80 90 1b 00 00 	mov    0x1b90(%rax),%rax
      2a:*	83 38 03             	cmpl   $0x3,(%rax)		<-- trapping instruction
      2d:	0f 84 37 fe ff ff    	je     0xfffffffffffffe6a
      33:	bb ea ff ff ff       	mov    $0xffffffea,%ebx
      38:	eb cc                	jmp    0x6
      3a:	49                   	rex.WB
      3b:	8b                   	.byte 0x8b
      3c:	84 24 10             	test   %ah,(%rax,%rdx,1)
      3f:	f3                   	repz

    Code starting with the faulting instruction
    ===========================================
       0:	83 38 03             	cmpl   $0x3,(%rax)
       3:	0f 84 37 fe ff ff    	je     0xfffffffffffffe40
       9:	bb ea ff ff ff       	mov    $0xffffffea,%ebx
       e:	eb cc                	jmp    0xffffffffffffffdc
      10:	49                   	rex.WB
      11:	8b                   	.byte 0x8b
      12:	84 24 10             	test   %ah,(%rax,%rdx,1)
      15:	f3                   	repz
    [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246
    [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8
    [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685
    [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873
    [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70
    [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000
    [ 9890.526313] FS:  0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000
    [ 9890.526316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0
    [ 9890.526321] Call Trace:
    [ 9890.526324]  <TASK>
    [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479)
    [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713)
    [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3))
    [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211

    Signed-off-by: Zong-Zhe Yang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6a91a5816b289018e0b42a25444c0b4f8c637dca
Author: Pavel Begunkov <[email protected]>
Date:   Wed Apr 10 02:26:54 2024 +0100

    io_uring: always lock __io_cqring_overflow_flush

    commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream.

    Conditional locking is never great, in case of
    __io_cqring_overflow_flush(), which is a slow path, it's not justified.
    Don't handle IOPOLL separately, always grab uring_lock for overflow
    flushing.

    Signed-off-by: Pavel Begunkov <[email protected]>
    Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e3fb0e6afcc399660770428a35162b4880e2e14e
Author: Haibo Chen <[email protected]>
Date:   Thu Sep 5 17:43:38 2024 +0800

    arm64: dts: imx8ulp: correct the flexspi compatible string

    commit 409dc5196d5b6eb67468a06bf4d2d07d7225a67b upstream.

    The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has
    32 LUTs, so correct the compatible string here, otherwise will
    meet below error:

    [    1.119072] ------------[ cut here ]------------
    [    1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64
    [    1.133239] Modules linked in:
    [    1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69
    [    1.146821] Hardware name: NXP i.MX8ULP EVK (DT)
    [    1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64
    [    1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64
    [    1.168060] sp : ffff80008002b2a0
    [    1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000
    [    1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000
    [    1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0
    [    1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff
    [    1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002
    [    1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00
    [    1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80
    [    1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001
    [    1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6
    [    1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92
    [    1.246267] Call trace:
    [    1.248824]  nxp_fspi_exec_op+0xb04/0xb64
    [    1.253031]  spi_mem_exec_op+0x3a0/0x430
    [    1.257139]  spi_nor_read_id+0x80/0xcc
    [    1.261065]  spi_nor_scan+0x1ec/0xf10
    [    1.264901]  spi_nor_probe+0x108/0x2fc
    [    1.268828]  spi_mem_probe+0x6c/0xbc
    [    1.272574]  spi_probe+0x84/0xe4
    [    1.275958]  really_probe+0xbc/0x29c
    [    1.279713]  __driver_probe_device+0x78/0x12c
    [    1.284277]  driver_probe_device+0xd8/0x15c
    [    1.288660]  __device_attach_driver+0xb8/0x134
    [    1.293316]  bus_for_each_drv+0x88/0xe8
    [    1.297337]  __device_attach+0xa0/0x190
    [    1.301353]  device_initial_probe+0x14/0x20
    [    1.305734]  bus_probe_device+0xac/0xb0
    [    1.309752]  device_add+0x5d0/0x790
    [    1.313408]  __spi_add_device+0x134/0x204
    [    1.317606]  of_register_spi_device+0x3b4/0x590
    [    1.322348]  spi_register_controller+0x47c/0x754
    [    1.327181]  devm_spi_register_controller+0x4c/0xa4
    [    1.332289]  nxp_fspi_probe+0x1cc/0x2b0
    [    1.336307]  platform_probe+0x68/0xc4
    [    1.340145]  really_probe+0xbc/0x29c
    [    1.343893]  __driver_probe_device+0x78/0x12c
    [    1.348457]  driver_probe_device+0xd8/0x15c
    [    1.352838]  __driver_attach+0x90/0x19c
    [    1.356857]  bus_for_each_dev+0x7c/0xdc
    [    1.360877]  driver_attach+0x24/0x30
    [    1.364624]  bus_add_driver+0xe4/0x208
    [    1.368552]  driver_register+0x5c/0x124
    [    1.372573]  __platform_driver_register+0x28/0x34
    [    1.377497]  nxp_fspi_driver_init+0x1c/0x28
    [    1.381888]  do_one_initcall+0x80/0x1c8
    [    1.385908]  kernel_init_freeable+0x1c4/0x28c
    [    1.390472]  kernel_init+0x20/0x1d8
    [    1.394138]  ret_from_fork+0x10/0x20
    [    1.397885] ---[ end trace 0000000000000000 ]---
    [    1.407908] ------------[ cut here ]------------

    Fixes: ef89fd56bdfc ("arm64: dts: imx8ulp: add flexspi node")
    Cc: [email protected]
    Signed-off-by: Haibo Chen <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1a49b96c51063d38be296a0c1537928a06f02d6e
Author: Gregory Price <[email protected]>
Date:   Fri Oct 25 10:17:24 2024 -0400

    vmscan,migrate: fix page count imbalance on node stats when demoting pages

    [ Upstream commit 35e41024c4c2b02ef8207f61b9004f6956cf037b ]

    When numa balancing is enabled with demotion, vmscan will call
    migrate_pages when shrinking LRUs.  migrate_pages will decrement the
    the node's isolated page count, leading to an imbalanced count when
    invoked from (MG)LRU code.

    The result is dmesg output like such:

    $ cat /proc/sys/vm/stat_refresh

    [77383.088417] vmstat_refresh: nr_isolated_anon -103212
    [77383.088417] vmstat_refresh: nr_isolated_file -899642

    This negative value may impact compaction and reclaim throttling.

    The following path produces the decrement:

    shrink_folio_list
      demote_folio_list
        migrate_pages
          migrate_pages_batch
            migrate_folio_move
              migrate_folio_done
                mod_node_page_state(-ve) <- decrement

    This path happens for SUCCESSFUL migrations, not failures.  Typically
    callers to migrate_pages are required to handle putback/accounting for
    failures, but this is already handled in the shrink code.

    When accounting for migrations, instead do not decrement the count when
    the migration reason is MR_DEMOTION.  As of v6.11, this demotion logic
    is the only source of MR_DEMOTION.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim")
    Signed-off-by: Gregory Price <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Reviewed-by: Davidlohr Bueso <[email protected]>
    Reviewed-by: Shakeel Butt <[email protected]>
    Reviewed-by: "Huang, Ying" <[email protected]>
    Reviewed-by: Oscar Salvador <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Wei Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 003d2996964c03dfd34860500428f4cdf1f5879e
Author: Jens Axboe <[email protected]>
Date:   Thu Oct 31 08:05:44 2024 -0600

    io_uring/rw: fix missing NOWAIT check for O_DIRECT start write

    [ Upstream commit 1d60d74e852647255bd8e76f5a22dc42531e4389 ]

    When io_uring starts a write, it'll call kiocb_start_write() to bump the
    super block rwsem, preventing any freezes from happening while that
    write is in-flight. The freeze side will grab that rwsem for writing,
    excluding any new writers from happening and waiting for existing writes
    to finish. But io_uring unconditionally uses kiocb_start_write(), which
    will block if someone is currently attempting to freeze the mount point.
    This causes a deadlock where freeze is waiting for previous writes to
    complete, but the previous writes cannot complete, as the task that is
    supposed to complete them is blocked waiting on starting a new write.
    This results in the following stuck trace showing that dependency with
    the write blocked starting a new write:

    task:fio             state:D stack:0     pid:886   tgid:886   ppid:876
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_rwsem_wait+0x1e8/0x3f8
     __percpu_down_read+0xe8/0x500
     io_write+0xbb8/0xff8
     io_issue_sqe+0x10c/0x1020
     io_submit_sqes+0x614/0x2110
     __arm64_sys_io_uring_enter+0x524/0x1038
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170
    INFO: task fsfreeze:7364 blocked for more than 15 seconds.
          Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963

    with the attempting freezer stuck trying to grab the rwsem:

    task:fsfreeze        state:D stack:0     pid:7364  tgid:7364  ppid:995
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_down_write+0x2b0/0x680
     freeze_super+0x248/0x8a8
     do_vfs_ioctl+0x149c/0x1b18
     __arm64_sys_ioctl+0xd0/0x1a0
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170

    Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
    blocking grab of the super block rwsem if it isn't set. For normal issue
    where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
    have io_uring core issue a blocking attempt of the write. That will in
    turn also get completions run, ensuring forward progress.

    Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
    something that can be triggered by a regular user.

    Cc: [email protected] # 5.10+
    Reported-by: Peter Mann <[email protected]>
    Link: https://lore.kernel.org/io-uring/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 70bbe8d0a949413df1bb6532fd6b19fbf0f88feb
Author: Andrey Konovalov <[email protected]>
Date:   Tue Oct 22 18:07:06 2024 +0200

    kasan: remove vmalloc_percpu test

    [ Upstream commit 330d8df81f3673d6fb74550bbc9bb159d81b35f7 ]

    Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the
    vmalloc_percpu KASAN test with the assumption that __alloc_percpu always
    uses vmalloc internally, which is tagged by KASAN.

    However, __alloc_percpu might allocate memory from the first per-CPU
    chunk, which is not allocated via vmalloc().  As a result, the test might
    fail.

    Remove the test until proper KASAN annotation for the per-CPU allocated
    are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests")
    Signed-off-by: Andrey Konovalov <[email protected]>
    Reported-by: Samuel Holland <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Reported-by: Sabyrzhan Tasbolatov <[email protected]>
    Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/
    Cc: Alexander Potapenko <[email protected]>
    Cc: Andrey Ryabinin <[email protected]>
    Cc: Dmitry Vyukov <[email protected]>
    Cc: Marco Elver <[email protected]>
    Cc: Sabyrzhan Tasbolatov <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c60af16e1d6cc2237d58336546d6adfc067b6b8f
Author: Vitaliy Shevtsov <[email protected]>
Date:   Mon Sep 16 22:41:37 2024 +0500

    nvmet-auth: assign dh_key to NULL after kfree_sensitive

    [ Upstream commit d2f551b1f72b4c508ab9298419f6feadc3b5d791 ]

    ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup()
    for the same controller. So it's better to nullify it after release on
    error path in order to avoid double free later in nvmet_destroy_auth().

    Found by Linux Verification Center (linuxtesting.org) with Svace.

    Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support")
    Cc: [email protected]
    Signed-off-by: Vitaliy Shevtsov <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Hannes Reinecke <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4a39320977f9c665faa37efaa8093b8e82dd8c41
Author: Christoffer Sandberg <[email protected]>
Date:   Tue Oct 29 16:16:53 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1

    [ Upstream commit e49370d769e71456db3fbd982e95bab8c69f73e8 ]

    Quirk is needed to enable headset microphone on missing pin 0x19.

    Signed-off-by: Christoffer Sandberg <[email protected]>
    Signed-off-by: Werner Sembach <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit b42adef85aca72b51eab1a812a79913ff5aeb584
Author: Christoffer Sandberg <[email protected]>
Date:   Tue Oct 29 16:16:52 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3

    [ Upstream commit 0b04fbe886b4274c8e5855011233aaa69fec6e75 ]

    Quirk is needed to enable headset microphone on missing pin 0x19.

    Signed-off-by: Christoffer Sandberg <[email protected]>
    Signed-off-by: Werner Sembach <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 77ddc732416b017180893cbb2356e9f0a414c575
Author: Christoph Hellwig <[email protected]>
Date:   Wed Oct 23 15:37:22 2024 +0200

    xfs: fix finding a last resort AG in xfs_filestream_pick_ag

    [ Upstream commit dc60992ce76fbc2f71c2674f435ff6bde2108028 ]

    When the main loop in xfs_filestream_pick_ag fails to find a suitable
    AG it tries to just pick the online AG.  But the loop for that uses
    args->pag as loop iterator while the later code expects pag to be
    set.  Fix this by reusing the max_pag case for this last resort, and
    also add a check for impossible case of no AG just to make sure that
    the uninitialized pag doesn't even escape in theory.

    Reported-by: [email protected]
    Signed-off-by: Christoph Hellwig <[email protected]>
    Tested-by: [email protected]
    Fixes: f8f1ed1ab3baba ("xfs: return a referenced perag from filestreams allocator")
    Cc: <[email protected]> # v6.3
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Carlos Maiolino <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 8e886e44397ba89f6e8da8471386112b4f5b67b7
Author: Matt Johnston <[email protected]>
Date:   Tue Oct 22 18:25:14 2024 +0800

    mctp i2c: handle NULL header address

    [ Upstream commit 01e215975fd80af81b5b79f009d49ddd35976c13 ]

    daddr can be NULL if there is no neighbour table entry present,
    in that case the tx packet should be dropped.

    saddr will usually be set by MCTP core, but check for NULL in case a
    packet is transmitted by a different protocol.

    Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
    Cc: [email protected]
    Reported-by: Dung Cao <[email protected]>
    Signed-off-by: Matt Johnston <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 88f97a4b5843ce21c1286e082c02a5fb4d8eb473
Author: Edward Adam Davis <[email protected]>
Date:   Wed Oct 16 19:43:47 2024 +0800

    ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow

    [ Upstream commit bc0a2f3a73fcdac651fca64df39306d1e5ebe3b0 ]

    Syzbot reported a kernel BUG in ocfs2_truncate_inline.  There are two
    reasons for this: first, the parameter value passed is greater than
    ocfs2_max_inline_data_with_xattr, second, the start and end parameters of
    ocfs2_truncate_inline are "unsigned int".

    So, we need to add a sanity check for byte_start and byte_len right before
    ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater
    than ocfs2_max_inline_data_with_xattr return -EINVAL.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1afc32b95233 ("ocfs2: Write support for inline data")
    Signed-off-by: Edward Adam Davis <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c117a980185ee3812612e7e453e356a6a4f05305
Author: Sabyrzhan Tasbolatov <[email protected]>
Date:   Wed Oct 16 20:24:07 2024 +0500

    x86/traps: move kmsan check after instrumentation_begin

    [ Upstream commit 1db272864ff250b5e607283eaec819e1186c8e26 ]

    During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following:

      AR      built-in.a
      AR      vmlinux.a
      LD      vmlinux.o
    vmlinux.o: warning: objtool: handle_bug+0x4: call to
        kmsan_unpoison_entry_regs() leaves .noinstr.text section
      OBJCOPY modules.builtin.modinfo
      GEN     modules.builtin
      MODPOST Module.symvers
      CC      .vmlinux.export.o

    Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes
    the warning.

    There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but
    it has the return condition and if we include it after
    instrumentation_begin() it results the warning "return with
    instrumentation enabled", hence, I'm concerned that regs will not be KMSAN
    unpoisoned if `ud_type == BUG_NONE` is true.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ba54d194f8da ("x86/traps: avoid KMSAN bugs originating from handle_bug()")
    Signed-off-by: Sabyrzhan Tasbolatov <[email protected]>
    Reviewed-by: Alexander Potapenko <[email protected]>
    Cc: Borislav Petkov (AMD) <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 86ee1845cbbf52eff6d41ce438d5f7e9ab6f4602
Author: Gatlin Newhouse <[email protected]>
Date:   Wed Jul 24 00:01:55 2024 +0000

    x86/traps: Enable UBSAN traps on x86

    [ Upstream commit 7424fc6b86c8980a87169e005f5cd4438d18efe6 ]

    Currently ARM64 extracts which specific sanitizer has caused a trap via
    encoded data in the trap instruction. Clang on x86 currently encodes the
    same data in the UD1 instruction but x86 handle_bug() and
    is_valid_bugaddr() currently only look at UD2.

    Bring x86 to parity with ARM64, similar to commit 25b84002afb9 ("arm64:
    Support Clang UBSAN trap codes for better reporting"). See the llvm
    links for information about the code generation.

    Enable the reporting of UBSAN sanitizer details on x86 compiled with clang
    when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate
    which is encoded by the compiler after the UD1.

    [ tglx: Simplified it by moving the printk() into handle_bug() ]

    Signed-off-by: Gatlin Newhouse <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Acked-by: Peter Zijlstra (Intel) <[email protected]>
    Cc: Kees Cook <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f
    Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27
    Stable-dep-of: 1db272864ff2 ("x86/traps: move kmsan check after instrumentation_begin")
    Signed-off-by: Sasha Levin <[email protected]>

commit b958948ae1cb3e39c48e9f805436fd652103c71e
Author: Matt Fleming <[email protected]>
Date:   Fri Oct 11 13:07:37 2024 +0100

    mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves

    [ Upstream commit 281dd25c1a018261a04d1b8bf41a0674000bfe38 ]

    Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to
    fail even though free pages are available in the highatomic reserves.
    GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock()
    since it's only run from reclaim.

    Given that such allocations will pass the watermarks in
    __zone_watermark_unusable_free(), it makes sense to fallback to highatomic
    reserves the same way that ALLOC_OOM can.

    This fixes order-0 page allocation failures observed on Cloudflare's fleet
    when handling network packets:

      kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC),
      nodemask=(null),cpuset=/,mems_allowed=0-7
      CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G           O 6.6.43-CUSTOM #1
      Hardware name: MACHINE
      Call Trace:
       <IRQ>
       dump_stack_lvl+0x3c/0x50
       warn_alloc+0x13a/0x1c0
       __alloc_pages_slowpath.constprop.0+0xc9d/0xd10
       __alloc_pages+0x327/0x340
       __napi_alloc_skb+0x16d/0x1f0
       bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en]
       bnxt_rx_pkt+0x201/0x15e0 [bnxt_en]
       __bnxt_poll_work+0x156/0x2b0 [bnxt_en]
       bnxt_poll+0xd9/0x1c0 [bnxt_en]
       __napi_poll+0x2b/0x1b0
       bpf_trampoline_6442524138+0x7d/0x1000
       __napi_poll+0x5/0x1b0
       net_rx_action+0x342/0x740
       handle_softirqs+0xcf/0x2b0
       irq_exit_rcu+0x6c/0x90
       sysvec_apic_timer_interrupt+0x72/0x90
       </IRQ>

    [[email protected]: update comment]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/
    Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs")
    Signed-off-by: Matt Fleming <[email protected]>
    Suggested-by: Vlastimil Babka <[email protected]>
    Reviewed-by: Vlastimil Babka <[email protected]>
    Cc: Mel Gorman <[email protected]>
    Cc: Michal Hocko <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4882a352b5df897c30f9d64fba340a219a6604d0
Author: Alexander Usyskin <[email protected]>
Date:   Tue Oct 15 15:31:57 2024 +0300

    mei: use kvmalloc for read buffer

    [ Upstream commit 4adf613e01bf99e1739f6ff3e162ad5b7d578d1a ]

    Read buffer is allocated according to max message size, reported by
    the firmware and may reach 64K in systems with pxp client.
    Contiguous 64k allocation may fail under memory pressure.
    Read buffer is used as in-driver message storage and not required
    to be contiguous.
    Use kvmalloc to allow kernel to allocate non-contiguous memory.

    Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.")
    Cc: stable <[email protected]>
    Reported-by: Rohit Agarwal <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Brian Geffon <[email protected]>
    Signed-off-by: Alexander Usyskin <[email protected]>
    Acked-by: Tomas Winkler <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit cb8b81ad3e893a6d18dcdd3754cc2ea2a42c0136
Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Mon Oct 21 12:25:26 2024 +0200

    mptcp: init: protect sched with rcu_read_lock

    [ Upstream commit 3deb12c788c385e17142ce6ec50f769852fcec65 ]

    Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
    creates this splat when an MPTCP socket is created:

      =============================
      WARNING: suspicious RCU usage
      6.12.0-rc2+ #11 Not tainted
      -----------------------------
      net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!

      other info that might help us debug this:

      rcu_scheduler_active = 2, debug_locks = 1
      no locks held by mptcp_connect/176.

      stack backtrace:
      CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Call Trace:
       <TASK>
       dump_stack_lvl (lib/dump_stack.c:123)
       lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
       mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
       mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
       ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
       inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
       ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
       __sock_create (net/socket.c:1576)
       __sys_socket (net/socket.c:1671)
       ? __pfx___sys_socket (net/socket.c:1712)
       ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
       __x64_sys_socket (net/socket.c:1728)
       do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
       entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

    That's because when the socket is initialised, rcu_read_lock() is not
    used despite the explicit comment written above the declaration of
    mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
    warning.

    Fixes: 1730b2b2c5a5 ("mptcp: add sched in mptcp_sock")
    Cc: [email protected]
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4f7ffa83fa79dd52efbaef366c850aaaae06a469
Author: Hugh Dickins <[email protected]>
Date:   Sun Oct 27 15:23:23 2024 -0700

    iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP

    [ Upstream commit c749d9b7ebbc5716af7a95f7768634b30d9446ec ]

    generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem,
    on huge=always tmpfs, issues a warning and then hangs (interruptibly):

    WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9
    CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2
    ...
    copy_page_from_iter_atomic+0xa6/0x5ec
    generic_perform_write+0xf6/0x1b4
    shmem_file_write_iter+0x54/0x67

    Fix copy_page_from_iter_atomic() by limiting it in that case
    (include/linux/skbuff.h skb_frag_must_loop() does similar).

    But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too
    surprising, has outlived its usefulness, and should just be removed?

    Fixes: 908a1ad89466 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()")
    Signed-off-by: Hugh Dickins <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Christoph Hellwig <[email protected]>
    Cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit ade91f6e9848b370add44d89c976e070ccb492ef
Author: Shawn Wang <[email protected]>
Date:   Fri Oct 25 10:22:08 2024 +0800

    sched/numa: Fix the potential null pointer dereference in task_numa_work()

    [ Upstream commit 9c70b2a33cd2aa6a5a59c5523ef053bd42265209 ]

    When running stress-ng-vm-segv test, we found a null pointer dereference
    error in task_numa_work(). Here is the backtrace:

      [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
      ......
      [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se
      ......
      [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
      [323676.067115] pc : vma_migratable+0x1c/0xd0
      [323676.067122] lr : task_numa_work+0x1ec/0x4e0
      [323676.067127] sp : ffff8000ada73d20
      [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010
      [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000
      [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000
      [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8
      [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035
      [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8
      [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4
      [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001
      [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000
      [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000
      [323676.067152] Call trace:
      [323676.067153]  vma_migratable+0x1c/0xd0
      [323676.067155]  task_numa_work+0x1ec/0x4e0
      [323676.067157]  task_work_run+0x78/0xd8
      [323676.067161]  do_notify_resume+0x1ec/0x290
      [323676.067163]  el0_svc+0x150/0x160
      [323676.067167]  el0t_64_sync_handler+0xf8/0x128
      [323676.067170]  el0t_64_sync+0x17c/0x180
      [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000)
      [323676.067177] SMP: stopping secondary CPUs
      [323676.070184] Starting crashdump kernel...

    stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error
    handling function of the system, which tries to cause a SIGSEGV error on
    return from unmapping the whole address space of the child process.

    Normally this program will not cause kernel crashes. But before the
    munmap system call returns to user mode, a potential task_numa_work()
    for numa balancing could be added and executed. In this scenario, since the
    child process has no vma after munmap, the vma_next() in task_numa_work()
    will return a null pointer even if the vma iterator restarts from 0.

    Recheck the vma pointer before dereferencing it in task_numa_work().

    Fixes: 214dbc428137 ("sched: convert to vma iterator")
    Signed-off-by: Shawn Wang <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Cc: [email protected] # v6.2+
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

commit 8c9a1ec39c698cbc38f4efa9113185f885137f8b
Author: Dan Williams <[email protected]>
Date:   Tue Oct 22 18:43:40 2024 -0700

    cxl/acpi: Ensure ports ready at cxl_acpi_probe() return

    [ Upstream commit 48f62d38a07d464a499fa834638afcfd2b68f852 ]

    In order to ensure root CXL ports are enabled upon cxl_acpi_probe()
    when the 'cxl_port' driver is built as a module, arrange for the
    module to be pre-loaded or built-in.

    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means. However, a stable
    backport should do no harm.

    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Dan Williams <[email protected]>
    Tested-by: Gregory Price <[email protected]>
    Reviewed-by: Jonathan Cameron <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit a9ed67f39f888bb6e5729112ad45f15d9c5a3ef8
Author: Dan Williams <[email protected]>
Date:   Tue Oct 22 18:43:32 2024 -0700

    cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()

    [ Upstream commit 3d6ebf16438de5d712030fefbb4182b46373d677 ]

    It turns out since its original introduction, pre-2.6.12,
    bus_rescan_devices() has skipped devices that might be in the process of
    attaching or detaching from their driver. For CXL this behavior is
    unwanted and expects that cxl_bus_rescan() is a probe barrier.

    That behavior is simple enough to achieve with bus_for_each_dev() paired
    with call to device_attach(), and it is unclear why bus_rescan_devices()
    took the position of lockless consumption of dev->driver which is racy.

    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means.  However, a stable
    backport should do no harm.

    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Dan Williams <[email protected]>
    Tested-by: Gregory Price <[email protected]>
    Reviewed-by: Jonathan Cameron <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit d210bc87cc4fdde62f757002530a08c3d109d94a
Author: Chunyan Zhang <[email protected]>
Date:   Tue Oct 8 17:41:39 2024 +0800

    riscv: Remove duplicated GET_RM

    [ Upstream commit 164f66de6bb6ef454893f193c898dc8f1da6d18b ]

    The macro GET_RM defined twice in this file, one can be removed.

    Reviewed-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Chunyan Zhang <[email protected]>
    Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 6d84e1b2e5ac04511e68bcf5577fc8369e73f4ed
Author: Chunyan Zhang <[email protected]>
Date:   Tue Oct 8 17:41:38 2024 +0800

    riscv: Remove unused GENERATING_ASM_OFFSETS

    [ Upstream commit 46d4e5ac6f2f801f97bcd0ec82365969197dc9b1 ]

    The macro is not used in the current version of kernel, it looks like
    can be removed to avoid a build warning:

    ../arch/riscv/kernel/asm-offsets.c: At top level:
    ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros]
        7 | #define GENERATING_ASM_OFFSETS

    Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()")
    Cc: [email protected]
    Reviewed-by: Alexandre Ghiti <[email protected]>
    Tested-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Chunyan Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit a63ba17207c50da91b19150b6cde09d199b34c2c
Author: WangYuli <[email protected]>
Date:   Thu Oct 17 11:20:10 2024 +0800

    riscv: Use '%u' to format the output of 'cpu'

    [ Upstream commit e0872ab72630dada3ae055bfa410bf463ff1d1e0 ]

    'cpu' is an unsigned integer, so its conversion specifier should
    be %u, not %d.

    Suggested-by: Wentao Guan <[email protected]>
    Suggested-by: Maciej W. Rozycki <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: WangYuli <[email protected]>
    Reviewed-by: Charlie Jenkins <[email protected]>
    Tested-by: Charlie Jenkins <[email protected]>
    Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 909e71f28e9615410f52fca1b54acfd3d61c61c2
Author: Heinrich Schuchardt <[email protected]>
Date:   Sun Sep 29 16:02:33 2024 +0200

    riscv: efi: Set NX compat flag in PE/COFF header

    [ Upstream commit d41373a4b910961df5a5e3527d7bde6ad45ca438 ]

    The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the
    EFI binary does not rely on pages that are both executable and
    writable.

    The flag is used by some distro versions of GRUB to decide if the EFI
    binary may be executed.

    As the Linux kernel neither has RWX sections nor needs RWX pages for
    relocation we should set the flag.

    Cc: Ard Biesheuvel <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Heinrich Schuchardt <[email protected]>
    Reviewed-by: Emil Renner Berthing <[email protected]>
    Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub")
    Acked-by: Ard Biesheuvel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 58e78589ade880330e359587bb50b1474f43aa12
Author: Kailang Yang <[email protected]>
Date:   Fri Oct 18 13:53:24 2024 +0800

    ALSA: hda/realtek: Limit internal Mic boost on Dell platform

    [ Upstream commit 78e7be018784934081afec77f96d49a2483f9188 ]

    Dell want to limit internal Mic boost on all Dell platform.

    Signed-off-by: Kailang Yang <[email protected]>
    Cc: <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit ceec8ad09135c27890cdee5a9bb0bf5f58c23720
Author: Dmitry Torokhov <[email protected]>
Date:   Fri Oct 18 17:17:48 2024 -0700

    Input: edt-ft5x06 - fix regmap leak when probe fails

    [ Upstream commit bffdf9d7e51a7be8eeaac2ccf9e54a5fde01ff65 ]

    The driver neglects to free the instance of I2C regmap constructed at
    the beginning of the edt_ft5x06_ts_probe() method when probe fails.
    Additionally edt_ft5x06_ts_remove() is freeing the regmap too early,
    before the rest of the device resources that are managed by devm are
    released.

    Fix this by installing a custom devm action that will ensure that the
    regmap is released at the right time during normal teardown as well as
    in case of probe failure.

    Note that devm_regmap_init_i2c() could not be used because the driver
    may replace the original regmap with a regmap specific for M06 devices
    in the middle of the probe, and using devm_regmap_init_i2c() would
    result in releasing the M06 regmap too early.

    Reported-by: Li Zetao <[email protected]>
    Fixes: 9dfd9708ffba ("Input: edt-ft5x06 - convert to use regmap API")
    Cc: [email protected]
    Reviewed-by: Oliver Graute <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c19a0c171d37f86ab7267c638d475321fd9f0b77
Author: Alexandre Ghiti <[email protected]>
Date:   Wed Oct 16 10:36:24 2024 +0200

    riscv: vdso: Prevent the compiler from inserting calls to memset()

    [ Upstream commit bf40167d54d55d4b54d0103713d86a8638fb9290 ]

    The compiler is smart enough to insert a call to memset() in
    riscv_vdso_get_cpus(), which generates a dynamic relocation.

    So prevent this by using -fno-builtin option.

    Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
    Cc: [email protected]
    Signed-off-by: Alexandre Ghiti <[email protected]>
    Reviewed-by: Guo Ren <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit e79c1f1c9100b4adc91c6512985db2cc961aafaa
Author: Frank Li <[email protected]>
Date:   Wed Oct 23 16:30:32 2024 -0400

    spi: spi-fsl-dspi: Fix crash when not using GPIO chip select

    [ Upstream commit 25f00a13dccf8e45441265768de46c8bf58e08f6 ]

    Add check for the return value of spi_get_csgpiod() to avoid passing a NULL
    pointer to gpiod_direction_output(), preventing a crash when GPIO chip
    select is not used.

    Fix below crash:
    [    4.251960] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    [    4.260762] Mem abort info:
    [    4.263556]   ESR = 0x0000000096000004
    [    4.267308]   EC = 0x25: DABT (current EL), IL = 32 bits
    [    4.272624]   SET = 0, FnV = 0
    [    4.275681]   EA = 0, S1PTW = 0
    [    4.278822]   FSC = 0x04: level 0 translation fault
    [    4.283704] Data abort info:
    [    4.286583]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    [    4.292074]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [    4.297130]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [    4.302445] [0000000000000000] user address but active_mm is swapper
    [    4.308805] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    [    4.315072] Modules linked in:
    [    4.318124] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc4-next-20241023-00008-ga20ec42c5fc1 #359
    [    4.328130] Hardware name: LS1046A QDS Board (DT)
    [    4.332832] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    4.339794] pc : gpiod_direction_output+0x34/0x5c
    [    4.344505] lr : gpiod_direction_output+0x18/0x5c
    [    4.349208] sp : ffff80008003b8f0
    [    4.352517] x29: ffff80008003b8f0 x28: 0000000000000000 x27: ffffc96bcc7e9068
    [    4.359659] x26: ffffc96bcc6e00b0 x25: ffffc96bcc598398 x24: ffff447400132810
    [    4.366800] x23: 0000000000000000 x22: 0000000011e1a300 x21: 0000000000020002
    [    4.373940] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff
    [    4.381081] x17: ffff44740016e600 x16: 0000000500000003 x15: 0000000000000007
    [    4.388221] x14: 0000000000989680 x13: 0000000000020000 x12: 000000000000001e
    [    4.395362] x11: 0044b82fa09b5a53 x10: 0000000000000019 x9 : 0000000000000008
    [    4.402502] x8 : 0000000000000002 x7 : 0000000000000007 …
qaz6750 added a commit to qaz6750/linux-longterm that referenced this pull request Nov 30, 2024
commit 9b5aad3a7498c261116a0251fe57f14ba9c4c6cf
Author: Greg Kroah-Hartman <[email protected]>
Date:   Fri Nov 8 16:28:28 2024 +0100

    Linux 6.6.60

    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Hardik Garg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cc082e50375a29596153fc3f1f8fc85ad1b0b5b9
Author: Konstantin Komarov <[email protected]>
Date:   Thu Sep 5 15:03:48 2024 +0300

    fs/ntfs3: Sequential field availability check in mi_enum_attr()

    commit 090f612756a9720ec18b0b130e28be49839d7cb5 upstream.

    The code is slightly reformatted to consistently check field availability
    without duplication.

    Fixes: 556bdf27c2dd ("ntfs3: Add bounds checking to mi_enum_attr()")
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 10c20d79d59cadfe572480d98cec271a89ffb024
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon May 27 20:15:21 2024 +0530

    drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing

    commit 15c2990e0f0108b9c3752d7072a97d45d4283aea upstream.

    This commit adds null checks for the 'stream' and 'plane' variables in
    the dcn30_apply_idle_power_optimizations function. These variables were
    previously assumed to be null at line 922, but they were used later in
    the code without checking if they were null. This could potentially lead
    to a null pointer dereference, which would cause a crash.

    The null checks ensure that 'stream' and 'plane' are not null before
    they are used, preventing potential crashes.

    Fixes the below static smatch checker:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922)
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922)

    Cc: Tom Chung <[email protected]>
    Cc: Nicholas Kazlauskas <[email protected]>
    Cc: Bhawanpreet Lakha <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Hersen Wu <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    [Xiangyu: Modified file path to backport this commit]
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e979a6a626abf1358a5bb79219eea82ac160d3d3
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:15 2023 +0300

    ASoC: SOF: ipc4-control: Add support for ALSA enum control

    commit 07a866a41982c896dc46476f57d209a200602946 upstream.

    Enum controls use generic param_id and a generic struct where the data
    is passed to the firmware.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3facc0417d3d7b3ba5822e74155bcb1267ce62c1
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:14 2023 +0300

    ASoC: SOF: ipc4-control: Add support for ALSA switch control

    commit 4a2fd607b7ca6128ee3532161505da7624197f55 upstream.

    Volume controls with a max value of 1 are switches.
    Switch controls use generic param_id and a generic struct where the data
    is passed to the firmware.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f01d8fc623711046e1efee00827bff6d5882cdfd
Author: Peter Ujfalusi <[email protected]>
Date:   Tue Sep 19 13:31:13 2023 +0300

    ASoC: SOF: ipc4-topology: Add definition for generic switch/enum control

    commit 060a07cd9bc69eba2da33ed96b1fa69ead60bab1 upstream.

    Currently IPC4 has no notion of a switch or enum type of control which is
    a generic concept in ALSA.

    The generic support for these control types will be as follows:
    - large config is used to send the channel-value par array
    - param_id of a SWITCH type is 200
    - param_id of an ENUM type is 201

    Each module need to support a switch or/and enum must handle these
    universal param_ids.
    The message payload is described by struct sof_ipc4_control_msg_payload.

    Signed-off-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Bard Liao <[email protected]>
    Reviewed-by: Pierre-Louis Bossart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d54afaef6570c277070c3cafe1ed73dcdc129e0a
Author: Chuck Lever <[email protected]>
Date:   Tue Sep 19 11:35:15 2023 -0400

    SUNRPC: Remove BUG_ON call sites

    commit 789ce196a31dd13276076762204bee87df893e53 upstream.

    There is no need to take down the whole system for these assertions.

    I'd rather not attempt a heroic save here, as some bug has occurred
    that has left the transport data structures in an unknown state.
    Just warn and then leak the left-over resources.

    Acked-by: Christian Brauner <[email protected]>
    Reviewed-by: NeilBrown <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Dominique Martinet <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 27a58a19bd20a7afe369da2ce6d4ebea70768acd
Author: Michael Walle <[email protected]>
Date:   Fri Jun 21 14:09:29 2024 +0200

    mtd: spi-nor: winbond: fix w25q128 regression

    commit d35df77707bf5ae1221b5ba1c8a88cf4fcdd4901 upstream.

    Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    removed the flags for non-SFDP devices. It was assumed that it wasn't in
    use anymore. This wasn't true. Add the no_sfdp_flags as well as the size
    again.

    We add the additional flags for dual and quad read because they have
    been reported to work properly by Hartmut using both older and newer
    versions of this flash, the similar flashes with 64Mbit and 256Mbit
    already have these flags and because it will (luckily) trigger our
    legacy SFDP parsing, so newer versions with SFDP support will still get
    the parameters from the SFDP tables.

    Reported-by: Hartmut Birr <[email protected]>
    Closes: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/
    Fixes: 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
    Reviewed-by: Linus Walleij <[email protected]>
    Signed-off-by: Michael Walle <[email protected]>
    Acked-by: Tudor Ambarus <[email protected]>
    Reviewed-by: Esben Haabendal <[email protected]>
    Reviewed-by: Pratyush Yadav <[email protected]>
    Signed-off-by: Pratyush Yadav <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    [Backported to v6.6 - vastly different due to upstream changes]
    Reviewed-by: Tudor Ambarus <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3d544942c0010feedc048b048ee0c35d2d921100
Author: David Hildenbrand <[email protected]>
Date:   Fri Oct 11 12:24:45 2024 +0200

    mm: don't install PMD mappings when THPs are disabled by the hw/process/vma

    commit 2b0f922323ccfa76219bcaacd35cd50aeaa13592 upstream.

    We (or rather, readahead logic :) ) might be allocating a THP in the
    pagecache and then try mapping it into a process that explicitly disabled
    THP: we might end up installing PMD mappings.

    This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
    THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
    starting the VM.

    For example, starting a VM backed on a file system with large folios
    supported makes the VM crash when the VM tries accessing such a mapping
    using KVM.

    Is it also a problem when the HW disabled THP using
    TRANSPARENT_HUGEPAGE_UNSUPPORTED?  At least on x86 this would be the case
    without X86_FEATURE_PSE.

    In the future, we might be able to do better on s390x and only disallow
    PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
    really wants.  For now, fix it by essentially performing the same check as
    would be done in __thp_vma_allowable_orders() or in shmem code, where this
    works as expected, and disallow PMD mappings, making us fallback to PTE
    mappings.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Leo Fu <[email protected]>
    Tested-by: Thomas Huth <[email protected]>
    Cc: Thomas Huth <[email protected]>
    Cc: Matthew Wilcox (Oracle) <[email protected]>
    Cc: Ryan Roberts <[email protected]>
    Cc: Christian Borntraeger <[email protected]>
    Cc: Janosch Frank <[email protected]>
    Cc: Claudio Imbrenda <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Kefeng Wang <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 02ec4b3bba49e8d3abb25a3feba6875cae12da92
Author: Kefeng Wang <[email protected]>
Date:   Fri Oct 11 12:24:44 2024 +0200

    mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()

    commit 963756aac1f011d904ddd9548ae82286d3a91f96 upstream.

    Patch series "mm: don't install PMD mappings when THPs are disabled by the
    hw/process/vma".

    During testing, it was found that we can get PMD mappings in processes
    where THP (and more precisely, PMD mappings) are supposed to be disabled.
    While it works as expected for anon+shmem, the pagecache is the
    problematic bit.

    For s390 KVM this currently means that a VM backed by a file located on
    filesystem with large folio support can crash when KVM tries accessing the
    problematic page, because the readahead logic might decide to use a
    PMD-sized THP and faulting it into the page tables will install a PMD
    mapping, something that s390 KVM cannot tolerate.

    This might also be a problem with HW that does not support PMD mappings,
    but I did not try reproducing it.

    Fix it by respecting the ways to disable THPs when deciding whether we can
    install a PMD mapping.  khugepaged should already be taking care of not
    collapsing if THPs are effectively disabled for the hw/process/vma.

    This patch (of 2):

    Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by
    shmem_allowable_huge_orders() and __thp_vma_allowable_orders().

    [[email protected]: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
    Signed-off-by: Kefeng Wang <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Leo Fu <[email protected]>
    Tested-by: Thomas Huth <[email protected]>
    Reviewed-by: Ryan Roberts <[email protected]>
    Cc: Boqiao Fu <[email protected]>
    Cc: Christian Borntraeger <[email protected]>
    Cc: Claudio Imbrenda <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Janosch Frank <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fc621e7a043de346c33bd7ae7e2e0c651d6152ef
Author: Johannes Berg <[email protected]>
Date:   Wed Oct 23 09:17:44 2024 +0200

    wifi: iwlwifi: mvm: fix 6 GHz scan construction

    commit 7245012f0f496162dd95d888ed2ceb5a35170f1a upstream.

    If more than 255 colocated APs exist for the set of all
    APs found during 2.4/5 GHz scanning, then the 6 GHz scan
    construction will loop forever since the loop variable
    has type u8, which can never reach the number found when
    that's bigger than 255, and is stored in a u32 variable.
    Also move it into the loops to have a smaller scope.

    Using a u32 there is fine, we limit the number of APs in
    the scan list and each has a limit on the number of RNR
    entries due to the frame size. With a limit of 1000 scan
    results, a frame size upper bound of 4096 (really it's
    more like ~2300) and a TBTT entry size of at least 11,
    we get an upper bound for the number of ~372k, well in
    the bounds of a u32.

    Cc: [email protected]
    Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375
    Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f2f1fa446676c21edb777e6d2bc4fa8f956fab68
Author: Ryusuke Konishi <[email protected]>
Date:   Fri Oct 18 04:33:10 2024 +0900

    nilfs2: fix kernel bug due to missing clearing of checked flag

    commit 41e192ad2779cae0102879612dfe46726e4396aa upstream.

    Syzbot reported that in directory operations after nilfs2 detects
    filesystem corruption and degrades to read-only,
    __block_write_begin_int(), which is called to prepare block writes, may
    fail the BUG_ON check for accesses exceeding the folio/page size,
    triggering a kernel bug.

    This was found to be because the "checked" flag of a page/folio was not
    cleared when it was discarded by nilfs2's own routine, which causes the
    sanity check of directory entries to be skipped when the directory
    page/folio is reloaded.  So, fix that.

    This was necessary when the use of nilfs2's own page discard routine was
    applied to more than just metadata files.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a53c2d847627b790fb3bd8b00e02c247941b17e0
Author: Zong-Zhe Yang <[email protected]>
Date:   Mon Jun 17 19:52:17 2024 +0800

    wifi: mac80211: fix NULL dereference at band check in starting tx ba session

    commit 021d53a3d87eeb9dbba524ac515651242a2a7e3b upstream.

    In MLD connection, link_data/link_conf are dynamically allocated. They
    don't point to vif->bss_conf. So, there will be no chanreq assigned to
    vif->bss_conf and then the chan will be NULL. Tweak the code to check
    ht_supported/vht_supported/has_he/has_eht on sta deflink.

    Crash log (with rtw89 version under MLO development):
    [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [ 9890.526102] #PF: supervisor read access in kernel mode
    [ 9890.526105] #PF: error_code(0x0000) - not-present page
    [ 9890.526109] PGD 0 P4D 0
    [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI
    [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G           OE      6.9.0 #1
    [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018
    [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core]
    [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
    [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3
    All code
    ========
       0:	f7 e8                	imul   %eax
       2:	d5                   	(bad)
       3:	93                   	xchg   %eax,%ebx
       4:	3e ea                	ds (bad)
       6:	48 83 c4 28          	add    $0x28,%rsp
       a:	89 d8                	mov    %ebx,%eax
       c:	5b                   	pop    %rbx
       d:	41 5c                	pop    %r12
       f:	41 5d                	pop    %r13
      11:	41 5e                	pop    %r14
      13:	41 5f                	pop    %r15
      15:	5d                   	pop    %rbp
      16:	c3                   	retq
      17:	cc                   	int3
      18:	cc                   	int3
      19:	cc                   	int3
      1a:	cc                   	int3
      1b:	49 8b 84 24 e0 f1 ff 	mov    -0xe20(%r12),%rax
      22:	ff
      23:	48 8b 80 90 1b 00 00 	mov    0x1b90(%rax),%rax
      2a:*	83 38 03             	cmpl   $0x3,(%rax)		<-- trapping instruction
      2d:	0f 84 37 fe ff ff    	je     0xfffffffffffffe6a
      33:	bb ea ff ff ff       	mov    $0xffffffea,%ebx
      38:	eb cc                	jmp    0x6
      3a:	49                   	rex.WB
      3b:	8b                   	.byte 0x8b
      3c:	84 24 10             	test   %ah,(%rax,%rdx,1)
      3f:	f3                   	repz

    Code starting with the faulting instruction
    ===========================================
       0:	83 38 03             	cmpl   $0x3,(%rax)
       3:	0f 84 37 fe ff ff    	je     0xfffffffffffffe40
       9:	bb ea ff ff ff       	mov    $0xffffffea,%ebx
       e:	eb cc                	jmp    0xffffffffffffffdc
      10:	49                   	rex.WB
      11:	8b                   	.byte 0x8b
      12:	84 24 10             	test   %ah,(%rax,%rdx,1)
      15:	f3                   	repz
    [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246
    [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8
    [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685
    [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873
    [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70
    [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000
    [ 9890.526313] FS:  0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000
    [ 9890.526316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0
    [ 9890.526321] Call Trace:
    [ 9890.526324]  <TASK>
    [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479)
    [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713)
    [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3))
    [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211

    Signed-off-by: Zong-Zhe Yang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6a91a5816b289018e0b42a25444c0b4f8c637dca
Author: Pavel Begunkov <[email protected]>
Date:   Wed Apr 10 02:26:54 2024 +0100

    io_uring: always lock __io_cqring_overflow_flush

    commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream.

    Conditional locking is never great, in case of
    __io_cqring_overflow_flush(), which is a slow path, it's not justified.
    Don't handle IOPOLL separately, always grab uring_lock for overflow
    flushing.

    Signed-off-by: Pavel Begunkov <[email protected]>
    Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e3fb0e6afcc399660770428a35162b4880e2e14e
Author: Haibo Chen <[email protected]>
Date:   Thu Sep 5 17:43:38 2024 +0800

    arm64: dts: imx8ulp: correct the flexspi compatible string

    commit 409dc5196d5b6eb67468a06bf4d2d07d7225a67b upstream.

    The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has
    32 LUTs, so correct the compatible string here, otherwise will
    meet below error:

    [    1.119072] ------------[ cut here ]------------
    [    1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64
    [    1.133239] Modules linked in:
    [    1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69
    [    1.146821] Hardware name: NXP i.MX8ULP EVK (DT)
    [    1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64
    [    1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64
    [    1.168060] sp : ffff80008002b2a0
    [    1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000
    [    1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000
    [    1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0
    [    1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff
    [    1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002
    [    1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00
    [    1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80
    [    1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001
    [    1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6
    [    1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92
    [    1.246267] Call trace:
    [    1.248824]  nxp_fspi_exec_op+0xb04/0xb64
    [    1.253031]  spi_mem_exec_op+0x3a0/0x430
    [    1.257139]  spi_nor_read_id+0x80/0xcc
    [    1.261065]  spi_nor_scan+0x1ec/0xf10
    [    1.264901]  spi_nor_probe+0x108/0x2fc
    [    1.268828]  spi_mem_probe+0x6c/0xbc
    [    1.272574]  spi_probe+0x84/0xe4
    [    1.275958]  really_probe+0xbc/0x29c
    [    1.279713]  __driver_probe_device+0x78/0x12c
    [    1.284277]  driver_probe_device+0xd8/0x15c
    [    1.288660]  __device_attach_driver+0xb8/0x134
    [    1.293316]  bus_for_each_drv+0x88/0xe8
    [    1.297337]  __device_attach+0xa0/0x190
    [    1.301353]  device_initial_probe+0x14/0x20
    [    1.305734]  bus_probe_device+0xac/0xb0
    [    1.309752]  device_add+0x5d0/0x790
    [    1.313408]  __spi_add_device+0x134/0x204
    [    1.317606]  of_register_spi_device+0x3b4/0x590
    [    1.322348]  spi_register_controller+0x47c/0x754
    [    1.327181]  devm_spi_register_controller+0x4c/0xa4
    [    1.332289]  nxp_fspi_probe+0x1cc/0x2b0
    [    1.336307]  platform_probe+0x68/0xc4
    [    1.340145]  really_probe+0xbc/0x29c
    [    1.343893]  __driver_probe_device+0x78/0x12c
    [    1.348457]  driver_probe_device+0xd8/0x15c
    [    1.352838]  __driver_attach+0x90/0x19c
    [    1.356857]  bus_for_each_dev+0x7c/0xdc
    [    1.360877]  driver_attach+0x24/0x30
    [    1.364624]  bus_add_driver+0xe4/0x208
    [    1.368552]  driver_register+0x5c/0x124
    [    1.372573]  __platform_driver_register+0x28/0x34
    [    1.377497]  nxp_fspi_driver_init+0x1c/0x28
    [    1.381888]  do_one_initcall+0x80/0x1c8
    [    1.385908]  kernel_init_freeable+0x1c4/0x28c
    [    1.390472]  kernel_init+0x20/0x1d8
    [    1.394138]  ret_from_fork+0x10/0x20
    [    1.397885] ---[ end trace 0000000000000000 ]---
    [    1.407908] ------------[ cut here ]------------

    Fixes: ef89fd56bdfc ("arm64: dts: imx8ulp: add flexspi node")
    Cc: [email protected]
    Signed-off-by: Haibo Chen <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1a49b96c51063d38be296a0c1537928a06f02d6e
Author: Gregory Price <[email protected]>
Date:   Fri Oct 25 10:17:24 2024 -0400

    vmscan,migrate: fix page count imbalance on node stats when demoting pages

    [ Upstream commit 35e41024c4c2b02ef8207f61b9004f6956cf037b ]

    When numa balancing is enabled with demotion, vmscan will call
    migrate_pages when shrinking LRUs.  migrate_pages will decrement the
    the node's isolated page count, leading to an imbalanced count when
    invoked from (MG)LRU code.

    The result is dmesg output like such:

    $ cat /proc/sys/vm/stat_refresh

    [77383.088417] vmstat_refresh: nr_isolated_anon -103212
    [77383.088417] vmstat_refresh: nr_isolated_file -899642

    This negative value may impact compaction and reclaim throttling.

    The following path produces the decrement:

    shrink_folio_list
      demote_folio_list
        migrate_pages
          migrate_pages_batch
            migrate_folio_move
              migrate_folio_done
                mod_node_page_state(-ve) <- decrement

    This path happens for SUCCESSFUL migrations, not failures.  Typically
    callers to migrate_pages are required to handle putback/accounting for
    failures, but this is already handled in the shrink code.

    When accounting for migrations, instead do not decrement the count when
    the migration reason is MR_DEMOTION.  As of v6.11, this demotion logic
    is the only source of MR_DEMOTION.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim")
    Signed-off-by: Gregory Price <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Reviewed-by: Davidlohr Bueso <[email protected]>
    Reviewed-by: Shakeel Butt <[email protected]>
    Reviewed-by: "Huang, Ying" <[email protected]>
    Reviewed-by: Oscar Salvador <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Wei Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 003d2996964c03dfd34860500428f4cdf1f5879e
Author: Jens Axboe <[email protected]>
Date:   Thu Oct 31 08:05:44 2024 -0600

    io_uring/rw: fix missing NOWAIT check for O_DIRECT start write

    [ Upstream commit 1d60d74e852647255bd8e76f5a22dc42531e4389 ]

    When io_uring starts a write, it'll call kiocb_start_write() to bump the
    super block rwsem, preventing any freezes from happening while that
    write is in-flight. The freeze side will grab that rwsem for writing,
    excluding any new writers from happening and waiting for existing writes
    to finish. But io_uring unconditionally uses kiocb_start_write(), which
    will block if someone is currently attempting to freeze the mount point.
    This causes a deadlock where freeze is waiting for previous writes to
    complete, but the previous writes cannot complete, as the task that is
    supposed to complete them is blocked waiting on starting a new write.
    This results in the following stuck trace showing that dependency with
    the write blocked starting a new write:

    task:fio             state:D stack:0     pid:886   tgid:886   ppid:876
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_rwsem_wait+0x1e8/0x3f8
     __percpu_down_read+0xe8/0x500
     io_write+0xbb8/0xff8
     io_issue_sqe+0x10c/0x1020
     io_submit_sqes+0x614/0x2110
     __arm64_sys_io_uring_enter+0x524/0x1038
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170
    INFO: task fsfreeze:7364 blocked for more than 15 seconds.
          Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963

    with the attempting freezer stuck trying to grab the rwsem:

    task:fsfreeze        state:D stack:0     pid:7364  tgid:7364  ppid:995
    Call trace:
     __switch_to+0x1d8/0x348
     __schedule+0x8e8/0x2248
     schedule+0x110/0x3f0
     percpu_down_write+0x2b0/0x680
     freeze_super+0x248/0x8a8
     do_vfs_ioctl+0x149c/0x1b18
     __arm64_sys_ioctl+0xd0/0x1a0
     invoke_syscall+0x74/0x268
     el0_svc_common.constprop.0+0x160/0x238
     do_el0_svc+0x44/0x60
     el0_svc+0x44/0xb0
     el0t_64_sync_handler+0x118/0x128
     el0t_64_sync+0x168/0x170

    Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
    blocking grab of the super block rwsem if it isn't set. For normal issue
    where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
    have io_uring core issue a blocking attempt of the write. That will in
    turn also get completions run, ensuring forward progress.

    Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
    something that can be triggered by a regular user.

    Cc: [email protected] # 5.10+
    Reported-by: Peter Mann <[email protected]>
    Link: https://lore.kernel.org/io-uring/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 70bbe8d0a949413df1bb6532fd6b19fbf0f88feb
Author: Andrey Konovalov <[email protected]>
Date:   Tue Oct 22 18:07:06 2024 +0200

    kasan: remove vmalloc_percpu test

    [ Upstream commit 330d8df81f3673d6fb74550bbc9bb159d81b35f7 ]

    Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the
    vmalloc_percpu KASAN test with the assumption that __alloc_percpu always
    uses vmalloc internally, which is tagged by KASAN.

    However, __alloc_percpu might allocate memory from the first per-CPU
    chunk, which is not allocated via vmalloc().  As a result, the test might
    fail.

    Remove the test until proper KASAN annotation for the per-CPU allocated
    are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests")
    Signed-off-by: Andrey Konovalov <[email protected]>
    Reported-by: Samuel Holland <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Reported-by: Sabyrzhan Tasbolatov <[email protected]>
    Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/
    Cc: Alexander Potapenko <[email protected]>
    Cc: Andrey Ryabinin <[email protected]>
    Cc: Dmitry Vyukov <[email protected]>
    Cc: Marco Elver <[email protected]>
    Cc: Sabyrzhan Tasbolatov <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c60af16e1d6cc2237d58336546d6adfc067b6b8f
Author: Vitaliy Shevtsov <[email protected]>
Date:   Mon Sep 16 22:41:37 2024 +0500

    nvmet-auth: assign dh_key to NULL after kfree_sensitive

    [ Upstream commit d2f551b1f72b4c508ab9298419f6feadc3b5d791 ]

    ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup()
    for the same controller. So it's better to nullify it after release on
    error path in order to avoid double free later in nvmet_destroy_auth().

    Found by Linux Verification Center (linuxtesting.org) with Svace.

    Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support")
    Cc: [email protected]
    Signed-off-by: Vitaliy Shevtsov <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Hannes Reinecke <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4a39320977f9c665faa37efaa8093b8e82dd8c41
Author: Christoffer Sandberg <[email protected]>
Date:   Tue Oct 29 16:16:53 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1

    [ Upstream commit e49370d769e71456db3fbd982e95bab8c69f73e8 ]

    Quirk is needed to enable headset microphone on missing pin 0x19.

    Signed-off-by: Christoffer Sandberg <[email protected]>
    Signed-off-by: Werner Sembach <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit b42adef85aca72b51eab1a812a79913ff5aeb584
Author: Christoffer Sandberg <[email protected]>
Date:   Tue Oct 29 16:16:52 2024 +0100

    ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3

    [ Upstream commit 0b04fbe886b4274c8e5855011233aaa69fec6e75 ]

    Quirk is needed to enable headset microphone on missing pin 0x19.

    Signed-off-by: Christoffer Sandberg <[email protected]>
    Signed-off-by: Werner Sembach <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 77ddc732416b017180893cbb2356e9f0a414c575
Author: Christoph Hellwig <[email protected]>
Date:   Wed Oct 23 15:37:22 2024 +0200

    xfs: fix finding a last resort AG in xfs_filestream_pick_ag

    [ Upstream commit dc60992ce76fbc2f71c2674f435ff6bde2108028 ]

    When the main loop in xfs_filestream_pick_ag fails to find a suitable
    AG it tries to just pick the online AG.  But the loop for that uses
    args->pag as loop iterator while the later code expects pag to be
    set.  Fix this by reusing the max_pag case for this last resort, and
    also add a check for impossible case of no AG just to make sure that
    the uninitialized pag doesn't even escape in theory.

    Reported-by: [email protected]
    Signed-off-by: Christoph Hellwig <[email protected]>
    Tested-by: [email protected]
    Fixes: f8f1ed1ab3baba ("xfs: return a referenced perag from filestreams allocator")
    Cc: <[email protected]> # v6.3
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Carlos Maiolino <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 8e886e44397ba89f6e8da8471386112b4f5b67b7
Author: Matt Johnston <[email protected]>
Date:   Tue Oct 22 18:25:14 2024 +0800

    mctp i2c: handle NULL header address

    [ Upstream commit 01e215975fd80af81b5b79f009d49ddd35976c13 ]

    daddr can be NULL if there is no neighbour table entry present,
    in that case the tx packet should be dropped.

    saddr will usually be set by MCTP core, but check for NULL in case a
    packet is transmitted by a different protocol.

    Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
    Cc: [email protected]
    Reported-by: Dung Cao <[email protected]>
    Signed-off-by: Matt Johnston <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 88f97a4b5843ce21c1286e082c02a5fb4d8eb473
Author: Edward Adam Davis <[email protected]>
Date:   Wed Oct 16 19:43:47 2024 +0800

    ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow

    [ Upstream commit bc0a2f3a73fcdac651fca64df39306d1e5ebe3b0 ]

    Syzbot reported a kernel BUG in ocfs2_truncate_inline.  There are two
    reasons for this: first, the parameter value passed is greater than
    ocfs2_max_inline_data_with_xattr, second, the start and end parameters of
    ocfs2_truncate_inline are "unsigned int".

    So, we need to add a sanity check for byte_start and byte_len right before
    ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater
    than ocfs2_max_inline_data_with_xattr return -EINVAL.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1afc32b95233 ("ocfs2: Write support for inline data")
    Signed-off-by: Edward Adam Davis <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c117a980185ee3812612e7e453e356a6a4f05305
Author: Sabyrzhan Tasbolatov <[email protected]>
Date:   Wed Oct 16 20:24:07 2024 +0500

    x86/traps: move kmsan check after instrumentation_begin

    [ Upstream commit 1db272864ff250b5e607283eaec819e1186c8e26 ]

    During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following:

      AR      built-in.a
      AR      vmlinux.a
      LD      vmlinux.o
    vmlinux.o: warning: objtool: handle_bug+0x4: call to
        kmsan_unpoison_entry_regs() leaves .noinstr.text section
      OBJCOPY modules.builtin.modinfo
      GEN     modules.builtin
      MODPOST Module.symvers
      CC      .vmlinux.export.o

    Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes
    the warning.

    There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but
    it has the return condition and if we include it after
    instrumentation_begin() it results the warning "return with
    instrumentation enabled", hence, I'm concerned that regs will not be KMSAN
    unpoisoned if `ud_type == BUG_NONE` is true.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ba54d194f8da ("x86/traps: avoid KMSAN bugs originating from handle_bug()")
    Signed-off-by: Sabyrzhan Tasbolatov <[email protected]>
    Reviewed-by: Alexander Potapenko <[email protected]>
    Cc: Borislav Petkov (AMD) <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 86ee1845cbbf52eff6d41ce438d5f7e9ab6f4602
Author: Gatlin Newhouse <[email protected]>
Date:   Wed Jul 24 00:01:55 2024 +0000

    x86/traps: Enable UBSAN traps on x86

    [ Upstream commit 7424fc6b86c8980a87169e005f5cd4438d18efe6 ]

    Currently ARM64 extracts which specific sanitizer has caused a trap via
    encoded data in the trap instruction. Clang on x86 currently encodes the
    same data in the UD1 instruction but x86 handle_bug() and
    is_valid_bugaddr() currently only look at UD2.

    Bring x86 to parity with ARM64, similar to commit 25b84002afb9 ("arm64:
    Support Clang UBSAN trap codes for better reporting"). See the llvm
    links for information about the code generation.

    Enable the reporting of UBSAN sanitizer details on x86 compiled with clang
    when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate
    which is encoded by the compiler after the UD1.

    [ tglx: Simplified it by moving the printk() into handle_bug() ]

    Signed-off-by: Gatlin Newhouse <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Acked-by: Peter Zijlstra (Intel) <[email protected]>
    Cc: Kees Cook <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f
    Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27
    Stable-dep-of: 1db272864ff2 ("x86/traps: move kmsan check after instrumentation_begin")
    Signed-off-by: Sasha Levin <[email protected]>

commit b958948ae1cb3e39c48e9f805436fd652103c71e
Author: Matt Fleming <[email protected]>
Date:   Fri Oct 11 13:07:37 2024 +0100

    mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves

    [ Upstream commit 281dd25c1a018261a04d1b8bf41a0674000bfe38 ]

    Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to
    fail even though free pages are available in the highatomic reserves.
    GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock()
    since it's only run from reclaim.

    Given that such allocations will pass the watermarks in
    __zone_watermark_unusable_free(), it makes sense to fallback to highatomic
    reserves the same way that ALLOC_OOM can.

    This fixes order-0 page allocation failures observed on Cloudflare's fleet
    when handling network packets:

      kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC),
      nodemask=(null),cpuset=/,mems_allowed=0-7
      CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G           O 6.6.43-CUSTOM #1
      Hardware name: MACHINE
      Call Trace:
       <IRQ>
       dump_stack_lvl+0x3c/0x50
       warn_alloc+0x13a/0x1c0
       __alloc_pages_slowpath.constprop.0+0xc9d/0xd10
       __alloc_pages+0x327/0x340
       __napi_alloc_skb+0x16d/0x1f0
       bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en]
       bnxt_rx_pkt+0x201/0x15e0 [bnxt_en]
       __bnxt_poll_work+0x156/0x2b0 [bnxt_en]
       bnxt_poll+0xd9/0x1c0 [bnxt_en]
       __napi_poll+0x2b/0x1b0
       bpf_trampoline_6442524138+0x7d/0x1000
       __napi_poll+0x5/0x1b0
       net_rx_action+0x342/0x740
       handle_softirqs+0xcf/0x2b0
       irq_exit_rcu+0x6c/0x90
       sysvec_apic_timer_interrupt+0x72/0x90
       </IRQ>

    [[email protected]: update comment]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/
    Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs")
    Signed-off-by: Matt Fleming <[email protected]>
    Suggested-by: Vlastimil Babka <[email protected]>
    Reviewed-by: Vlastimil Babka <[email protected]>
    Cc: Mel Gorman <[email protected]>
    Cc: Michal Hocko <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4882a352b5df897c30f9d64fba340a219a6604d0
Author: Alexander Usyskin <[email protected]>
Date:   Tue Oct 15 15:31:57 2024 +0300

    mei: use kvmalloc for read buffer

    [ Upstream commit 4adf613e01bf99e1739f6ff3e162ad5b7d578d1a ]

    Read buffer is allocated according to max message size, reported by
    the firmware and may reach 64K in systems with pxp client.
    Contiguous 64k allocation may fail under memory pressure.
    Read buffer is used as in-driver message storage and not required
    to be contiguous.
    Use kvmalloc to allow kernel to allocate non-contiguous memory.

    Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.")
    Cc: stable <[email protected]>
    Reported-by: Rohit Agarwal <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Brian Geffon <[email protected]>
    Signed-off-by: Alexander Usyskin <[email protected]>
    Acked-by: Tomas Winkler <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit cb8b81ad3e893a6d18dcdd3754cc2ea2a42c0136
Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Mon Oct 21 12:25:26 2024 +0200

    mptcp: init: protect sched with rcu_read_lock

    [ Upstream commit 3deb12c788c385e17142ce6ec50f769852fcec65 ]

    Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
    creates this splat when an MPTCP socket is created:

      =============================
      WARNING: suspicious RCU usage
      6.12.0-rc2+ #11 Not tainted
      -----------------------------
      net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!

      other info that might help us debug this:

      rcu_scheduler_active = 2, debug_locks = 1
      no locks held by mptcp_connect/176.

      stack backtrace:
      CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Call Trace:
       <TASK>
       dump_stack_lvl (lib/dump_stack.c:123)
       lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
       mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
       mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
       ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
       inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
       ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
       __sock_create (net/socket.c:1576)
       __sys_socket (net/socket.c:1671)
       ? __pfx___sys_socket (net/socket.c:1712)
       ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
       __x64_sys_socket (net/socket.c:1728)
       do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
       entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

    That's because when the socket is initialised, rcu_read_lock() is not
    used despite the explicit comment written above the declaration of
    mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
    warning.

    Fixes: 1730b2b2c5a5 ("mptcp: add sched in mptcp_sock")
    Cc: [email protected]
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 4f7ffa83fa79dd52efbaef366c850aaaae06a469
Author: Hugh Dickins <[email protected]>
Date:   Sun Oct 27 15:23:23 2024 -0700

    iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP

    [ Upstream commit c749d9b7ebbc5716af7a95f7768634b30d9446ec ]

    generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem,
    on huge=always tmpfs, issues a warning and then hangs (interruptibly):

    WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9
    CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2
    ...
    copy_page_from_iter_atomic+0xa6/0x5ec
    generic_perform_write+0xf6/0x1b4
    shmem_file_write_iter+0x54/0x67

    Fix copy_page_from_iter_atomic() by limiting it in that case
    (include/linux/skbuff.h skb_frag_must_loop() does similar).

    But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too
    surprising, has outlived its usefulness, and should just be removed?

    Fixes: 908a1ad89466 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()")
    Signed-off-by: Hugh Dickins <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Christoph Hellwig <[email protected]>
    Cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit ade91f6e9848b370add44d89c976e070ccb492ef
Author: Shawn Wang <[email protected]>
Date:   Fri Oct 25 10:22:08 2024 +0800

    sched/numa: Fix the potential null pointer dereference in task_numa_work()

    [ Upstream commit 9c70b2a33cd2aa6a5a59c5523ef053bd42265209 ]

    When running stress-ng-vm-segv test, we found a null pointer dereference
    error in task_numa_work(). Here is the backtrace:

      [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
      ......
      [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se
      ......
      [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
      [323676.067115] pc : vma_migratable+0x1c/0xd0
      [323676.067122] lr : task_numa_work+0x1ec/0x4e0
      [323676.067127] sp : ffff8000ada73d20
      [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010
      [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000
      [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000
      [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8
      [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035
      [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8
      [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4
      [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001
      [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000
      [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000
      [323676.067152] Call trace:
      [323676.067153]  vma_migratable+0x1c/0xd0
      [323676.067155]  task_numa_work+0x1ec/0x4e0
      [323676.067157]  task_work_run+0x78/0xd8
      [323676.067161]  do_notify_resume+0x1ec/0x290
      [323676.067163]  el0_svc+0x150/0x160
      [323676.067167]  el0t_64_sync_handler+0xf8/0x128
      [323676.067170]  el0t_64_sync+0x17c/0x180
      [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000)
      [323676.067177] SMP: stopping secondary CPUs
      [323676.070184] Starting crashdump kernel...

    stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error
    handling function of the system, which tries to cause a SIGSEGV error on
    return from unmapping the whole address space of the child process.

    Normally this program will not cause kernel crashes. But before the
    munmap system call returns to user mode, a potential task_numa_work()
    for numa balancing could be added and executed. In this scenario, since the
    child process has no vma after munmap, the vma_next() in task_numa_work()
    will return a null pointer even if the vma iterator restarts from 0.

    Recheck the vma pointer before dereferencing it in task_numa_work().

    Fixes: 214dbc428137 ("sched: convert to vma iterator")
    Signed-off-by: Shawn Wang <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Cc: [email protected] # v6.2+
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

commit 8c9a1ec39c698cbc38f4efa9113185f885137f8b
Author: Dan Williams <[email protected]>
Date:   Tue Oct 22 18:43:40 2024 -0700

    cxl/acpi: Ensure ports ready at cxl_acpi_probe() return

    [ Upstream commit 48f62d38a07d464a499fa834638afcfd2b68f852 ]

    In order to ensure root CXL ports are enabled upon cxl_acpi_probe()
    when the 'cxl_port' driver is built as a module, arrange for the
    module to be pre-loaded or built-in.

    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means. However, a stable
    backport should do no harm.

    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Dan Williams <[email protected]>
    Tested-by: Gregory Price <[email protected]>
    Reviewed-by: Jonathan Cameron <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit a9ed67f39f888bb6e5729112ad45f15d9c5a3ef8
Author: Dan Williams <[email protected]>
Date:   Tue Oct 22 18:43:32 2024 -0700

    cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()

    [ Upstream commit 3d6ebf16438de5d712030fefbb4182b46373d677 ]

    It turns out since its original introduction, pre-2.6.12,
    bus_rescan_devices() has skipped devices that might be in the process of
    attaching or detaching from their driver. For CXL this behavior is
    unwanted and expects that cxl_bus_rescan() is a probe barrier.

    That behavior is simple enough to achieve with bus_for_each_dev() paired
    with call to device_attach(), and it is unclear why bus_rescan_devices()
    took the position of lockless consumption of dev->driver which is racy.

    The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
    is merely by inspection since the bug that triggered the discovery of
    this potential problem [1] is fixed by other means.  However, a stable
    backport should do no harm.

    Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Dan Williams <[email protected]>
    Tested-by: Gregory Price <[email protected]>
    Reviewed-by: Jonathan Cameron <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit d210bc87cc4fdde62f757002530a08c3d109d94a
Author: Chunyan Zhang <[email protected]>
Date:   Tue Oct 8 17:41:39 2024 +0800

    riscv: Remove duplicated GET_RM

    [ Upstream commit 164f66de6bb6ef454893f193c898dc8f1da6d18b ]

    The macro GET_RM defined twice in this file, one can be removed.

    Reviewed-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Chunyan Zhang <[email protected]>
    Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 6d84e1b2e5ac04511e68bcf5577fc8369e73f4ed
Author: Chunyan Zhang <[email protected]>
Date:   Tue Oct 8 17:41:38 2024 +0800

    riscv: Remove unused GENERATING_ASM_OFFSETS

    [ Upstream commit 46d4e5ac6f2f801f97bcd0ec82365969197dc9b1 ]

    The macro is not used in the current version of kernel, it looks like
    can be removed to avoid a build warning:

    ../arch/riscv/kernel/asm-offsets.c: At top level:
    ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros]
        7 | #define GENERATING_ASM_OFFSETS

    Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()")
    Cc: [email protected]
    Reviewed-by: Alexandre Ghiti <[email protected]>
    Tested-by: Alexandre Ghiti <[email protected]>
    Signed-off-by: Chunyan Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit a63ba17207c50da91b19150b6cde09d199b34c2c
Author: WangYuli <[email protected]>
Date:   Thu Oct 17 11:20:10 2024 +0800

    riscv: Use '%u' to format the output of 'cpu'

    [ Upstream commit e0872ab72630dada3ae055bfa410bf463ff1d1e0 ]

    'cpu' is an unsigned integer, so its conversion specifier should
    be %u, not %d.

    Suggested-by: Wentao Guan <[email protected]>
    Suggested-by: Maciej W. Rozycki <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: WangYuli <[email protected]>
    Reviewed-by: Charlie Jenkins <[email protected]>
    Tested-by: Charlie Jenkins <[email protected]>
    Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 909e71f28e9615410f52fca1b54acfd3d61c61c2
Author: Heinrich Schuchardt <[email protected]>
Date:   Sun Sep 29 16:02:33 2024 +0200

    riscv: efi: Set NX compat flag in PE/COFF header

    [ Upstream commit d41373a4b910961df5a5e3527d7bde6ad45ca438 ]

    The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the
    EFI binary does not rely on pages that are both executable and
    writable.

    The flag is used by some distro versions of GRUB to decide if the EFI
    binary may be executed.

    As the Linux kernel neither has RWX sections nor needs RWX pages for
    relocation we should set the flag.

    Cc: Ard Biesheuvel <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Heinrich Schuchardt <[email protected]>
    Reviewed-by: Emil Renner Berthing <[email protected]>
    Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub")
    Acked-by: Ard Biesheuvel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit 58e78589ade880330e359587bb50b1474f43aa12
Author: Kailang Yang <[email protected]>
Date:   Fri Oct 18 13:53:24 2024 +0800

    ALSA: hda/realtek: Limit internal Mic boost on Dell platform

    [ Upstream commit 78e7be018784934081afec77f96d49a2483f9188 ]

    Dell want to limit internal Mic boost on all Dell platform.

    Signed-off-by: Kailang Yang <[email protected]>
    Cc: <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit ceec8ad09135c27890cdee5a9bb0bf5f58c23720
Author: Dmitry Torokhov <[email protected]>
Date:   Fri Oct 18 17:17:48 2024 -0700

    Input: edt-ft5x06 - fix regmap leak when probe fails

    [ Upstream commit bffdf9d7e51a7be8eeaac2ccf9e54a5fde01ff65 ]

    The driver neglects to free the instance of I2C regmap constructed at
    the beginning of the edt_ft5x06_ts_probe() method when probe fails.
    Additionally edt_ft5x06_ts_remove() is freeing the regmap too early,
    before the rest of the device resources that are managed by devm are
    released.

    Fix this by installing a custom devm action that will ensure that the
    regmap is released at the right time during normal teardown as well as
    in case of probe failure.

    Note that devm_regmap_init_i2c() could not be used because the driver
    may replace the original regmap with a regmap specific for M06 devices
    in the middle of the probe, and using devm_regmap_init_i2c() would
    result in releasing the M06 regmap too early.

    Reported-by: Li Zetao <[email protected]>
    Fixes: 9dfd9708ffba ("Input: edt-ft5x06 - convert to use regmap API")
    Cc: [email protected]
    Reviewed-by: Oliver Graute <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit c19a0c171d37f86ab7267c638d475321fd9f0b77
Author: Alexandre Ghiti <[email protected]>
Date:   Wed Oct 16 10:36:24 2024 +0200

    riscv: vdso: Prevent the compiler from inserting calls to memset()

    [ Upstream commit bf40167d54d55d4b54d0103713d86a8638fb9290 ]

    The compiler is smart enough to insert a call to memset() in
    riscv_vdso_get_cpus(), which generates a dynamic relocation.

    So prevent this by using -fno-builtin option.

    Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
    Cc: [email protected]
    Signed-off-by: Alexandre Ghiti <[email protected]>
    Reviewed-by: Guo Ren <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

commit e79c1f1c9100b4adc91c6512985db2cc961aafaa
Author: Frank Li <[email protected]>
Date:   Wed Oct 23 16:30:32 2024 -0400

    spi: spi-fsl-dspi: Fix crash when not using GPIO chip select

    [ Upstream commit 25f00a13dccf8e45441265768de46c8bf58e08f6 ]

    Add check for the return value of spi_get_csgpiod() to avoid passing a NULL
    pointer to gpiod_direction_output(), preventing a crash when GPIO chip
    select is not used.

    Fix below crash:
    [    4.251960] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    [    4.260762] Mem abort info:
    [    4.263556]   ESR = 0x0000000096000004
    [    4.267308]   EC = 0x25: DABT (current EL), IL = 32 bits
    [    4.272624]   SET = 0, FnV = 0
    [    4.275681]   EA = 0, S1PTW = 0
    [    4.278822]   FSC = 0x04: level 0 translation fault
    [    4.283704] Data abort info:
    [    4.286583]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    [    4.292074]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [    4.297130]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [    4.302445] [0000000000000000] user address but active_mm is swapper
    [    4.308805] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    [    4.315072] Modules linked in:
    [    4.318124] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc4-next-20241023-00008-ga20ec42c5fc1 #359
    [    4.328130] Hardware name: LS1046A QDS Board (DT)
    [    4.332832] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    4.339794] pc : gpiod_direction_output+0x34/0x5c
    [    4.344505] lr : gpiod_direction_output+0x18/0x5c
    [    4.349208] sp : ffff80008003b8f0
    [    4.352517] x29: ffff80008003b8f0 x28: 0000000000000000 x27: ffffc96bcc7e9068
    [    4.359659] x26: ffffc96bcc6e00b0 x25: ffffc96bcc598398 x24: ffff447400132810
    [    4.366800] x23: 0000000000000000 x22: 0000000011e1a300 x21: 0000000000020002
    [    4.373940] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff
    [    4.381081] x17: ffff44740016e600 x16: 0000000500000003 x15: 0000000000000007
    [    4.388221] x14: 0000000000989680 x13: 0000000000020000 x12: 000000000000001e
    [    4.395362] x11: 0044b82fa09b5a53 x10: 0000000000000019 x9 : 0000000000000008
    [    4.402502] x8 : 0000000000000002 x7 : 0000000000000007 …
github-actions bot pushed a commit to sirdarckcat/linux-1 that referenced this pull request Dec 5, 2024
[ Upstream commit 5bf1557 ]

test_progs uses glibc specific functions backtrace() and
backtrace_symbols_fd() to print backtrace in case of SIGSEGV.

Recent commit (see fixes) updated test_progs.c to define stub versions
of the same functions with attriubte "weak" in order to allow linking
test_progs against musl libc. Unfortunately this broke the backtrace
handling for glibc builds.

As it turns out, glibc defines backtrace() and backtrace_symbols_fd()
as weak:

  $ llvm-readelf --symbols /lib64/libc.so.6 \
     | grep -P '( backtrace_symbols_fd| backtrace)$'
  4910: 0000000000126b40   161 FUNC    WEAK   DEFAULT    16 backtrace
  6843: 0000000000126f90   852 FUNC    WEAK   DEFAULT    16 backtrace_symbols_fd

So does test_progs:

 $ llvm-readelf --symbols test_progs \
    | grep -P '( backtrace_symbols_fd| backtrace)$'
  2891: 00000000006ad190    15 FUNC    WEAK   DEFAULT    13 backtrace
 11215: 00000000006ad1a0    41 FUNC    WEAK   DEFAULT    13 backtrace_symbols_fd

In such situation dynamic linker is not obliged to favour glibc
implementation over the one defined in test_progs.

Compiling with the following simple modification to test_progs.c
demonstrates the issue:

  $ git diff
  ...
  \--- a/tools/testing/selftests/bpf/test_progs.c
  \+++ b/tools/testing/selftests/bpf/test_progs.c
  \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv)
          if (err)
                  return err;

  +       *(int *)0xdeadbeef  = 42;
          err = cd_flavor_subdir(argv[0]);
          if (err)
                  return err;

  $ ./test_progs
  [0]: Caught signal gregkh#11!
  Stack trace:
  <backtrace not supported>
  Segmentation fault (core dumped)

Resolve this by hiding stub definitions behind __GLIBC__ macro check
instead of using "weak" attribute.

Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc")
Signed-off-by: Eduard Zingerman <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Tested-by: Tony Ambardar <[email protected]>
Reviewed-by: Tony Ambardar <[email protected]>
Acked-by: Daniel Xu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
gregkh pushed a commit that referenced this pull request Dec 5, 2024
[ Upstream commit 5bf1557 ]

test_progs uses glibc specific functions backtrace() and
backtrace_symbols_fd() to print backtrace in case of SIGSEGV.

Recent commit (see fixes) updated test_progs.c to define stub versions
of the same functions with attriubte "weak" in order to allow linking
test_progs against musl libc. Unfortunately this broke the backtrace
handling for glibc builds.

As it turns out, glibc defines backtrace() and backtrace_symbols_fd()
as weak:

  $ llvm-readelf --symbols /lib64/libc.so.6 \
     | grep -P '( backtrace_symbols_fd| backtrace)$'
  4910: 0000000000126b40   161 FUNC    WEAK   DEFAULT    16 backtrace
  6843: 0000000000126f90   852 FUNC    WEAK   DEFAULT    16 backtrace_symbols_fd

So does test_progs:

 $ llvm-readelf --symbols test_progs \
    | grep -P '( backtrace_symbols_fd| backtrace)$'
  2891: 00000000006ad190    15 FUNC    WEAK   DEFAULT    13 backtrace
 11215: 00000000006ad1a0    41 FUNC    WEAK   DEFAULT    13 backtrace_symbols_fd

In such situation dynamic linker is not obliged to favour glibc
implementation over the one defined in test_progs.

Compiling with the following simple modification to test_progs.c
demonstrates the issue:

  $ git diff
  ...
  \--- a/tools/testing/selftests/bpf/test_progs.c
  \+++ b/tools/testing/selftests/bpf/test_progs.c
  \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv)
          if (err)
                  return err;

  +       *(int *)0xdeadbeef  = 42;
          err = cd_flavor_subdir(argv[0]);
          if (err)
                  return err;

  $ ./test_progs
  [0]: Caught signal #11!
  Stack trace:
  <backtrace not supported>
  Segmentation fault (core dumped)

Resolve this by hiding stub definitions behind __GLIBC__ macro check
instead of using "weak" attribute.

Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc")
Signed-off-by: Eduard Zingerman <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Tested-by: Tony Ambardar <[email protected]>
Reviewed-by: Tony Ambardar <[email protected]>
Acked-by: Daniel Xu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
gregkh pushed a commit that referenced this pull request Dec 14, 2024
[ Upstream commit 5bf1557 ]

test_progs uses glibc specific functions backtrace() and
backtrace_symbols_fd() to print backtrace in case of SIGSEGV.

Recent commit (see fixes) updated test_progs.c to define stub versions
of the same functions with attriubte "weak" in order to allow linking
test_progs against musl libc. Unfortunately this broke the backtrace
handling for glibc builds.

As it turns out, glibc defines backtrace() and backtrace_symbols_fd()
as weak:

  $ llvm-readelf --symbols /lib64/libc.so.6 \
     | grep -P '( backtrace_symbols_fd| backtrace)$'
  4910: 0000000000126b40   161 FUNC    WEAK   DEFAULT    16 backtrace
  6843: 0000000000126f90   852 FUNC    WEAK   DEFAULT    16 backtrace_symbols_fd

So does test_progs:

 $ llvm-readelf --symbols test_progs \
    | grep -P '( backtrace_symbols_fd| backtrace)$'
  2891: 00000000006ad190    15 FUNC    WEAK   DEFAULT    13 backtrace
 11215: 00000000006ad1a0    41 FUNC    WEAK   DEFAULT    13 backtrace_symbols_fd

In such situation dynamic linker is not obliged to favour glibc
implementation over the one defined in test_progs.

Compiling with the following simple modification to test_progs.c
demonstrates the issue:

  $ git diff
  ...
  \--- a/tools/testing/selftests/bpf/test_progs.c
  \+++ b/tools/testing/selftests/bpf/test_progs.c
  \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv)
          if (err)
                  return err;

  +       *(int *)0xdeadbeef  = 42;
          err = cd_flavor_subdir(argv[0]);
          if (err)
                  return err;

  $ ./test_progs
  [0]: Caught signal #11!
  Stack trace:
  <backtrace not supported>
  Segmentation fault (core dumped)

Resolve this by hiding stub definitions behind __GLIBC__ macro check
instead of using "weak" attribute.

Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc")
Signed-off-by: Eduard Zingerman <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Tested-by: Tony Ambardar <[email protected]>
Reviewed-by: Tony Ambardar <[email protected]>
Acked-by: Daniel Xu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.