HACL* Raw RSA Integration #11

karthikbhargavan · 2023-12-01T08:05:15Z

This PR adds an alternative implementation for the core RSA encryption and decryption functions in crypto/rsa.c. The code is taken from HACL*, but required changes to the source F* code, since it was embedded within RSA-PSS and had to be factored out. The RSA functions are quite (perhaps too) strict on their input and output lengths, so this code needs extensive testing to be sure that it handles all the kinds of inputs the kernel may throw at it.

[ Upstream commit 9c37785 ] Some error paths don't call acpi_put_table() before returning. Branch to the correct place instead of doing some direct return. Fixes: 4d27328 ("tpm_crb: Add support for CRB devices based on Pluton") Signed-off-by: Christophe JAILLET <[email protected]> Acked-by: Matthew Garrett <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 6df373b ] In gfs2_logd(), switch from an open-coded wait loop to wait_event_interruptible_timeout(). Signed-off-by: Andreas Gruenbacher <[email protected]> Stable-dep-of: b74cd55 ("gfs2: low-memory forced flush fixes") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b74cd55 ] First, function gfs2_ail_flush_reqd checks the SDF_FORCE_AIL_FLUSH flag to determine if an AIL flush should be forced in low-memory situations. However, it also immediately clears the flag, and when called repeatedly as in function gfs2_logd, the flag will be lost. Fix that by pulling the SDF_FORCE_AIL_FLUSH flag check out of gfs2_ail_flush_reqd. Second, function gfs2_writepages sets the SDF_FORCE_AIL_FLUSH flag whether or not enough pages were written. If enough pages could be written, flushing the AIL is unnecessary, though. Third, gfs2_writepages doesn't wake up logd after setting the SDF_FORCE_AIL_FLUSH flag, so it can take a long time for logd to react. It would be preferable to wake up logd, but that hurts the performance of some workloads and we don't quite understand why so far, so don't wake up logd so far. Fixes: b066a4e ("gfs2: forcibly flush ail to relieve memory pressure") Signed-off-by: Andreas Gruenbacher <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a493208 ] Breaking out early when a match is found leads to an incorrect num_chans value when more than one ipcc mailbox channel is used by the same device. Fixes: e9d50e4 ("mailbox: qcom-ipcc: Dynamic alloc for channel arrangement") Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Jassi Brar <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a3b7039 ] Buffer 'new_argv' is accessed without bound check after accessing with bound check via 'new_argc' index. Fixes: e298f3b ("kconfig: add built-in function support") Co-developed-by: Ivanov Mikhail <[email protected]> Signed-off-by: Konstantin Meskhidze <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 7f33105 ] Commit 97d5f2e ("tools api fs: More thread safety for global filesystem variables") introduces pthread_once, so the libpthread should be added at link time, or we'll meet the following compile error when 'make -C tools/mm': gcc -Wall -Wextra -I../lib/ -o page-types page-types.c ../lib/api/libapi.a ~/linux/tools/lib/api/fs/fs.c:146: undefined reference to `pthread_once' ~/linux/tools/lib/api/fs/fs.c:147: undefined reference to `pthread_once' ~/linux/tools/lib/api/fs/fs.c:148: undefined reference to `pthread_once' ~/linux/tools/lib/api/fs/fs.c:149: undefined reference to `pthread_once' ~/linux/tools/lib/api/fs/fs.c:150: undefined reference to `pthread_once' /usr/bin/ld: ../lib/api/libapi.a(libapi-in.o):~/linux/tools/lib/api/fs/fs.c:151: more undefined references to `pthread_once' follow collect2: error: ld returned 1 exit status make: *** [Makefile:22: page-types] Error 1 Link: https://lkml.kernel.org/r/[email protected] Fixes: 97d5f2e ("tools api fs: More thread safety for global filesystem variables") Signed-off-by: Xie XiuQi <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 2e00b8b ] If the device drops into ultra-low-power mode before being placed into normal-power mode as part of ATI being triggered, the device does not assert any interrupts until the ATI routine is restarted two seconds later. Solve this problem by adopting the vendor's recommendation, which calls for the device to be placed into normal-power mode prior to being configured and ATI being triggered. The original implementation followed this sequence, but the order was inadvertently changed as part of the resolution of a separate erratum. Fixes: 1e4189d ("Input: iqs7222 - protect volatile registers") Signed-off-by: Jeff LaBundy <[email protected]> Link: https://lore.kernel.org/r/ZKrpHc2Ji9qR25r2@nixie71 Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 7962ef1 ] In 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") it only was freeing if strcmp(evsel->tp_format->system, "syscalls") returned zero, while the corresponding initialization of evsel->priv was being performed if it was _not_ zero, i.e. if the tp system wasn't 'syscalls'. Just stop looking for that and free it if evsel->priv was set, which should be equivalent. Also use the pre-existing evsel_trace__delete() function. This resolves these leaks, detected with: $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin ================================================================= ==481565==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212 #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 gregkh#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 gregkh#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205 #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 gregkh#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 gregkh#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s). [root@quaco ~]# With this we plug all leaks with "perf trace sleep 1". Fixes: 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 0323e8f ] Allocate driver data as first resource in the probe function. This way it can be used during allocation of the other resources (instead of assigning these to local variables first and update driver data only when it's allocated). Also as driver data is allocated using a devm function this should happen first to have the order of freeing resources in the error path and the remove function in reverse. Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Stable-dep-of: c116223 ("pwm: atmel-tcb: Fix resource freeing in error path and remove") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit c116223 ] Several resources were not freed in the error path and the remove function. Add the forgotten items. Fixes: 34cbcd7 ("pwm: atmel-tcb: Add sama5d2 support") Fixes: 061f857 ("pwm: atmel-tcb: Switch to new binding") Signed-off-by: Uwe Kleine-König <[email protected]> Reviewed-by: Claudiu Beznea <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 4c09e20 ] As pointed out by Uwe Kleine-König[1], the changes introduced in commit c1ff7da ("video: backlight: lp855x: Get PWM for PWM mode during probe") caused the PWM state set up by the bootloader to be re-set when the driver is probed. This differs from the behavior from before that patch, where the PWM state would be initialized on the first brightness change. Fix this by moving the PWM state initialization into the PWM control function. Add a new variable, needs_pwm_init, to the device info struct to allow us to check whether we need the initialization, or whether it has already been done. [1] https://lore.kernel.org/lkml/[email protected]/ Fixes: c1ff7da ("video: backlight: lp855x: Get PWM for PWM mode during probe") Signed-off-by: Artur Weber <[email protected]> Reviewed-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…al power state [ Upstream commit fe1328b ] So, let's drop output GPIO direction check and only check GPIO value to set the initial power state. Fixes: 706dc68 ("backlight: gpio: Explicitly set the direction of the GPIO") Signed-off-by: Liu Ying <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Linus Walleij <[email protected]> Acked-by: Bartosz Golaszewski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a7a3252 ] Split cases in event_pmu for greater accuracy. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Stable-dep-of: b30d4f0 ("perf parse-events: Additional error reporting") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 77cdd78 ] Migration to improve error reporting as YYABORT cases should carry event parsing errors. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Stable-dep-of: b30d4f0 ("perf parse-events: Additional error reporting") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b52cb99 ] Add PE_ABORT that will YYNOMEM or YYABORT accordingly. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Stable-dep-of: b30d4f0 ("perf parse-events: Additional error reporting") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b30d4f0 ] When no events or PMUs match report an error for event_pmu: Before: ``` $ perf stat -e 'asdfasdf' -a sleep 1 Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ``` After: ``` $ perf stat -e 'asdfasdf' -a sleep 1 event syntax error: 'asdfasdf' \___ Bad event name Unabled to find PMU or event on a PMU of 'asdfasdf' Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ``` Fixes the inadvertent removal when hybrid parsing was modified. Fixes: 70c90e4 ("perf parse-events: Avoid scanning PMUs before parsing") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 389fbbe ] Immediately mark NMIs as unmasked in response to #VMGEXIT(NMI complete) instead of setting awaiting_iret_completion and waiting until the *next* VM-Exit to unmask NMIs. The whole point of "NMI complete" is that the guest is responsible for telling the hypervisor when it's safe to inject an NMI, i.e. there's no need to wait. And because there's no IRET to single-step, the next VM-Exit could be a long time coming, i.e. KVM could incorrectly hold an NMI pending for far longer than what is required and expected. Opportunistically fix a stale reference to HF_IRET_MASK. Fixes: 916b54a ("KVM: x86: Move HF_NMI_MASK and HF_IRET_MASK into "struct vcpu_svm"") Fixes: 4444dfe ("KVM: SVM: Add NMI support for an SEV-ES guest") Cc: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 687fe7d ] Remove option having i2c client contain raw gpio number instead of proper IRQ number. There are no users of this facility in mainline and it will allow cleaning up the driver code with regard to wakeup handling, etc. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Stable-dep-of: cc141c3 ("Input: tca6416-keypad - fix interrupt enable disbalance") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit cc141c3 ] The driver has been switched to use IRQF_NO_AUTOEN, but in the error unwinding and remove paths calls to enable_irq() were left in place, which will lead to an incorrect enable counter value. Fixes: bcd9730 ("Input: move to use request_irq by IRQF_NO_AUTOEN flag") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 979e9c9 ] In 616b14b ("perf build: Conditionally define NDEBUG") we started using NDEBUG=1 when DEBUG=1 isn't present, so code that is enclosed with assert() is not called. In dd317df ("perf build: Make binutil libraries opt in") we stopped linking against binutils-devel, for licensing reasons. Recently people asked me why annotation of BPF programs wasn't working, i.e. this: $ perf annotate bpf_prog_5280546344e3f45c_kfree_skb was returning: case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF: scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation"); This was on a fedora rpm, so its new enough that I had to try to test by rebuilding using BUILD_NONDISTRO=1, only to get it segfaulting on me. This combination made this libopcode function not to be called: assert(bfd_check_format(bfdf, bfd_object)); Changing it to: if (!bfd_check_format(bfdf, bfd_object)) abort(); Made it work, looking at this "check" function made me realize it changes the 'bfdf' internal state, i.e. we better call it. So stop using assert() on it, just call it and abort if it fails. Probably it is better to propagate the error, etc, but it seems it is unlikely to fail from the usage done so far and we really need to stop using libopcodes, so do the quick fix above and move on. With it we have BPF annotation back working when built with BUILD_NONDISTRO=1: ⬢[acme@toolbox perf-tools-next]$ perf annotate --stdio2 bpf_prog_5280546344e3f45c_kfree_skb | head No kallsyms or vmlinux with build-id 939bc71a1a51cdc434e60af93c7e734f7d5c0e7e was found Samples: 12 of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 3000000, [percent: local period] bpf_prog_5280546344e3f45c_kfree_skb() bpf_prog_5280546344e3f45c_kfree_skb Percent int kfree_skb(struct trace_event_raw_kfree_skb *args) { nop 33.33 xchg %ax,%ax push %rbp mov %rsp,%rbp sub $0x180,%rsp push %rbx push %r13 ⬢[acme@toolbox perf-tools-next]$ Fixes: 6987561 ("perf annotate: Enable annotation of BPF programs") Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mohamed Mahmoud <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Dave Tucker <[email protected]> Cc: Derek Barbosa <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…vm() [ Upstream commit 5df8ecf ] Drop the explicit check on the extended CPUID level in cpu_has_svm(), the kernel's cached CPUID info will leave the entire SVM leaf unset if said leaf is not supported by hardware. Prior to using cached information, the check was needed to avoid false positives due to Intel's rather crazy CPUID behavior of returning the values of the maximum supported leaf if the specified leaf is unsupported. Fixes: 682a810 ("x86/kvm/svm: Simplify cpu_has_svm()") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 8c49c6e ] Commit 3fd7a16 ("perf script: Add 'cgroup' field for output") added support for printing cgroup path in perf script output. It was okay if you didn't want any stacks: $ sudo perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup jpegtran:23f4bf 3321915 [013] 404718.587488: /idle.slice/polish.service jpegtran:23f4bf 3321915 [031] 404718.592073: /idle.slice/polish.service With stacks it gets messier as cgroup is printed after the stack: $ perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup,ip,sym jpegtran:23f4bf 3321915 [013] 404718.587488: 5c554 compress_output 570d9 jpeg_finish_compress 3476e jpegtran_main 330ee jpegtran::main 326e2 core::ops::function::FnOnce::call_once (inlined) 326e2 std::sys_common::backtrace::__rust_begin_short_backtrace /idle.slice/polish.service jpegtran:23f4bf 3321915 [031] 404718.592073: 8474d jsimd_encode_mcu_AC_first_prepare_sse2.PADDING 55af68e62fff [unknown] /idle.slice/polish.service Let's instead print cgroup on the same line as comm: $ perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup,ip,sym jpegtran:23f4bf 3321915 [013] 404718.587488: /idle.slice/polish.service 5c554 compress_output 570d9 jpeg_finish_compress 3476e jpegtran_main 330ee jpegtran::main 326e2 core::ops::function::FnOnce::call_once (inlined) 326e2 std::sys_common::backtrace::__rust_begin_short_backtrace jpegtran:23f4bf 3321915 [031] 404718.592073: /idle.slice/polish.service 8474d jsimd_encode_mcu_AC_first_prepare_sse2.PADDING 55af68e62fff [unknown] Fixes: 3fd7a16 ("perf script: Add 'cgroup' field for output") Signed-off-by: Ivan Babrou <[email protected]> Acked-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit dc7f01f ] For logical OR operator, the actual sample_flags are in the 'groups' list so it needs to check entries in the list instead. Otherwise it would show the following error message. $ sudo perf record -a -e cycles:p --filter 'period > 100 || weight > 0' sleep 1 Error: cycles:p event does not have sample flags 0 failed to set filter "BPF" on event cycles:p with 2 (No such file or directory) Actually it should warn on 'weight' is used without WEIGHT flag. Error: cycles:p event does not have PERF_SAMPLE_WEIGHT Hint: please add -W option to perf record failed to set filter "BPF" on event cycles:p with 2 (No such file or directory) Fixes: 4310551 ("perf bpf filter: Show warning for missing sample flags") Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…find_symbol_fb() [ Upstream commit 42c6dd9 ] As thread__find_symbol_fb() will end up calling thread__find_map() and it in turn will call these on uninitialized memory: maps__zput(al->maps); map__zput(al->map); thread__zput(al->thread); Fixes: 0dd5041 ("perf addr_location: Add init/exit/copy functions") Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 82b0a10 ] Add perf_dlfilter_fns.al_cleanup() to do addr_location__exit() on data passed via perf_dlfilter_fns.resolve_address(). Add dlfilter-test-api-v2 to the "dlfilter C API" test to test it. Update documentation, clarifying that data returned by APIs should not be dereferenced after filter_event() and filter_event_early() return. Fixes: 0dd5041 ("perf addr_location: Add init/exit/copy functions") Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…latform [ Upstream commit 3286f88 ] Update the description for some of the JSON/events for power10 platform. Fixes: 32daa5d ("perf vendor events: Initial JSON/events list for power10 platform") Signed-off-by: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit e104df9 ] Drop some of the JSON/events for power10 platform due to counter data mismatch. Fixes: 32daa5d ("perf vendor events: Initial JSON/events list for power10 platform") Signed-off-by: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

…tform [ Upstream commit 4836b9a ] Drop STORES_PER_INST metric event for the power10 platform, as the metric expression of STORES_PER_INST metric event using dropped event PM_ST_FIN. Fixes: 3ca3af7 ("perf vendor events power10: Add metric events JSON file for power10 platform") Signed-off-by: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

… platform [ Upstream commit 7d473f4 ] Move some of the power10 JSON/events to appropriate files. Fixes: 32daa5d ("perf vendor events: Initial JSON/events list for power10 platform") Signed-off-by: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit edd65d2 ] Update metric event name for some of the JSON/metric events for power10 platform. Fixes: 3ca3af7 ("perf vendor events power10: Add metric events JSON file for power10 platform") Signed-off-by: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 gregkh#1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 gregkh#2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 gregkh#3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 gregkh#4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 gregkh#5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 gregkh#6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 gregkh#7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 gregkh#8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 gregkh#9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 23dfdb5 upstream. The following kernel trace can be triggered with fstest generic/629 when executed against a filesystem with fast-commit feature enabled: INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ gregkh#11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x66/0x90 register_lock_class+0x759/0x7d0 __lock_acquire+0x85/0x2630 ? __find_get_block+0xb4/0x380 lock_acquire+0xd1/0x2d0 ? __ext4_journal_get_write_access+0xd5/0x160 _raw_spin_lock+0x33/0x40 ? __ext4_journal_get_write_access+0xd5/0x160 __ext4_journal_get_write_access+0xd5/0x160 ext4_reserve_inode_write+0x61/0xb0 __ext4_mark_inode_dirty+0x79/0x270 ? ext4_ext_replay_set_iblocks+0x2f8/0x450 ext4_ext_replay_set_iblocks+0x330/0x450 ext4_fc_replay+0x14c8/0x1540 ? jread+0x88/0x2e0 ? rcu_is_watching+0x11/0x40 do_one_pass+0x447/0xd00 jbd2_journal_recover+0x139/0x1b0 jbd2_journal_load+0x96/0x390 ext4_load_and_init_journal+0x253/0xd40 ext4_fill_super+0x2cc6/0x3180 ... In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in function ext4_check_bdev_write_error(). Unfortunately, at this point this spinlock has not been initialized yet. Moving it's initialization to an earlier point in __ext4_fill_super() fixes this splat. Signed-off-by: Luis Henriques (SUSE) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Cc: [email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ac01c8c upstream. AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 gregkh#1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 gregkh#2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 gregkh#3 0x5650d804568f in __hists__add_entry util/hist.c:754 gregkh#4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 gregkh#5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 gregkh#6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 gregkh#7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 gregkh#8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 gregkh#9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 gregkh#10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 gregkh#11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 gregkh#12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 gregkh#13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67 ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <[email protected]> Signed-off-by: Matt Fleming (Cloudflare) <[email protected]> Cc: Ian Rogers <[email protected]> Cc: [email protected] Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: [email protected] # v5.13+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 gregkh#1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 gregkh#2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 gregkh#3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 gregkh#4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 gregkh#5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 gregkh#6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 gregkh#7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 gregkh#8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 gregkh#9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ac01c8c upstream. AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 gregkh#1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 gregkh#2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 gregkh#3 0x5650d804568f in __hists__add_entry util/hist.c:754 gregkh#4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 gregkh#5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 gregkh#6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 gregkh#7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 gregkh#8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 gregkh#9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 gregkh#10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 gregkh#11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 gregkh#12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 gregkh#13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67 ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <[email protected]> Signed-off-by: Matt Fleming (Cloudflare) <[email protected]> Cc: Ian Rogers <[email protected]> Cc: [email protected] Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: [email protected] # v5.13+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 gregkh#1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 gregkh#2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 gregkh#3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 gregkh#4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 gregkh#5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 gregkh#6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 gregkh#7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 gregkh#8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 gregkh#9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ac01c8c upstream. AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 gregkh#1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 gregkh#2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 gregkh#3 0x5650d804568f in __hists__add_entry util/hist.c:754 gregkh#4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 gregkh#5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 gregkh#6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 gregkh#7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 gregkh#8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 gregkh#9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 gregkh#10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 gregkh#11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 gregkh#12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 gregkh#13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67 ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <[email protected]> Signed-off-by: Matt Fleming (Cloudflare) <[email protected]> Cc: Ian Rogers <[email protected]> Cc: [email protected] Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: [email protected] # v5.13+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 gregkh#1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 gregkh#2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 gregkh#3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 gregkh#4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 gregkh#5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 gregkh#6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 gregkh#7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 gregkh#8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 gregkh#9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 gregkh#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 gregkh#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 gregkh#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 gregkh#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 gregkh#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 gregkh#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 gregkh#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 gregkh#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). Fixes: 38cfb5a ("blk-wbt: improve waking of tasks") Cc: [email protected] Signed-off-by: Omar Sandoval <[email protected]> Acked-by: Tejun Heo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com Signed-off-by: Jens Axboe <[email protected]>

…g the sock [ Upstream commit 3cf7203 ] There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 gregkh#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d gregkh#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 gregkh#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b gregkh#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb gregkh#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 gregkh#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 gregkh#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] gregkh#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 gregkh#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 gregkh#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 gregkh#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 gregkh#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde gregkh#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b gregkh#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 gregkh#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a gregkh#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 gregkh#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca gregkh#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <[email protected]> Suggested-by: Jakub Sitnicki <[email protected]> Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit e972b08 upstream. We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [gregkh#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 gregkh#11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). Fixes: 38cfb5a ("blk-wbt: improve waking of tasks") Cc: [email protected] Signed-off-by: Omar Sandoval <[email protected]> Acked-by: Tejun Heo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT creates this splat when an MPTCP socket is created: ============================= WARNING: suspicious RCU usage 6.12.0-rc2+ gregkh#11 Not tainted ----------------------------- net/mptcp/sched.c:44 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by mptcp_connect/176. stack backtrace: CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ gregkh#11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7)) mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1)) ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28) inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386) ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1)) __sock_create (net/socket.c:1576) __sys_socket (net/socket.c:1671) ? __pfx___sys_socket (net/socket.c:1712) ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1)) __x64_sys_socket (net/socket.c:1728) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's because when the socket is initialised, rcu_read_lock() is not used despite the explicit comment written above the declaration of mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the warning. Fixes: 1730b2b ("mptcp: add sched in mptcp_sock") Cc: [email protected] Closes: multipath-tcp/mptcp_net-next#523 Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>

Disable strict aliasing, as has been done in the kernel proper for decades (literally since before git history) to fix issues where gcc will optimize away loads in code that looks 100% correct, but is _technically_ undefined behavior, and thus can be thrown away by the compiler. E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long) pointer to a u64 (unsigned long long) pointer when setting PMCR.N via u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores the result and uses the original PMCR. The issue is most easily observed by making set_pmcr_n() noinline and wrapping the call with printf(), e.g. sans comments, for this code: printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); set_pmcr_n(&pmcr, pmcr_n); printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); gcc-13 generates: 0000000000401c90 <set_pmcr_n>: 401c90: f9400002 ldr x2, [x0] 401c94: b3751022 bfi x2, x1, gregkh#11, gregkh#5 401c98: f9000002 str x2, [x0] 401c9c: d65f03c0 ret 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>: 402724: aa1403e3 mov x3, x20 402728: aa1503e2 mov x2, x21 40272c: aa1603e0 mov x0, x22 402730: aa1503e1 mov x1, x21 402734: 940060ff bl 41ab30 <_IO_printf> 402738: aa1403e1 mov x1, x20 40273c: 910183e0 add x0, sp, #0x60 402740: 97fffd54 bl 401c90 <set_pmcr_n> 402744: aa1403e3 mov x3, x20 402748: aa1503e2 mov x2, x21 40274c: aa1503e1 mov x1, x21 402750: aa1603e0 mov x0, x22 402754: 940060f7 bl 41ab30 <_IO_printf> with the value stored in [sp + 0x60] ignored by both printf() above and in the test proper, resulting in a false failure due to vcpu_set_reg() simply storing the original value, not the intended value. $ ./vpmu_counter_access Random seed: 0x6b8b4567 orig = 3040, next = 3040, want = 0 orig = 3040, next = 3040, want = 0 ==== Test Assertion Failure ==== aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr) pid=71578 tid=71578 errno=9 - Bad file descriptor 1 0x400673: run_access_test at vpmu_counter_access.c:522 2 (inlined by) main at vpmu_counter_access.c:643 3 0x4132d7: __libc_start_call_main at libc-start.o:0 4 0x413653: __libc_start_main at ??:0 5 0x40106f: _start at ??:0 Failed to update PMCR.N to 0 (received: 6) Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n() is inlined in its sole caller. Cc: [email protected] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912 Signed-off-by: Sean Christopherson <[email protected]>

[ Upstream commit 3deb12c ] Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT creates this splat when an MPTCP socket is created: ============================= WARNING: suspicious RCU usage 6.12.0-rc2+ #11 Not tainted ----------------------------- net/mptcp/sched.c:44 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by mptcp_connect/176. stack backtrace: CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7)) mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1)) ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28) inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386) ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1)) __sock_create (net/socket.c:1576) __sys_socket (net/socket.c:1671) ? __pfx___sys_socket (net/socket.c:1712) ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1)) __x64_sys_socket (net/socket.c:1728) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's because when the socket is initialised, rcu_read_lock() is not used despite the explicit comment written above the declaration of mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the warning. Fixes: 1730b2b ("mptcp: add sched in mptcp_sock") Cc: [email protected] Closes: multipath-tcp/mptcp_net-next#523 Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit e972b08 upstream. We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). Fixes: 38cfb5a ("blk-wbt: improve waking of tasks") Cc: [email protected] Signed-off-by: Omar Sandoval <[email protected]> Acked-by: Tejun Heo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9b5aad3a7498c261116a0251fe57f14ba9c4c6cf Author: Greg Kroah-Hartman <[email protected]> Date: Fri Nov 8 16:28:28 2024 +0100 Linux 6.6.60 Link: https://lore.kernel.org/r/[email protected] Tested-by: SeongJae Park <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Peter Schneider <[email protected]> Tested-by: Takeshi Ogasawara <[email protected]> Tested-by: Jon Hunter <[email protected]> Tested-by: Florian Fainelli <[email protected]> Tested-by: Ron Economos <[email protected]> Tested-by: Hardik Garg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit cc082e50375a29596153fc3f1f8fc85ad1b0b5b9 Author: Konstantin Komarov <[email protected]> Date: Thu Sep 5 15:03:48 2024 +0300 fs/ntfs3: Sequential field availability check in mi_enum_attr() commit 090f612756a9720ec18b0b130e28be49839d7cb5 upstream. The code is slightly reformatted to consistently check field availability without duplication. Fixes: 556bdf27c2dd ("ntfs3: Add bounds checking to mi_enum_attr()") Signed-off-by: Konstantin Komarov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 10c20d79d59cadfe572480d98cec271a89ffb024 Author: Srinivasan Shanmugam <[email protected]> Date: Mon May 27 20:15:21 2024 +0530 drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing commit 15c2990e0f0108b9c3752d7072a97d45d4283aea upstream. This commit adds null checks for the 'stream' and 'plane' variables in the dcn30_apply_idle_power_optimizations function. These variables were previously assumed to be null at line 922, but they were used later in the code without checking if they were null. This could potentially lead to a null pointer dereference, which would cause a crash. The null checks ensure that 'stream' and 'plane' are not null before they are used, preventing potential crashes. Fixes the below static smatch checker: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922) drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922) Cc: Tom Chung <[email protected]> Cc: Nicholas Kazlauskas <[email protected]> Cc: Bhawanpreet Lakha <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Roman Li <[email protected]> Cc: Hersen Wu <[email protected]> Cc: Alex Hung <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Harry Wentland <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Signed-off-by: Alex Deucher <[email protected]> [Xiangyu: Modified file path to backport this commit] Signed-off-by: Xiangyu Chen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit e979a6a626abf1358a5bb79219eea82ac160d3d3 Author: Peter Ujfalusi <[email protected]> Date: Tue Sep 19 13:31:15 2023 +0300 ASoC: SOF: ipc4-control: Add support for ALSA enum control commit 07a866a41982c896dc46476f57d209a200602946 upstream. Enum controls use generic param_id and a generic struct where the data is passed to the firmware. Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 3facc0417d3d7b3ba5822e74155bcb1267ce62c1 Author: Peter Ujfalusi <[email protected]> Date: Tue Sep 19 13:31:14 2023 +0300 ASoC: SOF: ipc4-control: Add support for ALSA switch control commit 4a2fd607b7ca6128ee3532161505da7624197f55 upstream. Volume controls with a max value of 1 are switches. Switch controls use generic param_id and a generic struct where the data is passed to the firmware. Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit f01d8fc623711046e1efee00827bff6d5882cdfd Author: Peter Ujfalusi <[email protected]> Date: Tue Sep 19 13:31:13 2023 +0300 ASoC: SOF: ipc4-topology: Add definition for generic switch/enum control commit 060a07cd9bc69eba2da33ed96b1fa69ead60bab1 upstream. Currently IPC4 has no notion of a switch or enum type of control which is a generic concept in ALSA. The generic support for these control types will be as follows: - large config is used to send the channel-value par array - param_id of a SWITCH type is 200 - param_id of an ENUM type is 201 Each module need to support a switch or/and enum must handle these universal param_ids. The message payload is described by struct sof_ipc4_control_msg_payload. Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit d54afaef6570c277070c3cafe1ed73dcdc129e0a Author: Chuck Lever <[email protected]> Date: Tue Sep 19 11:35:15 2023 -0400 SUNRPC: Remove BUG_ON call sites commit 789ce196a31dd13276076762204bee87df893e53 upstream. There is no need to take down the whole system for these assertions. I'd rather not attempt a heroic save here, as some bug has occurred that has left the transport data structures in an unknown state. Just warn and then leak the left-over resources. Acked-by: Christian Brauner <[email protected]> Reviewed-by: NeilBrown <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Dominique Martinet <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 27a58a19bd20a7afe369da2ce6d4ebea70768acd Author: Michael Walle <[email protected]> Date: Fri Jun 21 14:09:29 2024 +0200 mtd: spi-nor: winbond: fix w25q128 regression commit d35df77707bf5ae1221b5ba1c8a88cf4fcdd4901 upstream. Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128") removed the flags for non-SFDP devices. It was assumed that it wasn't in use anymore. This wasn't true. Add the no_sfdp_flags as well as the size again. We add the additional flags for dual and quad read because they have been reported to work properly by Hartmut using both older and newer versions of this flash, the similar flashes with 64Mbit and 256Mbit already have these flags and because it will (luckily) trigger our legacy SFDP parsing, so newer versions with SFDP support will still get the parameters from the SFDP tables. Reported-by: Hartmut Birr <[email protected]> Closes: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/ Fixes: 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128") Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: Michael Walle <[email protected]> Acked-by: Tudor Ambarus <[email protected]> Reviewed-by: Esben Haabendal <[email protected]> Reviewed-by: Pratyush Yadav <[email protected]> Signed-off-by: Pratyush Yadav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] [Backported to v6.6 - vastly different due to upstream changes] Reviewed-by: Tudor Ambarus <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 3d544942c0010feedc048b048ee0c35d2d921100 Author: David Hildenbrand <[email protected]> Date: Fri Oct 11 12:24:45 2024 +0200 mm: don't install PMD mappings when THPs are disabled by the hw/process/vma commit 2b0f922323ccfa76219bcaacd35cd50aeaa13592 upstream. We (or rather, readahead logic :) ) might be allocating a THP in the pagecache and then try mapping it into a process that explicitly disabled THP: we might end up installing PMD mappings. This is a problem for s390x KVM, which explicitly remaps all PMD-mapped THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before starting the VM. For example, starting a VM backed on a file system with large folios supported makes the VM crash when the VM tries accessing such a mapping using KVM. Is it also a problem when the HW disabled THP using TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case without X86_FEATURE_PSE. In the future, we might be able to do better on s390x and only disallow PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED really wants. For now, fix it by essentially performing the same check as would be done in __thp_vma_allowable_orders() or in shmem code, where this works as expected, and disallow PMD mappings, making us fallback to PTE mappings. Link: https://lkml.kernel.org/r/[email protected] Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: David Hildenbrand <[email protected]> Reported-by: Leo Fu <[email protected]> Tested-by: Thomas Huth <[email protected]> Cc: Thomas Huth <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 02ec4b3bba49e8d3abb25a3feba6875cae12da92 Author: Kefeng Wang <[email protected]> Date: Fri Oct 11 12:24:44 2024 +0200 mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw() commit 963756aac1f011d904ddd9548ae82286d3a91f96 upstream. Patch series "mm: don't install PMD mappings when THPs are disabled by the hw/process/vma". During testing, it was found that we can get PMD mappings in processes where THP (and more precisely, PMD mappings) are supposed to be disabled. While it works as expected for anon+shmem, the pagecache is the problematic bit. For s390 KVM this currently means that a VM backed by a file located on filesystem with large folio support can crash when KVM tries accessing the problematic page, because the readahead logic might decide to use a PMD-sized THP and faulting it into the page tables will install a PMD mapping, something that s390 KVM cannot tolerate. This might also be a problem with HW that does not support PMD mappings, but I did not try reproducing it. Fix it by respecting the ways to disable THPs when deciding whether we can install a PMD mapping. khugepaged should already be taking care of not collapsing if THPs are effectively disabled for the hw/process/vma. This patch (of 2): Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by shmem_allowable_huge_orders() and __thp_vma_allowable_orders(). [[email protected]: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ] Link: https://lkml.kernel.org/r/[email protected] Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Reported-by: Leo Fu <[email protected]> Tested-by: Thomas Huth <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: Boqiao Fu <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit fc621e7a043de346c33bd7ae7e2e0c651d6152ef Author: Johannes Berg <[email protected]> Date: Wed Oct 23 09:17:44 2024 +0200 wifi: iwlwifi: mvm: fix 6 GHz scan construction commit 7245012f0f496162dd95d888ed2ceb5a35170f1a upstream. If more than 255 colocated APs exist for the set of all APs found during 2.4/5 GHz scanning, then the 6 GHz scan construction will loop forever since the loop variable has type u8, which can never reach the number found when that's bigger than 255, and is stored in a u32 variable. Also move it into the loops to have a smaller scope. Using a u32 there is fine, we limit the number of APs in the scan list and each has a limit on the number of RNR entries due to the frame size. With a limit of 1000 scan results, a frame size upper bound of 4096 (really it's more like ~2300) and a TBTT entry size of at least 11, we get an upper bound for the number of ~372k, well in the bounds of a u32. Cc: [email protected] Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375 Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit f2f1fa446676c21edb777e6d2bc4fa8f956fab68 Author: Ryusuke Konishi <[email protected]> Date: Fri Oct 18 04:33:10 2024 +0900 nilfs2: fix kernel bug due to missing clearing of checked flag commit 41e192ad2779cae0102879612dfe46726e4396aa upstream. Syzbot reported that in directory operations after nilfs2 detects filesystem corruption and degrades to read-only, __block_write_begin_int(), which is called to prepare block writes, may fail the BUG_ON check for accesses exceeding the folio/page size, triggering a kernel bug. This was found to be because the "checked" flag of a page/folio was not cleared when it was discarded by nilfs2's own routine, which causes the sanity check of directory entries to be skipped when the directory page/folio is reloaded. So, fix that. This was necessary when the use of nilfs2's own page discard routine was applied to more than just metadata files. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption") Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959 Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit a53c2d847627b790fb3bd8b00e02c247941b17e0 Author: Zong-Zhe Yang <[email protected]> Date: Mon Jun 17 19:52:17 2024 +0800 wifi: mac80211: fix NULL dereference at band check in starting tx ba session commit 021d53a3d87eeb9dbba524ac515651242a2a7e3b upstream. In MLD connection, link_data/link_conf are dynamically allocated. They don't point to vif->bss_conf. So, there will be no chanreq assigned to vif->bss_conf and then the chan will be NULL. Tweak the code to check ht_supported/vht_supported/has_he/has_eht on sta deflink. Crash log (with rtw89 version under MLO development): [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 9890.526102] #PF: supervisor read access in kernel mode [ 9890.526105] #PF: error_code(0x0000) - not-present page [ 9890.526109] PGD 0 P4D 0 [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G OE 6.9.0 #1 [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018 [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core] [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211 [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3 All code ======== 0: f7 e8 imul %eax 2: d5 (bad) 3: 93 xchg %eax,%ebx 4: 3e ea ds (bad) 6: 48 83 c4 28 add $0x28,%rsp a: 89 d8 mov %ebx,%eax c: 5b pop %rbx d: 41 5c pop %r12 f: 41 5d pop %r13 11: 41 5e pop %r14 13: 41 5f pop %r15 15: 5d pop %rbp 16: c3 retq 17: cc int3 18: cc int3 19: cc int3 1a: cc int3 1b: 49 8b 84 24 e0 f1 ff mov -0xe20(%r12),%rax 22: ff 23: 48 8b 80 90 1b 00 00 mov 0x1b90(%rax),%rax 2a:* 83 38 03 cmpl $0x3,(%rax) <-- trapping instruction 2d: 0f 84 37 fe ff ff je 0xfffffffffffffe6a 33: bb ea ff ff ff mov $0xffffffea,%ebx 38: eb cc jmp 0x6 3a: 49 rex.WB 3b: 8b .byte 0x8b 3c: 84 24 10 test %ah,(%rax,%rdx,1) 3f: f3 repz Code starting with the faulting instruction =========================================== 0: 83 38 03 cmpl $0x3,(%rax) 3: 0f 84 37 fe ff ff je 0xfffffffffffffe40 9: bb ea ff ff ff mov $0xffffffea,%ebx e: eb cc jmp 0xffffffffffffffdc 10: 49 rex.WB 11: 8b .byte 0x8b 12: 84 24 10 test %ah,(%rax,%rdx,1) 15: f3 repz [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246 [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8 [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685 [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873 [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70 [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000 [ 9890.526313] FS: 0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000 [ 9890.526316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0 [ 9890.526321] Call Trace: [ 9890.526324] <TASK> [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479) [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713) [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3)) [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211 Signed-off-by: Zong-Zhe Yang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Xiangyu Chen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 6a91a5816b289018e0b42a25444c0b4f8c637dca Author: Pavel Begunkov <[email protected]> Date: Wed Apr 10 02:26:54 2024 +0100 io_uring: always lock __io_cqring_overflow_flush commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream. Conditional locking is never great, in case of __io_cqring_overflow_flush(), which is a slow path, it's not justified. Don't handle IOPOLL separately, always grab uring_lock for overflow flushing. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit e3fb0e6afcc399660770428a35162b4880e2e14e Author: Haibo Chen <[email protected]> Date: Thu Sep 5 17:43:38 2024 +0800 arm64: dts: imx8ulp: correct the flexspi compatible string commit 409dc5196d5b6eb67468a06bf4d2d07d7225a67b upstream. The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has 32 LUTs, so correct the compatible string here, otherwise will meet below error: [ 1.119072] ------------[ cut here ]------------ [ 1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64 [ 1.133239] Modules linked in: [ 1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69 [ 1.146821] Hardware name: NXP i.MX8ULP EVK (DT) [ 1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64 [ 1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64 [ 1.168060] sp : ffff80008002b2a0 [ 1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000 [ 1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000 [ 1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0 [ 1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff [ 1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002 [ 1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00 [ 1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80 [ 1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001 [ 1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6 [ 1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92 [ 1.246267] Call trace: [ 1.248824] nxp_fspi_exec_op+0xb04/0xb64 [ 1.253031] spi_mem_exec_op+0x3a0/0x430 [ 1.257139] spi_nor_read_id+0x80/0xcc [ 1.261065] spi_nor_scan+0x1ec/0xf10 [ 1.264901] spi_nor_probe+0x108/0x2fc [ 1.268828] spi_mem_probe+0x6c/0xbc [ 1.272574] spi_probe+0x84/0xe4 [ 1.275958] really_probe+0xbc/0x29c [ 1.279713] __driver_probe_device+0x78/0x12c [ 1.284277] driver_probe_device+0xd8/0x15c [ 1.288660] __device_attach_driver+0xb8/0x134 [ 1.293316] bus_for_each_drv+0x88/0xe8 [ 1.297337] __device_attach+0xa0/0x190 [ 1.301353] device_initial_probe+0x14/0x20 [ 1.305734] bus_probe_device+0xac/0xb0 [ 1.309752] device_add+0x5d0/0x790 [ 1.313408] __spi_add_device+0x134/0x204 [ 1.317606] of_register_spi_device+0x3b4/0x590 [ 1.322348] spi_register_controller+0x47c/0x754 [ 1.327181] devm_spi_register_controller+0x4c/0xa4 [ 1.332289] nxp_fspi_probe+0x1cc/0x2b0 [ 1.336307] platform_probe+0x68/0xc4 [ 1.340145] really_probe+0xbc/0x29c [ 1.343893] __driver_probe_device+0x78/0x12c [ 1.348457] driver_probe_device+0xd8/0x15c [ 1.352838] __driver_attach+0x90/0x19c [ 1.356857] bus_for_each_dev+0x7c/0xdc [ 1.360877] driver_attach+0x24/0x30 [ 1.364624] bus_add_driver+0xe4/0x208 [ 1.368552] driver_register+0x5c/0x124 [ 1.372573] __platform_driver_register+0x28/0x34 [ 1.377497] nxp_fspi_driver_init+0x1c/0x28 [ 1.381888] do_one_initcall+0x80/0x1c8 [ 1.385908] kernel_init_freeable+0x1c4/0x28c [ 1.390472] kernel_init+0x20/0x1d8 [ 1.394138] ret_from_fork+0x10/0x20 [ 1.397885] ---[ end trace 0000000000000000 ]--- [ 1.407908] ------------[ cut here ]------------ Fixes: ef89fd56bdfc ("arm64: dts: imx8ulp: add flexspi node") Cc: [email protected] Signed-off-by: Haibo Chen <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 1a49b96c51063d38be296a0c1537928a06f02d6e Author: Gregory Price <[email protected]> Date: Fri Oct 25 10:17:24 2024 -0400 vmscan,migrate: fix page count imbalance on node stats when demoting pages [ Upstream commit 35e41024c4c2b02ef8207f61b9004f6956cf037b ] When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. migrate_pages will decrement the the node's isolated page count, leading to an imbalanced count when invoked from (MG)LRU code. The result is dmesg output like such: $ cat /proc/sys/vm/stat_refresh [77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642 This negative value may impact compaction and reclaim throttling. The following path produces the decrement: shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- decrement This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code. When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION. Link: https://lkml.kernel.org/r/[email protected] Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim") Signed-off-by: Gregory Price <[email protected]> Reviewed-by: Yang Shi <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Wei Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 003d2996964c03dfd34860500428f4cdf1f5879e Author: Jens Axboe <[email protected]> Date: Thu Oct 31 08:05:44 2024 -0600 io_uring/rw: fix missing NOWAIT check for O_DIRECT start write [ Upstream commit 1d60d74e852647255bd8e76f5a22dc42531e4389 ] When io_uring starts a write, it'll call kiocb_start_write() to bump the super block rwsem, preventing any freezes from happening while that write is in-flight. The freeze side will grab that rwsem for writing, excluding any new writers from happening and waiting for existing writes to finish. But io_uring unconditionally uses kiocb_start_write(), which will block if someone is currently attempting to freeze the mount point. This causes a deadlock where freeze is waiting for previous writes to complete, but the previous writes cannot complete, as the task that is supposed to complete them is blocked waiting on starting a new write. This results in the following stuck trace showing that dependency with the write blocked starting a new write: task:fio state:D stack:0 pid:886 tgid:886 ppid:876 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_rwsem_wait+0x1e8/0x3f8 __percpu_down_read+0xe8/0x500 io_write+0xbb8/0xff8 io_issue_sqe+0x10c/0x1020 io_submit_sqes+0x614/0x2110 __arm64_sys_io_uring_enter+0x524/0x1038 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 INFO: task fsfreeze:7364 blocked for more than 15 seconds. Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963 with the attempting freezer stuck trying to grab the rwsem: task:fsfreeze state:D stack:0 pid:7364 tgid:7364 ppid:995 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_down_write+0x2b0/0x680 freeze_super+0x248/0x8a8 do_vfs_ioctl+0x149c/0x1b18 __arm64_sys_ioctl+0xd0/0x1a0 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a blocking grab of the super block rwsem if it isn't set. For normal issue where IOCB_NOWAIT would always be set, this returns -EAGAIN which will have io_uring core issue a blocking attempt of the write. That will in turn also get completions run, ensuring forward progress. Since freezing requires CAP_SYS_ADMIN in the first place, this isn't something that can be triggered by a regular user. Cc: [email protected] # 5.10+ Reported-by: Peter Mann <[email protected]> Link: https://lore.kernel.org/io-uring/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 70bbe8d0a949413df1bb6532fd6b19fbf0f88feb Author: Andrey Konovalov <[email protected]> Date: Tue Oct 22 18:07:06 2024 +0200 kasan: remove vmalloc_percpu test [ Upstream commit 330d8df81f3673d6fb74550bbc9bb159d81b35f7 ] Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the vmalloc_percpu KASAN test with the assumption that __alloc_percpu always uses vmalloc internally, which is tagged by KASAN. However, __alloc_percpu might allocate memory from the first per-CPU chunk, which is not allocated via vmalloc(). As a result, the test might fail. Remove the test until proper KASAN annotation for the per-CPU allocated are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests") Signed-off-by: Andrey Konovalov <[email protected]> Reported-by: Samuel Holland <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Reported-by: Sabyrzhan Tasbolatov <[email protected]> Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/ Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Sabyrzhan Tasbolatov <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit c60af16e1d6cc2237d58336546d6adfc067b6b8f Author: Vitaliy Shevtsov <[email protected]> Date: Mon Sep 16 22:41:37 2024 +0500 nvmet-auth: assign dh_key to NULL after kfree_sensitive [ Upstream commit d2f551b1f72b4c508ab9298419f6feadc3b5d791 ] ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup() for the same controller. So it's better to nullify it after release on error path in order to avoid double free later in nvmet_destroy_auth(). Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support") Cc: [email protected] Signed-off-by: Vitaliy Shevtsov <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 4a39320977f9c665faa37efaa8093b8e82dd8c41 Author: Christoffer Sandberg <[email protected]> Date: Tue Oct 29 16:16:53 2024 +0100 ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 [ Upstream commit e49370d769e71456db3fbd982e95bab8c69f73e8 ] Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <[email protected]> Signed-off-by: Werner Sembach <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit b42adef85aca72b51eab1a812a79913ff5aeb584 Author: Christoffer Sandberg <[email protected]> Date: Tue Oct 29 16:16:52 2024 +0100 ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3 [ Upstream commit 0b04fbe886b4274c8e5855011233aaa69fec6e75 ] Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <[email protected]> Signed-off-by: Werner Sembach <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 77ddc732416b017180893cbb2356e9f0a414c575 Author: Christoph Hellwig <[email protected]> Date: Wed Oct 23 15:37:22 2024 +0200 xfs: fix finding a last resort AG in xfs_filestream_pick_ag [ Upstream commit dc60992ce76fbc2f71c2674f435ff6bde2108028 ] When the main loop in xfs_filestream_pick_ag fails to find a suitable AG it tries to just pick the online AG. But the loop for that uses args->pag as loop iterator while the later code expects pag to be set. Fix this by reusing the max_pag case for this last resort, and also add a check for impossible case of no AG just to make sure that the uninitialized pag doesn't even escape in theory. Reported-by: [email protected] Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: [email protected] Fixes: f8f1ed1ab3baba ("xfs: return a referenced perag from filestreams allocator") Cc: <[email protected]> # v6.3 Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 8e886e44397ba89f6e8da8471386112b4f5b67b7 Author: Matt Johnston <[email protected]> Date: Tue Oct 22 18:25:14 2024 +0800 mctp i2c: handle NULL header address [ Upstream commit 01e215975fd80af81b5b79f009d49ddd35976c13 ] daddr can be NULL if there is no neighbour table entry present, in that case the tx packet should be dropped. saddr will usually be set by MCTP core, but check for NULL in case a packet is transmitted by a different protocol. Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver") Cc: [email protected] Reported-by: Dung Cao <[email protected]> Signed-off-by: Matt Johnston <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 88f97a4b5843ce21c1286e082c02a5fb4d8eb473 Author: Edward Adam Davis <[email protected]> Date: Wed Oct 16 19:43:47 2024 +0800 ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow [ Upstream commit bc0a2f3a73fcdac651fca64df39306d1e5ebe3b0 ] Syzbot reported a kernel BUG in ocfs2_truncate_inline. There are two reasons for this: first, the parameter value passed is greater than ocfs2_max_inline_data_with_xattr, second, the start and end parameters of ocfs2_truncate_inline are "unsigned int". So, we need to add a sanity check for byte_start and byte_len right before ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater than ocfs2_max_inline_data_with_xattr return -EINVAL. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1afc32b95233 ("ocfs2: Write support for inline data") Signed-off-by: Edward Adam Davis <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7 Reviewed-by: Joseph Qi <[email protected]> Cc: Joel Becker <[email protected]> Cc: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit c117a980185ee3812612e7e453e356a6a4f05305 Author: Sabyrzhan Tasbolatov <[email protected]> Date: Wed Oct 16 20:24:07 2024 +0500 x86/traps: move kmsan check after instrumentation_begin [ Upstream commit 1db272864ff250b5e607283eaec819e1186c8e26 ] During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following: AR built-in.a AR vmlinux.a LD vmlinux.o vmlinux.o: warning: objtool: handle_bug+0x4: call to kmsan_unpoison_entry_regs() leaves .noinstr.text section OBJCOPY modules.builtin.modinfo GEN modules.builtin MODPOST Module.symvers CC .vmlinux.export.o Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes the warning. There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but it has the return condition and if we include it after instrumentation_begin() it results the warning "return with instrumentation enabled", hence, I'm concerned that regs will not be KMSAN unpoisoned if `ud_type == BUG_NONE` is true. Link: https://lkml.kernel.org/r/[email protected] Fixes: ba54d194f8da ("x86/traps: avoid KMSAN bugs originating from handle_bug()") Signed-off-by: Sabyrzhan Tasbolatov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 86ee1845cbbf52eff6d41ce438d5f7e9ab6f4602 Author: Gatlin Newhouse <[email protected]> Date: Wed Jul 24 00:01:55 2024 +0000 x86/traps: Enable UBSAN traps on x86 [ Upstream commit 7424fc6b86c8980a87169e005f5cd4438d18efe6 ] Currently ARM64 extracts which specific sanitizer has caused a trap via encoded data in the trap instruction. Clang on x86 currently encodes the same data in the UD1 instruction but x86 handle_bug() and is_valid_bugaddr() currently only look at UD2. Bring x86 to parity with ARM64, similar to commit 25b84002afb9 ("arm64: Support Clang UBSAN trap codes for better reporting"). See the llvm links for information about the code generation. Enable the reporting of UBSAN sanitizer details on x86 compiled with clang when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate which is encoded by the compiler after the UD1. [ tglx: Simplified it by moving the printk() into handle_bug() ] Signed-off-by: Gatlin Newhouse <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Kees Cook <[email protected]> Link: https://lore.kernel.org/all/[email protected] Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27 Stable-dep-of: 1db272864ff2 ("x86/traps: move kmsan check after instrumentation_begin") Signed-off-by: Sasha Levin <[email protected]> commit b958948ae1cb3e39c48e9f805436fd652103c71e Author: Matt Fleming <[email protected]> Date: Fri Oct 11 13:07:37 2024 +0100 mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves [ Upstream commit 281dd25c1a018261a04d1b8bf41a0674000bfe38 ] Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to fail even though free pages are available in the highatomic reserves. GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock() since it's only run from reclaim. Given that such allocations will pass the watermarks in __zone_watermark_unusable_free(), it makes sense to fallback to highatomic reserves the same way that ALLOC_OOM can. This fixes order-0 page allocation failures observed on Cloudflare's fleet when handling network packets: kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-7 CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G O 6.6.43-CUSTOM #1 Hardware name: MACHINE Call Trace: <IRQ> dump_stack_lvl+0x3c/0x50 warn_alloc+0x13a/0x1c0 __alloc_pages_slowpath.constprop.0+0xc9d/0xd10 __alloc_pages+0x327/0x340 __napi_alloc_skb+0x16d/0x1f0 bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en] bnxt_rx_pkt+0x201/0x15e0 [bnxt_en] __bnxt_poll_work+0x156/0x2b0 [bnxt_en] bnxt_poll+0xd9/0x1c0 [bnxt_en] __napi_poll+0x2b/0x1b0 bpf_trampoline_6442524138+0x7d/0x1000 __napi_poll+0x5/0x1b0 net_rx_action+0x342/0x740 handle_softirqs+0xcf/0x2b0 irq_exit_rcu+0x6c/0x90 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> [[email protected]: update comment] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/ Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs") Signed-off-by: Matt Fleming <[email protected]> Suggested-by: Vlastimil Babka <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 4882a352b5df897c30f9d64fba340a219a6604d0 Author: Alexander Usyskin <[email protected]> Date: Tue Oct 15 15:31:57 2024 +0300 mei: use kvmalloc for read buffer [ Upstream commit 4adf613e01bf99e1739f6ff3e162ad5b7d578d1a ] Read buffer is allocated according to max message size, reported by the firmware and may reach 64K in systems with pxp client. Contiguous 64k allocation may fail under memory pressure. Read buffer is used as in-driver message storage and not required to be contiguous. Use kvmalloc to allow kernel to allocate non-contiguous memory. Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.") Cc: stable <[email protected]> Reported-by: Rohit Agarwal <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Tested-by: Brian Geffon <[email protected]> Signed-off-by: Alexander Usyskin <[email protected]> Acked-by: Tomas Winkler <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit cb8b81ad3e893a6d18dcdd3754cc2ea2a42c0136 Author: Matthieu Baerts (NGI0) <[email protected]> Date: Mon Oct 21 12:25:26 2024 +0200 mptcp: init: protect sched with rcu_read_lock [ Upstream commit 3deb12c788c385e17142ce6ec50f769852fcec65 ] Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT creates this splat when an MPTCP socket is created: ============================= WARNING: suspicious RCU usage 6.12.0-rc2+ #11 Not tainted ----------------------------- net/mptcp/sched.c:44 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by mptcp_connect/176. stack backtrace: CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7)) mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1)) ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28) inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386) ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1)) __sock_create (net/socket.c:1576) __sys_socket (net/socket.c:1671) ? __pfx___sys_socket (net/socket.c:1712) ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1)) __x64_sys_socket (net/socket.c:1728) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's because when the socket is initialised, rcu_read_lock() is not used despite the explicit comment written above the declaration of mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the warning. Fixes: 1730b2b2c5a5 ("mptcp: add sched in mptcp_sock") Cc: [email protected] Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523 Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 4f7ffa83fa79dd52efbaef366c850aaaae06a469 Author: Hugh Dickins <[email protected]> Date: Sun Oct 27 15:23:23 2024 -0700 iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP [ Upstream commit c749d9b7ebbc5716af7a95f7768634b30d9446ec ] generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem, on huge=always tmpfs, issues a warning and then hangs (interruptibly): WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9 CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2 ... copy_page_from_iter_atomic+0xa6/0x5ec generic_perform_write+0xf6/0x1b4 shmem_file_write_iter+0x54/0x67 Fix copy_page_from_iter_atomic() by limiting it in that case (include/linux/skbuff.h skb_frag_must_loop() does similar). But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too surprising, has outlived its usefulness, and should just be removed? Fixes: 908a1ad89466 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()") Signed-off-by: Hugh Dickins <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Christoph Hellwig <[email protected]> Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit ade91f6e9848b370add44d89c976e070ccb492ef Author: Shawn Wang <[email protected]> Date: Fri Oct 25 10:22:08 2024 +0800 sched/numa: Fix the potential null pointer dereference in task_numa_work() [ Upstream commit 9c70b2a33cd2aa6a5a59c5523ef053bd42265209 ] When running stress-ng-vm-segv test, we found a null pointer dereference error in task_numa_work(). Here is the backtrace: [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ...... [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se ...... [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--) [323676.067115] pc : vma_migratable+0x1c/0xd0 [323676.067122] lr : task_numa_work+0x1ec/0x4e0 [323676.067127] sp : ffff8000ada73d20 [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010 [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000 [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000 [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8 [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035 [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8 [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4 [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001 [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000 [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000 [323676.067152] Call trace: [323676.067153] vma_migratable+0x1c/0xd0 [323676.067155] task_numa_work+0x1ec/0x4e0 [323676.067157] task_work_run+0x78/0xd8 [323676.067161] do_notify_resume+0x1ec/0x290 [323676.067163] el0_svc+0x150/0x160 [323676.067167] el0t_64_sync_handler+0xf8/0x128 [323676.067170] el0t_64_sync+0x17c/0x180 [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000) [323676.067177] SMP: stopping secondary CPUs [323676.070184] Starting crashdump kernel... stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error handling function of the system, which tries to cause a SIGSEGV error on return from unmapping the whole address space of the child process. Normally this program will not cause kernel crashes. But before the munmap system call returns to user mode, a potential task_numa_work() for numa balancing could be added and executed. In this scenario, since the child process has no vma after munmap, the vma_next() in task_numa_work() will return a null pointer even if the vma iterator restarts from 0. Recheck the vma pointer before dereferencing it in task_numa_work(). Fixes: 214dbc428137 ("sched: convert to vma iterator") Signed-off-by: Shawn Wang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] # v6.2+ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> commit 8c9a1ec39c698cbc38f4efa9113185f885137f8b Author: Dan Williams <[email protected]> Date: Tue Oct 22 18:43:40 2024 -0700 cxl/acpi: Ensure ports ready at cxl_acpi_probe() return [ Upstream commit 48f62d38a07d464a499fa834638afcfd2b68f852 ] In order to ensure root CXL ports are enabled upon cxl_acpi_probe() when the 'cxl_port' driver is built as a module, arrange for the module to be pre-loaded or built-in. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/[email protected] [1] Signed-off-by: Dan Williams <[email protected]> Tested-by: Gregory Price <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit a9ed67f39f888bb6e5729112ad45f15d9c5a3ef8 Author: Dan Williams <[email protected]> Date: Tue Oct 22 18:43:32 2024 -0700 cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() [ Upstream commit 3d6ebf16438de5d712030fefbb4182b46373d677 ] It turns out since its original introduction, pre-2.6.12, bus_rescan_devices() has skipped devices that might be in the process of attaching or detaching from their driver. For CXL this behavior is unwanted and expects that cxl_bus_rescan() is a probe barrier. That behavior is simple enough to achieve with bus_for_each_dev() paired with call to device_attach(), and it is unclear why bus_rescan_devices() took the position of lockless consumption of dev->driver which is racy. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/[email protected] [1] Signed-off-by: Dan Williams <[email protected]> Tested-by: Gregory Price <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit d210bc87cc4fdde62f757002530a08c3d109d94a Author: Chunyan Zhang <[email protected]> Date: Tue Oct 8 17:41:39 2024 +0800 riscv: Remove duplicated GET_RM [ Upstream commit 164f66de6bb6ef454893f193c898dc8f1da6d18b ] The macro GET_RM defined twice in this file, one can be removed. Reviewed-by: Alexandre Ghiti <[email protected]> Signed-off-by: Chunyan Zhang <[email protected]> Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 6d84e1b2e5ac04511e68bcf5577fc8369e73f4ed Author: Chunyan Zhang <[email protected]> Date: Tue Oct 8 17:41:38 2024 +0800 riscv: Remove unused GENERATING_ASM_OFFSETS [ Upstream commit 46d4e5ac6f2f801f97bcd0ec82365969197dc9b1 ] The macro is not used in the current version of kernel, it looks like can be removed to avoid a build warning: ../arch/riscv/kernel/asm-offsets.c: At top level: ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros] 7 | #define GENERATING_ASM_OFFSETS Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()") Cc: [email protected] Reviewed-by: Alexandre Ghiti <[email protected]> Tested-by: Alexandre Ghiti <[email protected]> Signed-off-by: Chunyan Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit a63ba17207c50da91b19150b6cde09d199b34c2c Author: WangYuli <[email protected]> Date: Thu Oct 17 11:20:10 2024 +0800 riscv: Use '%u' to format the output of 'cpu' [ Upstream commit e0872ab72630dada3ae055bfa410bf463ff1d1e0 ] 'cpu' is an unsigned integer, so its conversion specifier should be %u, not %d. Suggested-by: Wentao Guan <[email protected]> Suggested-by: Maciej W. Rozycki <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: WangYuli <[email protected]> Reviewed-by: Charlie Jenkins <[email protected]> Tested-by: Charlie Jenkins <[email protected]> Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 909e71f28e9615410f52fca1b54acfd3d61c61c2 Author: Heinrich Schuchardt <[email protected]> Date: Sun Sep 29 16:02:33 2024 +0200 riscv: efi: Set NX compat flag in PE/COFF header [ Upstream commit d41373a4b910961df5a5e3527d7bde6ad45ca438 ] The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the EFI binary does not rely on pages that are both executable and writable. The flag is used by some distro versions of GRUB to decide if the EFI binary may be executed. As the Linux kernel neither has RWX sections nor needs RWX pages for relocation we should set the flag. Cc: Ard Biesheuvel <[email protected]> Cc: <[email protected]> Signed-off-by: Heinrich Schuchardt <[email protected]> Reviewed-by: Emil Renner Berthing <[email protected]> Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub") Acked-by: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 58e78589ade880330e359587bb50b1474f43aa12 Author: Kailang Yang <[email protected]> Date: Fri Oct 18 13:53:24 2024 +0800 ALSA: hda/realtek: Limit internal Mic boost on Dell platform [ Upstream commit 78e7be018784934081afec77f96d49a2483f9188 ] Dell want to limit internal Mic boost on all Dell platform. Signed-off-by: Kailang Yang <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit ceec8ad09135c27890cdee5a9bb0bf5f58c23720 Author: Dmitry Torokhov <[email protected]> Date: Fri Oct 18 17:17:48 2024 -0700 Input: edt-ft5x06 - fix regmap leak when probe fails [ Upstream commit bffdf9d7e51a7be8eeaac2ccf9e54a5fde01ff65 ] The driver neglects to free the instance of I2C regmap constructed at the beginning of the edt_ft5x06_ts_probe() method when probe fails. Additionally edt_ft5x06_ts_remove() is freeing the regmap too early, before the rest of the device resources that are managed by devm are released. Fix this by installing a custom devm action that will ensure that the regmap is released at the right time during normal teardown as well as in case of probe failure. Note that devm_regmap_init_i2c() could not be used because the driver may replace the original regmap with a regmap specific for M06 devices in the middle of the probe, and using devm_regmap_init_i2c() would result in releasing the M06 regmap too early. Reported-by: Li Zetao <[email protected]> Fixes: 9dfd9708ffba ("Input: edt-ft5x06 - convert to use regmap API") Cc: [email protected] Reviewed-by: Oliver Graute <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit c19a0c171d37f86ab7267c638d475321fd9f0b77 Author: Alexandre Ghiti <[email protected]> Date: Wed Oct 16 10:36:24 2024 +0200 riscv: vdso: Prevent the compiler from inserting calls to memset() [ Upstream commit bf40167d54d55d4b54d0103713d86a8638fb9290 ] The compiler is smart enough to insert a call to memset() in riscv_vdso_get_cpus(), which generates a dynamic relocation. So prevent this by using -fno-builtin option. Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API") Cc: [email protected] Signed-off-by: Alexandre Ghiti <[email protected]> Reviewed-by: Guo Ren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit e79c1f1c9100b4adc91c6512985db2cc961aafaa Author: Frank Li <[email protected]> Date: Wed Oct 23 16:30:32 2024 -0400 spi: spi-fsl-dspi: Fix crash when not using GPIO chip select [ Upstream commit 25f00a13dccf8e45441265768de46c8bf58e08f6 ] Add check for the return value of spi_get_csgpiod() to avoid passing a NULL pointer to gpiod_direction_output(), preventing a crash when GPIO chip select is not used. Fix below crash: [ 4.251960] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 4.260762] Mem abort info: [ 4.263556] ESR = 0x0000000096000004 [ 4.267308] EC = 0x25: DABT (current EL), IL = 32 bits [ 4.272624] SET = 0, FnV = 0 [ 4.275681] EA = 0, S1PTW = 0 [ 4.278822] FSC = 0x04: level 0 translation fault [ 4.283704] Data abort info: [ 4.286583] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 4.292074] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 4.297130] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 4.302445] [0000000000000000] user address but active_mm is swapper [ 4.308805] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 4.315072] Modules linked in: [ 4.318124] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc4-next-20241023-00008-ga20ec42c5fc1 #359 [ 4.328130] Hardware name: LS1046A QDS Board (DT) [ 4.332832] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.339794] pc : gpiod_direction_output+0x34/0x5c [ 4.344505] lr : gpiod_direction_output+0x18/0x5c [ 4.349208] sp : ffff80008003b8f0 [ 4.352517] x29: ffff80008003b8f0 x28: 0000000000000000 x27: ffffc96bcc7e9068 [ 4.359659] x26: ffffc96bcc6e00b0 x25: ffffc96bcc598398 x24: ffff447400132810 [ 4.366800] x23: 0000000000000000 x22: 0000000011e1a300 x21: 0000000000020002 [ 4.373940] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff [ 4.381081] x17: ffff44740016e600 x16: 0000000500000003 x15: 0000000000000007 [ 4.388221] x14: 0000000000989680 x13: 0000000000020000 x12: 000000000000001e [ 4.395362] x11: 0044b82fa09b5a53 x10: 0000000000000019 x9 : 0000000000000008 [ 4.402502] x8 : 0000000000000002 x7 : 0000000000000007 …

commit cf96b8e upstream. ASan reports a memory leak caused by evlist not being deleted on exit in perf-report, perf-script and perf-data. The problem is caused by evlist->session not being deleted, which is allocated in perf_session__read_header, called in perf_session__new if perf_data is in read mode. In case of write mode, the session->evlist is filled by the caller. This patch solves the problem by calling evlist__delete in perf_session__delete if perf_data is in read mode. Changes in v2: - call evlist__delete from within perf_session__delete v1: https://lore.kernel.org/lkml/[email protected]/ ASan report follows: $ ./perf script report flamegraph ================================================================= ==227640==ERROR: LeakSanitizer: detected memory leaks <SNIP unrelated> Indirect leak of 2704 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 gregkh#2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26 gregkh#3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20 gregkh#4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 gregkh#5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 gregkh#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 gregkh#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 gregkh#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 gregkh#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 gregkh#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 gregkh#11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 568 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 gregkh#2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24 gregkh#3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9 gregkh#4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11 gregkh#5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 gregkh#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 gregkh#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 gregkh#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 gregkh#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 gregkh#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 gregkh#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 gregkh#12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 264 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 gregkh#2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23 gregkh#3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21 gregkh#4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7 gregkh#5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 gregkh#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 gregkh#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 gregkh#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 gregkh#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 gregkh#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 gregkh#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 gregkh#12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) gregkh#1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 gregkh#2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14 gregkh#3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7 gregkh#4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 gregkh#5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 gregkh#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 gregkh#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 gregkh#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 gregkh#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 gregkh#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 gregkh#11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 7 byte(s) in 1 object(s) allocated from: #0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207) gregkh#1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16 gregkh#2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3 gregkh#3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9 gregkh#4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9 gregkh#5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2 gregkh#6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 gregkh#7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 gregkh#8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 gregkh#9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 gregkh#10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 gregkh#11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 gregkh#12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 gregkh#13 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s). Signed-off-by: Riccardo Mancini <[email protected]> Acked-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: [email protected] # 5.10.228 Signed-off-by: Shuai Xue <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Daniel Machon says: ==================== net: sparx5: add support for lan969x switch device == Description: This series is the second of a multi-part series, that prepares and adds support for the new lan969x switch driver. The upstreaming efforts is split into multiple series (might change a bit as we go along): 1) Prepare the Sparx5 driver for lan969x (merged) --> 2) add support lan969x (same basic features as Sparx5 provides excl. FDMA and VCAP). 3) Add support for lan969x VCAP, FDMA and RGMII == Lan969x in short: The lan969x Ethernet switch family [1] provides a rich set of switching features and port configurations (up to 30 ports) from 10Mbps to 10Gbps, with support for RGMII, SGMII, QSGMII, USGMII, and USXGMII, ideal for industrial & process automation infrastructure applications, transport, grid automation, power substation automation, and ring & intra-ring topologies. The LAN969x family is hardware and software compatible and scalable supporting 46Gbps to 102Gbps switch bandwidths. == Preparing Sparx5 for lan969x: The main preparation work for lan969x has already been merged [1]. After this series is applied, lan969x will have the same functionality as Sparx5, except for VCAP and FDMA support. QoS features that requires the VCAP (e.g. PSFP, port mirroring) will obviously not work until VCAP support is added later. == Patch breakdown: Patch gregkh#1-gregkh#4 do some preparation work for lan969x Patch gregkh#5 adds new registers required by lan969x Patch gregkh#6 adds initial match data for all lan969x targets Patch gregkh#7 defines the lan969x register differences Patch gregkh#8 adds lan969x constants to match data Patch gregkh#9 adds some lan969x ops in bulk Patch gregkh#10 adds PTP function to ops Patch gregkh#11 adds lan969x_calendar.c for calculating the calendar Patch gregkh#12 makes additional use of the is_sparx5() macro to branch out in certain places. Patch gregkh#13 documents lan969x in the dt-bindings Patch gregkh#14 adds lan969x compatible string to sparx5 driver Patch gregkh#15 introduces new concept of per-target features [1] https://lore.kernel.org/netdev/20241004-b4-sparx5-lan969x-switch-driver-v2-0-d3290f581663@microchip.com/ v1: https://lore.kernel.org/20241021-sparx5-lan969x-switch-driver-2-v1-0-c8c49ef21e0f@microchip.com ==================== Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-0-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <[email protected]>

This commit provides a watchdog timer that sets a limit of how long a single sub-test could run: - if sub-test runs for 10 seconds, the name of the test is printed (currently the name of the test is printed only after it finishes); - if sub-test runs for 120 seconds, the running thread is terminated with SIGSEGV (to trigger crash_handler() and get a stack trace). Specifically: - the timer is armed on each call to run_one_test(); - re-armed at each call to test__start_subtest(); - is stopped when exiting run_one_test(). Default timeout could be overridden using '-w' or '--watchdog-timeout' options. Value 0 can be used to turn the timer off. Here is an example execution: $ ./ssh-exec.sh ./test_progs -w 5 -t \ send_signal/send_signal_perf_thread_remote,send_signal/send_signal_nmi_thread_remote WATCHDOG: test case send_signal/send_signal_nmi_thread_remote executes for 5 seconds, terminating with SIGSEGV Caught signal gregkh#11! Stack trace: ./test_progs(crash_handler+0x1f)[0x9049ef] /lib64/libc.so.6(+0x40d00)[0x7f1f1184fd00] /lib64/libc.so.6(read+0x4a)[0x7f1f1191cc4a] ./test_progs[0x720dd3] ./test_progs[0x71ef7a] ./test_progs(test_send_signal+0x1db)[0x71edeb] ./test_progs[0x9066c5] ./test_progs(main+0x5ed)[0x9054ad] /lib64/libc.so.6(+0x2a088)[0x7f1f11839088] /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f1f1183914b] ./test_progs(_start+0x25)[0x527385] #292 send_signal:FAIL test_send_signal_common:PASS:reading pipe 0 nsec test_send_signal_common:PASS:reading pipe error: size 0 0 nsec test_send_signal_common:PASS:incorrect result 0 nsec test_send_signal_common:PASS:pipe_write 0 nsec test_send_signal_common:PASS:setpriority 0 nsec Timer is implemented using timer_{create,start} librt API. Internally librt uses pthreads for SIGEV_THREAD timers, so this change adds a background timer thread to the test process. Because of this a few checks in tests 'bpf_iter' and 'iters' need an update to account for an extra thread. For parallelized scenario the watchdog is also created for each worker fork. If one of the workers gets stuck, it would be terminated by a watchdog. In theory, this might lead to a scenario when all worker threads are exhausted, however this should not be a problem for server_main(), as it would exit with some of the tests not run. Signed-off-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

commit 5b188cc upstream. Disable strict aliasing, as has been done in the kernel proper for decades (literally since before git history) to fix issues where gcc will optimize away loads in code that looks 100% correct, but is _technically_ undefined behavior, and thus can be thrown away by the compiler. E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long) pointer to a u64 (unsigned long long) pointer when setting PMCR.N via u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores the result and uses the original PMCR. The issue is most easily observed by making set_pmcr_n() noinline and wrapping the call with printf(), e.g. sans comments, for this code: printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); set_pmcr_n(&pmcr, pmcr_n); printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); gcc-13 generates: 0000000000401c90 <set_pmcr_n>: 401c90: f9400002 ldr x2, [x0] 401c94: b3751022 bfi x2, x1, #11, #5 401c98: f9000002 str x2, [x0] 401c9c: d65f03c0 ret 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>: 402724: aa1403e3 mov x3, x20 402728: aa1503e2 mov x2, x21 40272c: aa1603e0 mov x0, x22 402730: aa1503e1 mov x1, x21 402734: 940060ff bl 41ab30 <_IO_printf> 402738: aa1403e1 mov x1, x20 40273c: 910183e0 add x0, sp, #0x60 402740: 97fffd54 bl 401c90 <set_pmcr_n> 402744: aa1403e3 mov x3, x20 402748: aa1503e2 mov x2, x21 40274c: aa1503e1 mov x1, x21 402750: aa1603e0 mov x0, x22 402754: 940060f7 bl 41ab30 <_IO_printf> with the value stored in [sp + 0x60] ignored by both printf() above and in the test proper, resulting in a false failure due to vcpu_set_reg() simply storing the original value, not the intended value. $ ./vpmu_counter_access Random seed: 0x6b8b4567 orig = 3040, next = 3040, want = 0 orig = 3040, next = 3040, want = 0 ==== Test Assertion Failure ==== aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr) pid=71578 tid=71578 errno=9 - Bad file descriptor 1 0x400673: run_access_test at vpmu_counter_access.c:522 2 (inlined by) main at vpmu_counter_access.c:643 3 0x4132d7: __libc_start_call_main at libc-start.o:0 4 0x413653: __libc_start_main at ??:0 5 0x40106f: _start at ??:0 Failed to update PMCR.N to 0 (received: 6) Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n() is inlined in its sole caller. Cc: [email protected] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912 Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9b5aad3a7498c261116a0251fe57f14ba9c4c6cf Author: Greg Kroah-Hartman <[email protected]> Date: Fri Nov 8 16:28:28 2024 +0100 Linux 6.6.60 Link: https://lore.kernel.org/r/[email protected] Tested-by: SeongJae Park <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Peter Schneider <[email protected]> Tested-by: Takeshi Ogasawara <[email protected]> Tested-by: Jon Hunter <[email protected]> Tested-by: Florian Fainelli <[email protected]> Tested-by: Ron Economos <[email protected]> Tested-by: Hardik Garg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit cc082e50375a29596153fc3f1f8fc85ad1b0b5b9 Author: Konstantin Komarov <[email protected]> Date: Thu Sep 5 15:03:48 2024 +0300 fs/ntfs3: Sequential field availability check in mi_enum_attr() commit 090f612756a9720ec18b0b130e28be49839d7cb5 upstream. The code is slightly reformatted to consistently check field availability without duplication. Fixes: 556bdf27c2dd ("ntfs3: Add bounds checking to mi_enum_attr()") Signed-off-by: Konstantin Komarov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 10c20d79d59cadfe572480d98cec271a89ffb024 Author: Srinivasan Shanmugam <[email protected]> Date: Mon May 27 20:15:21 2024 +0530 drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing commit 15c2990e0f0108b9c3752d7072a97d45d4283aea upstream. This commit adds null checks for the 'stream' and 'plane' variables in the dcn30_apply_idle_power_optimizations function. These variables were previously assumed to be null at line 922, but they were used later in the code without checking if they were null. This could potentially lead to a null pointer dereference, which would cause a crash. The null checks ensure that 'stream' and 'plane' are not null before they are used, preventing potential crashes. Fixes the below static smatch checker: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:938 dcn30_apply_idle_power_optimizations() error: we previously assumed 'stream' could be null (see line 922) drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:940 dcn30_apply_idle_power_optimizations() error: we previously assumed 'plane' could be null (see line 922) Cc: Tom Chung <[email protected]> Cc: Nicholas Kazlauskas <[email protected]> Cc: Bhawanpreet Lakha <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Roman Li <[email protected]> Cc: Hersen Wu <[email protected]> Cc: Alex Hung <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Harry Wentland <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Signed-off-by: Alex Deucher <[email protected]> [Xiangyu: Modified file path to backport this commit] Signed-off-by: Xiangyu Chen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit e979a6a626abf1358a5bb79219eea82ac160d3d3 Author: Peter Ujfalusi <[email protected]> Date: Tue Sep 19 13:31:15 2023 +0300 ASoC: SOF: ipc4-control: Add support for ALSA enum control commit 07a866a41982c896dc46476f57d209a200602946 upstream. Enum controls use generic param_id and a generic struct where the data is passed to the firmware. Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 3facc0417d3d7b3ba5822e74155bcb1267ce62c1 Author: Peter Ujfalusi <[email protected]> Date: Tue Sep 19 13:31:14 2023 +0300 ASoC: SOF: ipc4-control: Add support for ALSA switch control commit 4a2fd607b7ca6128ee3532161505da7624197f55 upstream. Volume controls with a max value of 1 are switches. Switch controls use generic param_id and a generic struct where the data is passed to the firmware. Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit f01d8fc623711046e1efee00827bff6d5882cdfd Author: Peter Ujfalusi <[email protected]> Date: Tue Sep 19 13:31:13 2023 +0300 ASoC: SOF: ipc4-topology: Add definition for generic switch/enum control commit 060a07cd9bc69eba2da33ed96b1fa69ead60bab1 upstream. Currently IPC4 has no notion of a switch or enum type of control which is a generic concept in ALSA. The generic support for these control types will be as follows: - large config is used to send the channel-value par array - param_id of a SWITCH type is 200 - param_id of an ENUM type is 201 Each module need to support a switch or/and enum must handle these universal param_ids. The message payload is described by struct sof_ipc4_control_msg_payload. Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit d54afaef6570c277070c3cafe1ed73dcdc129e0a Author: Chuck Lever <[email protected]> Date: Tue Sep 19 11:35:15 2023 -0400 SUNRPC: Remove BUG_ON call sites commit 789ce196a31dd13276076762204bee87df893e53 upstream. There is no need to take down the whole system for these assertions. I'd rather not attempt a heroic save here, as some bug has occurred that has left the transport data structures in an unknown state. Just warn and then leak the left-over resources. Acked-by: Christian Brauner <[email protected]> Reviewed-by: NeilBrown <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Dominique Martinet <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 27a58a19bd20a7afe369da2ce6d4ebea70768acd Author: Michael Walle <[email protected]> Date: Fri Jun 21 14:09:29 2024 +0200 mtd: spi-nor: winbond: fix w25q128 regression commit d35df77707bf5ae1221b5ba1c8a88cf4fcdd4901 upstream. Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128") removed the flags for non-SFDP devices. It was assumed that it wasn't in use anymore. This wasn't true. Add the no_sfdp_flags as well as the size again. We add the additional flags for dual and quad read because they have been reported to work properly by Hartmut using both older and newer versions of this flash, the similar flashes with 64Mbit and 256Mbit already have these flags and because it will (luckily) trigger our legacy SFDP parsing, so newer versions with SFDP support will still get the parameters from the SFDP tables. Reported-by: Hartmut Birr <[email protected]> Closes: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/ Fixes: 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128") Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: Michael Walle <[email protected]> Acked-by: Tudor Ambarus <[email protected]> Reviewed-by: Esben Haabendal <[email protected]> Reviewed-by: Pratyush Yadav <[email protected]> Signed-off-by: Pratyush Yadav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] [Backported to v6.6 - vastly different due to upstream changes] Reviewed-by: Tudor Ambarus <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 3d544942c0010feedc048b048ee0c35d2d921100 Author: David Hildenbrand <[email protected]> Date: Fri Oct 11 12:24:45 2024 +0200 mm: don't install PMD mappings when THPs are disabled by the hw/process/vma commit 2b0f922323ccfa76219bcaacd35cd50aeaa13592 upstream. We (or rather, readahead logic :) ) might be allocating a THP in the pagecache and then try mapping it into a process that explicitly disabled THP: we might end up installing PMD mappings. This is a problem for s390x KVM, which explicitly remaps all PMD-mapped THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before starting the VM. For example, starting a VM backed on a file system with large folios supported makes the VM crash when the VM tries accessing such a mapping using KVM. Is it also a problem when the HW disabled THP using TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case without X86_FEATURE_PSE. In the future, we might be able to do better on s390x and only disallow PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED really wants. For now, fix it by essentially performing the same check as would be done in __thp_vma_allowable_orders() or in shmem code, where this works as expected, and disallow PMD mappings, making us fallback to PTE mappings. Link: https://lkml.kernel.org/r/[email protected] Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: David Hildenbrand <[email protected]> Reported-by: Leo Fu <[email protected]> Tested-by: Thomas Huth <[email protected]> Cc: Thomas Huth <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 02ec4b3bba49e8d3abb25a3feba6875cae12da92 Author: Kefeng Wang <[email protected]> Date: Fri Oct 11 12:24:44 2024 +0200 mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw() commit 963756aac1f011d904ddd9548ae82286d3a91f96 upstream. Patch series "mm: don't install PMD mappings when THPs are disabled by the hw/process/vma". During testing, it was found that we can get PMD mappings in processes where THP (and more precisely, PMD mappings) are supposed to be disabled. While it works as expected for anon+shmem, the pagecache is the problematic bit. For s390 KVM this currently means that a VM backed by a file located on filesystem with large folio support can crash when KVM tries accessing the problematic page, because the readahead logic might decide to use a PMD-sized THP and faulting it into the page tables will install a PMD mapping, something that s390 KVM cannot tolerate. This might also be a problem with HW that does not support PMD mappings, but I did not try reproducing it. Fix it by respecting the ways to disable THPs when deciding whether we can install a PMD mapping. khugepaged should already be taking care of not collapsing if THPs are effectively disabled for the hw/process/vma. This patch (of 2): Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by shmem_allowable_huge_orders() and __thp_vma_allowable_orders(). [[email protected]: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ] Link: https://lkml.kernel.org/r/[email protected] Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Reported-by: Leo Fu <[email protected]> Tested-by: Thomas Huth <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: Boqiao Fu <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit fc621e7a043de346c33bd7ae7e2e0c651d6152ef Author: Johannes Berg <[email protected]> Date: Wed Oct 23 09:17:44 2024 +0200 wifi: iwlwifi: mvm: fix 6 GHz scan construction commit 7245012f0f496162dd95d888ed2ceb5a35170f1a upstream. If more than 255 colocated APs exist for the set of all APs found during 2.4/5 GHz scanning, then the 6 GHz scan construction will loop forever since the loop variable has type u8, which can never reach the number found when that's bigger than 255, and is stored in a u32 variable. Also move it into the loops to have a smaller scope. Using a u32 there is fine, we limit the number of APs in the scan list and each has a limit on the number of RNR entries due to the frame size. With a limit of 1000 scan results, a frame size upper bound of 4096 (really it's more like ~2300) and a TBTT entry size of at least 11, we get an upper bound for the number of ~372k, well in the bounds of a u32. Cc: [email protected] Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375 Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit f2f1fa446676c21edb777e6d2bc4fa8f956fab68 Author: Ryusuke Konishi <[email protected]> Date: Fri Oct 18 04:33:10 2024 +0900 nilfs2: fix kernel bug due to missing clearing of checked flag commit 41e192ad2779cae0102879612dfe46726e4396aa upstream. Syzbot reported that in directory operations after nilfs2 detects filesystem corruption and degrades to read-only, __block_write_begin_int(), which is called to prepare block writes, may fail the BUG_ON check for accesses exceeding the folio/page size, triggering a kernel bug. This was found to be because the "checked" flag of a page/folio was not cleared when it was discarded by nilfs2's own routine, which causes the sanity check of directory entries to be skipped when the directory page/folio is reloaded. So, fix that. This was necessary when the use of nilfs2's own page discard routine was applied to more than just metadata files. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption") Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959 Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit a53c2d847627b790fb3bd8b00e02c247941b17e0 Author: Zong-Zhe Yang <[email protected]> Date: Mon Jun 17 19:52:17 2024 +0800 wifi: mac80211: fix NULL dereference at band check in starting tx ba session commit 021d53a3d87eeb9dbba524ac515651242a2a7e3b upstream. In MLD connection, link_data/link_conf are dynamically allocated. They don't point to vif->bss_conf. So, there will be no chanreq assigned to vif->bss_conf and then the chan will be NULL. Tweak the code to check ht_supported/vht_supported/has_he/has_eht on sta deflink. Crash log (with rtw89 version under MLO development): [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 9890.526102] #PF: supervisor read access in kernel mode [ 9890.526105] #PF: error_code(0x0000) - not-present page [ 9890.526109] PGD 0 P4D 0 [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G OE 6.9.0 #1 [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018 [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core] [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211 [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3 All code ======== 0: f7 e8 imul %eax 2: d5 (bad) 3: 93 xchg %eax,%ebx 4: 3e ea ds (bad) 6: 48 83 c4 28 add $0x28,%rsp a: 89 d8 mov %ebx,%eax c: 5b pop %rbx d: 41 5c pop %r12 f: 41 5d pop %r13 11: 41 5e pop %r14 13: 41 5f pop %r15 15: 5d pop %rbp 16: c3 retq 17: cc int3 18: cc int3 19: cc int3 1a: cc int3 1b: 49 8b 84 24 e0 f1 ff mov -0xe20(%r12),%rax 22: ff 23: 48 8b 80 90 1b 00 00 mov 0x1b90(%rax),%rax 2a:* 83 38 03 cmpl $0x3,(%rax) <-- trapping instruction 2d: 0f 84 37 fe ff ff je 0xfffffffffffffe6a 33: bb ea ff ff ff mov $0xffffffea,%ebx 38: eb cc jmp 0x6 3a: 49 rex.WB 3b: 8b .byte 0x8b 3c: 84 24 10 test %ah,(%rax,%rdx,1) 3f: f3 repz Code starting with the faulting instruction =========================================== 0: 83 38 03 cmpl $0x3,(%rax) 3: 0f 84 37 fe ff ff je 0xfffffffffffffe40 9: bb ea ff ff ff mov $0xffffffea,%ebx e: eb cc jmp 0xffffffffffffffdc 10: 49 rex.WB 11: 8b .byte 0x8b 12: 84 24 10 test %ah,(%rax,%rdx,1) 15: f3 repz [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246 [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8 [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685 [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873 [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70 [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000 [ 9890.526313] FS: 0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000 [ 9890.526316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0 [ 9890.526321] Call Trace: [ 9890.526324] <TASK> [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479) [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713) [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3)) [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211 Signed-off-by: Zong-Zhe Yang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Xiangyu Chen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 6a91a5816b289018e0b42a25444c0b4f8c637dca Author: Pavel Begunkov <[email protected]> Date: Wed Apr 10 02:26:54 2024 +0100 io_uring: always lock __io_cqring_overflow_flush commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream. Conditional locking is never great, in case of __io_cqring_overflow_flush(), which is a slow path, it's not justified. Don't handle IOPOLL separately, always grab uring_lock for overflow flushing. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit e3fb0e6afcc399660770428a35162b4880e2e14e Author: Haibo Chen <[email protected]> Date: Thu Sep 5 17:43:38 2024 +0800 arm64: dts: imx8ulp: correct the flexspi compatible string commit 409dc5196d5b6eb67468a06bf4d2d07d7225a67b upstream. The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has 32 LUTs, so correct the compatible string here, otherwise will meet below error: [ 1.119072] ------------[ cut here ]------------ [ 1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64 [ 1.133239] Modules linked in: [ 1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69 [ 1.146821] Hardware name: NXP i.MX8ULP EVK (DT) [ 1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64 [ 1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64 [ 1.168060] sp : ffff80008002b2a0 [ 1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000 [ 1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000 [ 1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0 [ 1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff [ 1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002 [ 1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00 [ 1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80 [ 1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001 [ 1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6 [ 1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92 [ 1.246267] Call trace: [ 1.248824] nxp_fspi_exec_op+0xb04/0xb64 [ 1.253031] spi_mem_exec_op+0x3a0/0x430 [ 1.257139] spi_nor_read_id+0x80/0xcc [ 1.261065] spi_nor_scan+0x1ec/0xf10 [ 1.264901] spi_nor_probe+0x108/0x2fc [ 1.268828] spi_mem_probe+0x6c/0xbc [ 1.272574] spi_probe+0x84/0xe4 [ 1.275958] really_probe+0xbc/0x29c [ 1.279713] __driver_probe_device+0x78/0x12c [ 1.284277] driver_probe_device+0xd8/0x15c [ 1.288660] __device_attach_driver+0xb8/0x134 [ 1.293316] bus_for_each_drv+0x88/0xe8 [ 1.297337] __device_attach+0xa0/0x190 [ 1.301353] device_initial_probe+0x14/0x20 [ 1.305734] bus_probe_device+0xac/0xb0 [ 1.309752] device_add+0x5d0/0x790 [ 1.313408] __spi_add_device+0x134/0x204 [ 1.317606] of_register_spi_device+0x3b4/0x590 [ 1.322348] spi_register_controller+0x47c/0x754 [ 1.327181] devm_spi_register_controller+0x4c/0xa4 [ 1.332289] nxp_fspi_probe+0x1cc/0x2b0 [ 1.336307] platform_probe+0x68/0xc4 [ 1.340145] really_probe+0xbc/0x29c [ 1.343893] __driver_probe_device+0x78/0x12c [ 1.348457] driver_probe_device+0xd8/0x15c [ 1.352838] __driver_attach+0x90/0x19c [ 1.356857] bus_for_each_dev+0x7c/0xdc [ 1.360877] driver_attach+0x24/0x30 [ 1.364624] bus_add_driver+0xe4/0x208 [ 1.368552] driver_register+0x5c/0x124 [ 1.372573] __platform_driver_register+0x28/0x34 [ 1.377497] nxp_fspi_driver_init+0x1c/0x28 [ 1.381888] do_one_initcall+0x80/0x1c8 [ 1.385908] kernel_init_freeable+0x1c4/0x28c [ 1.390472] kernel_init+0x20/0x1d8 [ 1.394138] ret_from_fork+0x10/0x20 [ 1.397885] ---[ end trace 0000000000000000 ]--- [ 1.407908] ------------[ cut here ]------------ Fixes: ef89fd56bdfc ("arm64: dts: imx8ulp: add flexspi node") Cc: [email protected] Signed-off-by: Haibo Chen <[email protected]> Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> commit 1a49b96c51063d38be296a0c1537928a06f02d6e Author: Gregory Price <[email protected]> Date: Fri Oct 25 10:17:24 2024 -0400 vmscan,migrate: fix page count imbalance on node stats when demoting pages [ Upstream commit 35e41024c4c2b02ef8207f61b9004f6956cf037b ] When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. migrate_pages will decrement the the node's isolated page count, leading to an imbalanced count when invoked from (MG)LRU code. The result is dmesg output like such: $ cat /proc/sys/vm/stat_refresh [77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642 This negative value may impact compaction and reclaim throttling. The following path produces the decrement: shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- decrement This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code. When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION. Link: https://lkml.kernel.org/r/[email protected] Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim") Signed-off-by: Gregory Price <[email protected]> Reviewed-by: Yang Shi <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Wei Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 003d2996964c03dfd34860500428f4cdf1f5879e Author: Jens Axboe <[email protected]> Date: Thu Oct 31 08:05:44 2024 -0600 io_uring/rw: fix missing NOWAIT check for O_DIRECT start write [ Upstream commit 1d60d74e852647255bd8e76f5a22dc42531e4389 ] When io_uring starts a write, it'll call kiocb_start_write() to bump the super block rwsem, preventing any freezes from happening while that write is in-flight. The freeze side will grab that rwsem for writing, excluding any new writers from happening and waiting for existing writes to finish. But io_uring unconditionally uses kiocb_start_write(), which will block if someone is currently attempting to freeze the mount point. This causes a deadlock where freeze is waiting for previous writes to complete, but the previous writes cannot complete, as the task that is supposed to complete them is blocked waiting on starting a new write. This results in the following stuck trace showing that dependency with the write blocked starting a new write: task:fio state:D stack:0 pid:886 tgid:886 ppid:876 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_rwsem_wait+0x1e8/0x3f8 __percpu_down_read+0xe8/0x500 io_write+0xbb8/0xff8 io_issue_sqe+0x10c/0x1020 io_submit_sqes+0x614/0x2110 __arm64_sys_io_uring_enter+0x524/0x1038 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 INFO: task fsfreeze:7364 blocked for more than 15 seconds. Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963 with the attempting freezer stuck trying to grab the rwsem: task:fsfreeze state:D stack:0 pid:7364 tgid:7364 ppid:995 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_down_write+0x2b0/0x680 freeze_super+0x248/0x8a8 do_vfs_ioctl+0x149c/0x1b18 __arm64_sys_ioctl+0xd0/0x1a0 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a blocking grab of the super block rwsem if it isn't set. For normal issue where IOCB_NOWAIT would always be set, this returns -EAGAIN which will have io_uring core issue a blocking attempt of the write. That will in turn also get completions run, ensuring forward progress. Since freezing requires CAP_SYS_ADMIN in the first place, this isn't something that can be triggered by a regular user. Cc: [email protected] # 5.10+ Reported-by: Peter Mann <[email protected]> Link: https://lore.kernel.org/io-uring/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 70bbe8d0a949413df1bb6532fd6b19fbf0f88feb Author: Andrey Konovalov <[email protected]> Date: Tue Oct 22 18:07:06 2024 +0200 kasan: remove vmalloc_percpu test [ Upstream commit 330d8df81f3673d6fb74550bbc9bb159d81b35f7 ] Commit 1a2473f0cbc0 ("kasan: improve vmalloc tests") added the vmalloc_percpu KASAN test with the assumption that __alloc_percpu always uses vmalloc internally, which is tagged by KASAN. However, __alloc_percpu might allocate memory from the first per-CPU chunk, which is not allocated via vmalloc(). As a result, the test might fail. Remove the test until proper KASAN annotation for the per-CPU allocated are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1a2473f0cbc0 ("kasan: improve vmalloc tests") Signed-off-by: Andrey Konovalov <[email protected]> Reported-by: Samuel Holland <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Reported-by: Sabyrzhan Tasbolatov <[email protected]> Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/ Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Sabyrzhan Tasbolatov <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit c60af16e1d6cc2237d58336546d6adfc067b6b8f Author: Vitaliy Shevtsov <[email protected]> Date: Mon Sep 16 22:41:37 2024 +0500 nvmet-auth: assign dh_key to NULL after kfree_sensitive [ Upstream commit d2f551b1f72b4c508ab9298419f6feadc3b5d791 ] ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup() for the same controller. So it's better to nullify it after release on error path in order to avoid double free later in nvmet_destroy_auth(). Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: 7a277c37d352 ("nvmet-auth: Diffie-Hellman key exchange support") Cc: [email protected] Signed-off-by: Vitaliy Shevtsov <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 4a39320977f9c665faa37efaa8093b8e82dd8c41 Author: Christoffer Sandberg <[email protected]> Date: Tue Oct 29 16:16:53 2024 +0100 ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 [ Upstream commit e49370d769e71456db3fbd982e95bab8c69f73e8 ] Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <[email protected]> Signed-off-by: Werner Sembach <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit b42adef85aca72b51eab1a812a79913ff5aeb584 Author: Christoffer Sandberg <[email protected]> Date: Tue Oct 29 16:16:52 2024 +0100 ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3 [ Upstream commit 0b04fbe886b4274c8e5855011233aaa69fec6e75 ] Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <[email protected]> Signed-off-by: Werner Sembach <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 77ddc732416b017180893cbb2356e9f0a414c575 Author: Christoph Hellwig <[email protected]> Date: Wed Oct 23 15:37:22 2024 +0200 xfs: fix finding a last resort AG in xfs_filestream_pick_ag [ Upstream commit dc60992ce76fbc2f71c2674f435ff6bde2108028 ] When the main loop in xfs_filestream_pick_ag fails to find a suitable AG it tries to just pick the online AG. But the loop for that uses args->pag as loop iterator while the later code expects pag to be set. Fix this by reusing the max_pag case for this last resort, and also add a check for impossible case of no AG just to make sure that the uninitialized pag doesn't even escape in theory. Reported-by: [email protected] Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: [email protected] Fixes: f8f1ed1ab3baba ("xfs: return a referenced perag from filestreams allocator") Cc: <[email protected]> # v6.3 Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 8e886e44397ba89f6e8da8471386112b4f5b67b7 Author: Matt Johnston <[email protected]> Date: Tue Oct 22 18:25:14 2024 +0800 mctp i2c: handle NULL header address [ Upstream commit 01e215975fd80af81b5b79f009d49ddd35976c13 ] daddr can be NULL if there is no neighbour table entry present, in that case the tx packet should be dropped. saddr will usually be set by MCTP core, but check for NULL in case a packet is transmitted by a different protocol. Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver") Cc: [email protected] Reported-by: Dung Cao <[email protected]> Signed-off-by: Matt Johnston <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 88f97a4b5843ce21c1286e082c02a5fb4d8eb473 Author: Edward Adam Davis <[email protected]> Date: Wed Oct 16 19:43:47 2024 +0800 ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow [ Upstream commit bc0a2f3a73fcdac651fca64df39306d1e5ebe3b0 ] Syzbot reported a kernel BUG in ocfs2_truncate_inline. There are two reasons for this: first, the parameter value passed is greater than ocfs2_max_inline_data_with_xattr, second, the start and end parameters of ocfs2_truncate_inline are "unsigned int". So, we need to add a sanity check for byte_start and byte_len right before ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater than ocfs2_max_inline_data_with_xattr return -EINVAL. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1afc32b95233 ("ocfs2: Write support for inline data") Signed-off-by: Edward Adam Davis <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7 Reviewed-by: Joseph Qi <[email protected]> Cc: Joel Becker <[email protected]> Cc: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit c117a980185ee3812612e7e453e356a6a4f05305 Author: Sabyrzhan Tasbolatov <[email protected]> Date: Wed Oct 16 20:24:07 2024 +0500 x86/traps: move kmsan check after instrumentation_begin [ Upstream commit 1db272864ff250b5e607283eaec819e1186c8e26 ] During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following: AR built-in.a AR vmlinux.a LD vmlinux.o vmlinux.o: warning: objtool: handle_bug+0x4: call to kmsan_unpoison_entry_regs() leaves .noinstr.text section OBJCOPY modules.builtin.modinfo GEN modules.builtin MODPOST Module.symvers CC .vmlinux.export.o Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes the warning. There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but it has the return condition and if we include it after instrumentation_begin() it results the warning "return with instrumentation enabled", hence, I'm concerned that regs will not be KMSAN unpoisoned if `ud_type == BUG_NONE` is true. Link: https://lkml.kernel.org/r/[email protected] Fixes: ba54d194f8da ("x86/traps: avoid KMSAN bugs originating from handle_bug()") Signed-off-by: Sabyrzhan Tasbolatov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 86ee1845cbbf52eff6d41ce438d5f7e9ab6f4602 Author: Gatlin Newhouse <[email protected]> Date: Wed Jul 24 00:01:55 2024 +0000 x86/traps: Enable UBSAN traps on x86 [ Upstream commit 7424fc6b86c8980a87169e005f5cd4438d18efe6 ] Currently ARM64 extracts which specific sanitizer has caused a trap via encoded data in the trap instruction. Clang on x86 currently encodes the same data in the UD1 instruction but x86 handle_bug() and is_valid_bugaddr() currently only look at UD2. Bring x86 to parity with ARM64, similar to commit 25b84002afb9 ("arm64: Support Clang UBSAN trap codes for better reporting"). See the llvm links for information about the code generation. Enable the reporting of UBSAN sanitizer details on x86 compiled with clang when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate which is encoded by the compiler after the UD1. [ tglx: Simplified it by moving the printk() into handle_bug() ] Signed-off-by: Gatlin Newhouse <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Kees Cook <[email protected]> Link: https://lore.kernel.org/all/[email protected] Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27 Stable-dep-of: 1db272864ff2 ("x86/traps: move kmsan check after instrumentation_begin") Signed-off-by: Sasha Levin <[email protected]> commit b958948ae1cb3e39c48e9f805436fd652103c71e Author: Matt Fleming <[email protected]> Date: Fri Oct 11 13:07:37 2024 +0100 mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves [ Upstream commit 281dd25c1a018261a04d1b8bf41a0674000bfe38 ] Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to fail even though free pages are available in the highatomic reserves. GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock() since it's only run from reclaim. Given that such allocations will pass the watermarks in __zone_watermark_unusable_free(), it makes sense to fallback to highatomic reserves the same way that ALLOC_OOM can. This fixes order-0 page allocation failures observed on Cloudflare's fleet when handling network packets: kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-7 CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G O 6.6.43-CUSTOM #1 Hardware name: MACHINE Call Trace: <IRQ> dump_stack_lvl+0x3c/0x50 warn_alloc+0x13a/0x1c0 __alloc_pages_slowpath.constprop.0+0xc9d/0xd10 __alloc_pages+0x327/0x340 __napi_alloc_skb+0x16d/0x1f0 bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en] bnxt_rx_pkt+0x201/0x15e0 [bnxt_en] __bnxt_poll_work+0x156/0x2b0 [bnxt_en] bnxt_poll+0xd9/0x1c0 [bnxt_en] __napi_poll+0x2b/0x1b0 bpf_trampoline_6442524138+0x7d/0x1000 __napi_poll+0x5/0x1b0 net_rx_action+0x342/0x740 handle_softirqs+0xcf/0x2b0 irq_exit_rcu+0x6c/0x90 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> [[email protected]: update comment] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/ Fixes: 1d91df85f399 ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs") Signed-off-by: Matt Fleming <[email protected]> Suggested-by: Vlastimil Babka <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 4882a352b5df897c30f9d64fba340a219a6604d0 Author: Alexander Usyskin <[email protected]> Date: Tue Oct 15 15:31:57 2024 +0300 mei: use kvmalloc for read buffer [ Upstream commit 4adf613e01bf99e1739f6ff3e162ad5b7d578d1a ] Read buffer is allocated according to max message size, reported by the firmware and may reach 64K in systems with pxp client. Contiguous 64k allocation may fail under memory pressure. Read buffer is used as in-driver message storage and not required to be contiguous. Use kvmalloc to allow kernel to allocate non-contiguous memory. Fixes: 3030dc056459 ("mei: add wrapper for queuing control commands.") Cc: stable <[email protected]> Reported-by: Rohit Agarwal <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Tested-by: Brian Geffon <[email protected]> Signed-off-by: Alexander Usyskin <[email protected]> Acked-by: Tomas Winkler <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit cb8b81ad3e893a6d18dcdd3754cc2ea2a42c0136 Author: Matthieu Baerts (NGI0) <[email protected]> Date: Mon Oct 21 12:25:26 2024 +0200 mptcp: init: protect sched with rcu_read_lock [ Upstream commit 3deb12c788c385e17142ce6ec50f769852fcec65 ] Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT creates this splat when an MPTCP socket is created: ============================= WARNING: suspicious RCU usage 6.12.0-rc2+ #11 Not tainted ----------------------------- net/mptcp/sched.c:44 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by mptcp_connect/176. stack backtrace: CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7)) mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1)) ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28) inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386) ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1)) __sock_create (net/socket.c:1576) __sys_socket (net/socket.c:1671) ? __pfx___sys_socket (net/socket.c:1712) ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1)) __x64_sys_socket (net/socket.c:1728) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's because when the socket is initialised, rcu_read_lock() is not used despite the explicit comment written above the declaration of mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the warning. Fixes: 1730b2b2c5a5 ("mptcp: add sched in mptcp_sock") Cc: [email protected] Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523 Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 4f7ffa83fa79dd52efbaef366c850aaaae06a469 Author: Hugh Dickins <[email protected]> Date: Sun Oct 27 15:23:23 2024 -0700 iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP [ Upstream commit c749d9b7ebbc5716af7a95f7768634b30d9446ec ] generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem, on huge=always tmpfs, issues a warning and then hangs (interruptibly): WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9 CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2 ... copy_page_from_iter_atomic+0xa6/0x5ec generic_perform_write+0xf6/0x1b4 shmem_file_write_iter+0x54/0x67 Fix copy_page_from_iter_atomic() by limiting it in that case (include/linux/skbuff.h skb_frag_must_loop() does similar). But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too surprising, has outlived its usefulness, and should just be removed? Fixes: 908a1ad89466 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()") Signed-off-by: Hugh Dickins <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Christoph Hellwig <[email protected]> Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit ade91f6e9848b370add44d89c976e070ccb492ef Author: Shawn Wang <[email protected]> Date: Fri Oct 25 10:22:08 2024 +0800 sched/numa: Fix the potential null pointer dereference in task_numa_work() [ Upstream commit 9c70b2a33cd2aa6a5a59c5523ef053bd42265209 ] When running stress-ng-vm-segv test, we found a null pointer dereference error in task_numa_work(). Here is the backtrace: [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ...... [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se ...... [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--) [323676.067115] pc : vma_migratable+0x1c/0xd0 [323676.067122] lr : task_numa_work+0x1ec/0x4e0 [323676.067127] sp : ffff8000ada73d20 [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010 [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000 [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000 [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8 [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035 [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8 [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4 [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001 [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000 [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000 [323676.067152] Call trace: [323676.067153] vma_migratable+0x1c/0xd0 [323676.067155] task_numa_work+0x1ec/0x4e0 [323676.067157] task_work_run+0x78/0xd8 [323676.067161] do_notify_resume+0x1ec/0x290 [323676.067163] el0_svc+0x150/0x160 [323676.067167] el0t_64_sync_handler+0xf8/0x128 [323676.067170] el0t_64_sync+0x17c/0x180 [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000) [323676.067177] SMP: stopping secondary CPUs [323676.070184] Starting crashdump kernel... stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error handling function of the system, which tries to cause a SIGSEGV error on return from unmapping the whole address space of the child process. Normally this program will not cause kernel crashes. But before the munmap system call returns to user mode, a potential task_numa_work() for numa balancing could be added and executed. In this scenario, since the child process has no vma after munmap, the vma_next() in task_numa_work() will return a null pointer even if the vma iterator restarts from 0. Recheck the vma pointer before dereferencing it in task_numa_work(). Fixes: 214dbc428137 ("sched: convert to vma iterator") Signed-off-by: Shawn Wang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] # v6.2+ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> commit 8c9a1ec39c698cbc38f4efa9113185f885137f8b Author: Dan Williams <[email protected]> Date: Tue Oct 22 18:43:40 2024 -0700 cxl/acpi: Ensure ports ready at cxl_acpi_probe() return [ Upstream commit 48f62d38a07d464a499fa834638afcfd2b68f852 ] In order to ensure root CXL ports are enabled upon cxl_acpi_probe() when the 'cxl_port' driver is built as a module, arrange for the module to be pre-loaded or built-in. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/[email protected] [1] Signed-off-by: Dan Williams <[email protected]> Tested-by: Gregory Price <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit a9ed67f39f888bb6e5729112ad45f15d9c5a3ef8 Author: Dan Williams <[email protected]> Date: Tue Oct 22 18:43:32 2024 -0700 cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() [ Upstream commit 3d6ebf16438de5d712030fefbb4182b46373d677 ] It turns out since its original introduction, pre-2.6.12, bus_rescan_devices() has skipped devices that might be in the process of attaching or detaching from their driver. For CXL this behavior is unwanted and expects that cxl_bus_rescan() is a probe barrier. That behavior is simple enough to achieve with bus_for_each_dev() paired with call to device_attach(), and it is unclear why bus_rescan_devices() took the position of lockless consumption of dev->driver which is racy. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/[email protected] [1] Signed-off-by: Dan Williams <[email protected]> Tested-by: Gregory Price <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit d210bc87cc4fdde62f757002530a08c3d109d94a Author: Chunyan Zhang <[email protected]> Date: Tue Oct 8 17:41:39 2024 +0800 riscv: Remove duplicated GET_RM [ Upstream commit 164f66de6bb6ef454893f193c898dc8f1da6d18b ] The macro GET_RM defined twice in this file, one can be removed. Reviewed-by: Alexandre Ghiti <[email protected]> Signed-off-by: Chunyan Zhang <[email protected]> Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 6d84e1b2e5ac04511e68bcf5577fc8369e73f4ed Author: Chunyan Zhang <[email protected]> Date: Tue Oct 8 17:41:38 2024 +0800 riscv: Remove unused GENERATING_ASM_OFFSETS [ Upstream commit 46d4e5ac6f2f801f97bcd0ec82365969197dc9b1 ] The macro is not used in the current version of kernel, it looks like can be removed to avoid a build warning: ../arch/riscv/kernel/asm-offsets.c: At top level: ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros] 7 | #define GENERATING_ASM_OFFSETS Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()") Cc: [email protected] Reviewed-by: Alexandre Ghiti <[email protected]> Tested-by: Alexandre Ghiti <[email protected]> Signed-off-by: Chunyan Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit a63ba17207c50da91b19150b6cde09d199b34c2c Author: WangYuli <[email protected]> Date: Thu Oct 17 11:20:10 2024 +0800 riscv: Use '%u' to format the output of 'cpu' [ Upstream commit e0872ab72630dada3ae055bfa410bf463ff1d1e0 ] 'cpu' is an unsigned integer, so its conversion specifier should be %u, not %d. Suggested-by: Wentao Guan <[email protected]> Suggested-by: Maciej W. Rozycki <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: WangYuli <[email protected]> Reviewed-by: Charlie Jenkins <[email protected]> Tested-by: Charlie Jenkins <[email protected]> Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 909e71f28e9615410f52fca1b54acfd3d61c61c2 Author: Heinrich Schuchardt <[email protected]> Date: Sun Sep 29 16:02:33 2024 +0200 riscv: efi: Set NX compat flag in PE/COFF header [ Upstream commit d41373a4b910961df5a5e3527d7bde6ad45ca438 ] The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the EFI binary does not rely on pages that are both executable and writable. The flag is used by some distro versions of GRUB to decide if the EFI binary may be executed. As the Linux kernel neither has RWX sections nor needs RWX pages for relocation we should set the flag. Cc: Ard Biesheuvel <[email protected]> Cc: <[email protected]> Signed-off-by: Heinrich Schuchardt <[email protected]> Reviewed-by: Emil Renner Berthing <[email protected]> Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub") Acked-by: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit 58e78589ade880330e359587bb50b1474f43aa12 Author: Kailang Yang <[email protected]> Date: Fri Oct 18 13:53:24 2024 +0800 ALSA: hda/realtek: Limit internal Mic boost on Dell platform [ Upstream commit 78e7be018784934081afec77f96d49a2483f9188 ] Dell want to limit internal Mic boost on all Dell platform. Signed-off-by: Kailang Yang <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit ceec8ad09135c27890cdee5a9bb0bf5f58c23720 Author: Dmitry Torokhov <[email protected]> Date: Fri Oct 18 17:17:48 2024 -0700 Input: edt-ft5x06 - fix regmap leak when probe fails [ Upstream commit bffdf9d7e51a7be8eeaac2ccf9e54a5fde01ff65 ] The driver neglects to free the instance of I2C regmap constructed at the beginning of the edt_ft5x06_ts_probe() method when probe fails. Additionally edt_ft5x06_ts_remove() is freeing the regmap too early, before the rest of the device resources that are managed by devm are released. Fix this by installing a custom devm action that will ensure that the regmap is released at the right time during normal teardown as well as in case of probe failure. Note that devm_regmap_init_i2c() could not be used because the driver may replace the original regmap with a regmap specific for M06 devices in the middle of the probe, and using devm_regmap_init_i2c() would result in releasing the M06 regmap too early. Reported-by: Li Zetao <[email protected]> Fixes: 9dfd9708ffba ("Input: edt-ft5x06 - convert to use regmap API") Cc: [email protected] Reviewed-by: Oliver Graute <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit c19a0c171d37f86ab7267c638d475321fd9f0b77 Author: Alexandre Ghiti <[email protected]> Date: Wed Oct 16 10:36:24 2024 +0200 riscv: vdso: Prevent the compiler from inserting calls to memset() [ Upstream commit bf40167d54d55d4b54d0103713d86a8638fb9290 ] The compiler is smart enough to insert a call to memset() in riscv_vdso_get_cpus(), which generates a dynamic relocation. So prevent this by using -fno-builtin option. Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API") Cc: [email protected] Signed-off-by: Alexandre Ghiti <[email protected]> Reviewed-by: Guo Ren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]> commit e79c1f1c9100b4adc91c6512985db2cc961aafaa Author: Frank Li <[email protected]> Date: Wed Oct 23 16:30:32 2024 -0400 spi: spi-fsl-dspi: Fix crash when not using GPIO chip select [ Upstream commit 25f00a13dccf8e45441265768de46c8bf58e08f6 ] Add check for the return value of spi_get_csgpiod() to avoid passing a NULL pointer to gpiod_direction_output(), preventing a crash when GPIO chip select is not used. Fix below crash: [ 4.251960] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 4.260762] Mem abort info: [ 4.263556] ESR = 0x0000000096000004 [ 4.267308] EC = 0x25: DABT (current EL), IL = 32 bits [ 4.272624] SET = 0, FnV = 0 [ 4.275681] EA = 0, S1PTW = 0 [ 4.278822] FSC = 0x04: level 0 translation fault [ 4.283704] Data abort info: [ 4.286583] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 4.292074] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 4.297130] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 4.302445] [0000000000000000] user address but active_mm is swapper [ 4.308805] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 4.315072] Modules linked in: [ 4.318124] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc4-next-20241023-00008-ga20ec42c5fc1 #359 [ 4.328130] Hardware name: LS1046A QDS Board (DT) [ 4.332832] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.339794] pc : gpiod_direction_output+0x34/0x5c [ 4.344505] lr : gpiod_direction_output+0x18/0x5c [ 4.349208] sp : ffff80008003b8f0 [ 4.352517] x29: ffff80008003b8f0 x28: 0000000000000000 x27: ffffc96bcc7e9068 [ 4.359659] x26: ffffc96bcc6e00b0 x25: ffffc96bcc598398 x24: ffff447400132810 [ 4.366800] x23: 0000000000000000 x22: 0000000011e1a300 x21: 0000000000020002 [ 4.373940] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff [ 4.381081] x17: ffff44740016e600 x16: 0000000500000003 x15: 0000000000000007 [ 4.388221] x14: 0000000000989680 x13: 0000000000020000 x12: 000000000000001e [ 4.395362] x11: 0044b82fa09b5a53 x10: 0000000000000019 x9 : 0000000000000008 [ 4.402502] x8 : 0000000000000002 x7 : 0000000000000007 …

[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal gregkh#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Tested-by: Tony Ambardar <[email protected]> Reviewed-by: Tony Ambardar <[email protected]> Acked-by: Daniel Xu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal #11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Tested-by: Tony Ambardar <[email protected]> Reviewed-by: Tony Ambardar <[email protected]> Acked-by: Daniel Xu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>

tititiou36 and others added 30 commits September 19, 2023 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HACL* Raw RSA Integration #11

HACL* Raw RSA Integration #11

karthikbhargavan commented Dec 1, 2023

HACL* Raw RSA Integration #11

HACL* Raw RSA Integration #11

Conversation

karthikbhargavan commented Dec 1, 2023