-
Notifications
You must be signed in to change notification settings - Fork 54.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update cfi_cmdset_0002.c #125
Open
vjiki
wants to merge
1
commit into
torvalds:master
Choose a base branch
from
vjiki:patch-1
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Issue description: ========================== During writing the Linux Kernel image on /dev/mtdX device the burning is failed due to I/O write or erase error: Erasing blocks: 6/12 (50%) While erasing blocks 0x000a0000-0x000c0000 on /dev/mtd1: Input/output error Burning the /persistent/SCXB3ISF.img image on /dev/mtd1 failed. When doing erase, writing or verifying using FLASHCP program, the cfi mtd functions of Linux Kernel MTD driver can returns too early. The function chip_ready() return OK status while the actual writing or erasing is not finished on NOR flash device. It leads to I/O error and fail in writing, erasing or verifying of the data during working with MTD devices. When the issue with writing is reproduced we compared the data of LinuxKERNEL.img and the data which was written on the /dev/mtdX device: dd if=/dev/mtd3 bs=2367008 count=1 > /persistent/dest.buf hexdump /persistent/dest.buf > /persistent/dest.hex hexdump /persistent/LinuxKERNEL.img > /persistent/source.hex diff -purN /persistent/dest.hex /persistent/source.hex And it was observed that the driver didn't write some blocks at all (0xffff instead of real data in some places). Flash driver skip writing or hasn't been finished it during the specific time. I found that in Internet I am not alone in this world, but I didn't find some solution for this problem which doesn't lead to increasing of the time of writing. So I investigate the writing and found that there is no any check that the data is really was written on the Flash. So the proposed solution is to compare the given last byte with written one. We have run a long stubility during several weeks on our side with Switch Board using writing on NOR flash and the problem is not reproduced. FREQUENCY: ========================== 1% The I/O error appears about 1 time during 3-4 hours writing on NOR flash. But it was not possible to determine exactly at what time the issue can happen. SYSTEM IMPACT: ========================== fails to write/erase on NOR flash FIX DESCRIPTION: ========================== Wait until the following checks return positive results during writing/erasing on NOR Flash: 1. read the data/status from the device (checking status of the NOR Flash: busy, timeout or ok(data)) return ok; 2. check the last byte of given buffer data is written in case of WRITE or the last map word is erased in case of ERASE. TEST DESCRIPTION: ========================== Start writing some data on NOR FLASH (mtd device) using flashcp or start erasing using flash_erase.
I would like to propose to add this patch in official linux kernel. Might be this patch will be useful, but of course it will be good to do carefull review of this/ |
pstglia
pushed a commit
to pstglia/linux
that referenced
this pull request
Oct 6, 2014
ARM64 has defined the spinlock for init_mm's context, so need initialize the spinlock structure; otherwise during the suspend flow it will dump the info for spinlock's bad magic warning as below: [ 39.084394] Disabling non-boot CPUs ... [ 39.092871] BUG: spinlock bad magic on CPU#1, swapper/1/0 [ 39.092896] lock: init_mm+0x338/0x3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 [ 39.092907] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.10.33 torvalds#125 [ 39.092912] Call trace: [ 39.092927] [<ffffffc000087e64>] dump_backtrace+0x0/0x16c [ 39.092934] [<ffffffc000087fe0>] show_stack+0x10/0x1c [ 39.092947] [<ffffffc000765334>] dump_stack+0x1c/0x28 [ 39.092953] [<ffffffc0007653b8>] spin_dump+0x78/0x88 [ 39.092960] [<ffffffc0007653ec>] spin_bug+0x24/0x34 [ 39.092971] [<ffffffc000300a28>] do_raw_spin_lock+0x98/0x17c [ 39.092979] [<ffffffc00076cf08>] _raw_spin_lock_irqsave+0x4c/0x60 [ 39.092990] [<ffffffc000094044>] set_mm_context+0x1c/0x6c [ 39.092996] [<ffffffc0000941c8>] __new_context+0x94/0x10c [ 39.093007] [<ffffffc0000d63d4>] idle_task_exit+0x104/0x1b0 [ 39.093014] [<ffffffc00008d91c>] cpu_die+0x14/0x74 [ 39.093021] [<ffffffc000084f74>] arch_cpu_idle_dead+0x8/0x14 [ 39.093030] [<ffffffc0000e7f18>] cpu_startup_entry+0x1ec/0x258 [ 39.093036] [<ffffffc00008d810>] secondary_start_kernel+0x114/0x124 Signed-off-by: Leo Yan <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
chrillomat
pushed a commit
to chrillomat/linux
that referenced
this pull request
Oct 6, 2014
torvalds#125 IRQ torvalds#125's status is not constant on different boards, IRQ torvalds#32 is IOMUXC's interrupt which can be triggered manually at anytime, use this irq instead of torvalds#125 to generate interrupt for avoiding CCM enter low power mode by mistake. Signed-off-by: Anson Huang <[email protected]>
@vjiki This is great and good, but Linus doesn't accept pull requests from GitHub. Consult https://github.com/torvalds/linux/blob/master/Documentation/HOWTO instead. |
krzk
pushed a commit
to krzk/linux
that referenced
this pull request
May 2, 2015
…heckpatch-fixes ERROR: code indent should use tabs where possible torvalds#120: FILE: include/linux/capability.h:220: + return true;$ WARNING: please, no spaces at the start of a line torvalds#120: FILE: include/linux/capability.h:220: + return true;$ ERROR: code indent should use tabs where possible torvalds#125: FILE: include/linux/capability.h:225: + return true;$ WARNING: please, no spaces at the start of a line torvalds#125: FILE: include/linux/capability.h:225: + return true;$ ERROR: code indent should use tabs where possible torvalds#129: FILE: include/linux/capability.h:229: + return true;$ WARNING: please, no spaces at the start of a line torvalds#129: FILE: include/linux/capability.h:229: + return true;$ ERROR: code indent should use tabs where possible torvalds#134: FILE: include/linux/capability.h:234: + return true;$ WARNING: please, no spaces at the start of a line torvalds#134: FILE: include/linux/capability.h:234: + return true;$ ERROR: code indent should use tabs where possible torvalds#170: FILE: include/linux/cred.h:79: + return 1;$ WARNING: please, no spaces at the start of a line torvalds#170: FILE: include/linux/cred.h:79: + return 1;$ ERROR: code indent should use tabs where possible torvalds#174: FILE: include/linux/cred.h:83: + return 1;$ WARNING: please, no spaces at the start of a line torvalds#174: FILE: include/linux/cred.h:83: + return 1;$ total: 6 errors, 6 warnings, 310 lines checked NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/kernel-conditionally-support-non-root-users-groups-and-capabilities.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Iulia Manda <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
andy-shev
pushed a commit
to andy-shev/linux
that referenced
this pull request
Aug 4, 2015
Use device_is_registered() instad of sysfs flag to determine if we should free sysfs representation of particular LUN. sysfs flag in fsg common determines if luns attributes should be exposed using sysfs. This flag is used when creating and freeing luns. Unfortunately there is no guarantee that this flag will not be changed between creation and removal of particular LUN. Especially because of lun.0 which is created during allocating instance of function. This may lead to resource leak or NULL pointer dereference: [ 62.539925] Unable to handle kernel NULL pointer dereference at virtual address 00000044 [ 62.548014] pgd = ec994000 [ 62.550679] [00000044] *pgd=6d7be831, *pte=00000000, *ppte=00000000 [ 62.556933] Internal error: Oops: 17 [#1] PREEMPT SMP ARM [ 62.562310] Modules linked in: g_mass_storage(+) [ 62.566916] CPU: 2 PID: 613 Comm: insmod Not tainted 4.2.0-rc4-00077-ge29ee91-dirty torvalds#125 [ 62.574984] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 62.581061] task: eca56e80 ti: eca76000 task.ti: eca76000 [ 62.586450] PC is at kernfs_find_ns+0x8/0xe8 [ 62.590698] LR is at kernfs_find_and_get_ns+0x30/0x48 [ 62.595732] pc : [<c01277c0>] lr : [<c0127b88>] psr: 40010053 [ 62.595732] sp : eca77c40 ip : eca77c38 fp : 000008c1 [ 62.607187] r10: 00000001 r9 : c0082f38 r8 : ed41ce40 [ 62.612395] r7 : c05c1484 r6 : 00000000 r5 : 00000000 r4 : c0814488 [ 62.618904] r3 : 00000000 r2 : 00000000 r1 : c05c1484 r0 : 00000000 [ 62.625417] Flags: nZcv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user [ 62.632620] Control: 10c5387d Table: 6c99404a DAC: 00000015 [ 62.638348] Process insmod (pid: 613, stack limit = 0xeca76210) [ 62.644251] Stack: (0xeca77c40 to 0xeca78000) [ 62.648594] 7c40: c0814488 00000000 00000000 c05c1484 ed41ce40 c0127b88 00000000 c0824888 [ 62.656753] 7c60: ed41d038 ed41d030 ed41d000 c012af4c 00000000 c0824858 ed41d038 c02e3314 [ 62.664912] 7c80: ed41d030 00000000 ed41ce04 c02d9e8c c070eda8 eca77cb4 000008c1 c058317c [ 62.673071] 7ca0: 000008c1 ed41d030 ed41ce00 ed41ce04 ed41d000 c02da044 ed41cf48 c0375870 [ 62.681230] 7cc0: ed9d3c04 ed9d3c00 ed52df80 bf000940 fffffff0 c03758f4 c03758c0 00000000 [ 62.689389] 7ce0: bf000564 c03614e0 ed9d3c04 bf000194 c0082f38 00000001 00000000 c0000100 [ 62.697548] 7d00: c0814488 c0814488 c086b1dc c05893a8 00000000 ed7e8320 00000000 c0128b88 [ 62.705707] 7d20: ed8a6b40 00000000 00000000 ed410500 ed8a6b40 c0594818 ed7e8320 00000000 [ 62.713867] 7d40: 00000000 c0129f20 00000000 c082c444 ed8a6b40 c012a684 00001000 00000000 [ 62.722026] 7d60: c0594818 c082c444 00000000 00000000 ed52df80 ed52df80 00000000 00000000 [ 62.730185] 7d80: 00000000 00000000 00000001 00000002 ed8e9b70 ed52df80 bf0006d0 00000000 [ 62.738345] 7da0: ed8e9b70 ed410500 ed618340 c036129c ed8c1c00 bf0006d0 c080b158 ed8c1c00 [ 62.746504] 7dc0: bf0006d0 c080b158 ed8c1c08 ed410500 c0082f38 ed618340 000008c1 c03640ac [ 62.754663] 7de0: 00000000 bf0006d0 c082c8dc c080b158 c080b158 c03642d4 00000000 bf003000 [ 62.762822] 7e00: 00000000 c0009784 00000000 00000001 00000000 c05849b0 00000002 ee7ab780 [ 62.770981] 7e20: 00000002 ed4105c0 0000c53e 000000d0 c0808600 eca77e5c 00000004 00000000 [ 62.779140] 7e40: bf000000 c0095680 c08075a0 ee001f00 ed4105c0 c00cadc0 ed52df80 bf000780 [ 62.787300] 7e60: ed4105c0 bf000780 00000001 bf0007c8 c0082f38 ed618340 000008c1 c0083e24 [ 62.795459] 7e80: 00000001 bf000780 00000001 eca77f58 00000001 bf000780 00000001 c00857f4 [ 62.803618] 7ea0: bf00078c 00007fff 00000000 c00835b4 eca77f58 00000000 c0082fac eca77f58 [ 62.811777] 7ec0: f05038c0 0003b008 bf000904 00000000 00000000 bf00078c 6e72656b 00006c65 [ 62.819936] 7ee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 62.828095] 7f00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 62.836255] 7f20: 00000000 00000000 00000000 00000000 00000000 00000000 00000003 0003b008 [ 62.844414] 7f40: 0000017b c000f5c8 eca76000 00000000 0003b008 c0085df8 f04ef000 0001b8a9 [ 62.852573] 7f60: f0503258 f05030c2 f0509fe8 00000968 00000dc8 00000000 00000000 00000000 [ 62.860732] 7f80: 00000029 0000002a 00000011 00000000 0000000a 00000000 33f6eb00 0003b008 [ 62.868892] 7fa0: bef01cac c000f400 33f6eb00 0003b008 00000003 0003b008 00000000 00000003 [ 62.877051] 7fc0: 33f6eb00 0003b008 bef01cac 0000017b 00000000 0003b008 0000000b 0003b008 [ 62.885210] 7fe0: bef01ae0 bef01ad0 0001dc23 b6e8c162 800b0070 00000003 c0c0c0c0 c0c0c0c0 [ 62.893380] [<c01277c0>] (kernfs_find_ns) from [<c0824888>] (pm_qos_latency_tolerance_attr_group+0x0/0x10) [ 62.903005] Code: e28dd00c e8bd80f0 e92d41f0 e2923000 (e1d0e4b4) [ 62.909115] ---[ end trace 02fb4373ef095c7b ]--- Acked-by: Michal Nazarewicz <[email protected]> Signed-off-by: Krzysztof Opasiak <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>
ddstreet
referenced
this pull request
in ddstreet/linux
Aug 6, 2015
GIT 32f494c98c21b03a78c2305cde6ae1b421db576e commit 32f494c98c21b03a78c2305cde6ae1b421db576e Author: Stephen Rothwell <[email protected]> Date: Wed Aug 5 16:41:48 2015 +1000 crypto: authenc - select CRYPTO_NULL Signed-off-by: Stephen Rothwell <[email protected]> commit aa7c043d9783f538319e77deeae5d90ff5d6907b Author: Richard Guy Briggs <[email protected]> Date: Sat Aug 1 15:41:13 2015 -0400 audit: eliminate unnecessary extra layer of watch parent references The audit watch parent count was imbalanced, adding an unnecessary layer of watch parent references. Decrement the additional parent reference when a watch is reused, already having a reference to the parent. audit_find_parent() gets a reference to the parent, if the parent is already known. This additional parental reference is not needed if the watch is subsequently found by audit_add_to_parent(), and consumed if the watch does not already exist, so we need to put the parent if the watch is found, and do nothing if this new watch is added to the parent. If the parent wasn't already known, it is created with a refcount of 1 and added to the audit_watch_group, then incremented by one to be subsequently consumed by the newly created watch in audit_add_to_parent(). The rule points to the watch, not to the parent, so the rule's refcount gets bumped, not the parent's. See LKML, 2015-07-16 Signed-off-by: Richard Guy Briggs <[email protected]> Signed-off-by: Paul Moore <[email protected]> commit f8259b262bedd5ec71e55de5953464ea86ff69d9 Author: Richard Guy Briggs <[email protected]> Date: Sat Aug 1 15:41:12 2015 -0400 audit: eliminate unnecessary extra layer of watch references The audit watch count was imbalanced, adding an unnecessary layer of watch references. Only add the second reference when it is added to a parent. Signed-off-by: Richard Guy Briggs <[email protected]> Signed-off-by: Paul Moore <[email protected]> commit 6abc8ca19df0078de17dc38340db3002ed489ce7 Author: Tejun Heo <[email protected]> Date: Tue Aug 4 15:20:55 2015 -0400 cgroup: define controller file conventions Traditionally, each cgroup controller implemented whatever interface it wanted leading to interfaces which are widely inconsistent. Examining the requirements of the controllers readily yield that there are only a few control schemes shared among all. Two major controllers already had to implement new interface for the unified hierarchy due to significant structural changes. Let's take the chance to establish common conventions throughout all controllers. This patch defines CGROUP_WEIGHT_MIN/DFL/MAX to be used on all weight based control knobs and documents the conventions that controllers should follow on the unified hierarchy. Except for io.weight knob, all existing unified hierarchy knobs are already compliant. A follow-up patch will update io.weight. v2: Added descriptions of min, low and high knobs. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Li Zefan <[email protected]> Cc: Peter Zijlstra <[email protected]> commit 1dadafa86a779884f14a6e7a3ddde1a57b0a0a65 Author: Tim Gardner <[email protected]> Date: Tue Aug 4 11:26:04 2015 -0600 workqueue: Make flush_workqueue() available again to non GPL modules Commit 37b1ef31a568fc02e53587620226e5f3c66454c8 ("workqueue: move flush_scheduled_work() to workqueue.h") moved the exported non GPL flush_scheduled_work() from a function to an inline wrapper. Unfortunately, it directly calls flush_workqueue() which is a GPL function. This has the effect of changing the licensing requirement for this function and makes it unavailable to non GPL modules. See commit ad7b1f841f8a54c6d61ff181451f55b68175e15a ("workqueue: Make schedule_work() available again to non GPL modules") for precedent. Signed-off-by: Tim Gardner <[email protected]> Signed-off-by: Tejun Heo <[email protected]> commit b03ba9e314c12b2127243145b5c1f41b2408de62 Author: Sifan Naeem <[email protected]> Date: Wed Jul 29 11:55:26 2015 +0100 spi: img-spfi: fix multiple calls to request gpio spfi_setup may be called many times by the spi framework, but gpio_request_one can only be called once without freeing, repeatedly calling gpio_request_one will cause an error to be thrown, which causes the request to spi_setup to be marked as failed. We can have a per-spi_device flag that indicates whether or not the gpio has been requested. If the gpio has already been requested use gpio_direction_output to set the direction of the gpio. Fixes: 8c2c8c03cdcb ("spi: img-spfi: Control CS lines with GPIO") Signed-off-by: Sifan Naeem <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected] commit d4ea7d86457a8d0ea40ce77bdeda1fc966cc35ec Author: Ian Campbell <[email protected]> Date: Sat Aug 1 18:13:25 2015 +0100 regulator: axp20x: Add module alias This allows the module to be autoloaded. Together with 07949bf9c63c ("cpufreq: dt: allow driver to boot automatically") this is sufficient to allow a modular kernel (such as Debian's) to enable cpufreq on a Cubietruck. Signed-off-by: Ian Campbell <[email protected]> Signed-off-by: Mark Brown <[email protected]> commit 5dbe135a153837ce9367bdfacf7aabfc6fb76f4b Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:51 2015 +0200 usb: gadget: epautoconf: remove ep and desc configuration from ep_matches() As function ep_matches() is used to match endpoint with usb descriptor it's highly unintuitive that it modifies endpoint and descriptor structures fields. This patch moves code configuring ep and desc from ep_matches() to usb_ep_autoconfig_ss(), so now function ep_matches() does nothing more than its name suggests. [ [email protected] : fix build warning ] Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit b58713d53a8f41d57b24c93de0b1c7e9550ba70f Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:50 2015 +0200 usb: gadget: epautoconf: remove pxa quirk from ep_matches() The same effect can be achieved by using capabilities flags, so now we can get rid of handling of hardware specific limitations in generic code. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit b86f33a3a371a4c3aa8dbb2f4125634a4e0d09dc Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:49 2015 +0200 usb: gadget: epautoconf: add endpoint capabilities flags verification Introduce endpoint matching mechanism basing on endpoint capabilities flags. We check if endpoint supports transfer type and direction requested in ep descriptor. Since we have this new endpoint matching mechanism there is no need to have old code guessing endpoint capabilities basing on its name, so we are getting rid of it. Remove also the obsolete comment. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 47bef386511517449e2f24a89a41084af53616f8 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:48 2015 +0200 usb: gadget: atmel_usba_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 916f7ac5dbc312969a90bc35a5f4fcbfc2965d60 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:47 2015 +0200 usb: renesas: gadget: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 8501955e888662ca56775eec2eb804e7bc7fce0d Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:46 2015 +0200 usb: musb: gadget: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit eb4cbc19526d62657b838d6f0b694a000e5b4c81 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:45 2015 +0200 usb: isp1760: udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 927d9f77fe3d5f9261eeb465e2b60768e400ffc9 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:44 2015 +0200 usb: gadget: udc-xilinx: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 0648772d51c0ff3949397cb0cbcf2435ee32c550 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:43 2015 +0200 usb: gadget: s3c2410_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit bc1b9f300ae06c64fcd056fb959b3d709ad2ef33 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:42 2015 +0200 usb: gadget: s3c-hsudc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 0ec8026d7afee625f52631708d84435ea4735da6 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:41 2015 +0200 usb: gadget: r8a66597-udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit a180e3da97a323510071b2b5e42b5dc07df239da Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:40 2015 +0200 usb: gadget: pxa27x_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 36411b6b042d350a43fe1e0d3ce78fbda30f4f02 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:39 2015 +0200 usb: gadget: pxa25x_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 85a4ed003b39f70ba478e613a9be2c334f1079e7 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:38 2015 +0200 usb: gadget: pch_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 7d4ba80d3a91222de577b652a8936f935de8b409 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:37 2015 +0200 usb: gadget: omap_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit c23c3c3c3059f3dc47268cd7a28b96b9efdbc1ea Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:36 2015 +0200 usb: gadget: net2280: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit f95aec51da16250841a4254db36f9771446cdbb6 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:35 2015 +0200 usb: gadget: net2272: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 43710a8dba9ae607decdeaf7a56a51dd5b42184e Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:34 2015 +0200 usb: gadget: mv_udc_core: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit c12a30629f8b5fe8c2aba42c3128df702bbc9e83 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:33 2015 +0200 usb: gadget: mv_u3d_core: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 8ddbf94fd5b536b3adf5ffa631c5951718e7301d Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:32 2015 +0200 usb: gadget: m66592-udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 4d75c8bd613c5ba99ee02cfe38610c82e7fe8362 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:31 2015 +0200 usb: gadget: lpc32xx_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 892269925991b449ae47dcc0debb324ae451022e Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:30 2015 +0200 usb: gadget: gr_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit b0bf5fbfbd30ffbb9e45169f78412f0596e16412 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:29 2015 +0200 usb: gadget: goku_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 455d11c93582c4167f55af0969b83821450be120 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:28 2015 +0200 usb: gadget: fusb300_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 60a28c63712f190b45e9d8f0ca593c927970fd51 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:27 2015 +0200 usb: gadget: fsl_udc_core: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit e8fc42f6a19af320d7a4718c05c3f249bf5d3151 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:26 2015 +0200 usb: gadget: fsl_qe_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 8d29237a436dc8e2b5c44dd0ca662a680f16deb5 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:25 2015 +0200 usb: gadget: fotg210-udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 7a3b8e7098946b44c014f3df0ceef27fb273c142 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:24 2015 +0200 usb: gadget: dummy-hcd: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit b079dd6156a6544e0383642a9ec97d17485aa244 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:23 2015 +0200 usb: gadget: bdc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 1b0ba527702268992a62227dde4911477f057ca5 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:22 2015 +0200 usb: gadget: bcm63xx_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit b9ed96d7d579416a8fd2e1cef66541bfdad2720f Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:21 2015 +0200 usb: gadget: at91_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 6f02ac5ac9bba538593e4359dd5e83c4d42822fe Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:20 2015 +0200 usb: gadget: amd5536udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit a474d3b73ba7f22844e672ac004f01cd9b8be159 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:19 2015 +0200 usb: dwc3: gadget: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 2954522f135246287c36524bc96e9dae8c40f8a9 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:18 2015 +0200 usb: dwc2: gadget: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit a7e3f1410855db6b67ec29a386e74be2df3cc311 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:17 2015 +0200 usb: chipidea: udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 68b5c947515a252b9e416e419fca4c1382912948 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:16 2015 +0200 staging: emxx_udc: add ep capabilities support Convert endpoint configuration to new capabilities model. Fixed typo in "epc-nulk" to "epc-bulk". Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 80e6e3847f851fc05e63265050115e29e2a50d7e Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:15 2015 +0200 usb: gadget: add endpoint capabilities helper macros Add macros useful while initializing array of endpoint capabilities structures. These macros makes structure initialization more compact to decrease number of code lines and increase readability of code. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 734b5a2addd333829a6d647ee14a3609c7a87c44 Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:14 2015 +0200 usb: gadget: add endpoint capabilities flags Introduce struct usb_ep_caps which contains information about capabilities of usb endpoints - supported transfer types and directions. This structure should be filled by UDC driver for each of its endpoints, and will be used in epautoconf in new ep matching mechanism which will replace ugly guessing of endpoint capabilities basing on its name. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit cc476b42a39d5a66d94f46cade972dcb8ee278df Author: Robert Baldyga <[email protected]> Date: Fri Jul 31 16:00:13 2015 +0200 usb: gadget: encapsulate endpoint claiming mechanism So far it was necessary for usb functions to set ep->driver_data in endpoint obtained from autoconfig to non-null value, to indicate that endpoint is claimed by function (in autoconfig it was checked if endpoint has set this field to non-null value, and if it has, it was assumed that it is claimed). It could cause bugs because if some function doesn't set this field autoconfig could return the same endpoint more than one time. To help to avoid such bugs this patch adds claimed flag to struct usb_ep, and encapsulates endpoint claiming mechanism inside usb_ep_autoconfig_ss() and usb_ep_autoconfig_reset(), so now usb functions don't need to perform any additional actions to mark endpoint obtained from autoconfig as claimed. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 94e5c23d3c52f58f4051810a15aeacc085127ad1 Author: Axel Lin <[email protected]> Date: Tue Aug 4 13:52:22 2015 +0800 spi: pxa2xx: Add terminating entry for pxa2xx_spi_pci_compound_match Signed-off-by: Axel Lin <[email protected]> Acked-by: Jarkko Nikula <[email protected]> Signed-off-by: Mark Brown <[email protected]> commit 89a6356676eba38e498507b5b67dcc07a714a149 Author: Colin Ian King <[email protected]> Date: Fri Jul 31 13:42:29 2015 +0100 spi: spidev: fix inconsistent indenting Fix inconsistent indenting in spidev_open, no functional change. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Mark Brown <[email protected]> commit 0f4315a8f1a73f130bbc5dde134b704ea6dda56c Author: Felipe Balbi <[email protected]> Date: Tue Aug 4 11:02:45 2015 -0500 usb: gadget: f_uac2: fix build warning commit 913e4a90b6f9 ("usb: gadget: f_uac2: finalize wMaxPacketSize according to bandwidth") added a possible build warning when calling min(). In order to fix the warning, we just make sure to call min_t() and tell that its arguments should be u16. Cc: Peter Chen <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 7f352964852ede9e95d18fd3825c5200fb48b3a5 Author: Saurabh Karajgaonkar <[email protected]> Date: Tue Aug 4 14:02:28 2015 +0000 usb: musb: musb_dsps: Simplify return statement Replace redundant variable use in return statement. Signed-off-by: Saurabh Karajgaonkar <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit c5673f5ce4c0faa97df419877bcdebbd76d43151 Author: Saurabh Karajgaonkar <[email protected]> Date: Tue Aug 4 14:02:03 2015 +0000 usb: phy: phy-keystone: Simplify return statement Replace redundant variable use in return statement. Signed-off-by: Saurabh Karajgaonkar <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 4b68b50fd45e2f429da574c74d9a3788a861434f Author: Saurabh Karajgaonkar <[email protected]> Date: Tue Aug 4 14:01:31 2015 +0000 usb: phy: phy-mxs-usb: Simplify return statement Replace redundant variable use in return statement. Signed-off-by: Saurabh Karajgaonkar <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> commit 5406898354ebfb11f49b955fb5e49a62786a542f Author: Mengdong Lin <[email protected]> Date: Tue Aug 4 15:47:35 2015 +0100 ASoC: topology: fix typo in soc_tplg_kcontrol_bind_io() Signed-off-by: Mengdong Lin <[email protected]> Signed-off-by: Liam Girdwood <[email protected]> Signed-off-by: Mark Brown <[email protected]> commit 6d3ec14d703c660c4baf8d726538b5415e23b4fb Author: Lukas Czerner <[email protected]> Date: Tue Aug 4 11:21:52 2015 -0400 jbd2: limit number of reserved credits Currently there is no limitation on number of reserved credits we can ask for. If we ask for more reserved credits than 1/2 of maximum transaction size, or if total number of credits exceeds the maximum transaction size per operation (which is currently only possible with the former) we will spin forever in start_this_handle(). Fix this by adding this limitation at the start of start_this_handle(). This patch also removes the credit limitation 1/2 of maximum transaction size, since we really only want to limit the number of reserved credits. There is not much point to limit the credits if there is still space in the journal. This accidentally also fixes the online resize, where due to the limitation of the journal credits we're unable to grow file systems with 1k block size and size between 16M and 32M. It has been partially fixed by 2c869b262a10ca99cb866d04087d75311587a30c, but not entirely. Thanks Jan Kara for helping me getting the correct fix. Signed-off-by: Lukas Czerner <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> commit 21caf3a765b0a88f8fedf63b36e5d15683b73fe5 Author: Lorenzo Nava <[email protected]> Date: Thu Jul 2 17:28:03 2015 +0100 ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA This patch allows the use of CMA for DMA coherent memory allocation. At the moment if the input parameter "is_coherent" is set to true the allocation is not made using the CMA, which I think is not the desired behaviour. The patch covers the allocation and free of memory for coherent DMA. Signed-off-by: Lorenzo Nava <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Russell King <[email protected]> commit c0e736629b89ec176ab4dba45943364cbdc135ba Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 15:22:11 2015 +0200 drm/fb-helper: Move drm_fb_helper_force_kernel_mode() inside #ifdef If CONFIG_MAGIC_SYSRQ is not set: drivers/gpu/drm/drm_fb_helper.c:390:13: warning: 'drm_fb_helper_force_kernel_mode' defined but not used [-Wunused-function] static bool drm_fb_helper_force_kernel_mode(void) ^ Move drm_fb_helper_force_kernel_mode() inside the existing #ifdef to fix this. Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> commit b996f201b74e52f9e9da47e1377e0b948d5ba25c Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 15:22:10 2015 +0200 drm/fb-helper: Clarify drm_fb_helper_restore_fbdev_mode*() As of commit 5ea1f752ae04be40 ("drm: add drm_fb_helper_restore_fbdev_mode_unlocked()"), drm_fb_helper_restore_fbdev_mode() is no longer public, and drivers should call drm_fb_helper_restore_fbdev_mode_unlocked() from their ->lastclose callbacks instead. Update the documentation to reflect this, and absorb the one liner drm_fb_helper_restore_fbdev_mode() into its single caller. Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> commit 714dc94e0d22ef951722f5d6ec316f4a7867f850 Author: Jiri Kosina <[email protected]> Date: Tue Aug 4 15:46:10 2015 +0200 Revert "HID: core/input: Fix accessing freed memory during driver unbind" This reverts commit e19232a20913513fbb4b21ac8f45a4b9a805d6e4. It's broken and is going to be replaced with the one from for-4.2/upstream-fixes-devm-fixed branch. This is for-next branch only revert, both the original commit and the revert will never make it to Linus (instead a fixed version will be sent directly). commit 0621809e37936e7c2b3eac9165cf2aad7f9189eb Author: Krzysztof Kozlowski <[email protected]> Date: Mon Aug 3 14:57:30 2015 +0900 HID: hid-input: Fix accessing freed memory during device disconnect During unbinding the driver was dereferencing a pointer to memory already freed by power_supply_unregister(). Driver was freeing its internal description of battery through pointers stored in power_supply structure. However, because the core owns the power supply instance, after calling power_supply_unregister() this memory is freed and the driver cannot access these members. Fix this by storing the pointer to internal description of battery in a local variable before calling power_supply_unregister(), so the pointer remains valid. Signed-off-by: Krzysztof Kozlowski <[email protected]> Reported-by: H.J. Lu <[email protected]> Fixes: 297d716f6260 ("power_supply: Change ownership from driver to core") Cc: <[email protected]> Reviewed-by: Dmitry Torokhov <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> commit 3f14a63a544374225c17221a5058748360428dc3 Author: Jason Gerecke <[email protected]> Date: Mon Aug 3 10:17:05 2015 -0700 HID: wacom: Remove WACOM_QUIRK_NO_INPUT WACOM_QUIRK_NO_INPUT is a signal to the driver that input devices should not be created for a particular device. This quirk was used by the wireless receiver to prevent any devices from being created during the initial probe (defering it instead until we got a tablet connection event in 'wacom_wireless_work'). This quirk is not necessary now that a device_type is associated with each device. Any input device allocated by 'wacom_allocate_inputs' which is not necessary for a particular device is freed in 'wacom_register_inputs'. In particular, none of the wireless receivers devices have the pen, pad, or touch device types set so the same effect is achieved without the need to be explicit. We now return early in wacom_retrieve_hid_descriptor for wireless devices (to prevent the device_type from being overridden) but since we ignore the HID descriptor for the wireless reciever anyway, this is not an issue. Signed-off-by: Jason Gerecke <[email protected]> Reviewed-by: Benjamin Tissoires <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> commit ccad85cc1ee34509840e5af80a436ceaf0b71edb Author: Jason Gerecke <[email protected]> Date: Mon Aug 3 10:17:04 2015 -0700 HID: wacom: Replace WACOM_QUIRK_MONITOR with WACOM_DEVICETYPE_WL_MONITOR The monitor interface on the wireless receiver is more logically expressed as a type of device instead of a quirk. Signed-off-by: Jason Gerecke <[email protected]> Reviewed-by: Benjamin Tissoires <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> commit 8dc8641e619228153ab0bc609f9f534126e87c08 Author: Jason Gerecke <[email protected]> Date: Mon Aug 3 10:17:03 2015 -0700 HID: wacom: Use calculated pkglen for wireless touch interface Commit 01c846f introduced the 'wacom_compute_pktlen' function which automatically determines the correct value for an interface's pkglen by scanning the HID descriptor. This function returns the correct value for the wireless receiver's touch interface, removing the need for us to set it manually here. Signed-off-by: Jason Gerecke <[email protected]> Reviewed-by: Benjamin Tissoires <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> commit 061357a06c69dfa805a033ae22d668de16191cd4 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:17 2015 +0200 ARM: shmobile: R-Mobile: Use CPG/MSTP Clock Domain attach/detach helpers The R-Mobile PM Domain driver manages both power domains and a clock domain. The clock domain part is very similar to the CPG/MSTP Clock Domain, which is used on shmobile SoCs without device power domains, except for the way how clocks suitable for power management are selected: - The former uses the first clock tied to the device through the NULL con_id, which is a relic from the legacy pm_clk_notifier-based method in drivers/sh/pm_runtime.c, - The latter looks for suitable clocks in DT, which is more future-proof. All platforms using this driver are now supported in DT-based ARM multi-platform builds only, hence switch to using the CPG/MSTP Clock Domain helpers. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit a1a1fdd83722a30efc705a5032ace2ad20b2ca6d Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 15:25:35 2015 +0200 clk: shmobile: mstp: Consider "zb_clk" suitable for power management Currently the CPG/MSTP Clock Domain code looks for MSTP clocks to power manage a device. Unfortunately, on R-Mobile APE6 (r8a73a4) and SH-Mobile AG5 (sh73a0), the Bus State Controller (BSC) is not power-managed by an MSTP clock, but by a plain CPG clock (zb_clk). Add a special case to handle this, so the clock is properly managed, and devices connected to the BSC work as expected. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 6989c8c4cfa7f7cd406906e1abbe7b073d3d66e0 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:15 2015 +0200 drivers: sh: Disable PM runtime for multi-platform ARM with genpd If the default PM Domain using PM_CLK is used for PM runtime, the real Clock Domain cannot be registered from DT later. Hence do not enable it when running a multi-platform kernel with genpd support on R-Car or RZ. The CPG/MSTP Clock Domain driver will take care of PM runtime management of the module clocks. Now most multi-platform ARM shmobile platforms (SH-Mobile, R-Mobile, R-Car, RZ) use DT-based PM Domains to take care of PM runtime management of the module clocks, simplify the platform logic by replacing the explicit SoC checks by a single check for the presence of MSTP clocks in DT. Backwards-compatiblity with old DTs (mainly for R-Car Gen2) is provided by checking for the presence of a "#power-domain-cells" property in DT. The default PM Domain is still needed for: - backwards-compatibility with old DTs that lack PM Domain properties, - the CONFIG_PM=n case, - legacy (non-DT) ARM/shmobile platforms without genpd support (r8a7778, r8a7779), - legacy SuperH. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit d7c01f3aba883c57aacc57f884e9081d00d1dec7 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:14 2015 +0200 drivers: sh: Disable legacy default PM Domain on emev2 EMMA Mobile EV2 doesn't have MSTP clocks. All its device drivers manage clocks explicitly, without relying on Runtime PM, so it doesn't need the legacy default PM Domain. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit ba56d432e1dc98aa5df5930a0707d79e40fd5e39 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:13 2015 +0200 ARM: shmobile: r8a7794 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 23a7abb77197044b6e708e0d0d31d89bf1e68bf1 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:12 2015 +0200 ARM: shmobile: r8a7793 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit a5ab8bf59f7ec14081e161bad52430e37def1189 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:11 2015 +0200 ARM: shmobile: r8a7791 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. Notable exceptions are the "display" and "sound" nodes, which represent multiple SoC devices, each having their own MSTP clocks. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 835a058492b1d539e891af12ce37713b1fb327e9 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:10 2015 +0200 ARM: shmobile: r8a7790 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. Notable exceptions are the "display" and "sound" nodes, which represent multiple SoC devices, each having their own MSTP clocks. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 3397481089b3ab931536b26fb65ef4a41eefdb67 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:09 2015 +0200 ARM: shmobile: r8a7779 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 6d3bba42aef6eb7594b845518a51d5aa094d3502 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:08 2015 +0200 ARM: shmobile: r8a7778 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. A notable exception is the "sound" node, which represents multiple SoC devices, each having their own MSTP clocks. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit d45696723575a6a7fbb43b754fb89121746b5982 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:07 2015 +0200 ARM: shmobile: r7s72100 dtsi: Add CPG/MSTP Clock Domain Add an appropriate "#power-domain-cells" property to the cpg_clocks device node, to create the CPG/MSTP Clock Domain. Add "power-domains" properties to all device nodes for devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock. This applies to most on-SoC devices, which have a one-to-one mapping from SoC device to DT device node. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 584bf98feb6c38875b76d17ed52e89711b4b5577 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:06 2015 +0200 clk: shmobile: rz: Add CPG/MSTP Clock Domain support Add Clock Domain support to the RZ Clock Pulse Generator (CPG) driver using the generic PM Domain. This allows to power-manage the module clocks of SoC devices that are part of the CPG/MSTP Clock Domain using Runtime PM, or for system suspend/resume. SoC devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock should be tagged in DT with a proper "power-domains" property. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Stephen Boyd <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 3602362dcc18fd46a066583fe2a13b57b3df159e Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:05 2015 +0200 clk: shmobile: rcar-gen2: Add CPG/MSTP Clock Domain support Add Clock Domain support to the R-Car Gen2 Clock Pulse Generator (CPG) driver using the generic PM Domain. This allows to power-manage the module clocks of SoC devices that are part of the CPG/MSTP Clock Domain using Runtime PM, or for system suspend/resume. SoC devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock should be tagged in DT with a proper "power-domains" property. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Stephen Boyd <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 4aabed77053836418de7065a37414526099b889c Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:04 2015 +0200 clk: shmobile: r8a7779: Add CPG/MSTP Clock Domain support Add Clock Domain support to the R-Car H1 Clock Pulse Generator (CPG) driver using the generic PM Domain. This allows to power-manage the module clocks of SoC devices that are part of the CPG/MSTP Clock Domain using Runtime PM, or for system suspend/resume. SoC devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock should be tagged in DT with a proper "power-domains" property. Also update the reg property in the DT binding doc example to match the actual dtsi, which uses #address-cells and #size-cells == 1, not 2. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Stephen Boyd <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit aef2b0f2dbc72f79289d74d7c1fefcd25ae11627 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:03 2015 +0200 clk: shmobile: r8a7778: Add CPG/MSTP Clock Domain support Add Clock Domain support to the R-Car M1A Clock Pulse Generator (CPG) driver using the generic PM Domain. This allows to power-manage the module clocks of SoC devices that are part of the CPG/MSTP Clock Domain using Runtime PM, or for system suspend/resume. SoC devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock should be tagged in DT with a proper "power-domains" property. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Stephen Boyd <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit 1fcc5dc775d0d79153dda6fee1502ab96344b823 Author: Geert Uytterhoeven <[email protected]> Date: Tue Aug 4 14:28:02 2015 +0200 clk: shmobile: Add CPG/MSTP Clock Domain support Add Clock Domain support to the Clock Pulse Generator (CPG) Module Stop (MSTP) Clocks driver using the generic PM Domain. This allows to power-manage the module clocks of SoC devices that are part of the CPG/MSTP Clock Domain using Runtime PM, or for system suspend/resume. SoC devices that are part of the CPG/MSTP Clock Domain and can be power-managed through an MSTP clock should be tagged in DT with a proper "power-domains" property. The CPG/MSTP Clock Domain code will scan such devices for clocks that are suitable for power-managing the device, by looking for a clock that is compatible with "renesas,cpg-mstp-clocks". Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Laurent Pinchart <[email protected]> Acked-by: Stephen Boyd <[email protected]> Reviewed-by: Ulf Hansson <[email protected]> Reviewed-by: Kevin Hilman <[email protected]> Signed-off-by: Simon Horman <[email protected]> commit a4198fd4b487afc60810f5a12b994721df220022 Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:23 2015 +0800 crypto: testmgr - Reenable authenc tests Now that all implementations of authenc have been converted we can reenable the tests. Signed-off-by: Herbert Xu <[email protected]> commit aeb4c132f33d21f6cf37558a932e66e40dd8982e Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:22 2015 +0800 crypto: talitos - Convert to new AEAD interface This patch converts talitos to the new AEAD interface. IV generation has been removed since it's equivalent to a software implementation. Signed-off-by: Herbert Xu <[email protected]> commit e19ab1211d2848ebd028c824041d8ea249fa3aae Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:20 2015 +0800 crypto: qat - Convert to new AEAD interface This patch converts qat to the new AEAD interface. IV generation has been removed since it's equivalent to a software implementation. Signed-off-by: Herbert Xu <[email protected]> Tested-by: Tadeusz Struk <[email protected]> commit c1359495c8a19fa686aa48512e0290d2f3846273 Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:19 2015 +0800 crypto: picoxcell - Convert to new AEAD interface This patch converts picoxcell to the new AEAD interface. IV generation has been removed since it's equivalent to a software implementation. As picoxcell cannot handle SG lists longer than 16 elements, this patch has made the software fallback mandatory. If an SG list comes in that exceeds the limit, we will simply use the fallback. Signed-off-by: Herbert Xu <[email protected]> commit d7295a8dc965ee0d5b3f9b1eb7f556c2bfa78420 Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:18 2015 +0800 crypto: ixp4xx - Convert to new AEAD interface This patch converts ixp4xx to the new AEAD interface. IV generation has been removed since it's a purely software implementation. Signed-off-by: Herbert Xu <[email protected]> commit 479bcc7c5b9e1cbf3278462d786c37e468d5d404 Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:17 2015 +0800 crypto: caam - Convert authenc to new AEAD interface This patch converts the authenc implementations in caam to the new AEAD interface. The biggest change is that seqiv no longer generates a random IV. Instead the IPsec sequence number is used as the IV. Signed-off-by: Herbert Xu <[email protected]> commit 92d95ba91772279b6ef9c6e09661f67abcf27259 Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:16 2015 +0800 crypto: authenc - Convert to new AEAD interface This patch converts authenc to the new AEAD interface. Signed-off-by: Herbert Xu <[email protected]> commit 7079ce62c0e9bfcca35214105c08a2d00fbea9ee Author: Herbert Xu <[email protected]> Date: Thu Jul 30 17:53:14 2015 +0800 crypto: testmgr - Disable authenc test and convert test vectors This patch disables the authenc tests while the conversion to the new IV calling convention takes place. It also replaces the authenc test vectors with ones that will work with the new IV convention. Signed-off-by: Herbert Xu <[email protected]> commit fdf036507f1fc036d5a06753e9e8b13f46de73e8 Author: Fan Zhang <[email protected]> Date: Wed May 13 10:58:41 2015 +0200 KVM: s390: host STP toleration for VMs If the host has STP enabled, the TOD of the host will be changed during synchronization phases. These are performed during a stop_machine() call. As the guest TOD is based on the host TOD, we have to make sure that: - no VCPU is in the SIE (implicitly guaranteed via stop_machine()) - manual guest TOD calculations are not affected "Epoch" is the guest TOD clock delta to the host TOD clock. We have to adjust that value during the STP synchronization and make sure that code that accesses the epoch won't get interrupted in between (via disabling preemption). Signed-off-by: Fan Zhang <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Acked-by: Martin Schwidefsky <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]> commit 8578531dea9672fa17cce66d320ed7d7c17c837f Author: Martin Schwidefsky <[email protected]> Date: Mon Aug 3 16:16:40 2015 +0200 s390/vtime: limit MT scaling value updates The MT scaling values are updated on each calll to do_account_vtime. This function is called for each HZ interrupt and for each context switch. Context switch can happen often, the STCCTM instruction on this path is noticable. Limit the updates to once per jiffy. Signed-off-by: Martin Schwidefsky <[email protected]> commit 5a7ff75a0c63222d138d944240146dc49a9624e1 Author: Heiko Carstens <[email protected]> Date: Tue Aug 4 09:15:58 2015 +0200 s390/syscalls: ignore syscalls reachable via sys_socketcall x86 will wire up all syscalls reachable via sys_socketcall. Therefore this will yield a lot of warnings from the checksyscalls.sh scripts on s390 where we currently don't wire them up directly. This might change in the future, but this needs to be done carefully in order to not break anything. For the time being just tell the checksyscalls script to ignore the missing syscalls on s390. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]> commit a763bc8b656d11b7424cd2696e19efca301d8aa4 Author: Philipp Hachtmann <[email protected]> Date: Fri May 8 17:40:44 2015 +0200 s390/numa: enable support in s390 configs Signed-off-by: Philipp Hachtmann <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]> commit c29a7baf091fc6b2c9e40561030f8c62e6145a19 Author: Michael Holzheu <[email protected]> Date: Thu Mar 6 18:47:21 2014 +0100 s390/numa: add emulation support NUMA emulation (aka fake NUMA) distributes the available memory to nodes without using real topology information about the physical memory of the machine. Splitting the system memory into nodes replicates the memory management structures for each node. Particularly each node has its own "mm locks" and its own "kswapd" task. For large systems, under certain conditions, this results in improved system performance and/or latency based on reduced pressure on the mm locks and the kswapd tasks. NUMA emulation distributes CPUs to nodes while respecting the original machine topology information. This is done by trying to avoid to separate CPUs which reside on the same book or even on the same MC. Because the current Linux scheduler code requires a stable cpu to node mapping, cores are pinned to nodes when the first CPU thread is set online. This patch is based on the initial implementation from Philipp Hachtmann. Signed-off-by: Michael Holzheu <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]> commit 56113f6e6f8d57d3c184544c6421422558a9988e Author: Joe Perches <[email protected]> Date: Mon Aug 3 10:49:29 2015 -0700 ASoC: atmel_ssc_dai: Correct misuse of 0x%<decimal> Correct misuse of 0x%d in logging message. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Mark Brown <[email protected]> commit 4e491fe7920cb84dd0a2ea79800173ab1802fa22 Author: Dan Carpenter <[email protected]> Date: Tue Aug 4 10:47:23 2015 +0300 extcon: Fix signedness bugs about break error handling Unsigned is never less than zero so this error handling won't work. Fixes: be052cc87745 ('extcon: Fix hang and extcon_get/set_cable_state().') Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Roger Quadros <[email protected]> [cw00.choi: Change the patch title and fix signedness bug of find_cable_index_by_id() ] Signed-off-by: Chanwoo Choi <[email protected]> commit 76bea64c4c8d7fa911eb485c4c2b8583e813331e Author: Aaron Sierra <[email protected]> Date: Mon Aug 3 18:56:21 2015 -0500 crypto: talitos - Remove zero_entry static initializer Compiling the talitos driver with my GCC 4.3.1 e500v2 cross-compiler resulted in a failed build due to the anonymous union/structures introduced in this commit: crypto: talitos - enhanced talitos_desc struct for SEC1 The build error was: drivers/crypto/talitos.h:56: error: unknown field 'len' specified in initializer drivers/crypto/talitos.h:56: warning: missing braces around initializer drivers/crypto/talitos.h:56: warning: (near initialization for 'zero_entry.<anonymous>') drivers/crypto/talitos.h:57: error: unknown field 'j_extent' specified in initializer drivers/crypto/talitos.h:58: error: unknown field 'eptr' specified in initializer drivers/crypto/talitos.h:58: warning: excess elements in struct initializer drivers/crypto/talitos.h:58: warning: (near initialization for 'zero_entry') make[2]: *** [drivers/crypto/talitos.o] Error 1 make[1]: *** [drivers/crypto] Error 2 make: *** [drivers] Error 2 This patch eliminates the errors by relying on the C standard's implicit assignment of zero to static variables. Signed-off-by: Aaron Sierra <[email protected]> Signed-off-by: Herbert Xu <[email protected]> commit f6e45c24f401f3d0e648bfba304c83b64d763559 Author: Stephan Mueller <[email protected]> Date: Mon Aug 3 09:08:05 2015 +0200 crypto: doc - AEAD API conversion The AEAD API changes are now reflected in the crypto API doc book. Signed-off-by: Stephan Mueller <[email protected]> Signed-off-by: Herbert Xu <[email protected]> commit 327cbbabfb77c321fb9f21068c18e6bb951d07a7 Author: Colin Ian King <[email protected]> Date: Mon Aug 3 00:05:03 2015 +0100 crypto: img-hash - fix spelling mistake in dev_err error message Trival change, fix spelling mistake 'aquire' -> 'acquire' in dev_err message. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Herbert Xu <[email protected]> commit f109ff110b0f3e34cea45996734da485a2fdaa42 Author: Hariprasad Shenai <[email protected]> Date: Tue Aug 4 14:36:20 2015 +0530 cxgb4: Update T6 register ranges Signed-off-by: Hariprasad Shenai <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit d86bd29e0b31f30d5d85ab21385b59703ecc6464 Author: Hariprasad Shenai <[email protected]> Date: Tue Aug 4 14:36:19 2015 +0530 cxgb4/cxgb4vf: read the correct bits of PL Who Am I register Read the correct bits of PL Who Am I for the Source PF field which has changed in T6 Signed-off-by: Hariprasad Shenai <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit bf8ebb67dae0a07db7aebe7a65c178ff24d90842 Author: Hariprasad Shenai <[email protected]> Date: Tue Aug 4 14:36:18 2015 +0530 cxgb4: Add support to dump edc bist status Add support to dump edc bist status for ECC data errors Signed-off-by: Hariprasad Shenai <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 5888111cb8f7368304db42787c9495d4b2b82e06 Author: Hariprasad Shenai <[email protected]> Date: Tue Aug 4 14:36:17 2015 +0530 cxgb4: Add debugfs support to dump meminfo Add debug support to dump memory address ranges of various hardware modules of the adapter. Signed-off-by: Hariprasad Shenai <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit c231afa3ccf176490dcafc3666559e7567690e76 Author: Christian Borntraeger <[email protected]> Date: Tue Aug 4 09:33:44 2015 +0200 compiler.h: cast away attributes in WRITE_ONCE magic kernel build bot showed a warning triggered by commit 76695af20c01 ("locking, arch: use WRITE_ONCE()/READ_ONCE() in smp_store_release()/smp_load_acquire()"). Turns out that sparse does not like WRITE_ONCE accessing elements from the (sparse) rcu address space. fs/afs/inode.c:448:9: sparse: incorrect type in initializer (different address spaces) fs/afs/inode.c:448:9: expected struct afs_permits *__val fs/afs/inode.c:448:9: got void [noderef] <asn:4>*<noident> Solution is to force cast away the sparse attributes for the initializer of the union in WRITE_SAME. As this now gets too long, lets split the macro. Signed-off-by: Christian Borntraeger <[email protected]> commit 6b1621c4a15b9110cda7cc2e2eb3e8e1a0c2bcfc Author: Chanwoo Choi <[email protected]> Date: Fri Jul 24 12:58:41 2015 +0900 ARM: EXYNOS: Add exynos3250 compatible to use generic cpufreq driver This patch add exynos3250 compatible string to exynos_cpufreq_matches for supporting generic cpufreq driver on Exynos3250. Signed-off-by: Chanwoo Choi <[email protected]> Acked-by: Kyungmin Park <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Bartlomiej Zolnierkiewicz <[email protected]> Signed-off-by: Kukjin Kim <[email protected]> commit d4f279f154138d252854a21d34626f1c7ca85b60 Author: Thomas Abraham <[email protected]> Date: Wed Jul 1 15:10:37 2015 +0200 ARM: EXYNOS: switch to using generic cpufreq driver for exynos5250 The new CPU clock type allows the use of generic CPUfreq driver. Switch Exynos5250 to using generic cpufreq driver. Cc: Tomasz Figa <[email protected]> Signed-off-by: Thomas Abraham <[email protected]> [b.zolnierkie: split Exynos5250 support from the original patch] Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Tested-by: Javier Martinez Canillas <[email protected]> Acked-by: Viresh Kumar <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Kukjin Kim <[email protected]> commit a6affd24f439feddec04bab4d1e3ad6579868367 Author: Robert Shearman <[email protected]> Date: Mon Aug 3 17:50:04 2015 +0100 mpls: Use definition for reserved label checks In multiple locations there are checks for whether the label in hand is a reserved label or not using the arbritray value of 16. Factor this out into a #define for better maintainability and for documentation. Signed-off-by: Robert Shearman <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 0335f5b500adcc28bc3fd5d8a1e4482c348cff4a Author: Robert Shearman <[email protected]> Date: Mon Aug 3 17:39:21 2015 +0100 ipv4: apply lwtunnel encap for locally-generated packets lwtunnel encap is applied for forwarded packets, but not for locally-generated packets. This is because the output function is not overridden in __mkroute_output, unlike it is in __mkroute_input. The lwtunnel state is correctly set on the rth through the call to rt_set_nexthop, so all that needs to be done is to override the dst output function to be lwtunnel_output if there is lwtunnel state present and it requires output redirection. Signed-off-by: Robert Shearman <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit abf7c1c540f8330fead5d50730d92606dcbe7a7e Author: Robert Shearman <[email protected]> Date: Mon Aug 3 17:39:20 2015 +0100 lwtunnel: set skb protocol and dev In the locally-generated packet path skb->protocol may not be set and this is required for the lwtunnel encap in order to get the lwtstate. This would otherwise have been set by ip_output or ip6_output so set skb->protocol prior to calling the lwtunnel encap function. Additionally set skb->dev in case it is needed further down the transmit path. Signed-off-by: Robert Shearman <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 2475b22526d70234ecfe4a1ff88aed69badefba9 Author: Ross Lagerwall <[email protected]> Date: Mon Aug 3 15:38:03 2015 +0100 xen-netback: Allocate fraglist early to avoid complex rollback Determine if a fraglist is needed in the tx path, and allocate it if necessary before setting up the copy and map operations. Otherwise, undoing the copy and map operations is tricky. This fixes a use-after-free: if allocating the fraglist failed, the copy and map operations that had been set up were still executed, writing over the data area of a freed skb. Signed-off-by: Ross Lagerwall <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 10e2eb878f3ca07ac2f05fa5ca5e6c4c9174a27a Author: Eric Dumazet <[email protected]> Date: Sat Aug 1 12:14:33 2015 +0200 udp: fix dst races with multicast early demux Multicast dst are not cached. They carry DST_NOCACHE. As mentioned in commit f8864972126899 ("ipv4: fix dst race in sk_dst_get()"), these dst need special care before caching them into a socket. Caching them is allowed only if their refcnt was not 0, ie we must use atomic_inc_not_ze…
0day-ci
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Dec 18, 2015
Cc: Mel Gorman <[email protected]> WARNING: line over 80 characters torvalds#124: FILE: mm/memory.c:1107: + /* oom_reaper cannot tear down dirty pages */ WARNING: line over 80 characters torvalds#125: FILE: mm/memory.c:1108: + if (unlikely(details && details->ignore_dirty)) WARNING: line over 80 characters torvalds#245: FILE: mm/oom_kill.c:483: + wait_event_freezable(oom_reaper_wait, (mm = READ_ONCE(mm_to_reap))); WARNING: Missing a blank line after declarations torvalds#245: FILE: mm/oom_kill.c:483: + struct mm_struct *mm; + wait_event_freezable(oom_reaper_wait, (mm = READ_ONCE(mm_to_reap))); WARNING: line over 80 characters torvalds#301: FILE: mm/oom_kill.c:721: + * We cannot use oom_reaper for the mm shared by this process WARNING: line over 80 characters torvalds#302: FILE: mm/oom_kill.c:722: + * because it wouldn't get killed and so the memory might be total: 0 errors, 6 warnings, 228 lines checked ./patches/mm-oom-introduce-oom-reaper.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
0day-ci
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Jan 1, 2016
Cc: Mel Gorman <[email protected]> WARNING: line over 80 characters torvalds#124: FILE: mm/memory.c:1107: + /* oom_reaper cannot tear down dirty pages */ WARNING: line over 80 characters torvalds#125: FILE: mm/memory.c:1108: + if (unlikely(details && details->ignore_dirty)) WARNING: line over 80 characters torvalds#245: FILE: mm/oom_kill.c:483: + wait_event_freezable(oom_reaper_wait, (mm = READ_ONCE(mm_to_reap))); WARNING: Missing a blank line after declarations torvalds#245: FILE: mm/oom_kill.c:483: + struct mm_struct *mm; + wait_event_freezable(oom_reaper_wait, (mm = READ_ONCE(mm_to_reap))); WARNING: line over 80 characters torvalds#301: FILE: mm/oom_kill.c:721: + * We cannot use oom_reaper for the mm shared by this process WARNING: line over 80 characters torvalds#302: FILE: mm/oom_kill.c:722: + * because it wouldn't get killed and so the memory might be total: 0 errors, 6 warnings, 228 lines checked ./patches/mm-oom-introduce-oom-reaper.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
0day-ci
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Jan 6, 2016
Cc: Mel Gorman <[email protected]> WARNING: line over 80 characters torvalds#124: FILE: mm/memory.c:1107: + /* oom_reaper cannot tear down dirty pages */ WARNING: line over 80 characters torvalds#125: FILE: mm/memory.c:1108: + if (unlikely(details && details->ignore_dirty)) WARNING: line over 80 characters torvalds#245: FILE: mm/oom_kill.c:483: + wait_event_freezable(oom_reaper_wait, (mm = READ_ONCE(mm_to_reap))); WARNING: Missing a blank line after declarations torvalds#245: FILE: mm/oom_kill.c:483: + struct mm_struct *mm; + wait_event_freezable(oom_reaper_wait, (mm = READ_ONCE(mm_to_reap))); WARNING: line over 80 characters torvalds#301: FILE: mm/oom_kill.c:721: + * We cannot use oom_reaper for the mm shared by this process WARNING: line over 80 characters torvalds#302: FILE: mm/oom_kill.c:722: + * because it wouldn't get killed and so the memory might be total: 0 errors, 6 warnings, 228 lines checked ./patches/mm-oom-introduce-oom-reaper.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
0day-ci
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Mar 8, 2016
This reverts commit dfd55ad. Commit dfd55ad ("arm64: vmemmap: use virtual projection of linear region") causes this failure on Cavium Thunder systems: EFI stub: Booting Linux Kernel... EFI stub: Using DTB from configuration table EFI stub: Exiting boot services and installing virtual address map... [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 4.5.0-rc7-numa+ ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) torvalds#125 SMP PREEMPT Tue Mar 8 09:59:40 PST 2016 [ 0.000000] Boot CPU: AArch64 Processor [431f0a10] [ 0.000000] earlycon: Early serial console at MMIO 0x87e024000000 (options '') [ 0.000000] bootconsole [uart0] enabled [ 0.000000] efi: Getting EFI parameters from FDT: [ 0.000000] EFI v2.40 by Cavium Thunder cn88xx EFI jenkins_weekly_build_6-1-g188619c-dirty Feb 19 2016 13:32:48 [ 0.000000] efi: ACPI=0xfffff000 ACPI 2.0=0xfffff014 SMBIOS 3.0=0x3faa62000 [ 0.000000] Unable to handle kernel paging request at virtual address fffffdff60ff0000 [ 0.000000] pgd = fffffe0000df0000 [ 0.000000] [fffffdff60ff0000] *pgd=00000003ffda0003, *pud=00000003ffda0003, *pmd=00000003ffda0003, *pte=0000000000000000 [ 0.000000] Internal error: Oops: 96000007 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.0-rc7-numa+ torvalds#125 [ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT) [ 0.000000] task: fffffe0000b29800 ti: fffffe0000af0000 task.ti: fffffe0000af0000 [ 0.000000] PC is at memmap_init_zone+0xd4/0x11c [ 0.000000] LR is at memmap_init_zone+0xb4/0x11c [ 0.000000] pc : [<fffffe0000a73c94>] lr : [<fffffe0000a73c74>] pstate: 800002c5 [ 0.000000] sp : fffffe0000af3d50 [ 0.000000] x29: fffffe0000af3d50 x28: 0000000000000001 [ 0.000000] x27: fffffe000092c2c0 x26: fffffe0000a90fc0 [ 0.000000] x25: fffffe0000b20000 x24: 0000000000040000 [ 0.000000] x23: 0000000000000000 x22: 4000000000000000 [ 0.000000] x21: 0000000000000001 x20: 00000000ffffffff [ 0.000000] x19: 000000000003fd40 x18: 0000000000000008 [ 0.000000] x17: 0000000400000000 x16: 0000000000000008 [ 0.000000] x15: 0000000000000018 x14: 00000003ffaa0000 [ 0.000000] x13: fffffe0000d6ccc0 x12: 0000000000000080 [ 0.000000] x11: 0000000600000008 x10: 00000003fffd8800 [ 0.000000] x9 : 0000000000000000 x8 : fffffe03febde800 [ 0.000000] x7 : 0000000004fb0000 x6 : fffffe0000d6c0c0 [ 0.000000] x5 : 0000000000001d40 x4 : 0000000000000007 [ 0.000000] x3 : fffffdff60000000 x2 : 0000000000ff0000 [ 0.000000] x1 : 0000000000000001 x0 : 0000000000000001 [ 0.000000] [ 0.000000] Process swapper (pid: 0, stack limit = 0xfffffe0000af0020) [ 0.000000] Stack: (0xfffffe0000af3d50 to 0xfffffe0000af4000) [ 0.000000] 3d40: fffffe0000af3da0 fffffe0000a73f50 [ 0.000000] 3d60: fffffe0000bf8080 000000000002ff40 fffffe0000bf8688 fffffe0000bf7000 . . . [ 0.000000] [<fffffe0000a73c94>] memmap_init_zone+0xd4/0x11c [ 0.000000] [<fffffe0000a73f50>] free_area_init_node+0x274/0x2bc [ 0.000000] [<fffffe0000a34ec0>] bootmem_init+0x158/0x198 [ 0.000000] [<fffffe0000a352d0>] paging_init+0xec/0x1a4 [ 0.000000] [<fffffe0000a32ed0>] setup_arch+0x110/0x5a0 [ 0.000000] [<fffffe0000a30680>] start_kernel+0xa8/0x3dc [ 0.000000] [<fffffe00000811b4>] 0xfffffe00000811b4 [ 0.000000] Code: cb424262 f2dfbfe3 d37ae442 f2ffffe3 (f8636840) [ 0.000000] ---[ end trace cb88537fdc8fa200 ]--- [ 0.000000] Kernel panic - not syncing: Fatal exception [ 0.000000] ---[ end Kernel panic - not syncing: Fatal exception Signed-off-by: David Daney <[email protected]>
madisongh
referenced
this pull request
in Xeralux/linux-fslc
Mar 8, 2016
…irq Freescale#125 IRQ Freescale#125's status is not constant on different boards, IRQ Freescale#32 is IOMUXC's interrupt which can be triggered manually at anytime, use this irq instead of Freescale#125 to generate interrupt for avoiding CCM enter low power mode by mistake. Signed-off-by: Anson Huang <[email protected]>
Noltari
pushed a commit
to Noltari/linux
that referenced
this pull request
Jun 8, 2016
[ Upstream commit 5542f58 ] Use device_is_registered() instad of sysfs flag to determine if we should free sysfs representation of particular LUN. sysfs flag in fsg common determines if luns attributes should be exposed using sysfs. This flag is used when creating and freeing luns. Unfortunately there is no guarantee that this flag will not be changed between creation and removal of particular LUN. Especially because of lun.0 which is created during allocating instance of function. This may lead to resource leak or NULL pointer dereference: [ 62.539925] Unable to handle kernel NULL pointer dereference at virtual address 00000044 [ 62.548014] pgd = ec994000 [ 62.550679] [00000044] *pgd=6d7be831, *pte=00000000, *ppte=00000000 [ 62.556933] Internal error: Oops: 17 [#1] PREEMPT SMP ARM [ 62.562310] Modules linked in: g_mass_storage(+) [ 62.566916] CPU: 2 PID: 613 Comm: insmod Not tainted 4.2.0-rc4-00077-ge29ee91-dirty torvalds#125 [ 62.574984] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 62.581061] task: eca56e80 ti: eca76000 task.ti: eca76000 [ 62.586450] PC is at kernfs_find_ns+0x8/0xe8 [ 62.590698] LR is at kernfs_find_and_get_ns+0x30/0x48 [ 62.595732] pc : [<c01277c0>] lr : [<c0127b88>] psr: 40010053 [ 62.595732] sp : eca77c40 ip : eca77c38 fp : 000008c1 [ 62.607187] r10: 00000001 r9 : c0082f38 r8 : ed41ce40 [ 62.612395] r7 : c05c1484 r6 : 00000000 r5 : 00000000 r4 : c0814488 [ 62.618904] r3 : 00000000 r2 : 00000000 r1 : c05c1484 r0 : 00000000 [ 62.625417] Flags: nZcv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user [ 62.632620] Control: 10c5387d Table: 6c99404a DAC: 00000015 [ 62.638348] Process insmod (pid: 613, stack limit = 0xeca76210) [ 62.644251] Stack: (0xeca77c40 to 0xeca78000) [ 62.648594] 7c40: c0814488 00000000 00000000 c05c1484 ed41ce40 c0127b88 00000000 c0824888 [ 62.656753] 7c60: ed41d038 ed41d030 ed41d000 c012af4c 00000000 c0824858 ed41d038 c02e3314 [ 62.664912] 7c80: ed41d030 00000000 ed41ce04 c02d9e8c c070eda8 eca77cb4 000008c1 c058317c [ 62.673071] 7ca0: 000008c1 ed41d030 ed41ce00 ed41ce04 ed41d000 c02da044 ed41cf48 c0375870 [ 62.681230] 7cc0: ed9d3c04 ed9d3c00 ed52df80 bf000940 fffffff0 c03758f4 c03758c0 00000000 [ 62.689389] 7ce0: bf000564 c03614e0 ed9d3c04 bf000194 c0082f38 00000001 00000000 c0000100 [ 62.697548] 7d00: c0814488 c0814488 c086b1dc c05893a8 00000000 ed7e8320 00000000 c0128b88 [ 62.705707] 7d20: ed8a6b40 00000000 00000000 ed410500 ed8a6b40 c0594818 ed7e8320 00000000 [ 62.713867] 7d40: 00000000 c0129f20 00000000 c082c444 ed8a6b40 c012a684 00001000 00000000 [ 62.722026] 7d60: c0594818 c082c444 00000000 00000000 ed52df80 ed52df80 00000000 00000000 [ 62.730185] 7d80: 00000000 00000000 00000001 00000002 ed8e9b70 ed52df80 bf0006d0 00000000 [ 62.738345] 7da0: ed8e9b70 ed410500 ed618340 c036129c ed8c1c00 bf0006d0 c080b158 ed8c1c00 [ 62.746504] 7dc0: bf0006d0 c080b158 ed8c1c08 ed410500 c0082f38 ed618340 000008c1 c03640ac [ 62.754663] 7de0: 00000000 bf0006d0 c082c8dc c080b158 c080b158 c03642d4 00000000 bf003000 [ 62.762822] 7e00: 00000000 c0009784 00000000 00000001 00000000 c05849b0 00000002 ee7ab780 [ 62.770981] 7e20: 00000002 ed4105c0 0000c53e 000000d0 c0808600 eca77e5c 00000004 00000000 [ 62.779140] 7e40: bf000000 c0095680 c08075a0 ee001f00 ed4105c0 c00cadc0 ed52df80 bf000780 [ 62.787300] 7e60: ed4105c0 bf000780 00000001 bf0007c8 c0082f38 ed618340 000008c1 c0083e24 [ 62.795459] 7e80: 00000001 bf000780 00000001 eca77f58 00000001 bf000780 00000001 c00857f4 [ 62.803618] 7ea0: bf00078c 00007fff 00000000 c00835b4 eca77f58 00000000 c0082fac eca77f58 [ 62.811777] 7ec0: f05038c0 0003b008 bf000904 00000000 00000000 bf00078c 6e72656b 00006c65 [ 62.819936] 7ee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 62.828095] 7f00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 62.836255] 7f20: 00000000 00000000 00000000 00000000 00000000 00000000 00000003 0003b008 [ 62.844414] 7f40: 0000017b c000f5c8 eca76000 00000000 0003b008 c0085df8 f04ef000 0001b8a9 [ 62.852573] 7f60: f0503258 f05030c2 f0509fe8 00000968 00000dc8 00000000 00000000 00000000 [ 62.860732] 7f80: 00000029 0000002a 00000011 00000000 0000000a 00000000 33f6eb00 0003b008 [ 62.868892] 7fa0: bef01cac c000f400 33f6eb00 0003b008 00000003 0003b008 00000000 00000003 [ 62.877051] 7fc0: 33f6eb00 0003b008 bef01cac 0000017b 00000000 0003b008 0000000b 0003b008 [ 62.885210] 7fe0: bef01ae0 bef01ad0 0001dc23 b6e8c162 800b0070 00000003 c0c0c0c0 c0c0c0c0 [ 62.893380] [<c01277c0>] (kernfs_find_ns) from [<c0824888>] (pm_qos_latency_tolerance_attr_group+0x0/0x10) [ 62.903005] Code: e28dd00c e8bd80f0 e92d41f0 e2923000 (e1d0e4b4) [ 62.909115] ---[ end trace 02fb4373ef095c7b ]--- Acked-by: Michal Nazarewicz <[email protected]> Signed-off-by: Krzysztof Opasiak <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Nov 1, 2016
With recent update to printk, we get console output like below P O 4.9.0-rc2-next-20161028-00003-g23bf57759-dirty torvalds#125 c0000000ef5f4000 0000000000010000 P O (4.9.0-rc2-next-20161028-00003-g23bf57759-dirty) 9000000010009033 < SF ,HV ,EE ,ME ,IR ,DR ,RI ,LE > and more output that is not easy to parse Fix the OOPS output so that it is easier on the eyes again Signed-off-by: Balbir Singh <[email protected]>
laijs
pushed a commit
to laijs/linux
that referenced
this pull request
Feb 13, 2017
lkl: Return thread id from thread_create host op
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Apr 27, 2017
For EFI with old_map enabled, Kernel will panic when kaslr is enabled. The root cause is the ident mapping is not built correctly in this case. For nokaslr kernel, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE aligned. We can borrow the pud table from direct mapping safely. Given a physical address X, we have pud_index(X) == pud_index(__va(X)). However, for kaslr kernel, PAGE_OFFSET is PUD_SIZE aligned. For a given physical address X, pud_index(X) != pud_index(__va(X)). We can't only copy pgd entry from direct mapping to build ident mapping, instead need copy pud entry one by one from direct mapping. So fix it in this patch. The panic message is like below, an emty PUD or a wrong PUD. [ 0.233007] BUG: unable to handle kernel paging request at 000000007febd57e [ 0.233899] IP: 0x7febd57e [ 0.234000] PGD 1025a067 [ 0.234000] PUD 0 [ 0.234000] [ 0.234000] Oops: 0010 [#1] SMP [ 0.234000] Modules linked in: [ 0.234000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc8+ torvalds#125 [ 0.234000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 0.234000] task: ffffffffafe104c0 task.stack: ffffffffafe00000 [ 0.234000] RIP: 0010:0x7febd57e [ 0.234000] RSP: 0000:ffffffffafe03d98 EFLAGS: 00010086 [ 0.234000] RAX: ffff8c9e3fff9540 RBX: 000000007c4b6000 RCX: 0000000000000480 [ 0.234000] RDX: 0000000000000030 RSI: 0000000000000480 RDI: 000000007febd57e [ 0.234000] RBP: ffffffffafe03e40 R08: 0000000000000001 R09: 000000007c4b6000 [ 0.234000] R10: ffffffffafa71a40 R11: 20786c6c2478303d R12: 0000000000000030 [ 0.234000] R13: 0000000000000246 R14: ffff8c9e3c4198d8 R15: 0000000000000480 [ 0.234000] FS: 0000000000000000(0000) GS:ffff8c9e3fa00000(0000) knlGS:0000000000000000 [ 0.234000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.234000] CR2: 000000007febd57e CR3: 000000000fe09000 CR4: 00000000000406b0 [ 0.234000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.234000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.234000] Call Trace: [ 0.234000] ? efi_call+0x58/0x90 [ 0.234000] ? printk+0x58/0x6f [ 0.234000] efi_enter_virtual_mode+0x3c5/0x50d [ 0.234000] start_kernel+0x40f/0x4b8 [ 0.234000] ? set_init_arg+0x55/0x55 [ 0.234000] ? early_idt_handler_array+0x120/0x120 [ 0.234000] x86_64_start_reservations+0x24/0x26 [ 0.234000] x86_64_start_kernel+0x14c/0x16f [ 0.234000] start_cpu+0x14/0x14 [ 0.234000] Code: Bad RIP value. [ 0.234000] RIP: 0x7febd57e RSP: ffffffffafe03d98 [ 0.234000] CR2: 000000007febd57e [ 0.234000] ---[ end trace d4ded46ab8ab8ba9 ]--- [ 0.234000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.234000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Dave Young <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: [email protected]
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Apr 27, 2017
For EFI with old_map enabled, Kernel will panic when kaslr is enabled. The root cause is the ident mapping is not built correctly in this case. For nokaslr kernel, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE aligned. We can borrow the pud table from direct mapping safely. Given a physical address X, we have pud_index(X) == pud_index(__va(X)). However, for kaslr kernel, PAGE_OFFSET is PUD_SIZE aligned. For a given physical address X, pud_index(X) != pud_index(__va(X)). We can't only copy pgd entry from direct mapping to build ident mapping, instead need copy pud entry one by one from direct mapping. So fix it in this patch. The panic message is like below, an emty PUD or a wrong PUD. [ 0.233007] BUG: unable to handle kernel paging request at 000000007febd57e [ 0.233899] IP: 0x7febd57e [ 0.234000] PGD 1025a067 [ 0.234000] PUD 0 [ 0.234000] [ 0.234000] Oops: 0010 [#1] SMP [ 0.234000] Modules linked in: [ 0.234000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc8+ torvalds#125 [ 0.234000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 0.234000] task: ffffffffafe104c0 task.stack: ffffffffafe00000 [ 0.234000] RIP: 0010:0x7febd57e [ 0.234000] RSP: 0000:ffffffffafe03d98 EFLAGS: 00010086 [ 0.234000] RAX: ffff8c9e3fff9540 RBX: 000000007c4b6000 RCX: 0000000000000480 [ 0.234000] RDX: 0000000000000030 RSI: 0000000000000480 RDI: 000000007febd57e [ 0.234000] RBP: ffffffffafe03e40 R08: 0000000000000001 R09: 000000007c4b6000 [ 0.234000] R10: ffffffffafa71a40 R11: 20786c6c2478303d R12: 0000000000000030 [ 0.234000] R13: 0000000000000246 R14: ffff8c9e3c4198d8 R15: 0000000000000480 [ 0.234000] FS: 0000000000000000(0000) GS:ffff8c9e3fa00000(0000) knlGS:0000000000000000 [ 0.234000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.234000] CR2: 000000007febd57e CR3: 000000000fe09000 CR4: 00000000000406b0 [ 0.234000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.234000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.234000] Call Trace: [ 0.234000] ? efi_call+0x58/0x90 [ 0.234000] ? printk+0x58/0x6f [ 0.234000] efi_enter_virtual_mode+0x3c5/0x50d [ 0.234000] start_kernel+0x40f/0x4b8 [ 0.234000] ? set_init_arg+0x55/0x55 [ 0.234000] ? early_idt_handler_array+0x120/0x120 [ 0.234000] x86_64_start_reservations+0x24/0x26 [ 0.234000] x86_64_start_kernel+0x14c/0x16f [ 0.234000] start_cpu+0x14/0x14 [ 0.234000] Code: Bad RIP value. [ 0.234000] RIP: 0x7febd57e RSP: ffffffffafe03d98 [ 0.234000] CR2: 000000007febd57e [ 0.234000] ---[ end trace d4ded46ab8ab8ba9 ]--- [ 0.234000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.234000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Dave Young <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Thomas Garnier <[email protected]> Cc: Kees Cook <[email protected]> Cc: [email protected] Cc: [email protected]
mlyle
pushed a commit
to mlyle/linux
that referenced
this pull request
Oct 20, 2017
ERROR: code indent should use tabs where possible torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ WARNING: please, no spaces at the start of a line torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ ERROR: code indent should use tabs where possible torvalds#125: FILE: kernel/panic.c:605: +^I^I "%lld\n");$ WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h) torvalds#135: FILE: kernel/panic.c:615: +__initcall(register_warn_debugfs); total: 2 errors, 2 warnings, 87 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/support-resetting-warn_once.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Mark Brown <[email protected]>
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Oct 29, 2017
ERROR: code indent should use tabs where possible torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ WARNING: please, no spaces at the start of a line torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ ERROR: code indent should use tabs where possible torvalds#125: FILE: kernel/panic.c:605: +^I^I "%lld\n");$ WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h) torvalds#135: FILE: kernel/panic.c:615: +__initcall(register_warn_debugfs); total: 2 errors, 2 warnings, 87 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/support-resetting-warn_once.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Nov 8, 2017
ERROR: code indent should use tabs where possible torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ WARNING: please, no spaces at the start of a line torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ ERROR: code indent should use tabs where possible torvalds#125: FILE: kernel/panic.c:605: +^I^I "%lld\n");$ WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h) torvalds#135: FILE: kernel/panic.c:615: +__initcall(register_warn_debugfs); total: 2 errors, 2 warnings, 87 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/support-resetting-warn_once.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
thierryreding
pushed a commit
to thierryreding/linux
that referenced
this pull request
Nov 13, 2017
ERROR: code indent should use tabs where possible torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ WARNING: please, no spaces at the start of a line torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ ERROR: code indent should use tabs where possible torvalds#125: FILE: kernel/panic.c:605: +^I^I "%lld\n");$ WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h) torvalds#135: FILE: kernel/panic.c:615: +__initcall(register_warn_debugfs); total: 2 errors, 2 warnings, 87 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/support-resetting-warn_once.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]>
ddstreet
pushed a commit
to ddstreet/linux
that referenced
this pull request
Nov 15, 2017
ERROR: code indent should use tabs where possible torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ WARNING: please, no spaces at the start of a line torvalds#124: FILE: kernel/panic.c:604: + clear_warn_once_set,$ ERROR: code indent should use tabs where possible torvalds#125: FILE: kernel/panic.c:605: +^I^I "%lld\n");$ WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h) torvalds#135: FILE: kernel/panic.c:615: +__initcall(register_warn_debugfs); total: 2 errors, 2 warnings, 87 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/support-resetting-warn_once.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
iaguis
pushed a commit
to kinvolk/linux
that referenced
this pull request
Feb 6, 2018
Rebase patches onto 4.14.8
steils
pushed a commit
to steils/linux
that referenced
this pull request
Apr 12, 2018
…heckpatch-fixes ERROR: code indent should use tabs where possible torvalds#120: FILE: include/linux/capability.h:220: + return true;$ WARNING: please, no spaces at the start of a line torvalds#120: FILE: include/linux/capability.h:220: + return true;$ ERROR: code indent should use tabs where possible torvalds#125: FILE: include/linux/capability.h:225: + return true;$ WARNING: please, no spaces at the start of a line torvalds#125: FILE: include/linux/capability.h:225: + return true;$ ERROR: code indent should use tabs where possible torvalds#129: FILE: include/linux/capability.h:229: + return true;$ WARNING: please, no spaces at the start of a line torvalds#129: FILE: include/linux/capability.h:229: + return true;$ ERROR: code indent should use tabs where possible torvalds#134: FILE: include/linux/capability.h:234: + return true;$ WARNING: please, no spaces at the start of a line torvalds#134: FILE: include/linux/capability.h:234: + return true;$ ERROR: code indent should use tabs where possible torvalds#170: FILE: include/linux/cred.h:79: + return 1;$ WARNING: please, no spaces at the start of a line torvalds#170: FILE: include/linux/cred.h:79: + return 1;$ ERROR: code indent should use tabs where possible torvalds#174: FILE: include/linux/cred.h:83: + return 1;$ WARNING: please, no spaces at the start of a line torvalds#174: FILE: include/linux/cred.h:83: + return 1;$ total: 6 errors, 6 warnings, 310 lines checked NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/kernel-conditionally-support-non-root-users-groups-and-capabilities.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Iulia Manda <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
torvalds
pushed a commit
that referenced
this pull request
Jan 15, 2019
It is possible to trigger a NULL pointer dereference by writing an incorrectly formatted string to krpobe_events (trying to create a kretprobe omitting the symbol). Example: echo "r:event_1 " >> /sys/kernel/debug/tracing/kprobe_events That triggers this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 6 PID: 1757 Comm: bash Not tainted 5.0.0-rc1+ #125 Hardware name: Dell Inc. XPS 13 9370/0F6P3V, BIOS 1.5.1 08/09/2018 RIP: 0010:kstrtoull+0x2/0x20 Code: 28 00 00 00 75 17 48 83 c4 18 5b 41 5c 5d c3 b8 ea ff ff ff eb e1 b8 de ff ff ff eb da e8 d6 36 bb ff 66 0f 1f 44 00 00 31 c0 <80> 3f 2b 55 48 89 e5 0f 94 c0 48 01 c7 e8 5c ff ff ff 5d c3 66 2e RSP: 0018:ffffb5d482e57cb8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff82b12720 RDX: ffffb5d482e57cf8 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffb5d482e57d70 R08: ffffa0c05e5a7080 R09: ffffa0c05e003980 R10: 0000000000000000 R11: 0000000040000000 R12: ffffa0c04fe87b08 R13: 0000000000000001 R14: 000000000000000b R15: ffffa0c058d749e1 FS: 00007f137c7f7740(0000) GS:ffffa0c05e580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000497d46004 CR4: 00000000003606e0 Call Trace: ? trace_kprobe_create+0xb6/0x840 ? _cond_resched+0x19/0x40 ? _cond_resched+0x19/0x40 ? __kmalloc+0x62/0x210 ? argv_split+0x8f/0x140 ? trace_kprobe_create+0x840/0x840 ? trace_kprobe_create+0x840/0x840 create_or_delete_trace_kprobe+0x11/0x30 trace_run_command+0x50/0x90 trace_parse_run_command+0xc1/0x160 probes_write+0x10/0x20 __vfs_write+0x3a/0x1b0 ? apparmor_file_permission+0x1a/0x20 ? security_file_permission+0x31/0xf0 ? _cond_resched+0x19/0x40 vfs_write+0xb1/0x1a0 ksys_write+0x55/0xc0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x5a/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix by doing the proper argument checks in trace_kprobe_create(). Cc: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Link: http://lkml.kernel.org/r/20190111060113.GA22841@xps-13 Fixes: 6212dd2 ("tracing/kprobes: Use dyn_event framework for kprobe events") Acked-by: Masami Hiramatsu <[email protected]> Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
fengguang
pushed a commit
to 0day-ci/linux
that referenced
this pull request
Feb 11, 2019
ERROR: code indent should use tabs where possible torvalds#125: FILE: mm/shuffle.h:43: + return false;$ WARNING: please, no spaces at the start of a line torvalds#125: FILE: mm/shuffle.h:43: + return false;$ total: 1 errors, 1 warnings, 96 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/mm-maintain-randomization-of-page-free-lists.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Kees Cook <[email protected]> Cc: Keith Busch <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Robert Elliott <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
rgushchin
pushed a commit
to rgushchin/linux
that referenced
this pull request
Feb 12, 2019
ERROR: code indent should use tabs where possible torvalds#125: FILE: mm/shuffle.h:43: + return false;$ WARNING: please, no spaces at the start of a line torvalds#125: FILE: mm/shuffle.h:43: + return false;$ total: 1 errors, 1 warnings, 96 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/mm-maintain-randomization-of-page-free-lists.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Kees Cook <[email protected]> Cc: Keith Busch <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Robert Elliott <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
krzk
pushed a commit
to krzk/linux
that referenced
this pull request
Feb 21, 2019
ERROR: code indent should use tabs where possible torvalds#125: FILE: mm/shuffle.h:43: + return false;$ WARNING: please, no spaces at the start of a line torvalds#125: FILE: mm/shuffle.h:43: + return false;$ total: 1 errors, 1 warnings, 96 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. NOTE: Whitespace errors detected. You may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/mm-maintain-randomization-of-page-free-lists.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Kees Cook <[email protected]> Cc: Keith Busch <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Robert Elliott <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]>
roxell
pushed a commit
to roxell/linux
that referenced
this pull request
Jan 17, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
cthbleachbit
pushed a commit
to AOSC-Tracking/linux
that referenced
this pull request
Jan 17, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Jan 18, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
cthbleachbit
pushed a commit
to AOSC-Tracking/linux
that referenced
this pull request
Jan 28, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
shikongzhineng
pushed a commit
to shikongzhineng/linux
that referenced
this pull request
Feb 7, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
cthbleachbit
pushed a commit
to AOSC-Tracking/linux
that referenced
this pull request
Feb 17, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
yetist
pushed a commit
to loongarchlinux/linux
that referenced
this pull request
Feb 29, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
shikongzhineng
pushed a commit
to shikongzhineng/linux
that referenced
this pull request
Mar 17, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
shipujin
pushed a commit
to shipujin/linux
that referenced
this pull request
Jul 24, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL torvalds#119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL torvalds#120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL torvalds#121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL torvalds#122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL torvalds#123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL torvalds#124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL torvalds#125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL torvalds#126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL torvalds#127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL torvalds#128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL torvalds#129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL torvalds#486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Aug 26, 2024
The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Add the proper locking around the calls to start_kthread() and stop_kthread(). Link: https://lore.kernel.org/all/CAP4=nvRTH5VxSO3VSDCospWcZagawTMs0L9J_kcKdGSkn7xT_Q@mail.gmail.com/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Cc: [email protected] Fixes: c8895e2 ("trace/osnoise: Support hotplug operations") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 4, 2024
The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 4, 2024
The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
torvalds
pushed a commit
that referenced
this pull request
Sep 6, 2024
The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty #125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 9, 2024
commit 177e1cc upstream. The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 10, 2024
commit 177e1cc upstream. The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Sep 10, 2024
commit 177e1cc upstream. The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
KexyBiscuit
pushed a commit
to AOSC-Tracking/linux
that referenced
this pull request
Sep 10, 2024
commit 177e1cc upstream. The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 12, 2024
commit 177e1cc upstream. The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064
pushed a commit
to 1054009064/linux
that referenced
this pull request
Sep 12, 2024
commit 177e1cc upstream. The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
yhamamachi
pushed a commit
to yhamamachi/linux-pcie-virtio-net
that referenced
this pull request
Oct 23, 2024
The start_kthread() and stop_thread() code was not always called with the interface_lock held. This means that the kthread variable could be unexpectedly changed causing the kthread_stop() to be called on it when it should not have been, leading to: while true; do rtla timerlat top -u -q & PID=$!; sleep 5; kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done Causing the following OOPS: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty torvalds#125 a533010b71dab205ad2f507188ce8c82203b0254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:hrtimer_active+0x58/0x300 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f RSP: 0018:ffff88811d97f940 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28 FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0 Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x154/0x230 ? asm_exc_general_protection+0x26/0x30 ? hrtimer_active+0x58/0x300 ? __pfx_mutex_lock+0x10/0x10 ? __pfx_locks_remove_file+0x10/0x10 hrtimer_cancel+0x15/0x40 timerlat_fd_release+0x8e/0x1f0 ? security_file_release+0x43/0x80 __fput+0x372/0xb10 task_work_run+0x11e/0x1f0 ? _raw_spin_lock+0x85/0xe0 ? __pfx_task_work_run+0x10/0x10 ? poison_slab_object+0x109/0x170 ? do_exit+0x7a0/0x24b0 do_exit+0x7bd/0x24b0 ? __pfx_migrate_enable+0x10/0x10 ? __pfx_do_exit+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x64/0x140 ? _raw_spin_lock_irq+0x86/0xe0 do_group_exit+0xb0/0x220 get_signal+0x17ba/0x1b50 ? vfs_read+0x179/0xa40 ? timerlat_fd_read+0x30b/0x9d0 ? __pfx_get_signal+0x10/0x10 ? __pfx_timerlat_fd_read+0x10/0x10 arch_do_signal_or_restart+0x8c/0x570 ? __pfx_arch_do_signal_or_restart+0x10/0x10 ? vfs_read+0x179/0xa40 ? ksys_read+0xfe/0x1d0 ? __pfx_ksys_read+0x10/0x10 syscall_exit_to_user_mode+0xbc/0x130 do_syscall_64+0x74/0x110 ? __pfx___rseq_handle_notify_resume+0x10/0x10 ? __pfx_ksys_read+0x10/0x10 ? fpregs_restore_userregs+0xdb/0x1e0 ? fpregs_restore_userregs+0xdb/0x1e0 ? syscall_exit_to_user_mode+0x116/0x130 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 ? do_syscall_64+0x74/0x110 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7ff0070eca9c Code: Unable to access opcode bytes at 0x7ff0070eca72. RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008 </TASK> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core ---[ end trace 0000000000000000 ]--- This is because it would mistakenly call kthread_stop() on a user space thread making it "exit" before it actually exits. Since kthreads are created based on global behavior, use a cpumask to know when kthreads are running and that they need to be shutdown before proceeding to do new work. Link: https://lore.kernel.org/all/[email protected]/ This was debugged by using the persistent ring buffer: Link: https://lore.kernel.org/all/[email protected]/ Note, locking was originally used to fix this, but that proved to cause too many deadlocks to work around: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Luis Claudio R. Goncalves" <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: e88ed22 ("tracing/timerlat: Add user-space interface") Reported-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Dec 2, 2024
Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat[1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88811f5b9100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Dec 2, 2024
Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat[1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88811f5b9100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]>
bjoto
referenced
this pull request
in linux-riscv/linux-riscv
Dec 10, 2024
Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ #125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ #125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ #125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Dec 17, 2024
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Dec 17, 2024
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
ptr1337
pushed a commit
to CachyOS/linux
that referenced
this pull request
Dec 19, 2024
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Dec 20, 2024
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jan 10, 2025
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alva Lan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jan 15, 2025
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alva Lan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
staging-kernelci-org
pushed a commit
to kernelci/linux
that referenced
this pull request
Jan 18, 2025
commit ed1fc5d upstream. Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ torvalds#125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ torvalds#125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alva Lan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue description:
During writing the Linux Kernel image on /dev/mtdX device the burning is failed due to I/O write or erase error:
Erasing blocks: 6/12 (50%)
While erasing blocks 0x000a0000-0x000c0000 on /dev/mtd1: Input/output error
Burning the /persistent/ISF.img image on /dev/mtd1 failed.
When doing erase, writing or verifying using FLASHCP program, the cfi mtd functions of Linux Kernel MTD driver can returns too early. The function chip_ready() return OK status while the actual writing or erasing is not finished on NOR flash device. It leads to I/O error and fail in writing, erasing or verifying of the data during working with MTD devices.
When the issue with writing is reproduced we compared the data of LinuxKERNEL.img and the data which was written on the /dev/mtdX device:
dd if=/dev/mtd3 bs=2367008 count=1 > /persistent/dest.buf
hexdump /persistent/dest.buf > /persistent/dest.hex
hexdump /persistent/LinuxKERNEL.img > /persistent/source.hex
diff -purN /persistent/dest.hex /persistent/source.hex
And it was observed that the driver didn't write some blocks at all (0xffff instead of real data in some places). Flash driver skip writing or hasn't been finished it during the specific time.
I found that in Internet I am not alone in this world, but I didn't find some solution for this problem which doesn't lead to increasing of the time of writing. So I investigate the writing and found that there is no any check that the data is really was written on the Flash. So the proposed solution is to compare the given last byte with written one. We have run a long stubility during several weeks on our side with Switch Board using writing on NOR flash and the problem is not reproduced.
FREQUENCY:
1%
The I/O error appears about 1 time during 3-4 hours writing on NOR flash. But it was not possible to determine exactly at what time the issue can happen.
SYSTEM IMPACT:
fails to write/erase on NOR flash
FIX DESCRIPTION:
Wait until the following checks return positive results during writing/erasing on NOR Flash:
TEST DESCRIPTION:
Start writing some data on NOR FLASH (mtd device) using flashcp or start erasing using flash_erase.