Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rust cfg file to kbuild #23

Closed
wants to merge 6 commits into from
Closed

Conversation

kloenk
Copy link
Member

@kloenk kloenk commented Oct 10, 2020

Fixes #22

Contains #28

Signed-off-by: Finn Behrens <[email protected]>
@kloenk
Copy link
Member Author

kloenk commented Oct 10, 2020

I created a file containing --cfg arguments via the kbuild c script.
rustc can take this file if we give it via @. We just have to figure out the absolute (or relative path from where rustc is invoked) to that file an append it to the rustc arguments

@kloenk
Copy link
Member Author

kloenk commented Oct 10, 2020

@ojeda
The ci does seem to fail because of other failures (404?) maybe someone can restart the job if we are lucky this helps?

@ojeda
Copy link
Member

ojeda commented Oct 10, 2020

Yeah, something odd is going on in Azure (re-ran them now). Let me invite you to the organization so that you can re-run them too.

@kloenk kloenk added this to the First PR milestone Oct 10, 2020
@kloenk kloenk mentioned this pull request Oct 11, 2020
@kloenk
Copy link
Member Author

kloenk commented Oct 17, 2020

This somehow expanded to get it to build in general, so could take some time

@kloenk kloenk force-pushed the rust_cfg branch 2 times, most recently from 7c76515 to cbe5dcf Compare October 17, 2020 16:24
Co-authored-by: Matthias Beyer <[email protected]>
Signed-off-by: Finn Behrens <[email protected]>
@kloenk
Copy link
Member Author

kloenk commented Oct 18, 2020

I somehow destroyed the symbols. Not yet sure how.

@kloenk
Copy link
Member Author

kloenk commented Oct 25, 2020

Seems like that for some reasons make does not pass --cfg=MODULE to rustc, not sure why.

Does somebody with more makefile expertise want to look at it?

This is currenlty hacky

Signed-off-by: Finn Behrens <[email protected]>
@kloenk kloenk marked this pull request as ready for review October 29, 2020 21:30
@kloenk
Copy link
Member Author

kloenk commented Oct 29, 2020

Seems like it works now.

@kloenk
Copy link
Member Author

kloenk commented Oct 29, 2020

Now builtin fails

@ojeda
Copy link
Member

ojeda commented Oct 29, 2020

Now builtin fails

It might be the #20 issue that re-surfaces again here (I haven't taken a look at solving that panic yet, sorry).

@kloenk
Copy link
Member Author

kloenk commented Oct 30, 2020

I think it's something else. The panic from #20 prints output from rust. here it panics before printing module information.

It seems like this is the problem (unfortunately I don't know yet how to debug the kerne, but will try to learn)

[    0.567597] BUG: kernel NULL pointer dereference, address: 0000000000000004 

@kloenk
Copy link
Member Author

kloenk commented Nov 2, 2020

Sadly I cannot build the kernel with debug symbols, which make it harder to debug. Not sure why it fails, maybe some broken ld or so.

@kloenk
Copy link
Member Author

kloenk commented Nov 3, 2020

This also failes in v5.10-rc2.

Do you @ojeda know on which list or bugzilla I should report that bug? linux-kbuild or linux-kernel?

@ojeda
Copy link
Member

ojeda commented Nov 3, 2020

This also failes in v5.10-rc2.

Do you @ojeda know on which list or bugzilla I should report that bug? linux-kbuild or linux-kernel?

We would need to reproduce it without all our other code first, i.e. in the mainline kernel.

Sadly I cannot build the kernel with debug symbols, which make it harder to debug. Not sure why it fails, maybe some broken ld or so.

What do you mean? ld gives an error?

@kloenk
Copy link
Member Author

kloenk commented Nov 3, 2020

We would need to reproduce it without all our other code first, i.e. in the mainline kernel.

Took a clean v5.10-rc2 and build from there.

What do you mean? ld gives an error?

I see this error. Not sure if my ld is broken, or what the problem is.
This only occurred for me when building with DEBUG y.

  CHK     include/generated/compile.h
  LD [M]  fs/efivarfs/efivarfs.o
/nix/store/p792j5f44l3f0xi7ai5jllwnxqwnka88-binutils-2.31.1/bin/ld: fs/efivarfs/inode.o: unable to initialize decompress status for section .debug_loc
/nix/store/p792j5f44l3f0xi7ai5jllwnxqwnka88-binutils-2.31.1/bin/ld: fs/efivarfs/inode.o: unable to initialize decompress status for section .debug_loc
/nix/store/p792j5f44l3f0xi7ai5jllwnxqwnka88-binutils-2.31.1/bin/ld: fs/efivarfs/inode.o: unable to initialize decompress status for section .debug_loc
/nix/store/p792j5f44l3f0xi7ai5jllwnxqwnka88-binutils-2.31.1/bin/ld: fs/efivarfs/inode.o: unable to initialize decompress status for section .debug_loc
fs/efivarfs/inode.o: file not recognized: file format not recognized
make[3]: *** [../scripts/Makefile.build:434: fs/efivarfs/efivarfs.o] Error 1
make[2]: *** [../scripts/Makefile.build:500: fs/efivarfs] Error 2
make[1]: *** [/persist/data/kloenk/proj/linux/Makefile:1799: fs] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/persist/data/kloenk/proj/linux/build'
make: *** [Makefile:185: __sub-make] Error 2

(The /nix/store path is just where ld is in my path)

@ojeda
Copy link
Member

ojeda commented Nov 3, 2020

Took a clean v5.10-rc2 and build from there.

Ah, so this is another thing? To clarify, we have 3 issues now then?

I see this error. Not sure if my ld is broken, or what the problem is.
This only occurred for me when building with DEBUG y.

/nix/store/p792j5f44l3f0xi7ai5jllwnxqwnka88-binutils-2.31.1/bin/ld: fs/efivarfs/inode.o: unable to initialize decompress status for section .debug_loc

That looks like https://wiki.gentoo.org/wiki/Binutils_2.32_upgrade_notes/elfutils_0.175:_unable_to_initialize_decompress_status_for_section_.debug_info (and similar reports). Can you please try with a newer binutils? (or with LLVM -- if you have version 10.0.1 at least, try passing LLVM=1 to make)

@kloenk
Copy link
Member Author

kloenk commented Nov 4, 2020

Took a clean v5.10-rc2 and build from there.

Ah, so this is another thing? To clarify, we have 3 issues now then?

Yes, I see three errors.

I see this error. Not sure if my ld is broken, or what the problem is.
This only occurred for me when building with DEBUG y.

/nix/store/p792j5f44l3f0xi7ai5jllwnxqwnka88-binutils-2.31.1/bin/ld: fs/efivarfs/inode.o: unable to initialize decompress status for section .debug_loc

That looks like https://wiki.gentoo.org/wiki/Binutils_2.32_upgrade_notes/elfutils_0.175:_unable_to_initialize_decompress_status_for_section_.debug_info (and similar reports). Can you please try with a newer binutils? (or with LLVM -- if you have version 10.0.1 at least, try passing LLVM=1 to make)

I'm using LLVM 11. Will try later with LLVM=1, currently only used CC=clang. Newer binutils was not available the last time I checked. Will also look later.

@ojeda
Copy link
Member

ojeda commented Nov 4, 2020

I'm using LLVM 11. Will try later with LLVM=1, currently only used CC=clang. Newer binutils was not available the last time I checked. Will also look later.

Thanks! Yeah, CC=clang is only for the C compiler. LLVM=1 is a shorthand for CC=clang plus the rest of the toolchain, including lld etc.

@kloenk
Copy link
Member Author

kloenk commented Nov 19, 2020

I don't have enough time at hand to learn kernel debugging. Has anyone time to debug this error? or do we want to disable the builtins test for now, and say that it cannot be a builtin kernel module and fix it later?

@ojeda
Copy link
Member

ojeda commented Nov 27, 2020

Thanks for trying! We could have been lucky. I will take a look at this one then.

@kloenk
Copy link
Member Author

kloenk commented Nov 27, 2020

I saw a pagefault. The stacktrace contains something about unit, but I saw no errors regarding do_one_initcall

…nux#20 panic)

Some kernel configs were panicking at the Rust example driver
initialization. The `__MOD` static value had a bogus value,
which meant that trying to initialize it was dropping the object
that was, supposedly, there.

The memory corruption happened during rootfs unpacking,
which explains why it only happened in some configs (like in CI)
and why it also didn't happen if there was an early error
during unpacking.

That memory corruption, in turn, was caused because the
`__MOD` symbol was being placed after the end of the
kernel reserve. That happened due to the kernel's linker
script not supporting unique sections per symbol for dead
code data elimination -- yet. Some arches do, but until
we can rely on that, we need to disable their generation
in rustc's side for the moment.

Since we discussed to have the target spec on our side,
and since `-Z function-sections=false` was added just
a month ago, I went with the spec route.

Other symbols were being placed in unexpected places,
which should be fixed now too.

Signed-off-by: Miguel Ojeda <[email protected]>
@kloenk
Copy link
Member Author

kloenk commented Nov 27, 2020

pushed, so you have my current local state.

scripts/ld-version.sh Outdated Show resolved Hide resolved
gawk Outdated Show resolved Hide resolved
Signed-off-by: Finn Behrens <[email protected]>
@ojeda
Copy link
Member

ojeda commented Nov 27, 2020

pushed, so you have my current local state.

Thanks a lot -- I was about to ask how to reproduce it. So these changes should be enough?

@kloenk
Copy link
Member Author

kloenk commented Nov 27, 2020

Yes, I only have those awk patches added to my local version, so I can build it under nixos

@ojeda
Copy link
Member

ojeda commented Nov 28, 2020

OK, this one was easy. It is not a problem with the current HEAD. Instead, things got broken here because we didn't have the "naked" CONFIG_Xs I mentioned in #22, so the module! macro was not seeing CONFIG_HAVE_ARCH_PREL32_RELOCATIONS and therefore it was breaking the initcall table.

I took the liberty to reduce the differences w.r.t. the current HEAD, so I am sending a PR (#32) with you as author, but already having this fixed. Same for the bindgen proposal part, since you already had it done (b91aff3). With those two things out of the way, then the only differences that remain here are your O=... fixes.

ojeda added a commit that referenced this pull request Nov 28, 2020
At the moment, any mistake passing the kernel configuration
for conditional compilation means we break things silently,
like initcall handling which implies panics or memory
corruption [1].

Let's prevent that from happening again.

[1] #23

Signed-off-by: Miguel Ojeda <[email protected]>
@kloenk
Copy link
Member Author

kloenk commented Nov 28, 2020

Will reopen as new pr, with only the O= thing

@kloenk kloenk closed this Nov 28, 2020
This was referenced Nov 28, 2020
ojeda pushed a commit that referenced this pull request Feb 28, 2022
When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
ojeda pushed a commit that referenced this pull request May 4, 2022
We enabled UBSAN in the ubuntu kernel, and the cs35l41 driver triggers
a warning calltrace like below:

cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 8, word_offset=23, bit_sum mod 32=0, otp_map[i].size = 24
cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 0, word_offset=24, bit_sum mod 32=24, otp_map[i].size = 0
================================================================================
UBSAN: shift-out-of-bounds in linux-kernel-src/sound/soc/codecs/cs35l41-lib.c:836:8
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic #23
Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
Call Trace:
 <TASK>
 show_stack+0x52/0x58
 dump_stack_lvl+0x4a/0x5f
 dump_stack+0x10/0x12
 ubsan_epilogue+0x9/0x45
 __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
 ? regmap_unlock_mutex+0xe/0x10
 cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
 cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
 cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]

When both bitoffset and otp_map[i].size are 0, the line 836 will
result in GENMASK(-1, 0), this triggers the shift-out-of-bounds
calltrace.

Here add a checking, if both bitoffset and otp_map[i].size are 0,
do not run GENMASK() and directly set otp_val to 0, this will not
bring any function change on the driver but could avoid the calltrace.

Signed-off-by: Hui Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
wedsonaf pushed a commit to wedsonaf/linux that referenced this pull request Oct 5, 2022
After modifying the QP to the Error state, all RX WR would be completed
with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not
wait for it is done, but destroy the QP and free the link group directly.
So there is a risk that accessing the freed memory in tasklet context.

Here is a crash example:

 BUG: unable to handle page fault for address: ffffffff8f220860
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060
 Oops: 0002 [Rust-for-Linux#1] SMP PTI
 CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S         OE     5.10.0-0607+ Rust-for-Linux#23
 Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018
 RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0
 Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
 RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086
 RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000
 RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00
 RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b
 R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010
 R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040
 FS:  0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  _raw_spin_lock_irqsave+0x30/0x40
  mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib]
  smc_wr_rx_tasklet_fn+0x56/0xa0 [smc]
  tasklet_action_common.isra.21+0x66/0x100
  __do_softirq+0xd5/0x29c
  asm_call_irq_on_stack+0x12/0x20
  </IRQ>
  do_softirq_own_stack+0x37/0x40
  irq_exit_rcu+0x9d/0xa0
  sysvec_call_function_single+0x34/0x80
  asm_sysvec_call_function_single+0x12/0x20

Fixes: bd4ad57 ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR")
Signed-off-by: Yacan Liu <[email protected]>
Reviewed-by: Tony Lu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Darksonn pushed a commit to Darksonn/linux that referenced this pull request Oct 19, 2022
Fix port I/O string accessors such as `insb', `outsb', etc. which use
the physical PCI port I/O address rather than the corresponding memory
mapping to get at the requested location, which in turn breaks at least
accesses made by our parport driver to a PCIe parallel port such as:

PCI parallel port detected: 1415:c118, I/O at 0x1000(0x1008), IRQ 20
parport0: PC-style at 0x1000 (0x1008), irq 20, using FIFO [PCSPP,TRISTATE,COMPAT,EPP,ECP]

causing a memory access fault:

Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000001008
Oops [Rust-for-Linux#1]
Modules linked in:
CPU: 1 PID: 350 Comm: cat Not tainted 6.0.0-rc2-00283-g10d4879f9ef0-dirty Rust-for-Linux#23
Hardware name: SiFive HiFive Unmatched A00 (DT)
epc : parport_pc_fifo_write_block_pio+0x266/0x416
 ra : parport_pc_fifo_write_block_pio+0xb4/0x416
epc : ffffffff80542c3e ra : ffffffff80542a8c sp : ffffffd88899fc60
 gp : ffffffff80fa2700 tp : ffffffd882b1e900 t0 : ffffffd883d0b000
 t1 : ffffffffff000002 t2 : 4646393043330a38 s0 : ffffffd88899fcf0
 s1 : 0000000000001000 a0 : 0000000000000010 a1 : 0000000000000000
 a2 : ffffffd883d0a010 a3 : 0000000000000023 a4 : 00000000ffff8fbb
 a5 : ffffffd883d0a001 a6 : 0000000100000000 a7 : ffffffc800000000
 s2 : ffffffffff000002 s3 : ffffffff80d28880 s4 : ffffffff80fa1f50
 s5 : 0000000000001008 s6 : 0000000000000008 s7 : ffffffd883d0a000
 s8 : 0004000000000000 s9 : ffffffff80dc1d80 s10: ffffffd8807e4000
 s11: 0000000000000000 t3 : 00000000000000ff t4 : 393044410a303930
 t5 : 0000000000001000 t6 : 0000000000040000
status: 0000000200000120 badaddr: 0000000000001008 cause: 000000000000000f
[<ffffffff80543212>] parport_pc_compat_write_block_pio+0xfe/0x200
[<ffffffff8053bbc0>] parport_write+0x46/0xf8
[<ffffffff8050530e>] lp_write+0x158/0x2d2
[<ffffffff80185716>] vfs_write+0x8e/0x2c2
[<ffffffff80185a74>] ksys_write+0x52/0xc2
[<ffffffff80185af2>] sys_write+0xe/0x16
[<ffffffff80003770>] ret_from_syscall+0x0/0x2
---[ end trace 0000000000000000 ]---

For simplicity address the problem by adding PCI_IOBASE to the physical
address requested in the respective wrapper macros only, observing that
the raw accessors such as `__insb', `__outsb', etc. are not supposed to
be used other than by said macros.  Remove the cast to `long' that is no
longer needed on `addr' now that it is used as an offset from PCI_IOBASE
and add parentheses around `addr' needed for predictable evaluation in
macro expansion.  No need to make said adjustments in separate changes
given that current code is gravely broken and does not ever work.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Fixes: fab957c ("RISC-V: Atomic and Locking Code")
Cc: [email protected] # v4.15+
Reviewed-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
ojeda pushed a commit that referenced this pull request Apr 6, 2023
The UAF bug occurred because we were putting DFS root sessions in
cifs_umount() while DFS cache refresher was being executed.

Make DFS root sessions have same lifetime as DFS tcons so we can avoid
the use-after-free bug is DFS cache refresher and other places that
require IPCs to get new DFS referrals on.  Also, get rid of mount
group handling in DFS cache as we no longer need it.

This fixes below use-after-free bug catched by KASAN

[ 379.946955] BUG: KASAN: use-after-free in __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.947642] Read of size 8 at addr ffff888018f57030 by task kworker/u4:3/56
[ 379.948096]
[ 379.948208] CPU: 0 PID: 56 Comm: kworker/u4:3 Not tainted 6.2.0-rc7-lku #23
[ 379.948661] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
[ 379.949368] Workqueue: cifs-dfscache refresh_cache_worker [cifs]
[ 379.949942] Call Trace:
[ 379.950113] <TASK>
[ 379.950260] dump_stack_lvl+0x50/0x67
[ 379.950510] print_report+0x16a/0x48e
[ 379.950759] ? __virt_addr_valid+0xd8/0x160
[ 379.951040] ? __phys_addr+0x41/0x80
[ 379.951285] kasan_report+0xdb/0x110
[ 379.951533] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.952056] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.952585] __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.953096] ? __pfx___refresh_tcon.isra.0+0x10/0x10 [cifs]
[ 379.953637] ? __pfx___mutex_lock+0x10/0x10
[ 379.953915] ? lock_release+0xb6/0x720
[ 379.954167] ? __pfx_lock_acquire+0x10/0x10
[ 379.954443] ? refresh_cache_worker+0x34e/0x6d0 [cifs]
[ 379.954960] ? __pfx_wb_workfn+0x10/0x10
[ 379.955239] refresh_cache_worker+0x4ad/0x6d0 [cifs]
[ 379.955755] ? __pfx_refresh_cache_worker+0x10/0x10 [cifs]
[ 379.956323] ? __pfx_lock_acquired+0x10/0x10
[ 379.956615] ? read_word_at_a_time+0xe/0x20
[ 379.956898] ? lockdep_hardirqs_on_prepare+0x12/0x220
[ 379.957235] process_one_work+0x535/0x990
[ 379.957509] ? __pfx_process_one_work+0x10/0x10
[ 379.957812] ? lock_acquired+0xb7/0x5f0
[ 379.958069] ? __list_add_valid+0x37/0xd0
[ 379.958341] ? __list_add_valid+0x37/0xd0
[ 379.958611] worker_thread+0x8e/0x630
[ 379.958861] ? __pfx_worker_thread+0x10/0x10
[ 379.959148] kthread+0x17d/0x1b0
[ 379.959369] ? __pfx_kthread+0x10/0x10
[ 379.959630] ret_from_fork+0x2c/0x50
[ 379.959879] </TASK>

Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Cc: [email protected] # 6.2
Signed-off-by: Steve French <[email protected]>
ojeda pushed a commit that referenced this pull request Sep 12, 2023
Syzbot reported a null-ptr-deref of sqd->thread inside
io_sqpoll_wq_cpu_affinity.  It turns out the sqd->thread can go away
from under us during io_uring_register, in case the process gets a
fatal signal during io_uring_register.

It is not particularly hard to hit the race, and while I am not sure
this is the exact case hit by syzbot, it solves it.  Finally, checking
->thread is enough to close the race because we locked sqd while
"parking" the thread, thus preventing it from going away.

I reproduced it fairly consistently with a program that does:

int main(void) {
  ...
  io_uring_queue_init(RING_LEN, &ring1, IORING_SETUP_SQPOLL);
  while (1) {
    io_uring_register_iowq_aff(ring, 1, &mask);
  }
}

Executed in a loop with timeout to trigger SIGTERM:
  while true; do timeout 1 /a.out ; done

This will hit the following BUG() in very few attempts.

BUG: kernel NULL pointer dereference, address: 00000000000007a8
PGD 800000010e949067 P4D 800000010e949067 PUD 10e46e067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 15715 Comm: dead-sqpoll Not tainted 6.5.0-rc7-next-20230825-g193296236fa0-dirty #23
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:io_sqpoll_wq_cpu_affinity+0x27/0x70
Code: 90 90 90 0f 1f 44 00 00 55 53 48 8b 9f 98 03 00 00 48 85 db 74 4f
48 89 df 48 89 f5 e8 e2 f8 ff ff 48 8b 43 38 48 85 c0 74 22 <48> 8b b8
a8 07 00 00 48 89 ee e8 ba b1 00 00 48 89 df 89 c5 e8 70
RSP: 0018:ffffb04040ea7e70 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff93c010749e40 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffffa7653331 RDI: 00000000ffffffff
RBP: ffffb04040ea7eb8 R08: 0000000000000000 R09: c0000000ffffdfff
R10: ffff93c01141b600 R11: ffffb04040ea7d18 R12: ffff93c00ea74840
R13: 0000000000000011 R14: 0000000000000000 R15: ffff93c00ea74800
FS:  00007fb7c276ab80(0000) GS:ffff93c36f200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000007a8 CR3: 0000000111634003 CR4: 0000000000370ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? __die_body+0x1a/0x60
 ? page_fault_oops+0x154/0x440
 ? do_user_addr_fault+0x174/0x7b0
 ? exc_page_fault+0x63/0x140
 ? asm_exc_page_fault+0x22/0x30
 ? io_sqpoll_wq_cpu_affinity+0x27/0x70
 __io_register_iowq_aff+0x2b/0x60
 __io_uring_register+0x614/0xa70
 __x64_sys_io_uring_register+0xaa/0x1a0
 do_syscall_64+0x3a/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7fb7c226fec9
Code: 2e 00 b8 ca 00 00 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d 97 7f 2d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe2c0674f8 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb7c226fec9
RDX: 00007ffe2c067530 RSI: 0000000000000011 RDI: 0000000000000003
RBP: 00007ffe2c0675d0 R08: 00007ffe2c067550 R09: 00007ffe2c067550
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe2c067750 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:
CR2: 00000000000007a8
---[ end trace 0000000000000000 ]---

Reported-by: [email protected]
Fixes: ebdfefc ("io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used")
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
fbq pushed a commit that referenced this pull request Sep 25, 2023
The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Mike Christie <[email protected]>
Cc: [email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
fbq pushed a commit that referenced this pull request Dec 28, 2023
Andrii Nakryiko says:

====================
BPF register bounds logic and testing improvements

This patch set adds a big set of manual and auto-generated test cases
validating BPF verifier's register bounds tracking and deduction logic. See
details in the last patch.

We start with building a tester that validates existing <range> vs <scalar>
verifier logic for range bounds. To make all this work, BPF verifier's logic
needed a bunch of improvements to handle some cases that previously were not
covered. This had no implications as to correctness of verifier logic, but it
was incomplete enough to cause significant disagreements with alternative
implementation of register bounds logic that tests in this patch set
implement. So we need BPF verifier logic improvements to make all the tests
pass. This is what we do in patches #3 through #9.

The end goal of this work, though, is to extend BPF verifier range state
tracking such as to allow to derive new range bounds when comparing non-const
registers. There is some more investigative work required to investigate and
fix existing potential issues with range tracking as part of ALU/ALU64
operations, so <range> x <range> part of v5 patch set ([0]) is dropped until
these issues are sorted out.

For now, we include preparatory refactorings and clean ups, that set up BPF
verifier code base to extend the logic to <range> vs <range> logic in
subsequent patch set. Patches #10-#16 perform preliminary refactorings without
functionally changing anything. But they do clean up check_cond_jmp_op() logic
and generalize a bunch of other pieces in is_branch_taken() logic.

  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=797178&state=*

v5->v6:
  - dropped <range> vs <range> patches (original patches #18 through #23) to
    add more register range sanity checks and fix preexisting issues;
  - comments improvements, addressing other feedback on first 17 patches
    (Eduard, Alexei);
v4->v5:
  - added entirety of verifier reg bounds tracking changes, now handling
    <range> vs <range> cases (Alexei);
  - added way more comments trying to explain why deductions added are
    correct, hopefully they are useful and clarify things a bit (Daniel,
    Shung-Hsi);
  - added two preliminary selftests fixes necessary for RELEASE=1 build to
    work again, it keeps breaking.
v3->v4:
  - improvements to reg_bounds tester (progress report, split 32-bit and
    64-bit ranges, fix various verbosity output issues, etc);
v2->v3:
  - fix a subtle little-endianness assumption inside parge_reg_state() (CI);
v1->v2:
  - fix compilation when building selftests with llvm-16 toolchain (CI).
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Avoid using a bunch of cfg flags and use of hardcoded auto.conf path
2 participants