Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xu4-4.14 kernel memory leak (FCoE VN2VN) #360

Open
ardje opened this issue Jul 23, 2018 · 11 comments
Open

Xu4-4.14 kernel memory leak (FCoE VN2VN) #360

ardje opened this issue Jul 23, 2018 · 11 comments

Comments

@ardje
Copy link

ardje commented Jul 23, 2018

Hi mdrjr,
Just a heads up, I've had this leak since 4.14.30 something. I think 4.14.22 was stable, not sure though.

So the problem: kernel was leaking memory:
https://plus.google.com/u/0/+ArdvanBreemen/posts/cpLBpnizyLv

even the last kernel, 4.14.55 seems to be leaking.
So I decided to build with DEBUG_KMEMLEAK, and here is my "surprise"
kmemleak.txt

@ardje
Copy link
Author

ardje commented Jul 23, 2018

So there seems to be two leak points:

    [<c08085e0>] __napi_alloc_skb+0x90/0x120
    [<c06366c0>] r8152_poll+0x308/0xf90
    [<c081f77c>] net_rx_action+0x2c0/0x484

And:

    [<bf3382ec>] fcoe_ctlr_vn_add+0x3c/0x1b4 [libfcoe]
    [<bf338bb8>] fcoe_ctlr_vn_recv+0x754/0xb2c [libfcoe]
    [<bf33a400>] fcoe_ctlr_recv_work+0xb94/0x17f0 [libfcoe]

They might be related. The Xu4 has a load of macvlans on vlans and an FCoE on vlan.
The FCoE has no active partitions, but per spec, the FCoE drives are exported, and hence the kernel needs to keep track of the FCoE multicasts. It should automagically create a vn2vn connection/session to any other FCoE nodes.

Since the FCoE is not used on this odroid, I can block the FCoE vlan, and see if the lack of FCoE announcements will stop the vn2vn leak, and the r8152_poll leak.
It's possible the skb's should have been processed in soft irq or worker thread from fcoe and then freed.
To be clear: FCoE worked fine on 3.10.92

@ardje
Copy link
Author

ardje commented Jul 23, 2018

And a site note: mdrjr: I do not assume you are going to fix it ;-). I just need a place to document the bug.

@ardje
Copy link
Author

ardje commented Jul 23, 2018

Shutting down the FCoE vlan (so the XU4 FCoE setup is unchanged. The switch just doesn't pass the vn2vn multicasts) for now reveals no new and important leaks.

@ardje ardje changed the title Xu4-4.14 kernel memory leak Xu4-4.14 kernel memory leak (FCoE VN2VN) Jul 23, 2018
@ardje
Copy link
Author

ardje commented Jul 26, 2018

I tried it with a PC and it also got a memory leak. That's good, because it means we can fix something.
Anyway, hijacking this ticket for fcoe bug. If it is solved, I can order me some HC1 ;-).
Attached PC kmemleak
kmemleak-antec.txt
dmesg.txt

@ardje
Copy link
Author

ardje commented Aug 6, 2018

Kmemleak after patches from Johannes:
ardje@747bf04

dmesg-2018-08-06.txt
kmemleak-2018-08-06.txt

@ardje
Copy link
Author

ardje commented Aug 7, 2018

Sorry mdrjr, there is an issue on github on moving issues :-). But having working FCoE is a good feature for the HC1 and HC2.

Memory graphs:
PC with 4.14:
memory-year
memory-week
Idle Xu4 with 4.14:
odroid7-week png
Idle Xu4 with 4.9:
odroid6-week png
Production Xu4 with 4.4:
odroid4-week png
Production Xu4 with 4.14, notice the moment when I started turning of my steam machine due to heat.
odroid5-3months png
Now the year graph with a piece of 3.10 kernel (notice how collectd was not that important to me :-) ).
odroid5-year png

The gap in the collectd graph sunday is another issue (my rrdcache dying due to an OOM).
The munin graph gaps are the moments the PC was turned off.
Notice the memory leak in the XU4 going to 150MB/day when the PC is turned on, and slowing down when it is turned off.

@ardje
Copy link
Author

ardje commented Aug 8, 2018

Leaving a PC with 4.14 (patched) and a steam machine with 4.16 (not patched), results in kmemleak on setup chatter.
After turning off FCoE on the steam machine another memleak occurs.
kmemleak-2018-08-08.txt
I've filtered the kernlog from large skb's and from beacons.
kernlog-2018-08-08.txt

@ardje
Copy link
Author

ardje commented Aug 9, 2018

Now I turned off the steam machine, turned on the PC, doing scan almost every minute.
except for a single memleak, nothing for 30 minutes.
Then I turned on my steam machine again, and it keeps on adding rport

Aug  8 10:53:15 localhost kernel: [   14.843972] host10: fip: vn_add rport 00dd50 new state 0
Aug  8 10:53:15 localhost kernel: [   14.856235] host10: fip: vn_add rport 00dd50 old state 0
Aug  8 10:53:15 localhost kernel: [   14.868415] host10: fip: vn_add rport 0004e0 new state 0
Aug  8 10:53:15 localhost kernel: [   14.880589] host10: fip: vn_add rport 0004e0 old state 0
Aug  8 10:53:15 localhost kernel: [   14.892846] host10: fip: vn_add rport 006837 new state 0
Aug  8 10:53:15 localhost kernel: [   14.905107] host10: fip: vn_add rport 006837 old state 0
Aug  8 10:53:15 localhost kernel: [   14.917275] host10: fip: vn_add rport 0004e0 old state 0
Aug  8 10:53:15 localhost kernel: [   14.929451] host10: fip: vn_add rport 0004e0 old state 0
Aug  8 10:53:15 localhost kernel: [   14.941631] host10: fip: vn_add rport 000550 new state 0
Aug  8 10:53:15 localhost kernel: [   14.953797] host10: fip: vn_add rport 000550 old state 0
Aug  8 11:33:50 localhost kernel: [ 2452.571392] host10: fip: vn_add rport 00c76e new state 0
Aug  8 11:33:50 localhost kernel: [ 2452.582605] host10: fip: vn_add rport 00c76e old state 0
Aug  8 11:34:09 localhost kernel: [ 2470.863225] host10: fip: vn_add rport 00c76e old state 4
Aug  8 11:34:09 localhost kernel: [ 2470.874463] host10: fip: vn_add rport 00c76e old state 4
Aug  8 11:34:33 localhost kernel: [ 2495.438842] host10: fip: vn_add rport 00c76e old state 4
Aug  8 11:34:33 localhost kernel: [ 2495.450120] host10: fip: vn_add rport 00c76e old state 4
Aug  8 11:34:58 localhost kernel: [ 2520.014676] host10: fip: vn_add rport 00c76e old state 4
Aug  8 11:34:58 localhost kernel: [ 2520.026034] host10: fip: vn_add rport 00c76e old state 4

Also the kmemleaks are back

@ardje
Copy link
Author

ardje commented Aug 9, 2018

Resume of the past logs:
Working logins:
Xu4 4.9:
04e0-fcoe-log.txt
ss4000e 3.7:
6837-fcoe-log.txt
Steam machine 4.16:
c76e-fcoe-log.txt

EDIT: Pasted wrong kernlog and kmemleak, see 2 comments further

@ardje
Copy link
Author

ardje commented Aug 9, 2018

@ardje
Copy link
Author

ardje commented Aug 9, 2018

Wrong files, but:

root@antec:~/logs# grep "vn_add rport 00c76e\|kmemleak" 2018-08-08-kern.log|cut -d\  -f9-|uniq -c
      1   2.577320] kmemleak: Kernel memory leak detector initialized
      1   2.577350] kmemleak: Automatic memory scanning thread started
      1 136.452894] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      1 host10: fip: vn_add rport 00c76e new state 0
      1 host10: fip: vn_add rport 00c76e old state 0
      8 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     50 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      2 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 47 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     50 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     50 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 47 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     52 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 50 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     50 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 47 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     50 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 55 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     52 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 46 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     50 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 46 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
     36 host10: fip: vn_add rport 00c76e old state 4
      1 kmemleak: 50 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      1 kmemleak: 36 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      1 kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      1 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

And now the real kern log end memleak
2018-08-08-kmemleak.txt
2018-08-08-kern.log

paralin pushed a commit to paralin/linux that referenced this issue Mar 11, 2022
commit 6d1e6bc upstream.

_get_table_maxdiv() tries to access "clk_div_table" array out of bound
defined in phy-j721e-wiz.c. Add a sentinel entry to prevent
the following global-out-of-bounds error reported by enabling KASAN.

[    9.552392] BUG: KASAN: global-out-of-bounds in _get_maxdiv+0xc0/0x148
[    9.558948] Read of size 4 at addr ffff8000095b25a4 by task kworker/u4:1/38
[    9.565926]
[    9.567441] CPU: 1 PID: 38 Comm: kworker/u4:1 Not tainted 5.16.0-116492-gdaadb3bd0e8d-dirty hardkernel#360
[    9.576242] Hardware name: Texas Instruments J721e EVM (DT)
[    9.581832] Workqueue: events_unbound deferred_probe_work_func
[    9.587708] Call trace:
[    9.590174]  dump_backtrace+0x20c/0x218
[    9.594038]  show_stack+0x18/0x68
[    9.597375]  dump_stack_lvl+0x9c/0xd8
[    9.601062]  print_address_description.constprop.0+0x78/0x334
[    9.606830]  kasan_report+0x1f0/0x260
[    9.610517]  __asan_load4+0x9c/0xd8
[    9.614030]  _get_maxdiv+0xc0/0x148
[    9.617540]  divider_determine_rate+0x88/0x488
[    9.622005]  divider_round_rate_parent+0xc8/0x124
[    9.626729]  wiz_clk_div_round_rate+0x54/0x68
[    9.631113]  clk_core_determine_round_nolock+0x124/0x158
[    9.636448]  clk_core_round_rate_nolock+0x68/0x138
[    9.641260]  clk_core_set_rate_nolock+0x268/0x3a8
[    9.645987]  clk_set_rate+0x50/0xa8
[    9.649499]  cdns_sierra_phy_init+0x88/0x248
[    9.653794]  phy_init+0x98/0x108
[    9.657046]  cdns_pcie_enable_phy+0xa0/0x170
[    9.661340]  cdns_pcie_init_phy+0x250/0x2b0
[    9.665546]  j721e_pcie_probe+0x4b8/0x798
[    9.669579]  platform_probe+0x8c/0x108
[    9.673350]  really_probe+0x114/0x630
[    9.677037]  __driver_probe_device+0x18c/0x220
[    9.681505]  driver_probe_device+0xac/0x150
[    9.685712]  __device_attach_driver+0xec/0x170
[    9.690178]  bus_for_each_drv+0xf0/0x158
[    9.694124]  __device_attach+0x184/0x210
[    9.698070]  device_initial_probe+0x14/0x20
[    9.702277]  bus_probe_device+0xec/0x100
[    9.706223]  deferred_probe_work_func+0x124/0x180
[    9.710951]  process_one_work+0x4b0/0xbc0
[    9.714983]  worker_thread+0x74/0x5d0
[    9.718668]  kthread+0x214/0x230
[    9.721919]  ret_from_fork+0x10/0x20
[    9.725520]
[    9.727032] The buggy address belongs to the variable:
[    9.732183]  clk_div_table+0x24/0x440

Fixes: 091876c ("phy: ti: j721e-wiz: Add support for WIZ module present in TI J721E SoC")
Cc: [email protected] # v5.10+
Signed-off-by: Kishon Vijay Abraham I <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mdrjr pushed a commit that referenced this issue Dec 23, 2024
…ode.

[ Upstream commit d5c367e ]

creating a large files during checkpoint disable until it runs out of
space and then delete it, then remount to enable checkpoint again, and
then unmount the filesystem triggers the f2fs_bug_on as below:

------------[ cut here ]------------
kernel BUG at fs/f2fs/inode.c:896!
CPU: 2 UID: 0 PID: 1286 Comm: umount Not tainted 6.11.0-rc7-dirty #360
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:f2fs_evict_inode+0x58c/0x610
Call Trace:
 __die_body+0x15/0x60
 die+0x33/0x50
 do_trap+0x10a/0x120
 f2fs_evict_inode+0x58c/0x610
 do_error_trap+0x60/0x80
 f2fs_evict_inode+0x58c/0x610
 exc_invalid_op+0x53/0x60
 f2fs_evict_inode+0x58c/0x610
 asm_exc_invalid_op+0x16/0x20
 f2fs_evict_inode+0x58c/0x610
 evict+0x101/0x260
 dispose_list+0x30/0x50
 evict_inodes+0x140/0x190
 generic_shutdown_super+0x2f/0x150
 kill_block_super+0x11/0x40
 kill_f2fs_super+0x7d/0x140
 deactivate_locked_super+0x2a/0x70
 cleanup_mnt+0xb3/0x140
 task_work_run+0x61/0x90

The root cause is: creating large files during disable checkpoint
period results in not enough free segments, so when writing back root
inode will failed in f2fs_enable_checkpoint. When umount the file
system after enabling checkpoint, the root inode is dirty in
f2fs_evict_inode function, which triggers BUG_ON. The steps to
reproduce are as follows:

dd if=/dev/zero of=f2fs.img bs=1M count=55
mount f2fs.img f2fs_dir -o checkpoint=disable:10%
dd if=/dev/zero of=big bs=1M count=50
sync
rm big
mount -o remount,checkpoint=enable f2fs_dir
umount f2fs_dir

Let's redirty inode when there is not free segments during checkpoint
is disable.

Signed-off-by: Qi Han <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant