-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ramips-mt7621/mediatek-filogic: MT7915e - wifi crashes randomly #3154
Comments
This bug is also confirmed for the the D-Link DAP-X1860 a1 running with gluon v2023.1.1: A warning message with
|
My cronjob basically seems to work - but... Let's have a look in your graph. We can see crash at ~23:15 and continued to work at ~00:50. Until 23:17 everything's fine. Sun May 26 23:18:04 first timeout message from timer. So here's the Problem for the workaround.
|
maybe there needs to be a few seconds between the cat and unloading the kernelmodule? |
I'll have a look at this idea. I have modified my script to add an increasing delay after the cat. I have no hope that the sleep will help.
|
As it is the See attached logread from a Cudy WR3000 filogic device, on which I could reproduce that restarting the firmware did not help. As additional info: the error below occured 1 minute after I rebooted its mesh partner, which did now also mesh on 5Ghz after the reboot. Maybe this helps for the future to trace the bug.
|
Yeah. I think so. I'll make them rebooting now.
You can have a look at the logfile.
|
Short Update on my reboot-conjob. If you look at my log (it's remote; not node-logread) You can see here the little gap.
|
This package reboots the device if the mt7915-firmware hangs on ramips-mt7621 and mediatek-filogic. It's meant as a hotfix for the mcu timeout issue: freifunk-gluon/gluon#3154
This package reboots the device if the mt7915-firmware hangs on ramips-mt7621 and mediatek-filogic. It's meant as a hotfix for the mcu timeout issue: freifunk-gluon/gluon#3154
This package reboots the device if the mt7915-firmware hangs on ramips-mt7621 and mediatek-filogic. It's meant as a hotfix for the mcu timeout issue: freifunk-gluon/gluon#3154
Using #3370 (OpenWrt 24.10, to be eaxct 1bbea11) on NWA55AXE (ramips-mt7621) i think there is an improvement. Now the log looks as following:
Full dmesg of two nodes at two locations running into it: https://gist.github.com/herbetom/cc708b05398e2a521361ae79490f5cfc I'm not seeing clients afterwards so the requested hardware restart presumably fails somehow. But should be an improvement nonetheless and something that can hopefully be improved upon. |
yep, nbd improved everything so downtime is only 10 seconds or so, if the mcu crashes/requires a recovery. I already saw the issue you just experienced and have reported it to nbd last month. Last time I checked he doesn't have the time to look further into it. |
Bug report
What is the problem?
Some devices in ramips-mt7621, (e.g. COVR-X1860, Multy WSM20, NWA50AX, NWA55AXE) are rarely affected by openwrt/openwrt#11931
but the devices work well for weeks most of the time.
An example of this issue can be seen here:
https://grafana.ffac.rocks/d/000000002/node?orgId=1&var-node=0c0e76cf3bca&from=1704723079634&to=1704773255949
https://grafana.ffac.rocks/d/000000002/node?orgId=1&var-node=0c0e76cf2add&from=1699421650829&to=1701080735177
https://stats.ffmuc.net/d/hRIn3dRWk/mesh-nodes?orgId=1&var-nodeid=b8eca3e24c3f&from=1700938800000&to=1701032400000
The airtime is reported as 0 which is wrong.
A error log is found here: openwrt/openwrt#11931 (comment)
End users can not connect to this broken wifi, but can still see it.
It seems that some devices are affected more often than others.
The DAP-X1860 which has the same chip but is supported within v22.03 never had this issue.
What is the expected behaviour?
Wifi should work reliable for devices with MT7915e driver
Gluon Version:
This has been seen with Wifi6 devices using the MT7915e wifi chip since Openwrt 23.05 (gluon v2023.2.x)
I have not seen this behavior on a device on v22.03 or gluon v2023.1.x yet?
I can confirm this bug still exists on the tag v2023.2
Site Configuration:
FFAC, FFMUC
Workarounds
If this happens and you have SSH access through the WAN/VPN, you can run:
rmmod mt7915e && modprobe mt7915e && wifi
This restarts the wifi driver and fixes this issue temporarily.
The text was updated successfully, but these errors were encountered: