-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcasts failing on ember after migration #22453
Comments
Any chance you can downgrade to 7.4.1 and see if you still have those problems on the pi? |
Same problem with SLZB-06M But I don't have a raspberry pi 4, host is a x86 machine, running unraid and zigbee2mqtt in docker. |
Grouping the mentioned broadcasting issue here guys (#22445, #22398) I cannot reproduce this with my Dongle-E. I've tried various firmware, various ways to migrate from |
adapter: ember May need to add 'rtscts' below adapter setting. |
Two things: I recently installed https://www.zigbee2mqtt.io/devices/ZFP-1A-CH.html#siglis-zfp-1a-ch Wich I think is not a very common router. Swiss market only and most likely not very popular. Initially I had problems with it. Also shortly after I installed it, my second Dongle-E that I use as a router had to re-pair and this was one of the first devices in my 2yo network that I never had any problems with. Second: Shortly before my Router Dongle failed I set reporting interval of every lamp to 1-3 seconds because I didn't see lamps status change quickly enough (or at all) when pressing a HW button like the switches mentioned above. After the Dongle failed I reverted this to 1-30 s and had no problems since. But I did the reverting before I saw the error in logs. Also I have to say: I don't recognize bigger problems or misbehavior. I just saw the error in the logs. The only real problem I have is that sometimes (not reproducible) some IKEA Bulbs are starting in maximum dimmed mode even though at least one of them is never dimmed manually. |
As the dongle-e is working using a docker images on an x86 environnement I'm guessing there is no issue with the zigbee Dongle, so if I focus on some specifics configs, here is what's coming to my mind as part of the change that might be different than a regular installation :
everything else is quite standard in my opinion. |
Nothing special over here. Had 1.36 running with SLZD-06M running on zigbee FW 20231030. Everything was running OK with adapter: ezsp Did the following steps:
So currently I'm in a state that my network is running, but I can't add any new devices. Is there any more info we can provide? |
Oh I should have mentioned that I am running HAOS in a VM on Synology DSM 7.2. Interference should not be a problem as my dongle is in a USB 2 port with a 2 m extension cable. |
My setup is HAOS running on a ODROID M1 with 8GB RAM and 512 GB SSD. |
Exactly the same behavior. Plus the problem that no new devices can't be paired with ember. But with ezsp I can add devices. In my case especially all my routers get disconnected. |
I do have 4 mmwave presence sensors. Maybe these devices have an influence. |
Sorry, posted my follow-up on the wrong ticket... These are the messages I see when I startup Zigbee2MQTT. Maybe they are related. [2024-05-05 11:00:43] info: z2m: Logging to console, file (filename: log.log) [2024-05-05 11:00:51] info: z2m: Zigbee: disabling joining new devices. Whenever I try to start the pairing process, I see these messages: [2024-05-05 11:03:28] info: z2m: Zigbee: allowing new devices to join. |
Ah yes, and I wasn't aware it is related... I have a SLZB-06M as coordinator (groundfloor) and a Sonoff Dongle-E flashed as router (first floor). Yesterday evening my Sonoff router got disconnected. It is while trying to pair it again that I found out I couldn't pair any devices. I have a very small zigbee network (more a test setup here), so I have no other routers, only end devices. |
I already have nearly 70 devices... |
Here at home, HA is a small setup (12 devices) I use mainly for testing. But in our vacation home, everything is controlled by HA and we have 51 zigbee and 33 ESPHome devices. In this second setup, I also have the same SLZB-06M coordinator, but still on the older 20231030 firmware, where the adapter is still defined as 'adapter: ezsp'. Since I ugraded to 1.37, I couldn't pair any new devices too, due to another error: "zh:controller:greenpower: Received undefined command from '0'" And that setup is not a test setup :-( |
In the development-branch channel. The similarity we both have is the same coordinator (I am at the dev Firmware right now). But maybe you can rather rule out the cause if you only have 12 devices in your setup. |
Very very simple configuration here. HAOS on qemu VM in low end x86-64 QNAP nas, resources 2 cpu+2 GB ram as suggested by HAOS setup guide. Back to the setup, I can report two setups:
Anyway I see from other posts that the error is happening with a variety of devices and if I look at another common factor, all the variety of networks showing the error have -> a coordinator <- which again spots the light on the coordinator. I see that @Nerivec is not able to reproduce the issue, and, needless to say, also Nerivec is working with a coordinator which should obviously rule out the coordinator itself (unless there is some elusive coordinator hardware common factor), maybe a good starting point for you would be to constrain the system on a low resource/slow host or a VM with limited resources to see what happens with the coordinator handling of Z2M. Maybe another hint maybe found in the first post from @julien-billaud: "I've tried the exact same configuration on a regular x86 computer running debian (using the same zigbee dongle) and didn't face any issue which seems to be a linked with the Raspberry pi 4". |
OK, because my setup is a small setup mainly for test, I did the following steps:
[12:01:03] INFO: Preparing to start...
[2024-05-05 12:01:40] info: z2m: Zigbee: allowing new devices to join.
[2024-05-05 12:02:19] info: z2m: Removing device '0x00158d0008083d2a' (block: false, force: true)
[12:06:41] INFO: Preparing to start...
[2024-05-05 12:07:40] info: z2m: Zigbee: allowing new devices to join. so pairing is working and I didn't get the broadcast error now, not while starting up and not while pairing. So starting over with zigbee2mqtt solved it for me, but that is not possible for everyone I think :-) |
No, not completly... after approx 5 minutes, pairing was again not possible. No errors, but the connection / interview didn't start. Tried to restart z2m and reboot the coordinator, nothing helps. Downgraded the coordinator to the 20231030 FW (ESZP12) and switched back to "adapter: ezsp" and I still got the "error: zh:controller:greenpower: Received undefined command from '0' " messages, but pairing is possible again. Will see in about 10 minutes... |
I do also have one Sonoff TRVZB. And I also started fresh with one new zigbee2mqtt config and just the coordinator, and even at start the pairing/broadcast issue appeared immediately. I don't think that it is an issue with raspberry pi as I am using an x86 machine running a zigbee2mqtt container (docker). I also observed that a coordinator reset sometimes helped. @Nerivec recommended to do a hard reset with my device (that includes pushing the physical reset button). This also helped me once starting without any issues, but after restarting again, I again suffered by those errors. |
Just to have a better understanding: what CPU/RAM is your x86 machine? Is it running what OS? Is it on bare metal or on a virtualization environment like Proxmox or other VM of any sort? I agree dockers are less demanding, but performance then is limited by the host so it would be useful to know what kind of host is running your docker and how loaded is your x86 system. |
It is a Intel® Core™ i3-9100 system with 64 GB RAM ECC. |
I have a low-resource VM that mimics the specs of an average PI 4 to run tests on stuff that I know affect performance. No issue there either. No failed broadcast without any device, nor with devices, and successfully paired & re-paired a dozen devices since it's been running for a couple of hours. But just in case, you can try giving it some breathing room with the advanced:
adapter_delay: 20 Default/min is 5, max is 60 (milliseconds). Note that at 60, you are likely to experience some delays when triggering devices rapidly. PS: I created an issue in the firmware repo for the SLZB-06M and the failing config IDs. May or may not be related to the ensuing troubles, but we need to get to the bottom of it nonetheless. darkxst/silabs-firmware-builder#90 |
Added the adapter_delay option, no joy: [2024-05-05 14:42:54] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":54,"sourceEndpoint":0,"destinationEndpoint":0,"options":256,"groupId":0,"sequence":170} messageTag=255] at startup of z2m. |
I've been doing little more testing and figured out "what was wrong". To conclude, it seems like the ember driver is for some reason little bit more sensitive (I know that using the Dongle without extension cord isn't ideal). |
Can't be my problem. USB2 Port with 2m extension cable. |
After rebooting HA today my zigbee devices became unresponsive and "Delivery of BROADCAST failed" errors kept popping up. |
at this point is pretty evident that Ember is not ready for the release, |
for my "production" environment, I'm switching back to EZSP 'till ember problems are fixed.
Maybe in a next release you could set a higher value for the timer to call the logEzspDeprecated() method? A warning every hour is a bit overkill I think. Once a day or only on startup? |
In short in my env, Unfortunately, nothing has changed in the new firmware, it still will fail to start sometimes, and then restarting the docker container after it's failed will make it work. This container probably has less issues booting with Maybe there can be an environment variable to let it try 3 times before giving up? At least then we can ignore this until you win at hide and seek? |
My setup with Dongle E that has been really solid for quite some time has become close to unusable. I am having errors over errors, switching back and forth from ezsp to ember has produced errors in the likes of "duplicate device detected for 12345, kicking from network" - unfortunately these logs have been overwritten already so I don't know the exact wording. It is incomprehensible to me that it is apparently not possible to determine wich devices exactly are meant. Why not use the friendly name? |
@voc0der What error do you mean is preventing start? This broadcast delivery failure shouldn't prevent "start". @supaeasy You can reset the zigbee2mqtt installation completely (config + dabatase + adapter network), as long as you re-name devices the same after your re-pair them (so they end up with the same HA entity ID), HA should link back to the previous history. I did that a few months back without problem (assuming nothing's changed in HA logic since). I need someone that can consistently reproduce this |
Moved to other thread. |
@Nerivec unfortunately my current schedule doesn't allow for intense testing. Just for reference: all the Problems started after I made the following changes:
Unfortunately I made all of these changes in a very short time frame so it is impossible to tell which of these broke my network. Also I have a second Dongle-e with router Firmware and even though it seems to be working throughout these errors journey I have the feeling that only a power cycle of this device as well as resetting and re pairing the Zigfred solve my problems - at least temporarily. Had to do this twice since April now. Could the Dongle Router be too stressed? Are there known issues with Zigfred devices? Symptoms:
Problem is that these issues seem to come creeping in, it is not all of a sudden everything broken. Setup: Zigbee devices: ZBT-CCTLight-M3500107: 6 lumi.weather: 5 TRADFRI bulb GU10 WS 345lm: 5 TRADFRI control outlet: 4 TRADFRIbulbE27WSglobeopal1055lm: 4 WB01: 3 lumi.vibration.aq1: 2 TRADFRIbulbE14WScandleopal470lm: 2 TRADFRI motion sensor: 2 lumi.sensor_wleak.aq1: 2 TS0044: 1 S26R2ZB: 1 DONGLE-E_R: 1 Remote Control N2: 1 TS0049: 1 TRADFRI on/off switch: 1 lumi.airmonitor.acn01: 1 zigfred plus: 1 |
Very short intervals (reporting/availability/etc) and unstable devices/network definitely don't go well together. You may want to try reverting these to defaults one at a time, see if things start improving. Also, if you think something might be wrong with that Zigfred device, try removing it from Z2M, see if things improve after a few hours (give time for the network to adjust). A single bad device can make a whole mess on the network because of how Zigbee works... I don't know Zigfred at all however, so, can't say much about them. |
Okay, thank you. I have made the following changes now and hope this works out:
Everything seems to work fine for now. Will post updates if it goes down the drain again. The OTA thing however only changed that one update for an IKEA remote was found - nonetheless it cannot be applied because One more question though: Is there another way to ensure on/off states are sent to the network immediately? I mean I absoultely don't need reports every x seconds, just state changes that are reliable. I set the reporting intervall to these short settings because when I trigger on/off state via the zigfred (or any other switch) the correct power state of a lamp does not show up directly in my HA dashboard. So I can never be sure lights are on/off when I am away. If i trigger the state via HA the status is displayed correctly, immediately and reliably. |
Lots of IKEA stuff is picky with OTA, need to keep them awake until they start updating, otherwise it fails (docs device picked at random, but procedure similar).
Not sure why on/off would not be reported instantly. Do you have a screenshot of the reporting tab for your Zigfred (or a device that does this)? |
Here is my log detailing broadcast failure, with the following details below: |
Got the same broadcast 65533/65532 error every 3'30", after ember migration with Z2M 1.38.0 and ncp-uart-hw-v7.4.1.0-zbdonglee-115200.gbl or ncp-uart-hw-v7.4.3.0-zbdonglee-115200.gbl |
@Nerivec well, I found out: "Minimum reporting change (reportable_change)" is poorly translated in German, I use the localized version of Z2M. It is translated to say "Minimale Report-Änderungen" which is not really comprehensible (it translates back to "Minimal change of reports", referencing to reports instead of what leads to cause reports). It should be "Minimale Veränderung um Berichte auszulösen". I am quite sure I didn't mess with this setting before but it was set to 0. I have reverted reporting-Intervals zu 60-3600 now and set Min. Reporting change to 1. Now everything is working as it should (except for one group of bulbs that has somewhat lazy reports (some 5-30s delay) but I can live with that and my debug protocol does not seem as flooded anymore. Will see if this is sustainable now. It does "feel" better. |
I have two instances of z2m, both on SLZB-06M. The main one with 40 devices is on ezsp, no problems. The secondary one is on Ember with 10 devices, and I can run all the tests you want if you tell me what to do. On the ember istance, after migrating at version 1.37, I continue to receive this error: [2024-07-02 21:49:53] error: zh:ember:uart:ash: Received ERROR from NCP, with code=ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT. In both version 1.37 and 1.38, the error appeared only upon restarting the Docker container and even prevented the instance from starting. However, I resolved it by shutting down Docker, replacing the state.json file (sometimes need to replace database and configuration) with one from a few hours earlier (I have a continuous incremental backup always active), at which point it would restart and work perfectly. With version 1.39, which I have been testing for a few hours, the problem occurs every 10-15 minutes, often resolving itself. But after a few times, I am forced to shut down the container and replace with an oldest state.json file. |
Всем жертвам ember))) |
Are you sure? 7.3.1 seems like a major step back to me (although I have no Idea whatsoever what to expect from the different versions because I can't seem to find any description or change log). |
Вам нужна стабильность или красивые цифры в прошивка?))) |
I don't have any problems with |
Возможно зависит от оборудования. |
HASS OS on RPi 4 So please, don't obsolete ezsp yet |
Standalone installation 1.39.0 commit: 0326926 with Sonoff E dongle fw 7.4.3.0 build 0 has been working fine on a Raspberry 4. After upgrading to Raspi 5, it started throwing errors 'zh:ember: Delivery of BROADCAST failed for "65533"...' The only change I made to the installation is apt upgrade for the following packages: Edit: switched back to Pi 4 and configuration to ember; problems are gone. So the errors were either random or due to some hardware issue on the Pi 5. |
Jumping into this, adding my experience regarding SLZB-06M and ember:
z2m is running in a dedicated LXC on my proxmox server. Everytime the LXC with z2m restarts, z2m is not coming up, throwing this error:
z2m is never coming up regardless how long I wait. A simple systemctl restart zigbee2mqtt.service fixes the problem everytime.
Let me know if I can help with additional information :) |
Same configuration, same issue |
Just wanted to chime in for awareness. I run two different coordinators via docker containers. One coordinator is SLZB-06M via LAN and one is SLZB-07 via USB. The issue is only manifesting on the SLZB-07 via USB. The SLZB-06M has not had any issues. Switching to the ezsp adapter solves the issue for SLZB-07. I'm happy to provide logs or debug info as well. |
I got exactly the same setup / issue, it's rather annoying since I have weekly backups running and the LXC never comes back up after it cause of the same errors you provided, manual intervention via systemctl restart is required for it to resume working. |
Hello, My current feedback if can be usefull I had the same issue. I've seen messages about USB power and moved the dongle to another port. nothing change I stopped the zigbee2mqtt container and moved my configuration to "have a fresh install" I don't know what are the profileID and if they push back by home assistant :( |
Unfortunately I have the exact same issue. Also running Home Assistant as a VM. |
I want to save you time from going in the wrong direction, this is not related to Home Assistant, it is internal by the |
What happened?
While I've never been facing any issues for more than a year with the Sonoff Dongle-e + ezsp driver, I've tried to change the driver to ember, but nothing is working (tried multiple time) but sometime losing all the devices, sometime they are still there but impossible to interact with them, and pairing is never working. (for now I returned to the ezsp driver).
I'm not noticing much error in the log (only the broadcast error reported here #22445)
I've tried the exact same configuration on a regular x86 computer running debian (using the same zigbee dongle) and didn't face any issue which seems to be a linked with the Raspberry pi 4
What did you expect to happen?
No response
How to reproduce it (minimal and precise)
switch from eszp to ember driver
Zigbee2MQTT version
1.37.0
Adapter firmware version
7.4.2.0 build 0
Adapter
Sonoff dongle-e
Setup
Raspberry pi 4 using docker image
Debug log
No response
The text was updated successfully, but these errors were encountered: