Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z2M aborts on socket disconnection (UZG-01) #23236

Open
habitats-tech opened this issue Jul 2, 2024 · 4 comments
Open

Z2M aborts on socket disconnection (UZG-01) #23236

habitats-tech opened this issue Jul 2, 2024 · 4 comments
Labels
problem Something isn't working

Comments

@habitats-tech
Copy link

What happened?

We need to put some effort to improve Z2M error handling. I say we as I put myself on the frontline to assist in whatever capacity I can. I am a strong believer of using PoE Ethernet coordinators, so my focus is on that front.

The following are findings based on the following environment:

  1. UZG-01 always running on Ethernet with latest stable XZG firmware (at present 20240612/ESP32/MCU, 20240316/CC2652P7/Zigbee)
  2. Mosquitto under Debian 12 running in LXC under Proxmox (Linux - not Docker - install)
  3. Z2M under Debian 12 running in LXC under Proxmox (Linux - not Docker - install), always running latest stable version

High performance hardware used throughout: CPU, RAM, disk, network

The biggest issue is how Z2M handles recovery from XZG socket disconnection, at present no recovery, which I believe should be fixed as a priority.

What did you expect to happen?

Z2M exiting running state on socket disconnection. I believe Z2M should try to recover indefinitely, or through a setting we are in control of.

How to reproduce it (minimal and precise)

I run Z2M using npm start with debug log. I monitor real time: log to console and to a file. The Zigbee network has around 15 devices connected all within a radius of 2m from the coordinator, of which ~33% are routers.

The fastest way to reproduce is to reboot the UZG-01 or just disconnect from Ethernet for a seconds.

image

image

Zigbee2MQTT version

1.38.0 and 1.39.0

Adapter firmware version

20240316

Adapter

UZG-01 on Ethernet with latest XZG FW

Setup

x86-64 Debian 12 based system container (LXC) under Proxmox 8.x (no Docker)

Debug log

[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: <-- [254,30,68,129,0,0,0,3,0,122,1,1,0,102,0,246,137,218,0,0,10,8,155,10,11,240,72,3,0,3,0,0,122,29,38]
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: --- parseNext [254,30,68,129,0,0,0,3,0,122,1,1,0,102,0,246,137,218,0,0,10,8,155,10,11,240,72,3,0,3,0,0,122,29,38]
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: --> parsed 30 - 2 - 4 - 129 - [0,0,0,3,0,122,1,1,0,102,0,246,137,218,0,0,10,8,155,10,11,240,72,3,0,3,0,0,122,29] - 38
[2024-07-02 13:53:19] debug: zh:zstack:znp: AREQ: <-- AF - incomingMsg - {"groupid":0,"clusterid":768,"srcaddr":31232,"srcendpoint":1,"dstendpoint":1,"wasbroadcast":0,"linkquality":102,"securityuse":0,"timestamp":14322166,"transseqnumber":0,"len":10,"data":{"type":"Buffer","data":[8,155,10,11,240,72,3,0,3,0]}}
[2024-07-02 13:53:19] debug: zh:controller: Failed to parse frame: Error: Read for '3' not available
[2024-07-02 13:53:19] debug: zh:controller: Received payload: clusterID=768, address=31232, groupID=0, endpoint=1, destinationEndpoint=1, wasBroadcast=false, linkQuality=102, frame=undefined
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: --- parseNext []
[2024-07-02 13:53:19] debug: z2m: Received Zigbee message from 'ZB SGZB12W300Z-A0 DL24', type 'raw', cluster 'lightingColorCtrl', data '{"data":[8,155,10,11,240,72,3,0,3,0],"type":"Buffer"}' from endpoint 1 with groupID 0
[2024-07-02 13:53:19] debug: z2m: No converter available for 'TS0502B' with cluster 'lightingColorCtrl' and type 'raw' and data '{"data":[8,155,10,11,240,72,3,0,3,0],"type":"Buffer"}'
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: <-- [254,30,68,129,0,0,0,3,255,87,1,1,0,142,0,117,188,218,0,0,10,8,46,10,10,240,72,3,0,3,3,255,87,29,207]
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: --- parseNext [254,30,68,129,0,0,0,3,255,87,1,1,0,142,0,117,188,218,0,0,10,8,46,10,10,240,72,3,0,3,3,255,87,29,207]
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: --> parsed 30 - 2 - 4 - 129 - [0,0,0,3,255,87,1,1,0,142,0,117,188,218,0,0,10,8,46,10,10,240,72,3,0,3,3,255,87,29] - 207
[2024-07-02 13:53:19] debug: zh:zstack:znp: AREQ: <-- AF - incomingMsg - {"groupid":0,"clusterid":768,"srcaddr":22527,"srcendpoint":1,"dstendpoint":1,"wasbroadcast":0,"linkquality":142,"securityuse":0,"timestamp":14335093,"transseqnumber":0,"len":10,"data":{"type":"Buffer","data":[8,46,10,10,240,72,3,0,3,3]}}
[2024-07-02 13:53:19] debug: zh:controller: Failed to parse frame: Error: Read for '3' not available
[2024-07-02 13:53:19] debug: zh:controller: Received payload: clusterID=768, address=22527, groupID=0, endpoint=1, destinationEndpoint=1, wasBroadcast=false, linkQuality=142, frame=undefined
[2024-07-02 13:53:19] debug: zh:zstack:unpi:parser: --- parseNext []
[2024-07-02 13:53:19] debug: z2m: Received Zigbee message from 'ZB SGZB12W300Z-A0 DL15', type 'raw', cluster 'lightingColorCtrl', data '{"data":[8,46,10,10,240,72,3,0,3,3],"type":"Buffer"}' from endpoint 1 with groupID 0
[2024-07-02 13:53:19] debug: z2m: No converter available for 'TS0502B' with cluster 'lightingColorCtrl' and type 'raw' and data '{"data":[8,46,10,10,240,72,3,0,3,3],"type":"Buffer"}'
[2024-07-02 13:53:34] info: zh:zstack:znp: Socket error
[2024-07-02 13:53:34] info: zh:zstack:znp: Port closed
[2024-07-02 13:53:34] debug: zh:controller: Adapter disconnected
[2024-07-02 13:53:34] info: zh:zstack:znp: closing
[2024-07-02 13:53:34] error: z2m: Adapter disconnected, stopping
[2024-07-02 13:53:34] debug: z2m: Saving state to file /opt/zigbee2mqtt/data/state.json
[2024-07-02 13:53:34] info: z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload 'offline'
[2024-07-02 13:53:34] info: z2m: Disconnecting from MQTT server
[2024-07-02 13:53:34] info: z2m: Stopping zigbee-herdsman...
[2024-07-02 13:53:34] debug: zh:controller:database: Writing database to '/opt/zigbee2mqtt/data/database.db'
[2024-07-02 13:53:34] info: z2m: Stopped zigbee-herdsman
[2024-07-02 13:53:34] info: z2m: Stopped Zigbee2MQTT

@habitats-tech habitats-tech added the problem Something isn't working label Jul 2, 2024
@Koenkk
Copy link
Owner

Koenkk commented Jul 2, 2024

Z2M 1.39.0 introduces the watchdog which can be used for this.

@Dinth
Copy link

Dinth commented Jul 3, 2024

@habitats-tech if you're going to test the watchdog, could you report back how usable Z2M is? Im already reverting back to 1.38.0, but after an update to 1.39.0 my Z2M and UZG-1 both needed restart every 20 minutes or so.

@habitats-tech
Copy link
Author

For anyone who might attempt to test this: the system will run even if you have typos in the command line. Ensure there are no typos in the command line.

Z2M_WATCHDOG=0.5,0.5,0.5,1,1,1 npm start -> exit with error after 1st attempt
Z2M_WATCHDOG=1,1,1,1,1,1 npm start -> exit with error after 1st attempt
Z2M_WATCHDOG=1,2,3 npm start -> exit with error after 1st attempt

Failed after 1st attempt. I have seen the error in red in other instances.

[2024-07-04 05:46:44] info: z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/ZB SGZB12W300Z-A0 DL24', payload '{"brightness":255,"color":{"h":null,"hue":null},"color_mode":"xy","color_temp":153,"do_not_disturb":null,"last_seen":"2024-07-04T05:46:44+04:00","linkquality":123,"state":"OFF"}'
[2024-07-04 05:47:09] info: zh:zstack:znp: Socket error
[2024-07-04 05:47:09] info: zh:zstack:znp: Port closed
[2024-07-04 05:47:09] debug: zh:controller: Adapter disconnected
[2024-07-04 05:47:09] info: zh:zstack:znp: closing
[2024-07-04 05:47:09] error: z2m: Adapter disconnected, stopping
[2024-07-04 05:47:09] debug: z2m: Saving state to file /opt/zigbee2mqtt/data/state.json
[2024-07-04 05:47:09] info: z2m:mqtt: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload 'offline'
[2024-07-04 05:47:09] info: z2m: Disconnecting from MQTT server
[2024-07-04 05:47:09] info: z2m: Stopping zigbee-herdsman...
[2024-07-04 05:47:09] debug: zh:controller:database: Writing database to '/opt/zigbee2mqtt/data/database.db'
[2024-07-04 05:47:09] info: z2m: Stopped zigbee-herdsman
[2024-07-04 05:47:09] info: z2m: Stopped Zigbee2MQTT
WATCHDOG: Waiting 0.5min before next start try.
Starting Zigbee2MQTT with watchdog (30000,30000,30000,60000,60000,60000).
[2024-07-04 05:47:39] info: z2m: Logging to console, file (filename: log.log)
[2024-07-04 05:47:39] debug: z2m: Loaded state from file /opt/zigbee2mqtt/data/state.json
[2024-07-04 05:47:39] info: z2m: Starting Zigbee2MQTT version 1.39.0 (commit #0326926)
[2024-07-04 05:47:39] info: z2m: Starting zigbee-herdsman (0.50.1)
[2024-07-04 05:47:39] debug: z2m: Using zigbee-herdsman with settings: '"{"network":{"panID":5943,"extendedPanID":[88,178,103,219,169,189,77,71],"channelList":[25],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"path":"tcp://192.168.0.228:6638"},"adapter":{"concurrent":null,"delay":null,"disableLED":false}}"'
[2024-07-04 05:47:40] debug: zh:controller: Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":"HIDDEN","panID":5943,"extendedPanID":[88,178,103,219,169,189,77,71],"channelList":[25]},"serialPort":{"baudRate":115200,"path":"tcp://192.168.0.228:6638"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","adapter":{"disableLED":false,"concurrent":null,"delay":null}}'
[2024-07-04 05:47:40] info: zh:zstack:znp: Opening TCP socket with 192.168.0.228:6638
[2024-07-04 05:47:43] info: zh:zstack:znp: Socket error
[2024-07-04 05:47:43] error: z2m: Error while starting zigbee-herdsman
[2024-07-04 05:47:43] error: z2m: Failed to start zigbee
[2024-07-04 05:47:43] error: z2m: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start.html for possible solutions
[2024-07-04 05:47:43] error: z2m: Exiting...
[2024-07-04 05:47:43] error: z2m: Error: Error while opening socket
at Socket. (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:191:24)
at Socket.emit (node:events:531:35)
at emitErrorNT (node:internal/streams/destroy:169:8)
at emitErrorCloseNT (node:internal/streams/destroy:128:3)
at processTicksAndRejections (node:internal/process/task_queues:82:21)

/opt/zigbee2mqtt/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:264
var er = new ERR_STREAM_WRITE_AFTER_END();
^
Error: write after end
at writeAfterEnd (/opt/zigbee2mqtt/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:264:12)
at DerivedLogger.Writable.write (/opt/zigbee2mqtt/node_modules/winston/node_modules/readable-stream/lib/_stream_writable.js:300:21)
at DerivedLogger.log (/opt/zigbee2mqtt/node_modules/winston/lib/winston/logger.js:231:12)
at Logger.log (/opt/zigbee2mqtt/lib/util/logger.ts:188:25)
at Logger.info (/opt/zigbee2mqtt/lib/util/logger.ts:201:14)
at Znp.onPortClose (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:113:16)
at Object.onceWrapper (node:events:634:26)
at Socket.emit (node:events:519:28)
at TCP. (node:net:338:12)

image

@habitats-tech
Copy link
Author

@habitats-tech if you're going to test the watchdog, could you report back how usable Z2M is? Im already reverting back to 1.38.0, but after an update to 1.39.0 my Z2M and UZG-1 both needed restart every 20 minutes or so.

I have found 1.39.0 more stable. To my observations the Z2M version number has nothing to do with disconnections. However, so far I have failed to find a pattern. I have the exact config in 2 different locations, one does not fail at all, the other disconnects every 100s and automatically reconnects every 10s.

In one of the installations I had frequent disconnections, and after plugging and unplugging the UZG several times trying different things it stopped disconnecting. I have failed to find why.

Still in the process of investigating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
problem Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants