Cyberpower stale commands #91

mattlward · 2021-02-11T17:01:54Z

Problem/Motivation

Container file system not complete. After the upgrade to 0.6.1 my ups started going stale. I opened portainer and consoled into the nut container to edit /etc/nut/ups.conf: and add a pollinterval = 15. But, vi and/or nano are not in the container.

I did adjust my config this way and am waiting to see if it works.

users:
  - username: upsmonmaster
    password: H8NzOjV30g471PF6TZtr
    instcmds:
      - all
    actions: []
devices:
  - name: DR_UPS
    alias: drups
    driver: usbhid-ups
    port: auto
    pollinterval: 15
    config:
      - vendorid = 0764*
mode: netserver
shutdown_host: 'false'
log_level: debug
shutdown_hassio: 'false'

I can restart nut without complaints, at this point.

I suspect that the editors went away with the linux change in 0.6.0

Expected behavior

I expect my ups to stay online.

Actual behavior

Data goes stale after a few hours, Cyber Power ups and they are known for needing a short pollinterval

I may need to change DEADTIME 25, normally in the file /etc/nut/upsmon.conf:

System Health

version: core-2021.2.3
installation_type: Home Assistant Supervised
dev: false
hassio: true
docker: true
virtualenv: false
python_version: 3.8.7
os_name: Linux
os_version: 4.19.0-13-amd64
arch: x86_64
timezone: America/Chicago


GitHub API: ok
Github API Calls Remaining: 5000
Installed Version: 1.11.2
Stage: running
Available Repositories: 745
Installed Repositories: 10


host_os: Debian GNU/Linux 10 (buster)
update_channel: stable
supervisor_version: supervisor-2021.02.6
docker_version: 20.10.2
disk_total: 218.1 GB
disk_used: 10.0 GB
healthy: true
supported: true
supervisor_api: ok
version_api: ok
installed_addons: Backup Hassio to Google Drive (1.7.2), Dropbox Sync (1.3.0), Duck DNS (1.12.5), FTP (4.0.0), File editor (5.2.0), Log Viewer (0.9.1), RPC Shutdown (2.2), WireGuard (0.5.0), Mosquitto broker (5.1), SSH & Web Terminal (8.0.1), Samba share (9.3.0), TasmoAdmin (0.14.0), motionEye (0.11.0), AdGuard Home (3.0.0), Portainer (1.4.0), Glances (0.11.0), Check Home Assistant configuration (3.6.0), Network UPS Tools (0.6.1), DHCP server (1.2)


dashboards: 1
resources: 3
views: 16
mode: storage

Steps to reproduce

(How can someone else make/see it happen)

Proposed changes

(If you have a proposed change, workaround or fix,
describe the rationale behind it)

The text was updated successfully, but these errors were encountered:

sinclairpaul · 2021-02-11T17:08:47Z

The addons aren't designed to manually edit files, I would recommend looking at the upsd_maxage option as well, that can be configured in the config.

mattlward · 2021-02-11T17:14:51Z

Paul, I just noticed that in the docs. Is this a proper config? I does load.

users:
  - username: upsmonmaster
    password: H8NzOjV30g471PF6TZtr
    instcmds:
      - all
    actions: []
devices:
  - name: DR_UPS
    alias: drups
    driver: usbhid-ups
    port: auto
    pollinterval: 15
    upsd_maxage: 25
    config:
      - vendorid = 0764*
mode: netserver
shutdown_host: 'false'
log_level: debug
shutdown_hassio: 'false'

sinclairpaul · 2021-02-11T17:16:52Z

upsd_maxage: 25 should be outside the device (i.e. at the same level as devices/mode etc).

Also shutdown_hassio is not valid anymore and can be removed.

mattlward · 2021-02-11T17:22:45Z

Thanks for the info. New config:

users:
  - username: upsmonmaster
    password: H8NzOjV30g471PF6TZtr
    instcmds:
      - all
    actions: []
devices:
  - name: DR_UPS
    alias: drups
    driver: usbhid-ups
    port: auto
    pollinterval: 15
    config:
      - vendorid = 0764*
mode: netserver
shutdown_host: 'false'
upsd_maxage: 25
log_level: debug

I will update, if it stays online for 3 to 4 hours. Since the update I have only made about 2 hours. I am surprised that directly connected hardware is less tolerant than the other units I have remotely connected to RPi's. They just never fail.

mattlward · 2021-02-11T17:23:58Z

Does the log_level: debug place undo load on anything? I really does provide a lot of info.

sinclairpaul · 2021-02-11T17:24:34Z

You can remove the log_level debug, it is rarely that helpful:)

Please let us know how you get on.

mattlward · 2021-02-11T17:44:18Z

Do I need to worry about the cruff in the /etc/nut config files that was brought forward and now can't be removed? I verified that /etc/nut/ups.conf still has the basic perimeters that I have been running forever. Or is that being added to the file by the config in the addon?

sinclairpaul · 2021-02-11T17:47:40Z

The addon creates the config on startup, I'm not sure what you are referring to.

sinclairpaul · 2021-02-11T17:53:35Z

Actually looking further, you would likely need:

  - name: DR_UPS
    driver: usbhid-ups
    port: auto
    config:
      - vendorid = 0764*
      - alias = drups
      - pollinterval = 15

Assuming those are valid driver options

mattlward · 2021-02-11T21:55:49Z

This config did not work...

users:
  - username: upsmonmaster
    password: H8NzOjV30g471PF6TZtr
    instcmds:
      - all
    actions: []
devices:
  - name: DR_UPS
    driver: usbhid-ups
    port: auto
    config:
      - vendorid = 0764
      - alias = drups
      - pollinterval = 15
mode: netserver
shutdown_host: 'false'
upsd_maxage: 25

Log snippet around the nut restart"

0.007333	User [email protected] logged into UPS [DR_UPS]
   0.001957	Logged into UPS DR_UPS@localhost
   0.002118	Poll UPS [DR_UPS@localhost] failed - Driver not connected
{"message": "Event nut.ups_event fired."}Network UPS Tools upsmon 2.7.4
   5.002540	Poll UPS [DR_UPS@localhost] failed - Driver not connected
{"message": "Event nut.ups_event fired."}Network UPS Tools upsmon 2.7.4
  10.003294	Poll UPS [DR_UPS@localhost] failed - Driver not connected
  15.003649	Poll UPS [DR_UPS@localhost] failed - Driver not connected

My config from above does not seem to process the pollinterval properly, the ups becomes stale after about 1 hr and 45 minutes. Restarting nut restores service.

sinclairpaul · 2021-02-11T22:01:34Z

I personally have a Cyberpower UPS, it has been up and running for the past 5 hours or so of testing with the following:

devices:
  - name: Cyberpower
    driver: usbhid-ups
    port: auto
    config: []
mode: netserver
shutdown_host: 'false'
list_usb_devices: true
upsd_maxage: 25

Looking at the NUT docs, the default poll interval is 2 seconds.

mattlward · 2021-02-11T22:18:33Z

I just stripped mine down to this:

devices:
  - name: DR_UPS
    alias: drups
    driver: usbhid-ups
    port: auto
    config:
      - vendorid = 0764
mode: netserver
shutdown_host: 'false'
list_usb_devices: true
upsd_maxage: 20

Maybe just made it to complex or smart. I am more accustomed to running on a Pi and having to configure it by hand.

Will report back.

mattlward · 2021-02-12T00:32:31Z

Well crap, no data already. Reporting data stale in log. Will try a system restart, don't expect a change

ricarva · 2021-02-12T11:04:46Z

Same problem here: CyberPower UPS was working properly with the Addon, but now is throwing up "stale data" log messages and the sensors become unavailable.

It's unclear what the trigger is. A manual restart of the Addon seems to get things going, albeit temporarilly.

I first noticed this more than two days ago, so it was either brought about by HA 2012.2 or by the NUT Addon's v0.5.0.

Let me know what I can do to provide any more useful information.

mattlward · 2021-02-12T14:42:56Z

@ricarva This morning I fell back to my last stable config, it was version 0.5.0. Did a partial restore a few moments ago. I have noticed that just unplugging and replugging the usb restored service.

I am waiting to see if this test allows the unit to remain up.

mattlward · 2021-02-12T16:04:45Z

I have exceeded 2 hours of good connection after reverting to my 0.5.0 snapshot. Up longer now than I have been on 0.6.0 or 0.6.1, but long term stability test will just take time.

ricarva · 2021-02-12T16:06:59Z

@mattlward Thanks for the head's up.

I'm still on 0.6.1 and HA 2012.2, but confirm your finding that unplugging/replugging the USB restores the service. Let's see for how long.

@sinclairpaul Any idea of where the issue may lie?

Thanks for the help and insights.

mattlward · 2021-02-12T16:11:03Z

Under .0.6.1 the service was restored for less than 2 hours, no different than restarting Nut. I am on HA 2021.2.3

I am running on a Lenovo M73, so I have multiple usb controllers and changing to a different controller had the same short duration fix.

mattlward · 2021-02-12T16:16:42Z

@ricarva, did you edit the /etc/nut *.conf files under older versions? I did and I know the container gets rebuilt on upgrade, but my ups.conf and upsd.conf files still seemed to contain my old data when changing from 0.5.0 to 0.6.0.

I could be that it only appears that way, if when the system builds those files it reads data from the addon config file.

ricarva · 2021-02-12T16:18:16Z

I'm running on a RPi3b, and my last few attempts saw NUT last less than an hour.

@mattlward

I did not edit the *.conf files.

mattlward · 2021-02-12T16:34:22Z

Not sure if this means anything... This is a graph of my free memory, the spike was the last time 0.6.1 died and I fell back to 0.5.0. My system has always appeared to have a memory leak, but never really goes below around 5600. And never becomes unstable. If not for version changes, I would expect wonderful uptimes. I have see 16 weeks between reboots.

assices · 2021-02-12T16:56:49Z

Same problem here: CyberPower UPS was working properly with the Addon, but now is throwing up "stale data" log messages and the sensors become unavailable.

It's unclear what the trigger is. A manual restart of the Addon seems to get things going, albeit temporarilly.

I first noticed this more than two days ago, so it was either brought about by HA 2012.2 or by the NUT Addon's v0.5.0.

Let me know what I can do to provide any more useful information.

Same issue at my side. It began yesterday.
My UPS is a CyberPower VALUE600EILCD, connected by usb port to a Rasp 4 (Hassio).

sinclairpaul · 2021-02-12T18:14:03Z

I just pushed an update to the edge repo which you are welcome to test, allowing the config of the deadtime parameter. I have been running for at least 5 hours fine with it (although I got that yesterday), with the following config:

devices:
  - name: Cyberpower
    driver: usbhid-ups
    port: auto
    config:
      - pollinterval = 15
mode: netserver
shutdown_host: 'false'
upsd_maxage: 25
upsmon_deadtime: 25

Can I also suggest that when you save the config, take a quick look in the Supervisor log, as it will report any issues with it. I will continue to test over the next day or so.

mattlward · 2021-02-12T19:13:23Z

I will try to convert to the edge repo this evening or in the morning.

Thanks for your work and help.

ricarva · 2021-02-12T19:48:02Z

@sinclairpaul, thanks for the quick turnaround on a possible fix.

I do wonder: do you have an idea of what made the issue manifest when it wasn't a problem in the past?

mattlward · 2021-02-12T20:08:23Z

@sinclairpaul , now on the latest edge version with the following config:

devices:
  - name: DR_UPS
    alias: drups
    driver: usbhid-ups
    port: auto
    config:
      - vendorid = 0764*
      - pollinterval = 15
mode: netserver
shutdown_host: 'false'
list_usb_devices: true
upsd_maxage: 25
upsmon_deadtime: 25

sinclairpaul · 2021-02-12T21:24:09Z

At one time I could send them as a switch using sshpass to send commands into the nut container via a docker exec. But, that no longer works because I have lost system shell access in order to stay healthy and supported

You can docker exec all you want, however I think this is getting a little off topic for this issue, likely better asked on the forums or Discord.

mattlward · 2021-02-12T21:42:01Z

Understood. Just throwing it out there.

sinclairpaul · 2021-02-13T02:02:48Z

So currently I am 8 hours without an issue, any other updates?

mattlward · 2021-02-13T02:33:26Z

I am at 3:45 on the edge build, still looking good.

ricarva · 2021-02-13T10:22:25Z

@sinclairpaul still the question holds: why would deadtime need to be set now, when it wasn't a problem in the past?

sinclairpaul · 2021-02-13T12:24:51Z

As mine has been running all night without an issue, I will release and close this out.

still the question holds: why would deadtime need to be set now, when it wasn't a problem in the past?

Deadtime was also set, it now can be adjusted, the repo has a changelog, and after spending ~20 hours of my own time on the addon this week, I'm not really going to look any further into it 😉.

Might of been the debian change, or the HA hw layer change, but neither I can do anything about.

sinclairpaul · 2021-02-13T12:34:03Z

Hopefully fixed with v0.6.2, closing out for now.

sblantipodi · 2021-02-21T22:32:36Z

@sinclairpaul 0.6.2 doesn't solved it here. Can you reopen the issue please?

garyak · 2021-02-21T22:36:51Z

I'm also seeing stale data errors with v6.2.

sinclairpaul · 2021-02-21T22:42:49Z

I'm sorry folks, it is likely your configuration. I have been running for over a week with no issues, and based on the other comments I would suggest it works. To clarify my config is:

devices:
  - name: Cyberpower
    driver: usbhid-ups
    port: auto
    config:
      - pollinterval = 15
mode: netserver
shutdown_host: 'false'
list_usb_devices: true
upsd_maxage: 25
upsmon_deadtime: 25

garyak · 2021-02-21T22:57:40Z

Alright, I'll duplicate your config as it will work for me and let you know how it goes.

sblantipodi · 2021-02-22T10:54:08Z

same here, testing Paul's config...

ricarva · 2021-02-22T11:30:46Z

My input on the configuration piece: what made the setup stable for me was setting the poll interval.

The Maxage and Deadtime params, by themselves, were not enough.

Cheers,

geiseri · 2021-02-22T13:41:45Z

I can confirm here that since the 0.6.2 update AND the upsmon_deadtime change it has worked perfectly for the last few days. Looking at my history I think this might have been a regression/change/feature with supervisor since I only noticed this after my last update. Either way now works like a charm!

Hyrules · 2021-02-22T18:22:46Z

I'm having this issue. Here is my config :

devices:
  - name: BR1500G
    driver: usbhid-ups
    port: auto
    config:
      - pollinterval = 15
mode: netserver
shutdown_host: 'false'
upsd_maxage: 25
upsmon_deadtime: 25

Let's see if this fixes it. I had already setted the maxage and deadtime without success. I'm on 0.6.2 as well.

geiseri · 2021-02-22T22:42:16Z

i am not sure it matters, but have the serial in there because I have multiple ups attached.

devices:
  - name: ups_1
    driver: usbhid-ups
    port: auto
    config:
      - serial = "CXEJP2003238"
      - pollinterval = 15
  - name: ups_2
    driver: usbhid-ups
    port: auto
    config:
      - serial = "CTHGO2007041"
      - pollinterval = 15
mode: netserver
shutdown_host: 'false'
log_level: debug
list_usb_devices: true
upsd_maxage: 25
upsmon_deadtime: 25```

sblantipodi · 2021-02-23T09:57:21Z

something broke in the recent HA core since the problem started since the last HA Core update.

Hyrules · 2021-02-23T11:54:28Z

In my case the it's still working at the moment the solution was to add -pollininterval = 15 in the device -> config option So my config is working.

garyak · 2021-02-23T12:23:38Z

A couple days now without error. Thanks @sinclairpaul.

sinclairpaul · 2021-02-23T13:03:01Z

Thanks for all the feedback, please feel free to open new issues.

This comment has been minimized.

Sign in to view

mattlward mentioned this issue Feb 12, 2021

Elevated commands inside of nut container, in console not working. #95

Closed

sinclairpaul closed this as completed Feb 13, 2021

sinclairpaul mentioned this issue Feb 13, 2021

Working config broke by updating to 0.6 #89

Closed

sinclairpaul changed the title ~~Stale data and incomplete shell in docker container~~ Cyberpower stale commands Feb 14, 2021

sinclairpaul mentioned this issue Feb 14, 2021

Constant Stale #100

Closed

mib1185 mentioned this issue Feb 21, 2021

Latest update broke NUT integration home-assistant/core#46481

Closed

hassio-addons locked as resolved and limited conversation to collaborators Feb 23, 2021

Cyberpower stale commands #91

Cyberpower stale commands #91

Comments

mattlward commented Feb 11, 2021

Problem/Motivation

Expected behavior

Actual behavior

Steps to reproduce

Proposed changes

sinclairpaul commented Feb 11, 2021

mattlward commented Feb 11, 2021

sinclairpaul commented Feb 11, 2021

mattlward commented Feb 11, 2021

mattlward commented Feb 11, 2021

sinclairpaul commented Feb 11, 2021

mattlward commented Feb 11, 2021

sinclairpaul commented Feb 11, 2021

sinclairpaul commented Feb 11, 2021 • edited Loading

mattlward commented Feb 11, 2021

sinclairpaul commented Feb 11, 2021 • edited Loading

mattlward commented Feb 11, 2021

mattlward commented Feb 12, 2021

ricarva commented Feb 12, 2021 • edited Loading

mattlward commented Feb 12, 2021

mattlward commented Feb 12, 2021

ricarva commented Feb 12, 2021

mattlward commented Feb 12, 2021 • edited Loading

mattlward commented Feb 12, 2021

ricarva commented Feb 12, 2021

mattlward commented Feb 12, 2021 • edited Loading

assices commented Feb 12, 2021 • edited Loading

sinclairpaul commented Feb 12, 2021

mattlward commented Feb 12, 2021

ricarva commented Feb 12, 2021

mattlward commented Feb 12, 2021 • edited Loading

This comment has been minimized.

sinclairpaul commented Feb 12, 2021

mattlward commented Feb 12, 2021

sinclairpaul commented Feb 13, 2021

mattlward commented Feb 13, 2021

ricarva commented Feb 13, 2021

sinclairpaul commented Feb 13, 2021

sinclairpaul commented Feb 13, 2021

sblantipodi commented Feb 21, 2021

garyak commented Feb 21, 2021

sinclairpaul commented Feb 21, 2021

garyak commented Feb 21, 2021

sblantipodi commented Feb 22, 2021 • edited Loading

ricarva commented Feb 22, 2021

geiseri commented Feb 22, 2021

Hyrules commented Feb 22, 2021 • edited Loading

geiseri commented Feb 22, 2021

sblantipodi commented Feb 23, 2021

Hyrules commented Feb 23, 2021

garyak commented Feb 23, 2021

sinclairpaul commented Feb 23, 2021

sinclairpaul commented Feb 11, 2021 •

edited

Loading

sinclairpaul commented Feb 11, 2021 •

edited

Loading

ricarva commented Feb 12, 2021 •

edited

Loading

mattlward commented Feb 12, 2021 •

edited

Loading

mattlward commented Feb 12, 2021 •

edited

Loading

assices commented Feb 12, 2021 •

edited

Loading

mattlward commented Feb 12, 2021 •

edited

Loading

sblantipodi commented Feb 22, 2021 •

edited

Loading

Hyrules commented Feb 22, 2021 •

edited

Loading