Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Restarting tenderduty while alert is active causes that alert to struck forever and never produce any new alert (Critical) #65

Open
upnodedev opened this issue May 13, 2023 · 0 comments

Comments

@upnodedev
Copy link

Problem

Restarting tenderduty while alert is active causes that alert to struck forever and never produce any new alert

image

image

Above picture clearly state that xxx has missed 10 blocks on mocha is already resolved. In contrast, tenderduty still showing that alert is not resolved and neither pagerduty nor telegram receive any resolve event.

Moreover, it never produce missed 10 blocks alert anymore in the future.

Impact

Alert struck forever and never trigger any new alert. Causes our validator node to be at risk of being jailed and slashed. (Critical)

Step to reproduce

  1. Run both tenderduty and validator normally
  2. Stop validator and wait for it to trigger alert "ALERT: - xxx has missed xx blocks on xxx"
  3. Stop tenderduty
  4. Restart tenderduty
  5. Restart validator
  6. Notice that alert doesn't get resolved
  7. Repeat these steps and notice that new alert never get sent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant