Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_timex_sync_status metric indicates that NTP is not syncing #1183

Closed
errm opened this issue Oct 26, 2020 · 2 comments · Fixed by #1184
Closed

node_timex_sync_status metric indicates that NTP is not syncing #1183

errm opened this issue Oct 26, 2020 · 2 comments · Fixed by #1184

Comments

@errm
Copy link
Contributor

errm commented Oct 26, 2020

Image I'm using:

bottlerocket-aws-k8s-1.18-x86_64-v1.0.2-ddeb03c8 - ami-00e1ed6a3ebe06fd0

What I expected to happen:

We are running prometheus node exporter as a Kubernetes DemonSet.

We expected the node_timex_sync_status metric to be 1 if NTP is working properly.

What actually happened:

The NodeClockNotSynchronising alert on our prometheus instance fires: https://github.com/prometheus/node_exporter/blob/v1.0.1/docs/node-mixin/alerts/alerts.libsonnet#L237-L249 because node_timex_sync_status is 0

It seems like the clock on our instances is set correctly (but this doesn't prove that NTP is working correctly)

There are no logs indicating a problem either on the node exporter, or on chronyd on the host.

It's hard to debug this issue as there is no chronyc available when using the admin container to debug the host!

How to reproduce the problem:

Run the prometheus node exporter e.g. https://github.com/prometheus-operator/kube-prometheus/blob/v0.6.0/manifests/node-exporter-daemonset.yaml

And check the node_timex_sync_status metric

@errm
Copy link
Contributor Author

errm commented Oct 26, 2020

I discovered that chrony doesn't set the timex sync flag unless the rtcsync option is set.
See: https://github.com/prometheus/node_exporter/blob/v1.0.0-rc.0/docs/TIME.md#timex-collector

errm added a commit to errm/bottlerocket that referenced this issue Oct 26, 2020
This directive enables kernel synchronisation (every 11 minutes) of the real-time clock.

It also causes chrony to update the timex sync flag, which is commonly
used to monitor clock synchronisation, e.g. via the prometheus node
exporter.

Fixes bottlerocket-os#1183
@shamaloy
Copy link

shamaloy commented Sep 7, 2023

For ntpd: run the ntpddeamon without the option -x

Edit the ntpd configuration file and remove the -x option

vi /etc/sysconfig/ntpd

Restart the ntpd.service

systemctl restart ntpd.service

For chronyd: add the option rtcsync in /etc/chrony.conf if it is not there (it is included in the default configuration).

systemctl restart chronyd.service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants