-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node_timex_sync_status metric indicates that NTP is not syncing #1183
Comments
I discovered that chrony doesn't set the timex sync flag unless the rtcsync option is set. |
This directive enables kernel synchronisation (every 11 minutes) of the real-time clock. It also causes chrony to update the timex sync flag, which is commonly used to monitor clock synchronisation, e.g. via the prometheus node exporter. Fixes bottlerocket-os#1183
For ntpd: run the ntpddeamon without the option -x Edit the ntpd configuration file and remove the -x option vi /etc/sysconfig/ntpdRestart the ntpd.service systemctl restart ntpd.serviceFor chronyd: add the option rtcsync in /etc/chrony.conf if it is not there (it is included in the default configuration). systemctl restart chronyd.service |
Image I'm using:
bottlerocket-aws-k8s-1.18-x86_64-v1.0.2-ddeb03c8 - ami-00e1ed6a3ebe06fd0
What I expected to happen:
We are running prometheus node exporter as a Kubernetes DemonSet.
We expected the
node_timex_sync_status
metric to be 1 if NTP is working properly.What actually happened:
The
NodeClockNotSynchronising
alert on our prometheus instance fires: https://github.com/prometheus/node_exporter/blob/v1.0.1/docs/node-mixin/alerts/alerts.libsonnet#L237-L249 becausenode_timex_sync_status
is 0It seems like the clock on our instances is set correctly (but this doesn't prove that NTP is working correctly)
There are no logs indicating a problem either on the node exporter, or on chronyd on the host.
It's hard to debug this issue as there is no chronyc available when using the admin container to debug the host!
How to reproduce the problem:
Run the prometheus node exporter e.g. https://github.com/prometheus-operator/kube-prometheus/blob/v0.6.0/manifests/node-exporter-daemonset.yaml
And check the
node_timex_sync_status
metricThe text was updated successfully, but these errors were encountered: