Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak NTP error reporting #2125

Merged
merged 18 commits into from
Jan 12, 2025
Merged

Tweak NTP error reporting #2125

merged 18 commits into from
Jan 12, 2025

Conversation

DL6ER
Copy link
Member

@DL6ER DL6ER commented Nov 30, 2024

What does this implement/fix?

This PR has three changes regarding the NTP implementation:

  1. If there was a timeout, we log the used IP address in addition to the host name. This may be useful to determine if only specific servers cause timeouts (as suggested by @deHakkelaar)

  2. Do not issue double warnings when receiving NTP replies fails, e.g. skip the first of these two:

    2024-11-28 05:28:35.590 CET [25437/T25447] WARNING: Could not recv() in reply() (/app/src/ntp/client.c:223): Resource temporarily unavailable
    2024-11-28 05:28:35.590 CET [25437/T25447] ERROR: Failed to receive data from NTP server europe.pool.ntp.org: Timeout
    
  3. In case the upstream server sends a "kiss code", we interpret and show it (e.g. "access denied" when authentication is required or "rate exceeded" when querying too often). Before, we simply logged

    Received NTP reply has invalid version, ignoring
    

    in such a case.


Related issue or feature (if applicable): N/A

Pull request in docs with documentation (if applicable): N/A


By submitting this pull request, I confirm the following:

  1. I have read and understood the contributors guide, as well as this entire template. I understand which branch to base my commits and Pull Requests against.
  2. I have commented my proposed changes within the code.
  3. I am willing to help maintain this change if there are issues with it later.
  4. It is compatible with the EUPL 1.2 license
  5. I have squashed any insignificant commits. (git rebase)

Checklist:

  • The code change is tested and works locally.
  • I based my code and PRs against the repositories developmental branch.
  • I signed off all commits. Pi-hole enforces the DCO for all contributions
  • I signed all my commits. Pi-hole requires signatures to verify authorship
  • I have read the above and my PR is ready for review.

@DL6ER DL6ER changed the title Tweak/ntp errors Tweak NTP error reporting Nov 30, 2024
@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/no-valid-ntp-replies-received-check-server-and-network-connectivity/74237/12

@DL6ER DL6ER requested a review from a team December 6, 2024 18:40
@DL6ER DL6ER marked this pull request as ready for review December 6, 2024 18:40
@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/no-valid-ntp-replies-received-check-server-and-network-connectivity/74237/28

@sawsanders
Copy link

sawsanders commented Dec 16, 2024

Working well for me. No issues in two days of testing!

EDIT: See below, spoke too soon.

@yubiuser
Copy link
Member

There is an reported crash on with this branch
https://discourse.pi-hole.net/t/pi-hole-crashed-shortly-after-upgrade/74553

@DL6ER
Copy link
Member Author

DL6ER commented Dec 16, 2024

This crash happened somewhere in the DNS resolver thread, not code that is even remotely touched by this PR. So this may be 100% coincidence.

@DL6ER
Copy link
Member Author

DL6ER commented Dec 17, 2024

Nonetheless, I went through the resolver code line-by-line and added quite a few extra checks wherever possible. I don't think there was an issue but now everything should be even better because we are not believing in large enough buffers but actually check that before working on them.

…ERROR message from showing up in the logs

Signed-off-by: DL6ER <[email protected]>
…c was sufficiently far off (> 0.1 s) to postpone the start of the embedded NTP server

Signed-off-by: DL6ER <[email protected]>
…pile-time constant) and enlarge retry interval to 10 minutes (or ntp.sync.interval if smaller)

Signed-off-by: DL6ER <[email protected]>
src/ntp/client.c Outdated Show resolved Hide resolved
@DL6ER DL6ER requested a review from yubiuser December 22, 2024 08:16
@sawsanders
Copy link

This branch has been working very well for me. I haven't seen any issues since the last commits.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link

Conflicts have been resolved.

@yubiuser
Copy link
Member

This is nice, it restarts itself when the time was off too much on startup

2025-01-11 22:21:45.002 CET [196707/T196708] DEBUG_NTP: Successfully opened RTC at "/dev/rtc0"
2025-01-11 22:21:45.003 CET [196707/T196708] INFO: RTC time set to 2025-01-11 21:21:45
2025-01-11 22:21:45.003 CET [196707/T196708] INFO: System time was updated by 2185.4 seconds
2025-01-11 22:21:45.003 CET [196707/T196708] INFO: Restarting FTL: System time updated
2025-01-11 22:21:45.003 CET [196707/T196708] DEBUG_NTP: Local time is too inaccurate, retrying before launching NTP server
2025-01-11 22:21:45.006 CET [196707M] DEBUG_ANY: Received SIGTERM
2025-01-11 22:21:45.007 CET [196707M] INFO: Asked to terminate by "/usr/bin/pihole-FTL -f" (PID 196707, user pihole UID 999)
2025-01-11 22:21:45.008 CET [196707M] DEBUG_ANY: Sending SIGUSR6 to dnsmasq to stop DNS service
2025-01-11 22:21:45.008 CET [196707M] DEBUG_ANY: dnsmasq received signal 41

Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we prevent NTP server start at all if NTP sync is disabled?

2025-01-11 21:45:07.784 CET [196845M] INFO: NTP sync is disabled
2025-01-11 21:45:07.784 CET [196845/T196846] INFO: NTP server listening on 0.0.0.0:123 (IPv4)
2025-01-11 21:45:07.785 CET [196845/T196847] INFO: NTP server listening on :::123 (IPv6)

@DL6ER
Copy link
Member Author

DL6ER commented Jan 12, 2025

Seems like an artificial limitation to me. If they go and disable NTP sync, they can also disable NTP offering. Maybe they have a local GPS attached to the device (I had one once) so don't need to sync upstream but can still offer precise timing for the network.

Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the delay in NTP server startup can be log level info

src/dnsmasq/helper.c Outdated Show resolved Hide resolved
src/ntp/client.c Outdated Show resolved Hide resolved
src/ntp/client.c Show resolved Hide resolved
DL6ER and others added 2 commits January 12, 2025 13:23
Co-authored-by: yubiuser <[email protected]>
Signed-off-by: Dominik <[email protected]>
Co-authored-by: yubiuser <[email protected]>
Signed-off-by: Dominik <[email protected]>
src/ntp/client.c Outdated Show resolved Hide resolved
Signed-off-by: yubiuser <[email protected]>
Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025-01-12 17:55:11.992 CET [214697/T214698] INFO: Received 8/8 valid NTP replies from de.pool.ntp.org
2025-01-12 17:55:11.993 CET [214697/T214698] INFO: Time offset: 2.922946e+05 ms (excluded 0 outliers)
2025-01-12 17:55:11.993 CET [214697/T214698] INFO: Round-trip delay: 2.545059e+01 ms (excluded 0 outliers)
2025-01-12 18:00:05.000 CET [214697/T214698] INFO: Current RTC time is 2025-01-12 16:55:12
2025-01-12 18:00:05.000 CET [214697/T214698] INFO: RTC time set to 2025-01-12 17:00:05
2025-01-12 18:00:05.001 CET [214697/T214698] INFO: Local time is too inaccurate, retrying in 600 seconds before launching NTP server
2025-01-12 18:10:09.323 CET [214697/T214698] INFO: Received 8/8 valid NTP replies from de.pool.ntp.org
2025-01-12 18:10:09.324 CET [214697/T214698] INFO: Time offset: 1.023293e+00 ms (excluded 1 outliers)
2025-01-12 18:10:09.325 CET [214697/T214698] INFO: Round-trip delay: 2.449826e+01 ms (excluded 1 outliers)
2025-01-12 18:10:10.000 CET [214697/T214698] INFO: Current RTC time is 2025-01-12 17:10:10
2025-01-12 18:10:10.003 CET [214697/T214855] INFO: NTP server listening on 0.0.0.0:123 (IPv4)
2025-01-12 18:10:10.003 CET [214697/T214856] INFO: NTP server listening on :::123 (IPv6)
2025-01-12 19:10:10.329 CET [214697/T214698] INFO: Received 8/8 valid NTP replies from de.pool.ntp.org

@DL6ER DL6ER merged commit bed065c into development Jan 12, 2025
19 checks passed
@DL6ER DL6ER deleted the tweak/ntp_errors branch January 12, 2025 19:17
@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/shm-errors-in-ftl-log-for-v6-development/75131/3

@PromoFaux PromoFaux mentioned this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants