Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible issues with testing/metrics for uptime? #141

Closed
jamiew0w opened this issue Aug 24, 2021 · 4 comments
Closed

possible issues with testing/metrics for uptime? #141

jamiew0w opened this issue Aug 24, 2021 · 4 comments

Comments

@jamiew0w
Copy link
Contributor

hello @CorralPeltzer

i hope you're keeping well. i check in on newtrackon every now and again to check how uptime looks for my trackers (compared to other trackers) and i noticed the uptime for basically all trackers is a lot shorter than it used to be.

for example, there's one tracker at the moment (hosted on amazon) that has 4 months uptime, then next highest is 11 days.

about a year ago it was normal to see trackers with several months uptime. for my own tracker, i'm not seeing any issues locally or with my own monitoring so i'm curious why it's showing 93% availability.

could there be an issue with how newtrackon is polling trackers?

@CorralPeltzer
Copy link
Owner

Hi! The uptime of a tracker in newTrackon can only be as good as the stability of the connection between the server that hosts newTrackon and the tracker. https://newtrackon.com/ is running in a cheap VPS in Hetzner, which might not be the most stable network. It seems there was some network issues 22 days ago, but there's many trackers above 99% uptime.

Since your tracker is UDP, a 93% uptime would mean 7% of UDP packet are being dropped somewhere between Hetzner and your network, BuyVM. AFAIK, there's no widespread reports of packet loss in Hetzner. #80 should also improve the uptime in this case.

I'd recommend you to monitor your tracker from several locations, either running your own newTrackon instances or with other tools.

@jamiew0w
Copy link
Contributor Author

Hi @CorralPeltzer, sorry for the late reply.

Thanks for the extra information, that's useful. I'll see if I can narrow it down. Unfortunately from my testing I haven't been able to reproduce it, would you happen to have any examples of which error newtrackon encountered when testing my tracker?

Appreciated and thanks for your time!

@CorralPeltzer
Copy link
Owner

Hi! All the failed checks from newTrackon to tracker.leech.ie in September (32 checks) have been because of UDP timeouts.

I have also created an ICMP test in DataDog from all available locations for tracker.leech.ie. See more info in https://docs.datadoghq.com/synthetics/api_tests/icmp_tests/. This is the last 7 days:

image

4 ICMP packets are sent from each location every 30 seconds, and downtime is considered when there's any packet loss or latency above 1 second. Most downtime events only come from one or two regions, but there's still some events of widepread loss from all regions, which points to packet loss on your side.

The last ones happened on Sep 19, 14:06 UTC and on Sep 18, 04:39 UTC.

This is just ICMP loss so cannot reliably extrapolate on UDP loss. Assuming Hetzner also has some congestion, the best way to improve your uptime would be implementing #80.

@jamiew0w
Copy link
Contributor Author

fantastic, thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants