-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add probe_icmp_reply_ttl_total #694
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All codepaths here will need manual testing on both Linux and Windows.
f5d07a1
to
28c5059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks better, this will all need manual testing as we've been caught out before.
New push:
|
That push didn't seem to change anything, can you try again? |
6e8cf46
to
70673e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it you've manually tested the various code paths?
I tested both a working and not working case for IPv4 (raw and not), IPv6 and dns. I forced 'dontFragment' flag modifying the code as I could not pass it using URL variables. Do I need to test anything else? The raw ping from windows failed as v4RawConn.WriteTo returns "write ip 0.0.0.0->8.8.8.8: sendmsg: not implemented on windows/amd64". That's the only code path that still calls the same functions as I only removed the v4conn wrapper. It looks like it is expected: I guess RawConn does not have any use in Windows as both {Read/Write}(Batch,To,From} are not implemented yet. BTW, the timeout for non-existing targets is quite long (duration_seconds=119.500175799, about 2min) for both protocols and both systems. At ts=2020-10-01T17:35:09.042Z deadline is 2020-10-01T14:37:08.541967114-03:00. It's quite a long time for an ICMP answer. |
New push:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forced 'dontFragment' flag modifying the code as I could not pass it using URL variables.
It comes from the config file.
Do I need to test anything else?
The unprivileged sockets should also be tested.
BTW, the timeout for non-existing targets is quite long
It used to be much shorter, however that made it difficult to test slow targets from the browser.
New Push:
|
prober/icmp.go
Outdated
requestType icmp.Type | ||
replyType icmp.Type | ||
icmpConn *icmp.PacketConn | ||
v4RawConn *ipv4.RawConn | ||
hopLimit float64 = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The = 0 is redundant, however it'd be cleaner to set it to -1 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be either -1 or 0. I just need to set it to -1 if SetControlMessage fails or 0 if it works. As I don't use the "else" when I check SetControlMessage , I opted to use 0 by default and set it to -1 inside the existing block.
What would be the better option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A separate bool would be cleaner I think, rather than overloading this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was the first approach. Updated.
prober/icmp.go
Outdated
if dstIPAddr.IP.To4() == nil { | ||
var cm *ipv6.ControlMessage | ||
n, cm, peer, err = icmpConn.IPv6PacketConn().ReadFrom(rb) | ||
if cm != nil && hopLimit >= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the check on hopLimit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopLimit==-1 would tell me if SetControlMessage failed. In this cade, TTL/HopLimit info might exist (when there is any OOB data) but it is trash. It would be much easier if hopLimit == 0 wasn't a valid state.
var n int | ||
var peer net.Addr | ||
var err error | ||
var hopLimit float64 = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still messy, as there's now two hasHopLimit indicators doing different things. It'd be better to name the bool what is actually, namely whether the CM was set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real issue is that go does not tell me if hopLimit is missing or zero. I opened an issue golang/go#41820.
I renamed the bool to ipv6HopLimitFlagSet and now I use it only for IPv6. If golang fixes the issue, it can be removed in the future to adapt to whatever go uses to inform hopLimit is missing.
Another solution is to assume that hopLimit will never be 0 in a echo-reply. I assumed that hop limit could be zero when it is using the last permitted hop. However, I was wrong as a packet with hopLimit==1 will not get forwarded. The IPv6 echo reply will only be 0 if the target OS is explicitly setting the answer to be 0 (maybe a way to avoid its traffic to be routed?). As it is a very very special case, we might ignore it.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best not to make assumptions about the TTL in a tool like this, as if that weird situation does happen you wouldn't want to lead users down the wrong path. It's probably best to use the flag for both protocols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Let's play safe. I reintroduced the bool check for both.
|
IPv6 hop limit (IPv4 TTL) tells how many remaining hops the reply datagram has. When the target uses a well known initial hop limit, it can be used to count the number of hops from the target to the prober and detect routing problems. Signed-off-by: Luiz Angelo Daros de Luca <[email protected]>
Thanks! |
Thank YOU for Prometheus! |
IPv4 TTL (IPv6 hoplimit) identifies the number of routers used by
an arriving packet. It can be used to detect routing problems.
Signed-off-by: Luiz Angelo Daros de Luca [email protected]