You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
// We no not delete IPs from nodeIPLatencyMap as part of the Node delete event handler
// to avoid consistency issues and because it would not be sufficient to avoid stale entries completely.
// This means that we have to periodically invoke DeleteStaleNodeIPs to avoid stale entries in the map.
m.latencyStore.DeleteStaleNodeIPs()
m.report()
I believe this is not ideal, because when outputting the NodeLatencyStats, the values for the lastRecvTime and lastSendTime
fields can be a bit confusing / misleading:
We are "always" going to have lastRecvTime < lastSendTime, because we always update NodeLatencyStats right after sending a new probe (before the response has had a chance to be received). Ideally most of the time, especially with very low inter-Node latency like we have here (a few ms), most of the time we would observe timestamps which are very close to each other / identical. This can be achieved by providing enough time to the NodeLatencyMonitor to receive / process the response, before calling m.report().
Another advantage of decoupling the sending of probes from the latency reporting would be the ability to enforce a minimum time interval between two consecutive reports. At the moment it is possible for someone to set pingIntervalSeconds to 1s (minimum supported value in the NodeLatencyMonitor CRD). In turn, this would cause m.report() to be invoked every second. That may be a bit too frequent for a monitoring tool, especially for a large cluster. So we could consider enforcing a minimum interval of 10s (even though that would mean that values of pingIntervalSeconds under 10s are not very useful).
The text was updated successfully, but these errors were encountered:
@antoninbas Kindly Check i have raised an PR regarding This issue i have tried to decouple the both PingTicker and ReportTicker your guidance regarding this wiil be highly appericiated
At the moment, the NodeLatencyMonitor in the Agent reports latency measurements immediately after sending ICMP probes:
antrea/pkg/agent/monitortool/monitor.go
Lines 444 to 451 in 1907856
I believe this is not ideal, because when outputting the
NodeLatencyStats
, the values for thelastRecvTime
andlastSendTime
fields can be a bit confusing / misleading:
We are "always" going to have
lastRecvTime < lastSendTime
, because we always updateNodeLatencyStats
right after sending a new probe (before the response has had a chance to be received). Ideally most of the time, especially with very low inter-Node latency like we have here (a few ms), most of the time we would observe timestamps which are very close to each other / identical. This can be achieved by providing enough time to the NodeLatencyMonitor to receive / process the response, before callingm.report()
.Another advantage of decoupling the sending of probes from the latency reporting would be the ability to enforce a minimum time interval between two consecutive reports. At the moment it is possible for someone to set
pingIntervalSeconds
to1s
(minimum supported value in theNodeLatencyMonitor
CRD). In turn, this would causem.report()
to be invoked every second. That may be a bit too frequent for a monitoring tool, especially for a large cluster. So we could consider enforcing a minimum interval of 10s (even though that would mean that values ofpingIntervalSeconds
under 10s are not very useful).The text was updated successfully, but these errors were encountered: