hubble: Populate traffic direction for trace and drop events #11062

gandro · 2020-04-20T12:53:47Z

Where possible, populate the TrafficDirection field from the trace
and drop notifications. This is done by comparing the Source
endpoint field of the trace/drop event with source address of the
captured IP packet.

For drop notifications, we assume there are no ongoing connections
which this packet belongs to and just compare the source address. For
trace notifications however, we do assume that there might an ongoing
connection. Therefore, we rely on the connection tracking state to
revert the traffic direction for reply packages. For trace observation
points without connection tracking, we do not assign any traffic
direction.

Pulling in the datapath team for a quick review as well to confirm that.

/cc @michi-covalent

gandro · 2020-04-20T12:53:56Z

test-me-please

coveralls · 2020-04-20T14:02:53Z

Coverage increased (+0.06%) to 44.697% when pulling fa7f33e on pr/gandro/hubble-tracenotify-trafficdirection into 531e61f on master.

joestringer · 2020-04-20T23:56:10Z

pkg/hubble/parser/threefour/parser.go

+		// We need to access the connection tracking result from the `Reason`
+		// field to invert the direction for reply packets. The `Reason` field
+		// is populated for TRACE_TO_{LXC,HOST,STACK,PROXY} events.
+		// TRACE_TO_PROXY events happen for both ingress and egress flows,


TRACE_TO_PROXY events happen for both ingress and egress flows

So do TraceToLxc, TraceToStack, TraceToHost flows depending on whether the endpoint, remote peer or host initiates the connection or is responding to the connection. decodeIsReply() below appears to be relying on connection-tracking state to determine whether the traffic is a reply or not, so this seems to be more to do with whether CT runs on the particular path through the datapath or not.

I'd expect for us to decide to send something to the proxy, we'd need CT state so should be able to provide the same visibility, though it's possible that for historic reasons the trace messages don't always contain this information.

I guess this brings us to basically, it looks like these observation points reliably provide the information today, and while other observation points might be able to provide the info, they may/may not be reliably providing it with the current implementation. If this PR is improving the ability to detect flow direction, then great. I wouldn't be surprised if we need to come back and revisit this for other observation points though (for instance, we're missing stuff like ToOverlay which is a common deployment setup but will handle traffic from local endpoints OR the local host, so there's no guarantee that we run the connection tracker).

Thanks a lot for the feedback @joestringer! Much appreciated.

TRACE_TO_PROXY events happen for both ingress and egress flows

So do TraceToLxc, TraceToStack, TraceToHost flows depending on whether the endpoint, remote peer or host initiates the connection or is responding to the connection.

You are, of course, absolutely correct.

I somehow completely overlooked the obvious fact that I also have to look at the source/destination of the traffic do determine its direction. The code in this PR is not sufficient to determine the traffic direction 🤦‍♂️ And yes, I should be also able to support ToProxy, it reliably contains the CT result. I'm putting this PR back into Draft mode.

FWIW: The reason why ToOverlay is absent from the list, is that it doesn't seem to actually populate the CT reason in the send_trace_notify call (second last argument):

cilium/bpf/lib/encap.h

Lines 138 to 139 in ae5588c

send_trace_notify(ctx, TRACE_TO_OVERLAY, seclabel, 0, 0, ENCAP_IFINDEX,

0, monitor);

Okay, I have reworked the code to actually infer the traffic direction. It is now comparing the endpoint ID behind the actual source of the IP packet with the endpoint ID of the trace event to determine the traffic direction.

Note that I still consider the traffic information best effort. I have added support for DropNotify as well, with the (in reality not always valid) assumption that we don't need connection tracking there, since for dropped packets there is (probably) no ongoing connection. However, since we don't have the connection tracking state available for drop events anyways, there is not much we can do otherwise.

It is now comparing the endpoint ID behind the actual source of the IP packet with the endpoint ID of the trace event to determine the traffic direction.

I don't quite follow this part. A packet can be coming from or going to arbitrary peers, but that doesn't tell you whether the connection was initiated from there or somewhere else. That's why we bring in the connection tracker, which stores state that tells you, based upon a 5-tuple lookup, the originally-seen packet 5tuple (and hence direction when compared to the current tuple).

However, since we don't have the connection tracking state available for drop events anyways, there is not much we can do otherwise.

On average for drops we can roughly assume they're forward direction because we're dropping them and it's hard to sustain a communication channel if one of the directions is dropped :) ...but yeah in the corner cases like policy transition from allow -> deny, or other weird cases like Cilium restart (and hence proxy connection gets disrupted), drops can get tricky to classify with 100% confidence.

Looking at the implementation I think we're on the same page on these points.

Awesome, thanks for the review and input @joestringer!

You have made me think about some more cases that I haven't considered. For example I'm currently only comparing the source address of the packet. I wonder if I should check for tn.Source == dstEP as well (and assign traffic direction unknown if neither source nor destination are the endpoint at which the event occurred). I also have not considered if NAT has an impact on correctness here as well. Maybe the proper way would be to just extend the trace points in the datapath with a direction flag, instead of trying to re-implement the logic here.

For now, I think I going to leave it as is. I might revisit the approach in the future, but I currently don't want to allocate too many cycles on a best-effort feature.

Signed-off-by: Sebastian Wicki <[email protected]>

Where possible, populate the TrafficDirection field from the trace and drop notifications. This is done by comparing the `Source` endpoint field of the trace/drop event with source address of the captured IP packet. For drop notifications, we assume there are no ongoing connections which this packet belongs to and just compare the source address. For trace notifications however, we do assume that there might an ongoing connection. Therefore, we rely on the connection tracking state to revert the traffic direction for reply packages. For trace observation points without connection tracking, we do not assign any traffic direction. Signed-off-by: Sebastian Wicki <[email protected]>

gandro · 2020-04-22T16:27:55Z

test-me-please

gandro added pending-review release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/hubble Impacts hubble server or relay labels Apr 20, 2020

gandro requested review from a team April 20, 2020 12:53

michi-covalent approved these changes Apr 20, 2020

View reviewed changes

rolinh approved these changes Apr 20, 2020

View reviewed changes

joestringer reviewed Apr 20, 2020

View reviewed changes

gandro marked this pull request as draft April 21, 2020 09:01

gandro added 2 commits April 22, 2020 16:30

hubble/parser: Ignore encryption flags in decodeIsReply

b3a76db

Signed-off-by: Sebastian Wicki <[email protected]>

gandro force-pushed the pr/gandro/hubble-tracenotify-trafficdirection branch from f60216c to fa7f33e Compare April 22, 2020 16:26

gandro marked this pull request as ready for review April 22, 2020 16:27

gandro changed the title ~~hubble: Populate traffic direction for TraceTo{Lxc,Host,Stack} events~~ hubble: Populate traffic direction for trace and drop events Apr 22, 2020

pchaigno requested a review from joestringer April 22, 2020 20:03

joestringer approved these changes Apr 22, 2020

View reviewed changes

pchaigno merged commit 688e42b into master Apr 23, 2020

pchaigno deleted the pr/gandro/hubble-tracenotify-trafficdirection branch April 23, 2020 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hubble: Populate traffic direction for trace and drop events #11062

hubble: Populate traffic direction for trace and drop events #11062

gandro commented Apr 20, 2020 •

edited

Loading

gandro commented Apr 20, 2020

coveralls commented Apr 20, 2020 •

edited

Loading

joestringer Apr 20, 2020 •

edited

Loading

gandro Apr 21, 2020

gandro Apr 22, 2020 •

edited

Loading

joestringer Apr 22, 2020 •

edited

Loading

joestringer Apr 22, 2020

gandro Apr 23, 2020

gandro commented Apr 22, 2020

	send_trace_notify(ctx, TRACE_TO_OVERLAY, seclabel, 0, 0, ENCAP_IFINDEX,
	0, monitor);

hubble: Populate traffic direction for trace and drop events #11062

hubble: Populate traffic direction for trace and drop events #11062

Conversation

gandro commented Apr 20, 2020 • edited Loading

gandro commented Apr 20, 2020

coveralls commented Apr 20, 2020 • edited Loading

joestringer Apr 20, 2020 • edited Loading

Choose a reason for hiding this comment

gandro Apr 21, 2020

Choose a reason for hiding this comment

gandro Apr 22, 2020 • edited Loading

Choose a reason for hiding this comment

joestringer Apr 22, 2020 • edited Loading

Choose a reason for hiding this comment

joestringer Apr 22, 2020

Choose a reason for hiding this comment

gandro Apr 23, 2020

Choose a reason for hiding this comment

gandro commented Apr 22, 2020

gandro commented Apr 20, 2020 •

edited

Loading

coveralls commented Apr 20, 2020 •

edited

Loading

joestringer Apr 20, 2020 •

edited

Loading

gandro Apr 22, 2020 •

edited

Loading

joestringer Apr 22, 2020 •

edited

Loading