-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-1255 Adding Raw Packet export capability as Packet Capture Agent(PCA) #113
Conversation
@msherif1234, for awareness --- the idea behind this PR is to become the basis for both PANO (of course) but also for other Packet level network observability requirements, such as "selective" packet mirroring - functionality such as network tracing etc. So generalization and customization of the content are key ( hope this make sense to you ) |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #113 +/- ##
==========================================
- Coverage 39.56% 32.16% -7.41%
==========================================
Files 31 36 +5
Lines 2391 2960 +569
==========================================
+ Hits 946 952 +6
- Misses 1388 1951 +563
Partials 57 57
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took initial look
I see ur created a parallel agent instead of extending the current one after u check the new config knob can u explain the reasoning ?
The current agent works at the level of flows and flow metrics. With PANO, we wish to use a 'packet' as the unit of communication. The struct elements of flow metrics are no longer required. Hope this helps. |
yeah I am aware of that as u know I went through the same trouble when I prototyped TCPlife span hook but was able to refactor the current agent so it can handle both probably something we need to poll the team on @jotak @eranra @praveingk wdyt ? will be great if we somehow abstract support to perf map as we have other use cases as well that wanted to use it like the one I mentioned above and more will come and will be nice to share this so all hooks need to generate perf event will reuse |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shach33 can you also remove the temp1.pcap
file which I assume got pushed along with your commits.
pkg/flow/packet_record.go
Outdated
} | ||
|
||
// ReadPacketFrom reads a PacketRecord from a binary source, in LittleEndian order | ||
func ReadRawPacketFrom(reader io.Reader) (*PacketRecord, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename this to what it does ReadRawPacketFromBuffer
or just ReadRawPacket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to ReadRawPacket
I belive yes we can extract out common code and enable one or both functionalities in the agents rather than extending the code which can create code duplication. |
Performed first level refactoring. |
This PR needs a code review. Waiting for @shach33 to commit final version before providing the review |
.gitignore
Outdated
@@ -4,3 +4,5 @@ | |||
bin/ | |||
e2e-logs | |||
ebpf-agent.tar | |||
*.o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unclear to me why we would ignore .o
files, I think we shouldn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe its a good practice to not include object files in commits. We cant get their diffs, they always have merge problems. But if thats how we do things. I will revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if err != nil { | ||
logrus.WithError(err).Fatal("can't instantiate NetObserv eBPF Agent") | ||
} | ||
if config.EnablePCA { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this means the agent is started either for flow logs or for PCA, but cannot do both at the same time, right?
Do you see a technical limitation for doing so, or is that just a choice and we could change that any time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The flows and packets agent are disparate and only one is active at a time. The metric of transfer in flows agent is a flow struct, while with PCA is a packet. We could merge them but that would add multiple unnecessary if checks at every step.
If the team deems this to be a better way to go, I can later create another PR with the merge. As of now, the code is cleanly separated between packet capture and flows agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think if we are going to keep agent either PCA or Flow logs we should specify that make a single environment like AGENT_TYPE
which can be PCA
or FlowLogs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can discuss this in our calls and make a decision then. In this PR, I would suggest to let it stay the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
@shach33 if I understand correctly your perf measurements, the current ebpf agent (without your PR) consumes twice as much memory than with your PR with PCA disabled? (1=~280MB vs 2=~140MB). This would be good news but is quite unexpected, how can we explain that? Or it could be a wrong measurement? |
For info, I did some tests on my cluster for memory usage: there's indeed an increase with this feature (even when disabled) , I guess essentially due to the map. On my cluster, that's around 10MB increase per pod. Since it is constant, the more loaded is the agent, the more negligible is this overhead. |
Kernel does not support |
bpf/maps_definition.h
Outdated
//PerfEvent Array for Packet Payloads | ||
struct { | ||
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); | ||
__type(key, int); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use u32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
__type(key, __u32); | ||
__type(value, __u32); | ||
} packet_record SEC(".maps"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BPF_F_NO_PREALLOC
not avaliable for all map types
bpf/pca.h
Outdated
// Set flag's upper 32 bits with the size of the paylaod and the bpf_perf_event_output will | ||
// attach the specified amount of bytes from packet to the perf event | ||
// https://github.com/xdp-project/xdp-tutorial/tree/9b25f0a039179aca1f66cba5492744d9f09662c1/tracing04-xdp-tcpdump | ||
flags |= (__u64)packetSize << 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't there a limit of how much hdr we can send with perf event ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems there is no enforced limit by the kernel and if userspace can't hold it the event will be dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the size is too big, the call to bpf_perf_event_output
will fail and return -EFAULT.
meta.if_index = skb->ifindex; | ||
meta.pkt_len = packetSize; | ||
meta.timestamp = current_time; | ||
if (bpf_perf_event_output(skb, &packet_record, flags, &meta, sizeof(meta))){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the point of checking the return code if either way u return ok ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was an oversight. Fixed it.
} | ||
sourcePort = udp_header->source; | ||
destPort = udp_header->dest; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not right , we can't assume if its not TCP its UDP, we have SCTP, ICMPv4 and ICMPv6 protocols support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Only checks for tcp/udp, for anything else, we return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sctp will work very similar to ycp and udp unless we see no use case for it ? which isn't true because svc can use sctp protocol taht is why we added it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added SCTP. ICMP packets dont really have port associated to them. So, it would require an update to the filter specifications, which I can do as part of an extension, but may be not with the current PR.
bpf/pca.h
Outdated
} | ||
|
||
if (validate_pca_filter(ip->protocol, (void *)ip + sizeof(*ip), data_end )){ | ||
return attach_packet_payload(data, data_end, skb, current_time); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: u don't use current_time here so u can read it in attach_packet_payload and remove it from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Fixed.
bpf/pca.h
Outdated
return export_packet_payload(skb); | ||
} | ||
|
||
#endif /* __PCA_H__ */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit pls add line to the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
bpf/pca.h
Outdated
} | ||
|
||
// Only IPv4 and IPv6 packets captured | ||
if (eth->h_proto != bpf_htons(ETH_P_IP) && (eth->h_proto != bpf_htons(ETH_P_IPV6))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be more efficient instruction wise to do
u16 ethType = bpf_ntohs(eth->h_proto);
if (ethType != ETH_P_IP && ethType != ETH_P_IPV6) {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
canceler() | ||
}() | ||
return ctx | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did u add this ? for standalone testing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to reuse the same code block for terminating both netobserv and PCA.
cmd/netobserv-ebpf-agent.go
Outdated
} | ||
if config.EnablePCA { | ||
if config.PCAFilters == "" { | ||
logrus.Fatal("[PCA] NetObserv eBPF Agent instantiated without filters to identify packets. Please set environment variable PCA_FILTER.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we treat that as capture everything instead of error ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we treat this for capture everything? That might be extremely resource intensive. This was a design decision we made. This could be updated if the team deems this behavior correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wanted to share one of the PM customers asked use case is to do capture all (just learnt this now) cc'd @jpinsonneau
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 it seems to be a use case now 🙃
Would it work if we capture everything expect our own netobserv traffic ? Let's say with unlimited resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to collect and export all packets if the filter (enviroment variable PCA_FILTER) is not set at all.
However if the filter is wrongly set, incorrect format/ non-supported protocols etc, then the agent doesnt export anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shach33 : here's a suggestion as discussed on slack. This is to minimize the memory allocation when PCA isn't used.
pkg/ebpf/tracer.go
Outdated
@@ -117,12 +124,19 @@ func NewFlowFetcher(cfg *FlowFetcherConfig) (*FlowFetcher, error) { | |||
return nil, fmt.Errorf("rewriting BPF constants definition: %w", err) | |||
} | |||
|
|||
//Deleting specs for PCA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//Deleting specs for PCA | |
//Deleting specs for PCA | |
// Always set pcaRecordsMap to the minimum in FlowFetcher - PCA and Flow Fetcher are mutually exclusive. | |
spec.Maps[pcaRecordsMap].MaxEntries = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure Joel.
if (packetSize < skb->len){ | ||
packetSize = skb->len; | ||
if (bpf_skb_pull_data(skb, skb->len)){ | ||
return TC_ACT_UNSPEC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we come to this case for TCP DNS have u tried to capture DNS over TCP and measured the CPU usage ? u can simply do dig google.om +tcp
to do DNS over TCP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what is happening here, but when I experiment, the requests/response generated via dig google.com +tcp
are also UDP packets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no that is indeed tcp dns I have used it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, I am able to capture those.
bpf/pca.h
Outdated
u32 packetSize = (__u32)(data_end-data); | ||
|
||
// Record the current time. | ||
__u64 current_time = bpf_ktime_get_ns(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't use __types
in the agent code so probably need to stick to the same style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
if (packetSize < skb->len){ | ||
packetSize = skb->len; | ||
if (bpf_skb_pull_data(skb, skb->len)){ | ||
return TC_ACT_UNSPEC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no that is indeed tcp dns I have used it
bpf/pca.h
Outdated
// Set flag's upper 32 bits with the size of the paylaod and the bpf_perf_event_output will | ||
// attach the specified amount of bytes from packet to the perf event | ||
// https://github.com/xdp-project/xdp-tutorial/tree/9b25f0a039179aca1f66cba5492744d9f09662c1/tracing04-xdp-tcpdump | ||
flags |= (__u64)packetSize << 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems there is no enforced limit by the kernel and if userspace can't hold it the event will be dropped
} | ||
sourcePort = udp_header->source; | ||
destPort = udp_header->dest; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sctp will work very similar to ycp and udp unless we see no use case for it ? which isn't true because svc can use sctp protocol taht is why we added it
pkg/agent/PacketsAgent.go
Outdated
io.Closer | ||
Register(iface ifaces.Interface) error | ||
|
||
LookupAndDeleteMap() map[int][]*byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you aren't using this interface ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am. This is the one being used to delete map.
getLen := make([]byte, 4) | ||
packetTimestamp := make([]byte, 8) | ||
// Read IfIndex and discard it: To be used in other usecases | ||
_ = binary.Read(reader, binary.LittleEndian, make([]byte, 4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to show interface name with the collected flow, u already have if_index in thr perf hdr
iface, err := net.InterfaceByIndex(int(eventHdr.IfId))
ifcace.Name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PCA eventually only transmits a packet stream. How do you think we should include the interface information?
logrus.WithField("packets", len(evictingEntries)). | ||
Debug("evicting packets from userspace accounter after reaching cache max length") | ||
c.evict(evictingEntries, out) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do u need the above logs won't some of this will flood the logs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a debug log. But removed it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @shach33 , LGTM on my side, I'm leaving it in @msherif1234 / @dushyantbehl hands! :)
destPort = sctp_header->dest; | ||
} | ||
else | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please add {}
here with else
to keep the code structure same across all C code.
bpf/pca.h
Outdated
} | ||
else | ||
return false; | ||
if (sourcePort == bpf_htons(pca_port) || destPort == bpf_htons(pca_port)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shach33 nit: you can possibly do bpf_htons(pca_port)
once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 2 minor things for @shach33.
Let me know when they are done and I can add a lgtm
here.
Leaving for @msherif1234 for taking a final view
@shach33 can we just rebase this ? then we are good to merge 😸 |
@shach33 why the changes in main is showing up in ur PR as new diffs ? |
@msherif1234 This is fixed now. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msherif1234 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR extends the agent to export raw packets through the network. We call it Packet Capture Agent(PCA).
This functionality is turned off by default. I have added 2 new environments to enable this.
set ENABLE_PCA=true
to activate packet cpature.set PCA_FILTER=[ipproto,port number] to indicate the filter for packet capture, e.g.
export PCA_FILTER=tcp,22
The PCA_FILTER defaults to udp DNS(udp,53). If ENABLE_PCA is true, but filters are not specified, by default it captures DNS packets.
The captured packets are written to a socket in PCAP format.