From 902e3e3fdc417818d69e89f3da86cdacca3d4248 Mon Sep 17 00:00:00 2001 From: Maximilian Moehl Date: Wed, 10 Sep 2025 08:12:02 +0200 Subject: [PATCH 1/3] rfc: add initial draft of pcap-cf Co-Authored-By: Claude --- toc/rfc/rfc-draft-pcap-cf.md | 147 +++++++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) create mode 100644 toc/rfc/rfc-draft-pcap-cf.md diff --git a/toc/rfc/rfc-draft-pcap-cf.md b/toc/rfc/rfc-draft-pcap-cf.md new file mode 100644 index 000000000..029141228 --- /dev/null +++ b/toc/rfc/rfc-draft-pcap-cf.md @@ -0,0 +1,147 @@ +# Meta +[meta]: #meta +- Name: Integrate pcap feature for Cloud Foundry applications +- Start Date: 2025-09-10 +- Author(s): @domdom82 @maxmoehl @peanball @ameowlia @mariash +- Status: Draft +- RFC Pull Request: [community#1309](https://github.com/cloudfoundry/community/issues/1309) + +## Summary + +Add a feature to Cloud Foundry that provides application developers with the +ability to perform packet capture (tcpdump) operations within their +application containers for debugging purposes. This RFC complements +[RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) by extending packet capture +capabilities from BOSH-level infrastructure debugging to application-level +debugging. + +The main goal is to allow app developers to capture network traffic from +their applications when troubleshooting issues that require elevated +privileges, while maintaining security through operator controls and a +reduced-functionality custom tool. + +## Problem + +Application developers frequently need to perform in-depth troubleshooting +of their applications when deployed via Cloud Foundry buildpacks. Currently, +there is no possibility for app developers to perform privileged debugging +actions such as packet captures (tcpdump) within their application +containers. + +When troubleshooting applications, it is common to require elevated +privileges for operations such as: +* Dumping network packets to analyze traffic patterns +* Investigating connectivity issues between application components +* Debugging network-related performance problems + +While [RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) addresses packet capture +at the BOSH infrastructure level for operators, there remains a gap for +application developers who need similar capabilities scoped to their +individual applications. + +The challenge is providing this functionality while maintaining the security +model of Cloud Foundry, where applications run in isolated, unprivileged +containers. + +## Proposal: Custom Packet Capturing Tool + +This RFC proposes implementing a reduced-functionality packet capturing tool +written in Go using the `gopacket` library. This custom tool would be +injected into application containers via Diego, similar to how `diego-ssh` +currently provides SSH access. + +### Key Features + +**Reduced Attack Surface**: The custom tool supports only essential packet +capture operations: +* Specifying a network interface +* Applying packet filters (pcap-filter format) +* Setting snapshot length (snaplen) +* No support for arbitrary code execution or complex operations + +**Injection Mechanism**: The tool would be injected via Diego using a mechanism +similar to `diego-ssh`, ensuring: +* Consistent deployment across all application instances +* Proper integration with Cloud Foundry's security model +* Ability to control access through feature flags + +**Operator Controls**: +* Platform-wide feature flag to enable/disable packet capture functionality +* Per-application feature flag support for granular control +* Integration with Cloud Foundry's existing permission model + +**Security Model**: +* Tool runs with necessary capabilities (`CAP_NET_RAW` and `CAP_NET_ADMIN`) + but limited functionality +* No shell access or ability to execute arbitrary code +* Scoped to the application's network namespace + +### Usage Example + +```bash +# Capture HTTP traffic for myapp +cf pcap myapp --interface eth0 --filter "tcp port 80" --snaplen 1500 + +# Capture specific instance with custom filter +cf pcap myapp --instance 1 --filter "host database.example.com" +``` + +## Other Options Considered + +#### Option 1: `sudo` Access + +Adding a platform switch to make the `vcap` user a sudoer was considered but +discarded due to: +* Significant security risks of providing full root access +* Potential for privilege escalation beyond intended packet capture use +* Inconsistency with Cloud Foundry's security-first design principles + +#### Option 2: `setcap` on `tcpdump` Binary + +Setting capabilities on the existing `tcpdump` binary was considered but +discarded because: +* `tcpdump` offers functionality to run arbitrary code in its security + context +* Demonstrated security vulnerabilities allowing code execution +* Difficulty in hiding behind feature flags due to unconditional + deployment +* Broader attack surface compared to a purpose-built tool + +### Relationship to Existing Solutions + +This RFC complements [RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) by +addressing different use cases: + +* **pcap-bosh**: Infrastructure-level debugging for operators across BOSH + deployments +* **pcap-cf**: Application-level debugging for developers within CF + applications + +## Implementation Considerations + +#### diego-release: Custom Packet Capturing Tool + +A new package will be added to diego-release which implements packet capturing +in go through the `gopacket` library. The resulting binary will be included in +the various lifecycle archives that are added to the final app container and +the necessary capabilities (`CAP_NET_RAW` and `CAP_NET_ADMIN`) will be assigned +to the executable via file capabilities. This allows regular users to gain those +capabilities when executing the binary. + +#### CF CLI: New `pcap` command + +Similar to the `bosh pcap` command a `cf pcap` command will be added. Like its +predecessor it will connect to the desired instances via SSH and execute the new +packet capturing tool and stream back the captured packets via stdout. If there +are multiple streams, the CLI will merge them and write them out to a single +file in the pcap format. + +## References + +* Prior discussions: + * https://github.com/cloudfoundry/diego-release/issues/1023 + * https://cloudfoundry.slack.com/archives/C0DEQSW9W/p1744034414856129?thread_ts=1744034414.856129&cid=C0DEQSW9W + * https://cloudfoundry.slack.com/archives/C033RE5D6/p1744299390390049 +* https://github.com/cloudfoundry/pcap-release +* https://github.com/cloudfoundry/community/blob/main/toc/rfc/rfc-0019-pcap-bosh.md +* https://github.com/gopacket/gopacket From 3fa4a929cf006b93667dfb0392e8e17a9910b77a Mon Sep 17 00:00:00 2001 From: Maximilian Moehl Date: Wed, 10 Sep 2025 14:06:48 +0200 Subject: [PATCH 2/3] first round of review feedback --- toc/rfc/rfc-draft-pcap-cf.md | 86 ++++++++++++++---------------------- 1 file changed, 34 insertions(+), 52 deletions(-) diff --git a/toc/rfc/rfc-draft-pcap-cf.md b/toc/rfc/rfc-draft-pcap-cf.md index 029141228..09ebd5261 100644 --- a/toc/rfc/rfc-draft-pcap-cf.md +++ b/toc/rfc/rfc-draft-pcap-cf.md @@ -28,12 +28,6 @@ there is no possibility for app developers to perform privileged debugging actions such as packet captures (tcpdump) within their application containers. -When troubleshooting applications, it is common to require elevated -privileges for operations such as: -* Dumping network packets to analyze traffic patterns -* Investigating connectivity issues between application components -* Debugging network-related performance problems - While [RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) addresses packet capture at the BOSH infrastructure level for operators, there remains a gap for application developers who need similar capabilities scoped to their @@ -46,45 +40,29 @@ containers. ## Proposal: Custom Packet Capturing Tool This RFC proposes implementing a reduced-functionality packet capturing tool -written in Go using the `gopacket` library. This custom tool would be -injected into application containers via Diego, similar to how `diego-ssh` +written in Go using the [gopacket][gopacket] library. This custom tool would be +injected into application containers via Diego, similar to how [diego-ssh][diego-ssh] currently provides SSH access. ### Key Features -**Reduced Attack Surface**: The custom tool supports only essential packet -capture operations: +The tool reduces the attack surface to a minimum by only exposing the options +which are stricly necessary to capture packets, namely: * Specifying a network interface * Applying packet filters (pcap-filter format) * Setting snapshot length (snaplen) -* No support for arbitrary code execution or complex operations - -**Injection Mechanism**: The tool would be injected via Diego using a mechanism -similar to `diego-ssh`, ensuring: -* Consistent deployment across all application instances -* Proper integration with Cloud Foundry's security model -* Ability to control access through feature flags - -**Operator Controls**: -* Platform-wide feature flag to enable/disable packet capture functionality -* Per-application feature flag support for granular control -* Integration with Cloud Foundry's existing permission model -**Security Model**: -* Tool runs with necessary capabilities (`CAP_NET_RAW` and `CAP_NET_ADMIN`) - but limited functionality -* No shell access or ability to execute arbitrary code -* Scoped to the application's network namespace +The tool would be injected via Diego using a mechanism similar to +[diego-ssh][diego-ssh]. This ensures that the tool is available in all apps +regardeless of the deployment method and it follows established patterns that +are already known by maintainers. -### Usage Example - -```bash -# Capture HTTP traffic for myapp -cf pcap myapp --interface eth0 --filter "tcp port 80" --snaplen 1500 +The tool has the necessary file capabilities assigned (`CAP_NET_RAW` and +`CAP_NET_ADMIN`) to be able to capture packets. It remains in the namespaces of +the application to ensure a vulnerability in the tool does not enable a +container escape. -# Capture specific instance with custom filter -cf pcap myapp --instance 1 --filter "host database.example.com" -``` +The tool is only accessible if SSH is enabled for the application. ## Other Options Considered @@ -92,7 +70,7 @@ cf pcap myapp --instance 1 --filter "host database.example.com" Adding a platform switch to make the `vcap` user a sudoer was considered but discarded due to: -* Significant security risks of providing full root access +* Significant security risks of providing full root access in the app container. * Potential for privilege escalation beyond intended packet capture use * Inconsistency with Cloud Foundry's security-first design principles @@ -102,33 +80,24 @@ Setting capabilities on the existing `tcpdump` binary was considered but discarded because: * `tcpdump` offers functionality to run arbitrary code in its security context -* Demonstrated security vulnerabilities allowing code execution -* Difficulty in hiding behind feature flags due to unconditional - deployment * Broader attack surface compared to a purpose-built tool -### Relationship to Existing Solutions - -This RFC complements [RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) by -addressing different use cases: - -* **pcap-bosh**: Infrastructure-level debugging for operators across BOSH - deployments -* **pcap-cf**: Application-level debugging for developers within CF - applications - ## Implementation Considerations -#### diego-release: Custom Packet Capturing Tool +Based on the proposed solution the following sections detail out the changes +which have to be made to the individual components of Cloud Foundry to implement +it. + +### diego-release: Custom Packet Capturing Tool A new package will be added to diego-release which implements packet capturing -in go through the `gopacket` library. The resulting binary will be included in +in go through the [gopacket][gopacket] library. The resulting binary will be included in the various lifecycle archives that are added to the final app container and the necessary capabilities (`CAP_NET_RAW` and `CAP_NET_ADMIN`) will be assigned to the executable via file capabilities. This allows regular users to gain those capabilities when executing the binary. -#### CF CLI: New `pcap` command +### CF CLI: New `pcap` command Similar to the `bosh pcap` command a `cf pcap` command will be added. Like its predecessor it will connect to the desired instances via SSH and execute the new @@ -136,6 +105,16 @@ packet capturing tool and stream back the captured packets via stdout. If there are multiple streams, the CLI will merge them and write them out to a single file in the pcap format. +Usage example: + +```bash +# Capture HTTP traffic for myapp +cf pcap myapp --interface eth0 --filter "tcp port 80" --snaplen 1500 + +# Capture specific instance with custom filter +cf pcap myapp --instance 1 --filter "host database.example.com" +``` + ## References * Prior discussions: @@ -145,3 +124,6 @@ file in the pcap format. * https://github.com/cloudfoundry/pcap-release * https://github.com/cloudfoundry/community/blob/main/toc/rfc/rfc-0019-pcap-bosh.md * https://github.com/gopacket/gopacket + +[gopacket]: https://github.com/gopacket/gopacket +[diego-ssh]: https://github.com/cloudfoundry/diego-ssh From 463d35419c84df8e2282bf2acdea0226f4535aea Mon Sep 17 00:00:00 2001 From: Maximilian Moehl Date: Wed, 17 Sep 2025 08:09:30 +0200 Subject: [PATCH 3/3] second round of review feedback --- toc/rfc/rfc-draft-pcap-cf.md | 48 ++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 18 deletions(-) diff --git a/toc/rfc/rfc-draft-pcap-cf.md b/toc/rfc/rfc-draft-pcap-cf.md index 09ebd5261..f9e939dea 100644 --- a/toc/rfc/rfc-draft-pcap-cf.md +++ b/toc/rfc/rfc-draft-pcap-cf.md @@ -2,7 +2,7 @@ [meta]: #meta - Name: Integrate pcap feature for Cloud Foundry applications - Start Date: 2025-09-10 -- Author(s): @domdom82 @maxmoehl @peanball @ameowlia @mariash +- Author(s): @maxmoehl @peanball - Status: Draft - RFC Pull Request: [community#1309](https://github.com/cloudfoundry/community/issues/1309) @@ -23,15 +23,17 @@ reduced-functionality custom tool. ## Problem Application developers frequently need to perform in-depth troubleshooting -of their applications when deployed via Cloud Foundry buildpacks. Currently, -there is no possibility for app developers to perform privileged debugging -actions such as packet captures (tcpdump) within their application -containers. - -While [RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) addresses packet capture -at the BOSH infrastructure level for operators, there remains a gap for -application developers who need similar capabilities scoped to their -individual applications. +of their applications when deployed via Cloud Foundry buildpacks. Network analysis +tasks, such as packet captures (tcpdump), connectivity checks and performance +checks, require elevated privileges as the captured data may be sensitive. +Currently, there is no possibility for app developers to perform any privileged +actions within their application containers, which also excludes such network +analysis tasks. + +[RFC-0019 (pcap-bosh)](rfc-0019-pcap-bosh.md) addresses packet capture +at the BOSH infrastructure level for platform operators. Application developers +and operators, an equally important group of users, need similar capabilities, +scoped to their individual applications. The challenge is providing this functionality while maintaining the security model of Cloud Foundry, where applications run in isolated, unprivileged @@ -47,7 +49,7 @@ currently provides SSH access. ### Key Features The tool reduces the attack surface to a minimum by only exposing the options -which are stricly necessary to capture packets, namely: +which are strictly necessary to capture packets, namely: * Specifying a network interface * Applying packet filters (pcap-filter format) * Setting snapshot length (snaplen) @@ -72,7 +74,11 @@ Adding a platform switch to make the `vcap` user a sudoer was considered but discarded due to: * Significant security risks of providing full root access in the app container. * Potential for privilege escalation beyond intended packet capture use -* Inconsistency with Cloud Foundry's security-first design principles +* Inconsistency with Cloud Foundry's [security-first design principles][cf-sec] + +Even considering that we could configure sudo in a way that the vcap user is +only allowed to perform verify specific tasks (like running tcpdump) this still +suffers from potential privilege escalation as explained in the next section. #### Option 2: `setcap` on `tcpdump` Binary @@ -105,14 +111,19 @@ packet capturing tool and stream back the captured packets via stdout. If there are multiple streams, the CLI will merge them and write them out to a single file in the pcap format. -Usage example: +Examples usages and output: ```bash -# Capture HTTP traffic for myapp -cf pcap myapp --interface eth0 --filter "tcp port 80" --snaplen 1500 - -# Capture specific instance with custom filter -cf pcap myapp --instance 1 --filter "host database.example.com" +$ cf pcap myapp --output capture.pcap --interface eth0 --filter "tcp port 80" --snaplen 1500 +Starting capture on all instances. +Capturing, press ^C to stop... +^C +Saved capture to 'capture.pcap'. +$ cf pcap myapp -o capture.pcap -i 1 -f "host database.example.com" +Starting capture on instances: 1. +Capturing, press ^C to stop... +^C +Saved capture to 'capture.pcap'. ``` ## References @@ -127,3 +138,4 @@ cf pcap myapp --instance 1 --filter "host database.example.com" [gopacket]: https://github.com/gopacket/gopacket [diego-ssh]: https://github.com/cloudfoundry/diego-ssh +[cf-sec]: https://docs.cloudfoundry.org/concepts/security.html