Skip to content

Tcpdump for Everyone: Changes to diego-release for the proposed pcap-release #703

@a18e

Description

@a18e

Recently we proposed pcap-release as an easy way for CF application developers and landscape operators to capture network traffic for their apps and/or their BOSH VMs. See issue cloudfoundry/cf-deployment#980 for a more detailed description of pcap-release.

For the use case of capturing traffic from CF apps, we would need to implement some features in diego-release and would like to get your feedback on our proposed solution.

The following diagram shows how we're planning to capture app network traffic via the pcap-agent on the app-container, which is then sent via the pcap-api to the cf-CLI on the client machine:

single_instance_stream_to_client_pcapagent_on_container

Our proposed solution would work similarly to the cf app-ssh process:

  • cf-CLI plugin that implements commands to enable and perform tcpdumps on specific apps/app instances, with a possibility to pass on a packet filter as a parameter (e.g. for a specific source address) (see app-ssh commands)
  • pcap-api (analogous to ssh-proxy for app-ssh) acts as endpoint for cf-CLI and passes the requests on to the pcap-agent on the app-containers. pcap-api is also responsible for user authentication.
  • pcap-agent (analogous to diego-sshd for app-ssh) runs on the container and acts as a wrapper to libpcap to capture network traffic

We have already successfully executed a spike/PoC where we modified cloud-controller and diego-release on one of our dev-landscapes to globally enable pcap-agent/run the agent on every app-container in the landscape:

  • We added a new package “pcap-agent” to diego-release which build the pcap-agent from source
    (Note: For the final release, we're planning to use a submodule, see below)
  • The pcap-agent binary then packaged into the buildpack_app_lifecycle and docker_app_lifecycle (alongside diego-sshd), which are then extracted on every app-container

With these small changes we were able to perform a tcpdump on an app-container via the pcap-agent from any landscape-internal VM.

(Our issue on the required changes to the cloud-controller: cloudfoundry/cloud_controller_ng#3193)

While we directly included the pcap-agent source code in the diego-release src-directory, we’re planning to do this with a submodule in the future (We will extract the src/pcap folder in the current pcap-release into a separate repository which will serve as the diego-release submodule)

Before we move further, we would like to get your feedback, especially for the following questions:

  • Do you see any roadblocks or complexities we might have missed?
  • Is not having a Windows pcap-agent an issue?
  • Is it OK to include the pcap-agent-binaries in buildpack_app_lifecycle?
  • Do you agree with having a submodule for pcap-agent source code and including it as a submodule here?
  • How do we approach having our own go.mod file vs. the one in the diego-release/src/code.cloudfoundry.org folder?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions