Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[system-probe] Start of dynamic instrumentation system probe module #16034

Merged
merged 9 commits into from
Apr 14, 2023

Conversation

grantseltzer
Copy link
Member

What does this PR do?

This introduces the scaffolding for the new dynamic instrumentation system-probe module. This will be a module used by the dynamic-instrumentation product for system level tracing of user applications. This belongs in the system-probe as it will soon need higher level permissions (CAP_SYS_ADMIN, CAP_BPF) for loading and attaching bpf programs, and inspecting inside container filesystems.

Motivation

This PR simply creates the module in code, and gives it the scaffolding for reading a static configuration file. The motivation for this is to break up the large amount of code that will need to be placed here into smaller PRs.

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

  • Build the system probe: inv system-probe.build
  • Run the system-probe sudo bin/system-probe/system-probe

You should see it's disabled:

2023-03-08 13:37:33 PST | SYS-PROBE | INFO | (cmd/system-probe/api/module/loader.go:50 in Register) | module dynamic_instrumentation disabled
2023-03-08 13:37:33 PST | SYS-PROBE | INFO | (cmd/system-probe/app/run.go:123 in run) | system probe successfully started

You can run it enabled by setting DD_DYNAMIC_INSTRUMENTATION_ENABLED=1 or adding the following section to your system-probe config:

####################################
## Dynamic Instrumentation Config ##
###################################

dynamic_instrumentation:
  ## @param enabled - boolean - optional - default: false
  ## @env DD_DYNAMIC_INSTRUMENTATION_ENABLED - boolean -optional - default: false
  ## Set to true to enable the User Tracer Module of the System Probe
  #
  enabled: true

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
  • If applicable, the config template has been updated.

Copy link
Contributor

@sgnn7 sgnn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits. I'll preemptively approve since they're easy fixes.

@@ -247,6 +247,7 @@
/pkg/config/remote/service/meta/ @DataDog/remote-config @DataDog/software-integrity-and-trust
/pkg/diagnose/ @Datadog/container-integrations
/pkg/diagnose/connectivity/ @DataDog/agent-shared-components
/pkg/dynamicinstrumentation/ @DataDog/debugger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offset in column 2


type Module struct{}

func NewModule(config *Config) (*Module, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linter errors are right - you have an exclusion flag in the other file but no build flag here so for non-Linux platforms both files will be built, causing name collision. I would probably suggest that you maybe name this file module_linux just to make it clear from that standpoint that this only works on that platform.

import (
"github.com/DataDog/datadog-agent/cmd/system-probe/api/module"
"github.com/DataDog/datadog-agent/pkg/ebpf"
"github.com/DataDog/datadog-agent/pkg/security/config"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linter is correct here - you're not using "github.com/DataDog/datadog-agent/pkg/security/config" package in windows impl

@@ -131,6 +133,9 @@ func InitSystemProbeConfig(cfg Config) {
cfg.BindEnvAndSetDefault(join(spNS, "zypper_repos_dir"), suffixHostEtc(defaultZypperReposDirSuffix), "DD_ZYPPER_REPOS_DIR")
cfg.BindEnvAndSetDefault(join(spNS, "attach_kprobes_with_kprobe_events_abi"), false, "DD_ATTACH_KPROBES_WITH_KPROBE_EVENTS_ABI")

// User Tracer
cfg.BindEnvAndSetDefault(join(diNS, "enabled"), false, "DD_DYNAMIC_INSTRUMENTATION_ENABLED")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is recommended to add the feature to:

  • pkg/metadata/inventories/README.md
  • pkg/metadata/inventories/inventories.go

and to add UTs for the configuration:

  • pkg/network/config/config_test.go

adding an example PR from my team #14620

Copy link
Member Author

@grantseltzer grantseltzer Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata/inventories package seems to have specific features rather than having the whole module enabled. For example USM just has different network protocols it supports, rather than USM as a whole. Are you sure it makes sense to have something like feature_dynamic_instrumentation_enabled? I don't know what inventories is so I really don't know. I can certainly add it and then add specific features as they get added.

Similarly, dynamic instrumentation might not belong under pkg/network/config since it's not a network feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the configuration test, you don't need to add them in pkg/network/config/config_test.go but a similar to the correct config_test.go file
another example in other package https://github.com/DataDog/datadog-agent/blob/main/pkg/config/config_test.go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I've added to the metadata/inventories package and added a test to cmd/system-probe/config/config_linux_test.go

Copy link
Member

@brycekahle brycekahle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions, but nothing blocking.

@grantseltzer grantseltzer added [deprecated] qa/skip-qa - use other qa/ labels [DEPRECATED] Please use qa/done or qa/no-code-change to skip creating a QA card changelog/no-changelog labels Mar 9, 2023
@grantseltzer grantseltzer added this to the 7.45.0 milestone Mar 9, 2023
@grantseltzer
Copy link
Member Author

@brycekahle Is there anything else I can do to help this get merged?

@grantseltzer grantseltzer force-pushed the grantseltzer/user-tracer-scaffolding branch from dc3668d to d8adf12 Compare March 28, 2023 17:20
@github-actions
Copy link
Contributor

⚠️🚨 Warning, this pull request increases the binary size of serverless extension by 4096 bytes. Each MB of binary size increase means about 10ms of additional cold start time, so this pull request would increase cold start time by 0ms.

New dependencies added

We suggest you consider adding the !serverless build tag to remove any new dependencies not needed in the serverless extension.

If you have questions, we are happy to help, come visit us in the #serverless slack channel and provide a link to this comment.

@guyarb
Copy link
Contributor

guyarb commented Mar 29, 2023

@grantseltzer seems like you never logged in into CircleCI, therefore the relevant jobs didn't run, and those jobs are required for merging

@grantseltzer grantseltzer force-pushed the grantseltzer/user-tracer-scaffolding branch from d8adf12 to 838ec17 Compare March 29, 2023 14:24
@grantseltzer grantseltzer added the team/dynamic-instrumentation Dynamic Instrumentation label Mar 29, 2023
@grantseltzer grantseltzer force-pushed the grantseltzer/user-tracer-scaffolding branch 3 times, most recently from 60d736f to e889753 Compare April 7, 2023 13:40
@grantseltzer grantseltzer force-pushed the grantseltzer/user-tracer-scaffolding branch from e889753 to fd236e4 Compare April 11, 2023 14:55
@grantseltzer grantseltzer merged commit fd191ab into main Apr 14, 2023
@grantseltzer grantseltzer deleted the grantseltzer/user-tracer-scaffolding branch April 14, 2023 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog [deprecated] qa/skip-qa - use other qa/ labels [DEPRECATED] Please use qa/done or qa/no-code-change to skip creating a QA card team/dynamic-instrumentation Dynamic Instrumentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants