Skip to content

Datadog Integration (#3407)#3629

Open
natemollica-nm wants to merge 1 commit intorelease/1.1.xfrom
backport/natemollica-nm/datadog-integration/manual-cherry-pick-1-1-x
Open

Datadog Integration (#3407)#3629
natemollica-nm wants to merge 1 commit intorelease/1.1.xfrom
backport/natemollica-nm/datadog-integration/manual-cherry-pick-1-1-x

Conversation

@natemollica-nm
Copy link
Copy Markdown
Contributor

Backport

This PR is manually generated from #3407 to be assessed for back porting due to the failure of automated back porting via label backport/1.1.x.

The below text is copied from the body of the original PR.


Changes proposed in this PR

PR Resubmitted from Branch vice Fork for Automation Testing

  • Expose several metrics specific features on consul-k8s to include:
  • Introduce a means to ease the integration and operation of integrating with Datadog Agent metrics collection via fail-safe helm override value parameters. Overrides are intended to allow operators to easily configure 1 of the 3 following methods of monitoring Consul with Datadog on Kubernetes:
    • DogStatsD via one of either "UDP" or "UDS" transport protocols
    • OpenMetrics via Datadogs Autodiscovery feature to scrape the /v1/agent/metrics?format=prometheus endpoint
    • Datadog + Consul Integration Feature standard checks:
      • Serf events and member flaps
      • The Raft protocol
      • DNS performance
      • API Endpoints Health Checks:
        • /v1/agent/metrics?format=prometheus
        • /v1/agent/self
        • /v1/status/leader
        • /v1/status/peers
        • /v1/catalog/services
        • /v1/health/service
        • /v1/health/state/any
        • /v1/coordinate/datacenters
        • /v1/coordinate/nodes
  • Introduces server-acl-init token creation for OpenMetrics and Datadog Consul Integration check methods allowing default minimal acl token permission generation for Datadog agent usage as necessary.

How I've tested this PR

  • New ACL Token Testing as outline in the CONTRIBUTING.md steps.
  • Deployment and testing of local consul-dev (main) and consul-k8s-control-plane-dev (datadog-integration branch) images on k3d test cluster for each scenario. Test repository here.
  • Verification of helm templating for new value overrides added as instructed in CONTRIBUTING.md steps. bats ./charts/consul/test/unit --jobs 8 - ran successfully for all tests.

How I expect reviewers to test this PR

  • Assess the need for additional unit testing creation and verification.
  • If possible:
    • Reach out with any question/concerns or reasons for PR push-back.
    • Verification of fail-safe interlocks between the 3 methods of integration mentioned above.
    • Verification of ACL policy implementation.

Checklist


Overview of commits

* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes

* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push

* changelog entry update

* datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config

* curt pr review changes (minus extraConfig templating verification changes)

* global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics

* dogstatsd and otlp mutually exclusive verification checks

* breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck

* extraConfig hash updates post merge conflict update

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* update changelog .txt to match new PR number

* updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides)

* update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail

* update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul

* update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul

* correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior

* fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1

* add in server-statefulset bats test for extraConfig validation testing
@natemollica-nm natemollica-nm added the pr/no-backport signals that a PR will not contain a backport label label Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr/no-backport signals that a PR will not contain a backport label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants