Skip to content

Commit

Permalink
[connector/datadog] Update README for accuracy (#35121)
Browse files Browse the repository at this point in the history
**Description:**

The current description of the Datadog connector implies that it is only
useful in the presence of sampling. However, its use is actually
required to see trace-emitting services and their statistics in Datadog
APM. This PR rewords the README to reflect that more clearly.

I also fixed some indentation issues in the provided example.

**Link to tracking Issue:** No tracking issue on Github. Internal Jira
issue: OTEL-1776

---------

Co-authored-by: Pablo Baeyens <[email protected]>
  • Loading branch information
jade-guiton-dd and mx-psi authored Sep 11, 2024
1 parent 5cd3cd0 commit a1a77a5
Showing 1 changed file with 14 additions and 53 deletions.
67 changes: 14 additions & 53 deletions connector/datadogconnector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,29 +25,22 @@

## Description

The Datadog Connector is a connector component that computes Datadog APM Stats pre-sampling in the event that your traces pipeline is sampled using components such as the tailsamplingprocessor or probabilisticsamplerprocessor.
The Datadog Connector is a connector component that derives APM statistics, in the form of metrics, from service traces, for display in the Datadog APM product. This component is *required* for trace-emitting services and their statistics to appear in Datadog APM.

The connector is most applicable when using the sampling components such as the [tailsamplingprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor#tail-sampling-processor), or the [probabilisticsamplerprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor) in one of your pipelines. The sampled pipeline should be duplicated and the `datadog` connector should be added to the the pipeline that is not being sampled to ensure that Datadog APM Stats are accurate in the backend.
The Datadog connector can also forward the traces passed into it into another trace pipeline. Notably, if you plan to sample your traces with the [tailsamplingprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor#tail-sampling-processor) or the [probabilisticsamplerprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor), you should place the Datadog connector upstream to ensure that the metrics are computed before sampling, ensuring their accuracy. An example is given below.

## Usage

To use the Datadog Connector, add the connector to one set of the duplicated pipelines while sampling the other. The Datadog Connector will compute APM Stats on all spans that it sees. Here is an example on how to add it to a pipeline using the [probabilisticsampler]:

<table>
<tr>
<td> Before </td> <td> After </td>
</tr>
<tr>
<td valign="top">

```yaml
# ...
processors:
# ...
probabilistic_sampler:
sampling_percentage: 20
# add the "datadog" processor definition
datadog:

connectors:
# add the "datadog" connector definition and further configurations
datadog/connector:

exporters:
datadog:
Expand All @@ -58,53 +51,21 @@ service:
pipelines:
traces:
receivers: [otlp]
# prepend it to the sampler in your pipeline:
processors: [batch, datadog, probabilistic_sampler]
processors: [batch]
exporters: [datadog/connector]

traces/2: # this pipeline uses sampling
receivers: [datadog/connector]
processors: [batch, probabilistic_sampler]
exporters: [datadog]

metrics:
receivers: [otlp]
receivers: [datadog/connector]
processors: [batch]
exporters: [datadog]
```
</td><td valign="top">
```yaml
# ...
processors:
probabilistic_sampler:
sampling_percentage: 20

connectors:
# add the "datadog" connector definition and further configurations
datadog/connector:

exporters:
datadog:
api:
key: ${env:DD_API_KEY}

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [datadog/connector]

traces/2: # this pipeline uses sampling
receivers: [datadog/connector]
processors: [batch, probabilistic_sampler]
exporters: [datadog]

metrics:
receivers: [datadog/connector]
processors: [batch]
exporters: [datadog]
```
</tr></table>
Here we have two traces pipelines that ingest the same data but one is being sampled. The one that is sampled has its data sent to the datadog backend for you to see the sampled subset of the total traces sent across. The other non-sampled pipeline of traces sends its data to the metrics pipeline to be used in the APM stats. This unsampled pipeline gives the full picture of how much data the application emits in traces.
In this example configuration, incoming traces are received through OTLP, and processed by the Datadog connector in the `traces` pipeline. The traces are then forwarded to the `traces/2` pipeline, where a sample of them is exported to Datadog. In parallel, the APM stats computed from the full stream of traces are sent to the `metrics` pipeline, where they are exported to Datadog as well.

## Configurations

Expand Down

0 comments on commit a1a77a5

Please sign in to comment.