Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions pkg/proto/datadog/trace/stats.proto
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,10 @@ message ClientStatsPayload {
string git_commit_sha = 13;
// The image tag is obtained from a container's set of tags.
string image_tag = 14;
// The process tags hash is used as a key for agent stats agregation.
uint64 process_tags_hash = 15;
// The process tags contains a list of tags that are specific to the process.
string process_tags = 16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not have a repeated string field with split tags, same as container tags? This way you don't have to split the tags when computing the hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also going to suggest using camelCase for consistency but noticed we already have a mix of cases :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're propagating this payload to multiple intakes, to keep the same normalisation I preferred to not touch what's provided by the tracing library and normalize in the backend
Tracer library will have it's own normalisation at the source

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way you don't have to split the tags when computing the hash.

I think this can actually be a pretty big deal when it comes to performance, so we should really think twice about not doing this and be very intentional about it. It's probably fine, but it would be nice to understand what perf penalty we might be eating here with this decision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What performance cost do you expect? This should be actually the most efficient

  1. tracers normalise once their process tags in their lifetime
  2. trace-agent is a passthrough on them
  3. backends have a second normalization to cover further renormalization

Copy link
Member

@truthbk truthbk Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, maybe I'm missing something but from this comment:

This way you don't have to split the tags when computing the hash.

The associated cost I'm alluding to is related to that precisely, the need to do string allocations due to the string split, and then calculating the hash on that split here: https://github.com/DataDog/datadog-agent/pull/35746/files#diff-0b51a96e21823e23c7778a95700237481bac225576bba65aaded24d2b170a50dR103-R108.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that split would still be needed if we stored this as an array of string here, the process tags are received through a http header (string to split)

}

// ClientStatsBucket is a time bucket containing aggregated stats.
Expand Down
145 changes: 84 additions & 61 deletions pkg/proto/pbgo/trace/stats.pb.go

Large diffs are not rendered by default.

66 changes: 58 additions & 8 deletions pkg/proto/pbgo/trace/stats_gen.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

72 changes: 72 additions & 0 deletions pkg/proto/pbgo/trace/stats_vtproto.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/trace/agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ func (a *Agent) Process(p *api.Payload) {
defer a.Timing.Since("datadog.trace_agent.internal.process_payload_ms", now)
ts := p.Source
sampledChunks := new(writer.SampledChunks)
statsInput := stats.NewStatsInput(len(p.TracerPayload.Chunks), p.TracerPayload.ContainerID, p.ClientComputedStats)
statsInput := stats.NewStatsInput(len(p.TracerPayload.Chunks), p.TracerPayload.ContainerID, p.ClientComputedStats, p.ProcessTags)

p.TracerPayload.Env = traceutil.NormalizeTagValue(p.TracerPayload.Env)

Expand Down
22 changes: 13 additions & 9 deletions pkg/trace/agent/agent_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2450,9 +2450,11 @@ func TestConvertStats(t *testing.T) {
name: "containerID feature enabled, no fargate",
features: "enable_cid_stats",
in: &pb.ClientStatsPayload{
Hostname: "tracer_hots",
Env: "tracer_env",
Version: "code_version",
Hostname: "tracer_hots",
Env: "tracer_env",
Version: "code_version",
ProcessTags: "binary_name:bin",
ProcessTagsHash: 123456789,
Stats: []*pb.ClientStatsBucket{
{
Start: 1,
Expand Down Expand Up @@ -2487,12 +2489,14 @@ func TestConvertStats(t *testing.T) {
tracerVersion: "v1",
containerID: "abc123",
out: &pb.ClientStatsPayload{
Hostname: "tracer_hots",
Env: "tracer_env",
Version: "code_version",
Lang: "java",
TracerVersion: "v1",
ContainerID: "abc123",
Hostname: "tracer_hots",
Env: "tracer_env",
Version: "code_version",
Lang: "java",
TracerVersion: "v1",
ContainerID: "abc123",
ProcessTags: "binary_name:bin",
ProcessTagsHash: 123456789,
Stats: []*pb.ClientStatsBucket{
{
Start: 1,
Expand Down
13 changes: 13 additions & 0 deletions pkg/trace/api/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -440,6 +440,7 @@ const (
// tagContainersTags specifies the name of the tag which holds key/value
// pairs representing information about the container (Docker, EC2, etc).
tagContainersTags = "_dd.tags.container"
tagProcessTags = "_dd.tags.process"
)

// TagStats returns the stats and tags coinciding with the information found in header.
Expand Down Expand Up @@ -657,13 +658,21 @@ func (r *HTTPReceiver) handleTraces(v Version, w http.ResponseWriter, req *http.
}
tp.Tags[tagContainersTags] = ctags
}
ptags := getProcessTagsFromHeader(req.Header)
if ptags != "" {
if tp.Tags == nil {
tp.Tags = make(map[string]string)
}
tp.Tags[tagProcessTags] = ptags
}

payload := &Payload{
Source: ts,
TracerPayload: tp,
ClientComputedTopLevel: isHeaderTrue(header.ComputedTopLevel, req.Header.Get(header.ComputedTopLevel)),
ClientComputedStats: isHeaderTrue(header.ComputedStats, req.Header.Get(header.ComputedStats)),
ClientDroppedP0s: droppedTracesFromHeader(req.Header, ts),
ProcessTags: ptags,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment, this data is already present within the TracerPayload passes here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Played with it, but in the end it's a lot cleaner to keep the processTags as a dedicated field available for the stats concentrator

}
r.out <- payload
}
Expand Down Expand Up @@ -700,6 +709,10 @@ func droppedTracesFromHeader(h http.Header, ts *info.TagStats) int64 {
return dropped
}

func getProcessTagsFromHeader(h http.Header) string {
return h.Get(header.ProcessTags)
}

// handleServices handle a request with a list of several services
func (r *HTTPReceiver) handleServices(_ Version, w http.ResponseWriter, _ *http.Request) {
httpOK(w)
Expand Down
3 changes: 3 additions & 0 deletions pkg/trace/api/internal/header/headers.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ const (
// with the number of traces contained in the payload.
TraceCount = "X-Datadog-Trace-Count"

// ProcessTags is a list that contains process tags split by a ','.
ProcessTags = "X-Datadog-Process-Tags"

// ContainerID specifies the name of the header which contains the ID of the
// container where the request originated.
// Deprecated in favor of Datadog-Entity-ID.
Expand Down
3 changes: 3 additions & 0 deletions pkg/trace/api/payload.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ type Payload struct {

// ClientDroppedP0s specifies the number of P0 traces chunks dropped by the client.
ClientDroppedP0s int64

// ProcessTags is a list of tags describing an instrumented process.
ProcessTags string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Process Tags are already passed within the TracerPayload.Tags["_dd.tags.process"], why have this dedicated field with the same data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's used to pass it to stats without doing a map access, agreed though it makes a dupe, I'll move stats to pick it from this field

}

// Chunks returns chunks in TracerPayload
Expand Down
20 changes: 14 additions & 6 deletions pkg/trace/stats/aggregation.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,13 @@ type BucketsAggregationKey struct {

// PayloadAggregationKey specifies the key by which a payload is aggregated.
type PayloadAggregationKey struct {
Env string
Hostname string
Version string
ContainerID string
GitCommitSha string
ImageTag string
Env string
Hostname string
Version string
ContainerID string
GitCommitSha string
ImageTag string
ProcessTagsHash uint64
}

func getStatusCode(meta map[string]string, metrics map[string]float64) uint32 {
Expand Down Expand Up @@ -99,6 +100,13 @@ func NewAggregationFromSpan(s *StatSpan, origin string, aggKey PayloadAggregatio
return agg
}

func processTagsHash(processTags string) uint64 {
if processTags == "" {
return 0
}
return peerTagsHash(strings.Split(processTags, ","))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we rename peerTagsHash to be more general now that we're using it in both places?

}

func peerTagsHash(tags []string) uint64 {
if len(tags) == 0 {
return 0
Expand Down
Loading