-
Notifications
You must be signed in to change notification settings - Fork 1.4k
support process tags in apm stats and spans #35746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
667e41e
5f045d3
b41db28
9af131a
cf5a19b
bb7fef0
0185eb9
9241958
78b9321
4371103
7e553bb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -440,6 +440,7 @@ const ( | |
| // tagContainersTags specifies the name of the tag which holds key/value | ||
| // pairs representing information about the container (Docker, EC2, etc). | ||
| tagContainersTags = "_dd.tags.container" | ||
| tagProcessTags = "_dd.tags.process" | ||
| ) | ||
|
|
||
| // TagStats returns the stats and tags coinciding with the information found in header. | ||
|
|
@@ -657,13 +658,21 @@ func (r *HTTPReceiver) handleTraces(v Version, w http.ResponseWriter, req *http. | |
| } | ||
| tp.Tags[tagContainersTags] = ctags | ||
| } | ||
| ptags := getProcessTagsFromHeader(req.Header) | ||
| if ptags != "" { | ||
| if tp.Tags == nil { | ||
| tp.Tags = make(map[string]string) | ||
| } | ||
| tp.Tags[tagProcessTags] = ptags | ||
| } | ||
|
|
||
| payload := &Payload{ | ||
| Source: ts, | ||
| TracerPayload: tp, | ||
| ClientComputedTopLevel: isHeaderTrue(header.ComputedTopLevel, req.Header.Get(header.ComputedTopLevel)), | ||
| ClientComputedStats: isHeaderTrue(header.ComputedStats, req.Header.Get(header.ComputedStats)), | ||
| ClientDroppedP0s: droppedTracesFromHeader(req.Header, ts), | ||
| ProcessTags: ptags, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see other comment, this data is already present within the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Played with it, but in the end it's a lot cleaner to keep the processTags as a dedicated field available for the stats concentrator |
||
| } | ||
| r.out <- payload | ||
| } | ||
|
|
@@ -700,6 +709,10 @@ func droppedTracesFromHeader(h http.Header, ts *info.TagStats) int64 { | |
| return dropped | ||
| } | ||
|
|
||
| func getProcessTagsFromHeader(h http.Header) string { | ||
| return h.Get(header.ProcessTags) | ||
| } | ||
|
|
||
| // handleServices handle a request with a list of several services | ||
| func (r *HTTPReceiver) handleServices(_ Version, w http.ResponseWriter, _ *http.Request) { | ||
| httpOK(w) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,6 +29,9 @@ type Payload struct { | |
|
|
||
| // ClientDroppedP0s specifies the number of P0 traces chunks dropped by the client. | ||
| ClientDroppedP0s int64 | ||
|
|
||
| // ProcessTags is a list of tags describing an instrumented process. | ||
| ProcessTags string | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Process Tags are already passed within the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's used to pass it to stats without doing a map access, agreed though it makes a dupe, I'll move stats to pick it from this field |
||
| } | ||
|
|
||
| // Chunks returns chunks in TracerPayload | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -46,12 +46,13 @@ type BucketsAggregationKey struct { | |
|
|
||
| // PayloadAggregationKey specifies the key by which a payload is aggregated. | ||
| type PayloadAggregationKey struct { | ||
| Env string | ||
| Hostname string | ||
| Version string | ||
| ContainerID string | ||
| GitCommitSha string | ||
| ImageTag string | ||
| Env string | ||
| Hostname string | ||
| Version string | ||
| ContainerID string | ||
| GitCommitSha string | ||
| ImageTag string | ||
| ProcessTagsHash uint64 | ||
| } | ||
|
|
||
| func getStatusCode(meta map[string]string, metrics map[string]float64) uint32 { | ||
|
|
@@ -99,6 +100,13 @@ func NewAggregationFromSpan(s *StatSpan, origin string, aggKey PayloadAggregatio | |
| return agg | ||
| } | ||
|
|
||
| func processTagsHash(processTags string) uint64 { | ||
| if processTags == "" { | ||
| return 0 | ||
| } | ||
| return peerTagsHash(strings.Split(processTags, ",")) | ||
|
||
| } | ||
|
|
||
| func peerTagsHash(tags []string) uint64 { | ||
| if len(tags) == 0 { | ||
| return 0 | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not have a
repeated stringfield with split tags, same as container tags? This way you don't have to split the tags when computing the hash.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also going to suggest using camelCase for consistency but noticed we already have a mix of cases :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we're propagating this payload to multiple intakes, to keep the same normalisation I preferred to not touch what's provided by the tracing library and normalize in the backend
Tracer library will have it's own normalisation at the source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can actually be a pretty big deal when it comes to performance, so we should really think twice about not doing this and be very intentional about it. It's probably fine, but it would be nice to understand what perf penalty we might be eating here with this decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What performance cost do you expect? This should be actually the most efficient
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, maybe I'm missing something but from this comment:
The associated cost I'm alluding to is related to that precisely, the need to do string allocations due to the string split, and then calculating the hash on that split here: https://github.com/DataDog/datadog-agent/pull/35746/files#diff-0b51a96e21823e23c7778a95700237481bac225576bba65aaded24d2b170a50dR103-R108.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that split would still be needed if we stored this as an array of string here, the process tags are received through a http header (string to split)