Remove http.XX_content_length* semantic attributes #2028

iNikem · 2021-10-18T14:20:41Z

Http semantic convention defines 4 attributes which are very different from all others. Namely http.request_content_length, http.request_content_length_uncompressed, http.response_content_length and http.response_content_length_uncompressed. I think this information should be captured as metrics, not as span attributes. I cannot imagine a use-case where anybody would want to search/filter by this attribute or use it in any other "dimension"-like way. This is clearly a metric.

The text was updated successfully, but these errors were encountered:

Oberon00 · 2021-10-18T14:23:08Z

How about filtering for requests that exceed a certain size? Or requests without body (length=0)?

iNikem · 2021-10-18T14:31:48Z

Ok, true. Still, wouldn't it better served by metrics?

pyohannes · 2021-10-18T18:17:41Z

How about filtering for requests that exceed a certain size? Or requests without body (length=0)?

Also, having those attributes on spans will provide more help in troubleshooting or reproducing problems. You might get answers to those questions:

Why do request sizes exceed a certain limit?
Why do response sizes exceed a certain limit?
Why are request bodies missing?

You are much more likely to get answers to those questions from traces than from metrics.

lmolkova · 2021-11-06T04:58:50Z

assuming metrics are correlated to traces with exemplars, would it be possible to answer all these questions just not for a particular request? I.e. if you have problems with big requests, you'd see an exemplar of it on metrics for request/response size and there will be no need to stamp it on every request.

Oberon00 · 2021-11-06T13:08:39Z

@lmolkova Probably true, but the same could be said for http.status_code: "If you have a problem with status 500 requests, you'd see an examplar of it on metrics for status codes".

lmolkova · 2021-11-07T21:51:41Z

@Oberon00 I see your point, I'm not sure status code is the same as request size. Status code indicates a failure and you'd generally want to know about it. Size might be the cause of failure or big latency and I think only the correlation between size and status/latency is interesting. Size alone is rarely important.
This correlation seems to be better expressed with metrics.

anuraaga · 2021-11-08T02:59:20Z

I've always understood there is a (unspoken?) goal to allow spans to be converted to metrics fairly robustly, and is the motivation for work on propagating sampling probabilities among others. So I always assumed the attributes on a span would always be a superset of metrics - though this may not be true.

lmolkova · 2021-11-08T04:52:31Z

I guess it boils down to the stage at which metrics are reported: 1) separately, 2) in instrumentation API or 3) from spans in post-processing, e.g. collector. I think p3 leads to different challenges related to sampling and filtering and metrics reconstruction may not work great with all samplers or make them very complicated, as they start having a state.
Adaptive sampling algorithms that keep rate of reported traces (e.g. 5 traces per sec) are complex enough which leads to inaccuracy in calculations, especially with bursts of traffic.
So p3 seems like an viable, but imprecise solution suitable for some cases. It's great where p1 or p2 are not available.
Assuming we can report metrics from instrumentation API before (and regardless of) starting a span, we'd still have all attributes, but stamp a subset of them on spans and others go to metrics.
I agree this issue depends on overall metrics collection approach.

anuraaga · 2021-11-08T05:00:42Z

#1769 might be a reference on the sampling algorithm issue - the impression I got is that a hard rate limiting approach isn't appropriate in OTel for the reasons you mention, making it impossible to compute metrics from spans, so an approximate rate limiting approach is used instead. I personally don't need to compute metrics from spans and don't think instrumentation should take the potentially imprecise approach vs directly collecting metrics, but it appears to be important in general to keep the door open for span-based metrics. I guess removing attributes like these content length ones could close the door on it.

anuraaga · 2021-11-08T05:07:48Z

Ah - I just remembered that I had initially added these attributes to the spec 😅 I think my only motivation was to close a gap with the X-Ray data model, which includes them too

https://docs.aws.amazon.com/xray/latest/devguide/xray-api-segmentdocuments.html#api-segmentdocuments-http

If these attributes were removed, X-Ray would probably have to use the relatively new response.header.content_length which should be fine.

lmolkova · 2022-04-20T00:58:17Z

Came across it again in scope of #2469. Here's how these attributes are populated: #2114 (comment) in Java, .NET, JS, Python and Go.

http.XX_content_length is populated by some Java instrumentations, all Go instrumentations, and all JS instrumentations (.NET and Python don't populate them).
Only JS populates http.XX_content_length_uncompressed conditionally depending on Content-Encoding == identity.
AWS XRay seems to only need response content-length, which is "content_length – integer indicating the length of the response body in bytes." - presumably raw data length
HTTP clients don't necessarily decompress and instrumentation doesn't really know when decompression happens (e.g. when stream is read which can happen after the span ends)

So one thing I'd like to do is remove *_uncompressed attributes.

Then content_length: I'm trying to understand when it doesn't match Content-Length header. One case I found is where Chunked encoding is used, which requires Content-Length to not be set, but then it's up to network stream implementation to read it from the socket until EOF, so HTTP instrumentation won't be able to know about content length anyway.

Unless I'm missing something, we can remove all 4 attributes and let users/distros opt-in into http.*.header.content_length

@anuraaga lmk if you still think it'd work for AWS.

@Oberon00 @iNikem @pyohannes please share if you have any objections.

anuraaga · 2022-04-20T01:22:43Z

I think that should work for X-Ray.

As an anecdote, I have found good value from response length and uncompressed length in metrics when using the Armeria server library which exports these natively. It's true that black box instrumentation will generally not be able to handle the compression, but native instrumentation potentially could. Just food for thought on whether OTel semantic conventions are supposed to apply even when they would be hard to support in instrumentation libraries but could be in frameworks directly.

Either way I would remove now and potentially add back only in metrics if needed later to avoid slowing down semantic convention stability.

Oberon00 · 2022-04-20T08:06:40Z

HTTP clients don't necessarily decompress and instrumentation doesn't really know when decompression happens (e.g. when stream is read which can happen after the span ends)
So one thing I'd like to do is remove *_uncompressed attributes.

Would the opposite situation also be plausible, i.e. HTTP clients that always decompress and give no possibility to access the raw length?
Maybe it would be best to remove these altogether and only rely on the generic header conventions to report any content-length header?

lmolkova · 2022-04-20T17:14:25Z

Would the opposite situation also be plausible, i.e. HTTP clients that always decompress and give no possibility to access the raw length?

I believe HTTP clients always know the raw length (at least from Content-Length header, which is raw payload size), but if we define our attributes as something smarter, then it's up to the instrumentation author to interpret it.

Maybe it would be best to remove these altogether and only rely on the generic header conventions to report any content-length header?

Agreed and, also to @anuraaga 's point, - we probably would find metrics around content length very useful. And when it happens, we might need to return http.*.header.content_length back or require to add header attributes by default

…2028

lmolkova · 2022-07-27T19:04:34Z

after some back and force, I kept http.request_content_length and http.response_content_length because they are useful and enable metrics-over traces scenario.

But given how unreliable uncompressed length population is and that OTel HTTP instrumentations don't usually populate it, I removed http.response_content_length_uncompressed and http.request_content_length_uncompressed. in #2469 .

Since http request and response size metrics are defined now, I believe the only open question here is if we want to have a special name for attributes (http.request|response_content_length) or we can reuse http.*.header.content_length.

since Content-Length header is not always present, but some specialized instrumentation can still populate it and it's nice to have a generic name for it, I suggest keeping http.request|response_content_length.

@iNikem thoughts? do you feel we can close thin one?

iNikem added the spec:trace Related to the specification/trace directory label Oct 18, 2021

github-actions bot assigned SergeyKanzhelev Oct 18, 2021

arminru added the area:semantic-conventions Related to semantic conventions label Oct 18, 2021

lmolkova mentioned this issue Nov 10, 2021

Optional vs required attributes in HTTP conventions and beyond #2114

Closed

tedsuo moved this to V1 - Required for stable in Spec: HTTP Mar 29, 2022

tedsuo added this to Spec: HTTP Mar 29, 2022

lmolkova mentioned this issue Apr 5, 2022

Remove alternative attribute sets from HTTP semantic conventions #2469

Merged

denisivan0v mentioned this issue Apr 19, 2022

Agree on the scope for HTTP semantic conventions v1.0 #2499

Closed

4 tasks

lmolkova added a commit to lmolkova/opentelemetry-specification that referenced this issue Apr 26, 2022

remove uncomressed content length after discussion in open-telemetry#…

e2d9a3d

…2028

lmolkova added a commit to lmolkova/opentelemetry-specification that referenced this issue Apr 29, 2022

remove uncomressed content length after discussion in open-telemetry#…

c493b29

…2028

lmolkova added a commit to lmolkova/opentelemetry-specification that referenced this issue Jun 1, 2022

remove uncomressed content length after discussion in open-telemetry#…

9c1a3c1

…2028

lmolkova moved this from V1 - Required for stable to Done in Spec: HTTP Sep 19, 2022

SergeyKanzhelev removed their assignment Feb 18, 2023

This was referenced Apr 27, 2023

BREAKING: Introduce common url.* attributes, and improve use of namespacing under http.* #3355

Merged

Messaging: remove source to better align with ECS #3450

Closed

reyang closed this as completed in #3355 May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove http.XX_content_length* semantic attributes #2028

Remove http.XX_content_length* semantic attributes #2028

iNikem commented Oct 18, 2021

Oberon00 commented Oct 18, 2021

iNikem commented Oct 18, 2021

pyohannes commented Oct 18, 2021

lmolkova commented Nov 6, 2021

Oberon00 commented Nov 6, 2021

lmolkova commented Nov 7, 2021 •

edited

Loading

anuraaga commented Nov 8, 2021

lmolkova commented Nov 8, 2021

anuraaga commented Nov 8, 2021

anuraaga commented Nov 8, 2021

lmolkova commented Apr 20, 2022 •

edited

Loading

anuraaga commented Apr 20, 2022

Oberon00 commented Apr 20, 2022 •

edited

Loading

lmolkova commented Apr 20, 2022

lmolkova commented Jul 27, 2022

Remove http.XX_content_length* semantic attributes #2028

Remove http.XX_content_length* semantic attributes #2028

Comments

iNikem commented Oct 18, 2021

Oberon00 commented Oct 18, 2021

iNikem commented Oct 18, 2021

pyohannes commented Oct 18, 2021

lmolkova commented Nov 6, 2021

Oberon00 commented Nov 6, 2021

lmolkova commented Nov 7, 2021 • edited Loading

anuraaga commented Nov 8, 2021

lmolkova commented Nov 8, 2021

anuraaga commented Nov 8, 2021

anuraaga commented Nov 8, 2021

lmolkova commented Apr 20, 2022 • edited Loading

anuraaga commented Apr 20, 2022

Oberon00 commented Apr 20, 2022 • edited Loading

lmolkova commented Apr 20, 2022

lmolkova commented Jul 27, 2022

lmolkova commented Nov 7, 2021 •

edited

Loading

lmolkova commented Apr 20, 2022 •

edited

Loading

Oberon00 commented Apr 20, 2022 •

edited

Loading