Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ASGIGetter keys to fetch from actual carrier headers #1435

Merged
merged 14 commits into from
Jan 12, 2023
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `opentelemetry-instrumentation-aws-lambda` Adds an option to configure `disable_aws_context_propagation` by
environment variable: `OTEL_LAMBDA_DISABLE_AWS_CONTEXT_PROPAGATION`
([#1507](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1507))
- `opentelemetry-instrumentation-asgi` Fix keys() in class ASGIGetter to correctly fetch values from carrier headers.
([#1435](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1435))


## Version 1.15.0/0.36b0 (2022-12-10)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,8 @@ def get(
return decoded

def keys(self, carrier: dict) -> typing.List[str]:
return [_key.decode("utf8") for (_key, _value) in carrier]
headers = carrier.get("headers") or []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, if "headers" is not in carrier.keys, shouldn't this method return [_key.decode("utf8") for (_key, _value) in carrier]. The way this is being modified makes this method return an empty list if "headers" is not in carrier.keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the .get() method above returns header-specific fields and its description mentions "header name in scope", I assumed here we'd only want to return headers rather than the whole carrier, which in this case is the ASGI scope. I'm not too sure fetching all values from the scope is correct though, there's a bunch of ASGI-specific info there and I wouldn't expect to see (nor inject) any extra info from Opentelemetry.

Now that you mention it though, I guess some people may be relying on this behaviour now, so maybe we should join both lists? Or would we treat this as a breaking change instead?

Copy link

@torqataNate torqataNate Dec 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that this should be keyed to headers, but we do need to be conscious of other formats. For example as we look through most proprogators, the common use is for the carrier to only contain header keys. The most relatable part of the spec docs are here.
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/api-propagators.md#:~:text=In%20order%20to%20increase%20compatibility%2C
The carrier of propagated data on both the client (injector) and server (extractor) side is usually an HTTP request. In order to increase compatibility, the key/value pairs MUST only consist of US-ASCII characters that make up valid HTTP header fields as per [RFC 7230](https://tools.ietf.org/html/rfc7230#section-3.2).

The http request will have other keys with it such as client, method, query params etc which we dont care about, we only need the headers. It is only the TextMapPropagator section that states the carrier is usually a http request though even though that does seem to be the standard to supply the http payload as a carrier. As we look at the spec on proprogators, It actually doesn't specify whats in a carrier. Which is here https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/api-propagators.md#carrier

In the branch i was going to make a pull request against i searched for the headers key first and iderated over that only, and if that did not exist, I chose the carrier payload as it should then be a header.

The argument could be made that we need to look through the entire carrier payload for headers, as the format of carrier has not been specified. But it feels burdensome for this method to have to deal with that.

Lastly
The above pull request would also fail on line 264 when a header isn't a byte type so it is still broken. I had a pull request i was going to submit but im not a contributor. how would you like me to submit the proposed change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as this processor is concerned, it has been working until this change. The previous behaviour was not correct either, but at least it would not crash. The changes in this PR directly address the main issue, which is the crash due to iterating the wrong value, and fetching the correct value the same way we do in .get(). I would argue that any more changes in the behaviour than that are out of scope and probably require a separate discussion and investigation. In any case, something for the maintainers to decide CC @ocelotl @srikanthccv

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a correct fix both for the crash and behaviour because the web framework instrumentation may receive carriers of different types, but their goal is to inject/extract the trace context information into the HTTP header, which is used for propagation. And since the ASGI specification says the headers is Iterable[[byte string, byte string]] I think it's safe to assume keys will be a byte string so it's not broken.

return [_key.decode("utf8") for (_key, _value) in headers]


asgi_getter = ASGIGetter()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,18 @@


class TestASGIGetter(TestCase):
def test_get_none(self):
def test_get_none_empty_carrier(self):
getter = ASGIGetter()
carrier = {}
val = getter.get(carrier, "test")
self.assertIsNone(val)

def test_get_none_empty_headers(self):
getter = ASGIGetter()
carrier = {"headers": []}
val = getter.get(carrier, "test")
self.assertIsNone(val)

def test_get_(self):
getter = ASGIGetter()
carrier = {"headers": [(b"test-key", b"val")]}
Expand All @@ -44,7 +50,22 @@ def test_get_(self):
"Should be case insensitive",
)

def test_keys(self):
def test_keys_empty_carrier(self):
getter = ASGIGetter()
keys = getter.keys({})
self.assertEqual(keys, [])

def test_keys_empty_headers(self):
getter = ASGIGetter()
keys = getter.keys({"headers": []})
self.assertEqual(keys, [])

def test_keys(self):
getter = ASGIGetter()
carrier = {"headers": [(b"test-key", b"val")]}
expected_val = ["test-key"]
self.assertEqual(
getter.keys(carrier),
expected_val,
"Should be equal",
)