-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add semantic conventions for instrumenting AWS Lambda. #1442
Changes from 8 commits
a970192
9a3e392
86586ba
a11446f
438eb49
125feac
b94df8c
c4105fc
bc363eb
c258950
2869623
11b0a9f
812ef2e
1065bb5
fc3f707
d10423c
eef19ac
8ce446e
ed97a08
5c9d575
5b2bb84
d6f7640
ca27d56
77a4edf
e5d5934
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Instrumenting AWS Lambda | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
**Status**: [Experimental](../../../document-status.md) | ||
|
||
This document defines how to apply semantic conventions when instrumenting an AWS Lambda request handler. AWS | ||
Lambda largely follows the conventions for [FaaS](../faas.md) while [HTTP](../http.md) conventions are also | ||
applicable when handlers are for HTTP requests. | ||
|
||
There are a variety of triggers for Lambda functions, and this document will grow over time to cover all the | ||
use cases. | ||
|
||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## All triggers | ||
|
||
For all events, a span with kind `SERVER` MUST be created corresponding to the function invocation unless stated | ||
otherwise below. Unless stated otherwise below, the name of the span MUST be set to the function name from the | ||
Lambda `Context`. | ||
|
||
The following attributes SHOULD be set. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- [`faas.execution`](../faas.md) - The value of the AWS Request ID, which is always available through an accessor on the Lambda `Context` | ||
- [`faas.id`](../../../resource/semantic_conventions/faas.md) - The value of the invocation arn for the function, which is always available through an accessor on the Lambda `Context` | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- [`cloud.account.id`](../../../resource/semantic_conventions/cloud.md) - In some languages, this is available as an accessor on the Lambda `Context`. Otherwise, it can be parsed from the value of `faas.id` as the fifth item when splitting on `:` | ||
|
||
### Determining the parent of a span | ||
|
||
The parent of the span MUST be determined by considering both the environment and any headers or attributes | ||
available from the event. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If the `_X_AMZN_TRACE_ID` environment variable is set, it SHOULD be parsed into an OpenTelemetry `Context` using | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems to me that the application owner should be able to decide this. Ideally, the X-Ray propagator (or a specialized LambdaXrayPropagator) would be written in such a way that it checks the environment variable itself. The instrumentation should not decide this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The user can currently decide by enabling or disabling XRay so I think this reflects that choice well. If we had another option in the instrumentation it's two settings related to XRay which I think is just an extra setting. It reminds me that in a follow-up to the SDK PR I need to describe using the AWS propagator directly on the HTTP calls as we do in Java. It's the only recognized format for the next years so there is no point for an option at least until that changes it's just an extra option. I think it's similar for Lambda. |
||
the [AWS X-Ray Propagator](../../../context/api-propagators.md). If the resulting `Context` is sampled, then this | ||
`Context` is the parent of the function span. The environment variable will be set and the `Context` will be | ||
sampled only if AWS X-Ray has been enabled for the Lambda function. A user can disable AWS X-Ray for the function | ||
if this propagation is not desired. | ||
|
||
Otherwise, for an API Gateway Proxy Request, the user's configured propagators should be applied to the HTTP | ||
headers of the request to extract a `Context`. | ||
|
||
## API Gateway | ||
|
||
API Gateway allows a user to trigger a Lambda function in response to HTTP requests. It can be configured to be | ||
a pure proxy, where the information about the original HTTP request is passed to the Lambda function, or as a | ||
configuration for a REST API, in which case only a deserialized body payload is available. In the case the API | ||
gateway is configured to proxy to the Lambda function, the instrumented request handler will have access to all | ||
the information about the HTTP request in the form of an API Gateway Proxy Request Event. | ||
|
||
The Lambda span name and the [`http.route` span attribute](../http.md) SHOULD be set to the `Resource` from the | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
proxy request event, which corresponds to the user configured HTTP route instead of the function name. | ||
|
||
[`faas.trigger`](../faas.md) MUST be set to `http`. [HTTP attributes](../http.md) SHOULD be set based on the | ||
available information in the proxy request event. | ||
|
||
## SQS | ||
|
||
SQS is a message queue that triggers a Lambda function with a batch of messages. So we consider processing both | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
of a batch and of each individual message. The function invocation span MUST correspond to the SQS event, which | ||
is the batch of messages. For each message, an additional span SHOULD be created to correspond with the handling | ||
of the SQS message. Because handling of a message will be inside user business logic, not the Lambda framework, | ||
automatic instrumentation mechanisms without code change will often not be able to instrument the processing of | ||
the individual messages. | ||
|
||
The span kind for both spans MUST be `CONSUMER`. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### SQS Event | ||
|
||
For the SQS event span, if all the messages in the event have the same event source, the name of the span MUST | ||
be `<event source> process`. If there are multiple sources in the batch, the name MUST be | ||
`multiple_sources <process>`. The parent MUST be the `SERVER` span corresponding to the function invocation. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For every message in the event, the message's system attributes (not message attributes, which are provided by | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the user) SHOULD be checked for the key `AWSTraceHeader`. If it is present, an OpenTelemetry `Context` SHOULD be | ||
parsed from the value of the attribute using the [AWS X-Ray Propagator](../../../context/api-propagators.md) and | ||
added as a link to the span. This means the span may have as many links as messages in the batch. | ||
|
||
[`faas.trigger`](../faas.md) MUST be set to `pubsub`. | ||
[`messaging.operation`](../messaging.md) MUST be set to `process`. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[`messaging.system`](../messaging.md) MUST be set to `AmazonSQS`. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### SQS Message | ||
|
||
For the SQS message span, the name MUST be `<event source> process`. The parent MUST be the `CONSUMER` span | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
corresponding to the SQS event. The message's system attributes (not message attributes, which are provided by | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the user) SHOULD be checked for the key `AWSTraceHeader`. If it is present, an OpenTelemetry `Context` SHOULD be | ||
parsed from the value of the attribute using the [AWS X-Ray Propagator](../../../context/api-propagators.md) and | ||
added as a link to the span. | ||
|
||
[`faas.trigger`](../faas.md) MUST be set to `pubsub`. | ||
[`messaging.operation`](../messaging.md) MUST be set to `process`. | ||
[`messaging.system`](../messaging.md) MUST be set to `AmazonSQS`. | ||
anuraaga marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Other [Messaging attributes](../messaging.md) SHOULD be set based on the available information in the SQS message | ||
event. | ||
|
||
Note that `AWSTraceHeader` is the only supported mechanism for propagating `Context` for SQS to prevent conflicts | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Supported by whom? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it should be possible to use any OpenTelemetry-compatible vendor with SQS without paying for X-Ray. I'd like it more if we say something to the effect that "Instrumentations SHOULD default to using AWSTraceHeader with AWS X-Ray Propagator, but SHOULD be configurable to use any other header and propagator". If that makes sense at all. I.e., I don't think we should indirectly "force" users to pay for X-Ray in our semantic conventions, even if AWS has implemented special features to make it nicer to use than other solutions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also related: #1442 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant supported by our instrumentation. I am going to try adding some text with some of this information to allay the concerns, which are fair but I wouldn't be worried :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I support @anuraaga here. With the first approach to AWS propagation (for SQS producer - consumer) I used "standard" HTTP-headers mechanism, creating a SQS message attribute (upon produce) and extracting parent from it (upon consume). While this approach worked in SQS - SQS scenario, it would fail with more complex (but generally used around the world) scenarios such as S3 - SQS or S3 - SNS - SQS. Therefore I had to switch back to using AWS feature, guaranteeing that if AWS Trace header is set during AWS SDK request, the value will be maintained and returned at the end of the chain (consume - as SQS system attribute). There was simply no other way to do it (ie without relying on AWS features). To sum up, approach that came out of discussions and code reviews was:
Frankly at the beginning it didn't feel right to implement propagation enforcing AWS trace format but I realised that it's just relying on how a closed system work, just as we do with instrumentation libraries, in order to support as many use cases as possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this comment addressed @arminru ? |
||
with other sources. Notably, message attributes (user-provided, not system) are not supported - the linked contexts | ||
are always expected to have been sent as HTTP headers of the `SQS.SendMessage` request that the message originated | ||
from. This is a function of AWS SDK instrumentation, not Lambda instrumentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This document doesn't mention its relationship to the FaaS spec, https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/trace/faas.yaml. Is it extending it or overriding it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the first paragraph clarify this, including an actual link to faas? Let me know if something's unclear.