Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tracer): implement AWS payload tagging for request/response #10642

Merged
merged 56 commits into from
Nov 12, 2024

Conversation

bouwkast
Copy link
Contributor

@bouwkast bouwkast commented Sep 11, 2024

Overview

This pull request adds the ability to expand AWS request/response payloads as span tags.
This matches our lambda offerings and provides useful information to developers when debugging communication between various AWS services.
This is based on the AWS Payload Tagging RFC and this implementation in dd-trace-node and this implementation in dd-trace-java.

This feature is disabled by default.

When activated this will produce span tags such as:

 "aws.request.body.PublishBatchRequestEntries.0.Id": "1",
 "aws.request.body.PublishBatchRequestEntries.0.Message": "ironmaiden",
 "aws.request.body.PublishBatchRequestEntries.1.Id": "2",
 "aws.request.body.PublishBatchRequestEntries.1.Message": "megadeth"
 "aws.response.body.HTTPStatusCode": "200",

Configuration

There are five new configuration options:

  • DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING:
    • "" by default to indicate that AWS request payload expansion is disabled for requests.
    • "all" to define that AWS request payload expansion is enabled for requests using the default JSONPaths for redaction logic.
    • a comma-separated list of user-supplied JSONPaths to define that AWS request payload expansion is enabled for requests using the default JSONPaths and the user-supplied JSONPaths for redaction logic.
  • DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING:
    • "" by default to indicate that AWS response payload expansion is disabled for responses.
    • "all" to define that AWS response payload expansion is enabled for responses using the default JSONPaths for redaction logic.
    • a comma-separated list of user-supplied JSONPaths to define that AWS request payload expansion is enabled for responses using the default JSONPaths and the user-supplied JSONPaths for redaction logic.
  • DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_DEPTH (not defined in RFC but done to match NodeJS):
    • sets the depth after which we stop creating tags from a payload
    • defaults to a value of 10
  • DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_TAGS (to match Java implementation)
    • sets the maximum number of tags allowed to be expanded
    • defaults to a value of 758
  • DD_TRACE_CLOUD_PAYLOAD_TAGGING_SERVICES (to match Java implementation)
    • a comma-separated list of supported AWS services
    • defaults to s3,sns,sqs,kinesis,eventbridge

Other

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

Copy link
Contributor

github-actions bot commented Sep 11, 2024

CODEOWNERS have been resolved as:

ddtrace/_trace/utils_botocore/aws_payload_tagging.py                    @DataDog/apm-sdk-api-python
ddtrace/vendor/jsonpath_ng/__init__.py                                  @DataDog/apm-core-python
ddtrace/vendor/jsonpath_ng/exceptions.py                                @DataDog/apm-core-python
ddtrace/vendor/jsonpath_ng/jsonpath.py                                  @DataDog/apm-core-python
ddtrace/vendor/jsonpath_ng/lexer.py                                     @DataDog/apm-core-python
ddtrace/vendor/jsonpath_ng/parser.py                                    @DataDog/apm-core-python
ddtrace/vendor/ply/__init__.py                                          @DataDog/apm-core-python
ddtrace/vendor/ply/lex.py                                               @DataDog/apm-core-python
ddtrace/vendor/ply/yacc.py                                              @DataDog/apm-core-python
releasenotes/notes/add-aws-payload-tagging-d01f0033c7e1f5c0.yaml        @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_eventbridge.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_kinesis.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_s3.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_s3_invalid_config.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_s3_valid_config.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_sns.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_sns_valid_config.json  @DataDog/apm-python
tests/snapshots/tests.contrib.botocore.test.BotocoreTest.test_aws_payload_tagging_sqs.json  @DataDog/apm-python
ddtrace/_trace/utils_botocore/span_tags.py                              @DataDog/apm-sdk-api-python
ddtrace/contrib/internal/botocore/patch.py                              @DataDog/apm-core-python @DataDog/apm-idm-python
ddtrace/vendor/__init__.py                                              @DataDog/apm-core-python
docs/configuration.rst                                                  @DataDog/python-guild
docs/spelling_wordlist.txt                                              @DataDog/python-guild
tests/appsec/integrations/test_flask_entrypoint_iast_patches.py         @DataDog/asm-python
tests/conftest.py                                                       @DataDog/apm-core-python
tests/contrib/botocore/test.py                                          @DataDog/apm-core-python @DataDog/apm-idm-python

@pr-commenter
Copy link

pr-commenter bot commented Sep 11, 2024

Benchmarks

Benchmark execution time: 2024-11-12 20:45:03

Comparing candidate commit c6c5551 in PR branch steven/aws-payload-tagging with baseline commit 9e7acd0 in branch main.

Found 0 performance improvements and 16 performance regressions! Performance is the same for 372 metrics, 2 unstable metrics.

scenario:sethttpmeta-all-disabled

  • 🟥 max_rss_usage [+3.664MB; +3.837MB] or [+12.538%; +13.131%]

scenario:sethttpmeta-all-enabled

  • 🟥 max_rss_usage [+3.602MB; +3.778MB] or [+12.302%; +12.901%]

scenario:sethttpmeta-collectipvariant_exists

  • 🟥 max_rss_usage [+3.684MB; +3.847MB] or [+12.611%; +13.166%]

scenario:sethttpmeta-no-collectipvariant

  • 🟥 max_rss_usage [+3.806MB; +3.970MB] or [+13.026%; +13.589%]

scenario:sethttpmeta-no-useragentvariant

  • 🟥 max_rss_usage [+3.662MB; +3.831MB] or [+12.525%; +13.103%]

scenario:sethttpmeta-obfuscation-no-query

  • 🟥 max_rss_usage [+3.790MB; +3.958MB] or [+12.962%; +13.539%]

scenario:sethttpmeta-obfuscation-regular-case-explicit-query

  • 🟥 max_rss_usage [+3.713MB; +3.886MB] or [+12.722%; +13.316%]

scenario:sethttpmeta-obfuscation-regular-case-implicit-query

  • 🟥 max_rss_usage [+3.653MB; +3.822MB] or [+12.497%; +13.074%]

scenario:sethttpmeta-obfuscation-send-querystring-disabled

  • 🟥 max_rss_usage [+3.834MB; +4.001MB] or [+13.171%; +13.744%]

scenario:sethttpmeta-obfuscation-worst-case-explicit-query

  • 🟥 max_rss_usage [+3.667MB; +3.843MB] or [+12.527%; +13.128%]

scenario:sethttpmeta-obfuscation-worst-case-implicit-query

  • 🟥 max_rss_usage [+3.765MB; +3.958MB] or [+12.916%; +13.577%]

scenario:sethttpmeta-useragentvariant_exists_1

  • 🟥 max_rss_usage [+3.731MB; +3.873MB] or [+12.788%; +13.273%]

scenario:sethttpmeta-useragentvariant_exists_2

  • 🟥 max_rss_usage [+3.527MB; +3.701MB] or [+12.057%; +12.649%]

scenario:sethttpmeta-useragentvariant_exists_3

  • 🟥 max_rss_usage [+3.530MB; +3.703MB] or [+12.067%; +12.659%]

scenario:sethttpmeta-useragentvariant_not_exists_1

  • 🟥 max_rss_usage [+3.818MB; +3.994MB] or [+13.109%; +13.713%]

scenario:sethttpmeta-useragentvariant_not_exists_2

  • 🟥 max_rss_usage [+4.150MB; +4.326MB] or [+14.415%; +15.025%]

@bouwkast bouwkast force-pushed the steven/aws-payload-tagging branch from 8bdb59a to eea1069 Compare October 30, 2024 16:05
Mapping from Java implementation
Copy link

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving from an API perspective

Copy link
Contributor

@erikayasuda erikayasuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

Seems that environment leaks to other tests
@bouwkast bouwkast enabled auto-merge (squash) November 8, 2024 17:03
@P403n1x87
Copy link
Contributor

I'm slightly concerned by the new vendored libraries and by the potential performance of this solution. I'd love to see some benchmarking/profiling to get an idea of what the CPU overhead looks like, if possible 🙏

@bouwkast bouwkast requested a review from a team as a code owner November 12, 2024 16:56
@bouwkast bouwkast merged commit c32f043 into main Nov 12, 2024
545 checks passed
@bouwkast bouwkast deleted the steven/aws-payload-tagging branch November 12, 2024 21:35
quinna-h pushed a commit that referenced this pull request Nov 13, 2024
)

## Overview

This pull request adds the ability to expand AWS request/response
payloads as span tags.
This matches our lambda offerings and provides useful information to
developers when debugging communication between various AWS services.
This is based on the AWS Payload Tagging RFC and this implementation in
[dd-trace-node](DataDog/dd-trace-js#4309) and
this implementation in
[dd-trace-java](DataDog/dd-trace-java#7312).

This feature is _disabled_ by default.

When activated this will produce span tags such as:

```
 "aws.request.body.PublishBatchRequestEntries.0.Id": "1",
 "aws.request.body.PublishBatchRequestEntries.0.Message": "ironmaiden",
 "aws.request.body.PublishBatchRequestEntries.1.Id": "2",
 "aws.request.body.PublishBatchRequestEntries.1.Message": "megadeth"
 "aws.response.body.HTTPStatusCode": "200",
```

## Configuration

There are five new configuration options:

- `DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING`:
- `""` by default to indicate that AWS request payload expansion is
**disabled** for _requests_.
- `"all"` to define that AWS request payload expansion is **enabled**
for _requests_ using the default `JSONPath`s for redaction logic.
- a comma-separated list of user-supplied `JSONPath`s to define that AWS
request payload expansion is **enabled** for _requests_ using the
default `JSONPath`s and the user-supplied `JSONPath`s for redaction
logic.
- `DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING`:
- `""` by default to indicate that AWS response payload expansion is
**disabled** for _responses_.
- `"all"` to define that AWS response payload expansion is **enabled**
for _responses_ using the default `JSONPath`s for redaction logic.
- a comma-separated list of user-supplied `JSONPath`s to define that AWS
request payload expansion is **enabled** for _responses_ using the
default `JSONPath`s and the user-supplied `JSONPath`s for redaction
logic.
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_DEPTH` (not defined in RFC but
done to match NodeJS):
  - sets the depth after which we stop creating tags from a payload
  - defaults to a value of `10`
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_TAGS` (to match Java
implementation)
  - sets the maximum number of tags allowed to be expanded
  - defaults to a value of `758`
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_SERVICES` (to match Java
implementation)
  - a comma-separated list of supported AWS services
  - defaults to ` s3,sns,sqs,kinesis,eventbridge`

## Other

- [`jsonpath-ng` has been
vendored](https://github.com/h2non/jsonpath-ng/blob/master/jsonpath_ng/jsonpath.py)
- [`ply` has been vendored (v3.11) (dependency of
`jsonpath-ng`)](https://github.com/dabeaz/ply/releases/tag/3.11)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: erikayasuda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.