Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Payload Tagging (deprecated) #7312

Closed
wants to merge 38 commits into from
Closed

Conversation

ygree
Copy link
Contributor

@ygree ygree commented Jul 12, 2024

What Does This Do

Adds functionality to capture AWS Json response/request payload and convert it to span tags while applying expansion and redaction defined rules.

Motivation

Having the ability to see data that was passed into an HTTPS payload from one service to the other.
Help customers (especially those who are using serverless architecture) reproduce and resolve bugs in their serverless compute code or configuration.

Additional Notes

Jira ticket: AIDM-174

NodeJS: DataDog/dd-trace-js#4309
Python: DataDog/dd-trace-py#10642

NOTE: This only works for services that use the JSON protocol, which depends on the service and the AWS SDK version. There is an alternative hybrid approach in this PR that provides better overall visibility and performance.

@ygree ygree added tag: do not merge Do not merge changes inst: aws sdk AWS SDK instrumentation labels Jul 12, 2024
@pr-commenter
Copy link

pr-commenter bot commented Jul 12, 2024

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master ygree/aws-payload-tagging
git_commit_date 1729007930 1729012174
git_commit_sha a1c2f48 3105869
release_version 1.41.0-SNAPSHOT~a1c2f48c91 1.41.0-SNAPSHOT~3105869721
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1729014616 1729014616
ci_job_id 672970178 672970178
ci_pipeline_id 46645882 46645882
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 54 metrics, 9 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.066 s) : 0, 1066371
Total [baseline] (10.33 s) : 0, 10329665
Agent [candidate] (1.069 s) : 0, 1069332
Total [candidate] (10.564 s) : 0, 10563938
section appsec
Agent [baseline] (1.205 s) : 0, 1205481
Total [baseline] (10.546 s) : 0, 10545728
Agent [candidate] (1.205 s) : 0, 1204786
Total [candidate] (10.571 s) : 0, 10570503
section iast
Agent [baseline] (1.201 s) : 0, 1200562
Total [baseline] (10.911 s) : 0, 10911481
Agent [candidate] (1.194 s) : 0, 1194413
Total [candidate] (10.851 s) : 0, 10850800
section profiling
Agent [baseline] (1.262 s) : 0, 1262360
Total [baseline] (10.61 s) : 0, 10609808
Agent [candidate] (1.274 s) : 0, 1273667
Total [candidate] (10.67 s) : 0, 10670252
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.066 s -
Agent appsec 1.205 s 139.109 ms (13.0%)
Agent iast 1.201 s 134.191 ms (12.6%)
Agent profiling 1.262 s 195.988 ms (18.4%)
Total tracing 10.33 s -
Total appsec 10.546 s 216.064 ms (2.1%)
Total iast 10.911 s 581.816 ms (5.6%)
Total profiling 10.61 s 280.144 ms (2.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.069 s -
Agent appsec 1.205 s 135.454 ms (12.7%)
Agent iast 1.194 s 125.081 ms (11.7%)
Agent profiling 1.274 s 204.335 ms (19.1%)
Total tracing 10.564 s -
Total appsec 10.571 s 6.565 ms (0.1%)
Total iast 10.851 s 286.863 ms (2.7%)
Total profiling 10.67 s 106.314 ms (1.0%)
gantt
    title petclinic - break down per module: candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (681.203 ms) : 0, 681203
BytebuddyAgent [candidate] (681.096 ms) : 0, 681096
GlobalTracer [baseline] (309.794 ms) : 0, 309794
GlobalTracer [candidate] (312.741 ms) : 0, 312741
AppSec [baseline] (53.62 ms) : 0, 53620
AppSec [candidate] (53.758 ms) : 0, 53758
Remote Config [baseline] (662.244 µs) : 0, 662
Remote Config [candidate] (655.605 µs) : 0, 656
Telemetry [baseline] (7.514 ms) : 0, 7514
Telemetry [candidate] (7.461 ms) : 0, 7461
section appsec
BytebuddyAgent [baseline] (701.46 ms) : 0, 701460
BytebuddyAgent [candidate] (699.204 ms) : 0, 699204
GlobalTracer [baseline] (308.182 ms) : 0, 308182
GlobalTracer [candidate] (310.587 ms) : 0, 310587
AppSec [baseline] (162.911 ms) : 0, 162911
AppSec [candidate] (162.272 ms) : 0, 162272
Remote Config [baseline] (633.12 µs) : 0, 633
Remote Config [candidate] (629.622 µs) : 0, 630
Telemetry [baseline] (8.859 ms) : 0, 8859
Telemetry [candidate] (8.431 ms) : 0, 8431
IAST [baseline] (19.474 ms) : 0, 19474
IAST [candidate] (20.052 ms) : 0, 20052
section iast
BytebuddyAgent [baseline] (800.52 ms) : 0, 800520
BytebuddyAgent [candidate] (794.302 ms) : 0, 794302
GlobalTracer [baseline] (299.768 ms) : 0, 299768
GlobalTracer [candidate] (300.897 ms) : 0, 300897
AppSec [baseline] (52.763 ms) : 0, 52763
AppSec [candidate] (55.341 ms) : 0, 55341
Remote Config [baseline] (598.424 µs) : 0, 598
Remote Config [candidate] (595.011 µs) : 0, 595
Telemetry [baseline] (7.037 ms) : 0, 7037
Telemetry [candidate] (6.96 ms) : 0, 6960
IAST [baseline] (26.136 ms) : 0, 26136
IAST [candidate] (22.73 ms) : 0, 22730
section profiling
BytebuddyAgent [baseline] (674.082 ms) : 0, 674082
BytebuddyAgent [candidate] (678.972 ms) : 0, 678972
GlobalTracer [baseline] (391.654 ms) : 0, 391654
GlobalTracer [candidate] (397.505 ms) : 0, 397505
AppSec [baseline] (54.514 ms) : 0, 54514
AppSec [candidate] (54.232 ms) : 0, 54232
Remote Config [baseline] (647.907 µs) : 0, 648
Remote Config [candidate] (645.932 µs) : 0, 646
Telemetry [baseline] (7.325 ms) : 0, 7325
Telemetry [candidate] (7.424 ms) : 0, 7424
ProfilingAgent [baseline] (95.653 ms) : 0, 95653
ProfilingAgent [candidate] (95.923 ms) : 0, 95923
Profiling [baseline] (95.677 ms) : 0, 95677
Profiling [candidate] (95.947 ms) : 0, 95947
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.073 s) : 0, 1073004
Total [baseline] (8.534 s) : 0, 8533973
Agent [candidate] (1.076 s) : 0, 1076231
Total [candidate] (8.545 s) : 0, 8545466
section iast
Agent [baseline] (1.198 s) : 0, 1197941
Total [baseline] (9.066 s) : 0, 9066036
Agent [candidate] (1.213 s) : 0, 1213365
Total [candidate] (9.105 s) : 0, 9105342
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.198 s) : 0, 1198284
Total [baseline] (9.074 s) : 0, 9074436
Agent [candidate] (1.198 s) : 0, 1197690
Total [candidate] (9.111 s) : 0, 9110795
section iast_TELEMETRY_OFF
Agent [baseline] (1.19 s) : 0, 1189842
Total [baseline] (9.055 s) : 0, 9055436
Agent [candidate] (1.193 s) : 0, 1193456
Total [candidate] (9.057 s) : 0, 9056868
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.073 s -
Agent iast 1.198 s 124.937 ms (11.6%)
Agent iast_HARDCODED_SECRET_DISABLED 1.198 s 125.28 ms (11.7%)
Agent iast_TELEMETRY_OFF 1.19 s 116.838 ms (10.9%)
Total tracing 8.534 s -
Total iast 9.066 s 532.063 ms (6.2%)
Total iast_HARDCODED_SECRET_DISABLED 9.074 s 540.463 ms (6.3%)
Total iast_TELEMETRY_OFF 9.055 s 521.464 ms (6.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.076 s -
Agent iast 1.213 s 137.134 ms (12.7%)
Agent iast_HARDCODED_SECRET_DISABLED 1.198 s 121.459 ms (11.3%)
Agent iast_TELEMETRY_OFF 1.193 s 117.225 ms (10.9%)
Total tracing 8.545 s -
Total iast 9.105 s 559.875 ms (6.6%)
Total iast_HARDCODED_SECRET_DISABLED 9.111 s 565.329 ms (6.6%)
Total iast_TELEMETRY_OFF 9.057 s 511.401 ms (6.0%)
gantt
    title insecure-bank - break down per module: candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (686.114 ms) : 0, 686114
BytebuddyAgent [candidate] (685.711 ms) : 0, 685711
GlobalTracer [baseline] (311.232 ms) : 0, 311232
GlobalTracer [candidate] (314.574 ms) : 0, 314574
AppSec [baseline] (53.779 ms) : 0, 53779
AppSec [candidate] (53.978 ms) : 0, 53978
Remote Config [baseline] (664.167 µs) : 0, 664
Remote Config [candidate] (668.475 µs) : 0, 668
Telemetry [baseline] (7.467 ms) : 0, 7467
Telemetry [candidate] (7.531 ms) : 0, 7531
section iast
BytebuddyAgent [baseline] (800.183 ms) : 0, 800183
BytebuddyAgent [candidate] (807.113 ms) : 0, 807113
GlobalTracer [baseline] (298.089 ms) : 0, 298089
GlobalTracer [candidate] (305.279 ms) : 0, 305279
AppSec [baseline] (57.179 ms) : 0, 57179
AppSec [candidate] (56.364 ms) : 0, 56364
Remote Config [baseline] (595.413 µs) : 0, 595
Remote Config [candidate] (600.938 µs) : 0, 601
Telemetry [baseline] (7.083 ms) : 0, 7083
Telemetry [candidate] (7.123 ms) : 0, 7123
IAST [baseline] (21.112 ms) : 0, 21112
IAST [candidate] (23.042 ms) : 0, 23042
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (798.15 ms) : 0, 798150
BytebuddyAgent [candidate] (796.025 ms) : 0, 796025
GlobalTracer [baseline] (299.66 ms) : 0, 299660
GlobalTracer [candidate] (302.075 ms) : 0, 302075
AppSec [baseline] (55.861 ms) : 0, 55861
AppSec [candidate] (56.146 ms) : 0, 56146
Remote Config [baseline] (623.258 µs) : 0, 623
Remote Config [candidate] (613.003 µs) : 0, 613
Telemetry [baseline] (7.209 ms) : 0, 7209
Telemetry [candidate] (7.087 ms) : 0, 7087
IAST [baseline] (23.07 ms) : 0, 23070
IAST [candidate] (22.069 ms) : 0, 22069
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (792.561 ms) : 0, 792561
BytebuddyAgent [candidate] (792.879 ms) : 0, 792879
GlobalTracer [baseline] (298.374 ms) : 0, 298374
GlobalTracer [candidate] (301.744 ms) : 0, 301744
AppSec [baseline] (53.091 ms) : 0, 53091
AppSec [candidate] (54.587 ms) : 0, 54587
Remote Config [baseline] (616.686 µs) : 0, 617
Remote Config [candidate] (604.938 µs) : 0, 605
Telemetry [baseline] (6.832 ms) : 0, 6832
Telemetry [candidate] (6.946 ms) : 0, 6946
IAST [baseline] (24.717 ms) : 0, 24717
IAST [candidate] (23.054 ms) : 0, 23054
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2024-10-15T17:20:33 2024-10-15T17:27:26
git_branch master ygree/aws-payload-tagging
git_commit_date 1729007930 1729012174
git_commit_sha a1c2f48 3105869
release_version 1.41.0-SNAPSHOT~a1c2f48c91 1.41.0-SNAPSHOT~3105869721
start_time 2024-10-15T17:20:20 2024-10-15T17:27:13
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1729013593 1729013593
ci_job_id 672970179 672970179
ci_pipeline_id 46645882 46645882
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 17 unstable metrics.

Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91
    dateFormat X
    axisFormat %s
section baseline
no_agent (378.624 µs) : 359, 399
.   : milestone, 379,
iast (485.683 µs) : 464, 507
.   : milestone, 486,
iast_FULL (559.462 µs) : 538, 581
.   : milestone, 559,
iast_GLOBAL (514.273 µs) : 492, 536
.   : milestone, 514,
iast_HARDCODED_SECRET_DISABLED (492.157 µs) : 471, 514
.   : milestone, 492,
iast_INACTIVE (454.69 µs) : 433, 476
.   : milestone, 455,
iast_TELEMETRY_OFF (471.992 µs) : 451, 493
.   : milestone, 472,
tracing (451.796 µs) : 431, 473
.   : milestone, 452,
section candidate
no_agent (373.415 µs) : 354, 393
.   : milestone, 373,
iast (487.067 µs) : 465, 509
.   : milestone, 487,
iast_FULL (554.443 µs) : 533, 576
.   : milestone, 554,
iast_GLOBAL (502.628 µs) : 481, 524
.   : milestone, 503,
iast_HARDCODED_SECRET_DISABLED (488.318 µs) : 466, 510
.   : milestone, 488,
iast_INACTIVE (451.749 µs) : 431, 473
.   : milestone, 452,
iast_TELEMETRY_OFF (474.415 µs) : 453, 496
.   : milestone, 474,
tracing (447.354 µs) : 426, 468
.   : milestone, 447,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 378.624 µs [358.652 µs, 398.597 µs] -
iast 485.683 µs [464.387 µs, 506.98 µs] 107.059 µs (28.3%)
iast_FULL 559.462 µs [537.862 µs, 581.063 µs] 180.838 µs (47.8%)
iast_GLOBAL 514.273 µs [492.261 µs, 536.285 µs] 135.649 µs (35.8%)
iast_HARDCODED_SECRET_DISABLED 492.157 µs [470.588 µs, 513.726 µs] 113.533 µs (30.0%)
iast_INACTIVE 454.69 µs [433.103 µs, 476.277 µs] 76.066 µs (20.1%)
iast_TELEMETRY_OFF 471.992 µs [451.007 µs, 492.976 µs] 93.367 µs (24.7%)
tracing 451.796 µs [430.529 µs, 473.063 µs] 73.171 µs (19.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 373.415 µs [353.827 µs, 393.004 µs] -
iast 487.067 µs [465.389 µs, 508.745 µs] 113.651 µs (30.4%)
iast_FULL 554.443 µs [533.23 µs, 575.655 µs] 181.027 µs (48.5%)
iast_GLOBAL 502.628 µs [481.209 µs, 524.047 µs] 129.213 µs (34.6%)
iast_HARDCODED_SECRET_DISABLED 488.318 µs [466.499 µs, 510.136 µs] 114.902 µs (30.8%)
iast_INACTIVE 451.749 µs [430.856 µs, 472.642 µs] 78.334 µs (21.0%)
iast_TELEMETRY_OFF 474.415 µs [453.301 µs, 495.53 µs] 101.0 µs (27.0%)
tracing 447.354 µs [426.486 µs, 468.222 µs] 73.939 µs (19.8%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.34 ms) : 1321, 1359
.   : milestone, 1340,
appsec (1.721 ms) : 1697, 1744
.   : milestone, 1721,
appsec_no_iast (1.728 ms) : 1705, 1751
.   : milestone, 1728,
iast (1.489 ms) : 1466, 1512
.   : milestone, 1489,
profiling (1.51 ms) : 1485, 1536
.   : milestone, 1510,
tracing (1.469 ms) : 1445, 1493
.   : milestone, 1469,
section candidate
no_agent (1.349 ms) : 1329, 1369
.   : milestone, 1349,
appsec (1.721 ms) : 1697, 1745
.   : milestone, 1721,
appsec_no_iast (1.711 ms) : 1686, 1735
.   : milestone, 1711,
iast (1.485 ms) : 1462, 1507
.   : milestone, 1485,
profiling (1.485 ms) : 1460, 1509
.   : milestone, 1485,
tracing (1.477 ms) : 1453, 1501
.   : milestone, 1477,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.34 ms [1.321 ms, 1.359 ms] -
appsec 1.721 ms [1.697 ms, 1.744 ms] 380.972 µs (28.4%)
appsec_no_iast 1.728 ms [1.705 ms, 1.751 ms] 388.145 µs (29.0%)
iast 1.489 ms [1.466 ms, 1.512 ms] 148.976 µs (11.1%)
profiling 1.51 ms [1.485 ms, 1.536 ms] 170.76 µs (12.7%)
tracing 1.469 ms [1.445 ms, 1.493 ms] 129.279 µs (9.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.349 ms [1.329 ms, 1.369 ms] -
appsec 1.721 ms [1.697 ms, 1.745 ms] 371.895 µs (27.6%)
appsec_no_iast 1.711 ms [1.686 ms, 1.735 ms] 361.452 µs (26.8%)
iast 1.485 ms [1.462 ms, 1.507 ms] 135.419 µs (10.0%)
profiling 1.485 ms [1.46 ms, 1.509 ms] 135.598 µs (10.1%)
tracing 1.477 ms [1.453 ms, 1.501 ms] 127.819 µs (9.5%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master ygree/aws-payload-tagging
git_commit_date 1729007930 1729012174
git_commit_sha a1c2f48 3105869
release_version 1.41.0-SNAPSHOT~a1c2f48c91 1.41.0-SNAPSHOT~3105869721
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1729014123 1729014123
ci_job_id 672970180 672970180
ci_pipeline_id 46645882 46645882
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant appsec appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.246 s) : 15246000, 15246000
.   : milestone, 15246000,
appsec (15.15 s) : 15150000, 15150000
.   : milestone, 15150000,
iast (19.25 s) : 19250000, 19250000
.   : milestone, 19250000,
iast_GLOBAL (18.097 s) : 18097000, 18097000
.   : milestone, 18097000,
profiling (15.749 s) : 15749000, 15749000
.   : milestone, 15749000,
tracing (15.233 s) : 15233000, 15233000
.   : milestone, 15233000,
section candidate
no_agent (15.744 s) : 15744000, 15744000
.   : milestone, 15744000,
appsec (15.016 s) : 15016000, 15016000
.   : milestone, 15016000,
iast (18.741 s) : 18741000, 18741000
.   : milestone, 18741000,
iast_GLOBAL (18.172 s) : 18172000, 18172000
.   : milestone, 18172000,
profiling (15.711 s) : 15711000, 15711000
.   : milestone, 15711000,
tracing (15.151 s) : 15151000, 15151000
.   : milestone, 15151000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.246 s [15.246 s, 15.246 s] -
appsec 15.15 s [15.15 s, 15.15 s] -96.0 ms (-0.6%)
iast 19.25 s [19.25 s, 19.25 s] 4.004 s (26.3%)
iast_GLOBAL 18.097 s [18.097 s, 18.097 s] 2.851 s (18.7%)
profiling 15.749 s [15.749 s, 15.749 s] 503.0 ms (3.3%)
tracing 15.233 s [15.233 s, 15.233 s] -13.0 ms (-0.1%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.744 s [15.744 s, 15.744 s] -
appsec 15.016 s [15.016 s, 15.016 s] -728.0 ms (-4.6%)
iast 18.741 s [18.741 s, 18.741 s] 2.997 s (19.0%)
iast_GLOBAL 18.172 s [18.172 s, 18.172 s] 2.428 s (15.4%)
profiling 15.711 s [15.711 s, 15.711 s] -33.0 ms (-0.2%)
tracing 15.151 s [15.151 s, 15.151 s] -593.0 ms (-3.8%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.41.0-SNAPSHOT~3105869721, baseline=1.41.0-SNAPSHOT~a1c2f48c91
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.469 ms) : 1458, 1481
.   : milestone, 1469,
appsec (2.338 ms) : 2297, 2379
.   : milestone, 2338,
iast (2.079 ms) : 2027, 2131
.   : milestone, 2079,
iast_GLOBAL (2.13 ms) : 2077, 2182
.   : milestone, 2130,
profiling (1.943 ms) : 1902, 1984
.   : milestone, 1943,
tracing (1.919 ms) : 1880, 1958
.   : milestone, 1919,
section candidate
no_agent (1.466 ms) : 1455, 1478
.   : milestone, 1466,
appsec (2.33 ms) : 2289, 2370
.   : milestone, 2330,
iast (2.07 ms) : 2019, 2121
.   : milestone, 2070,
iast_GLOBAL (2.108 ms) : 2056, 2159
.   : milestone, 2108,
profiling (1.942 ms) : 1901, 1983
.   : milestone, 1942,
tracing (1.919 ms) : 1879, 1958
.   : milestone, 1919,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.469 ms [1.458 ms, 1.481 ms] -
appsec 2.338 ms [2.297 ms, 2.379 ms] 869.145 µs (59.2%)
iast 2.079 ms [2.027 ms, 2.131 ms] 610.179 µs (41.5%)
iast_GLOBAL 2.13 ms [2.077 ms, 2.182 ms] 660.644 µs (45.0%)
profiling 1.943 ms [1.902 ms, 1.984 ms] 474.068 µs (32.3%)
tracing 1.919 ms [1.88 ms, 1.958 ms] 449.643 µs (30.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.466 ms [1.455 ms, 1.478 ms] -
appsec 2.33 ms [2.289 ms, 2.37 ms] 863.338 µs (58.9%)
iast 2.07 ms [2.019 ms, 2.121 ms] 603.955 µs (41.2%)
iast_GLOBAL 2.108 ms [2.056 ms, 2.159 ms] 641.44 µs (43.7%)
profiling 1.942 ms [1.901 ms, 1.983 ms] 475.723 µs (32.4%)
tracing 1.919 ms [1.879 ms, 1.958 ms] 452.425 µs (30.9%)

@ygree ygree force-pushed the ygree/aws-payload-tagging branch from c9a799d to 1e180f3 Compare July 12, 2024 19:09
gradle/libs.versions.toml Outdated Show resolved Hide resolved
@ygree ygree force-pushed the ygree/aws-payload-tagging branch 6 times, most recently from 83f1bef to 7252b43 Compare September 11, 2024 04:50
@ygree ygree force-pushed the ygree/aws-payload-tagging branch 2 times, most recently from 37eed2b to bdb7eaf Compare September 12, 2024 00:36
@ygree ygree changed the title [PoC] AWS Payload observability AWS Payload Observability Sep 24, 2024
@ygree ygree force-pushed the ygree/aws-payload-tagging branch from 22921eb to 15a3056 Compare October 2, 2024 03:14
ResponseBodyStreamWrapper wrapper = (ResponseBodyStreamWrapper) body;
ByteArrayInputStream bodyStream = wrapper.asByteArrayInputStream();
if (bodyStream != null) {
AgentTracer.get().addTagsFromResponseBody(span, bodyStream, "aws.response.body");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've some concerns regarding rebuffering the response and parse part of it during the advice. Apart from holding additional memory (and risk to corrupt it) and adding processing to parse the json we actually assume that the encoding would be plain json while aws sdk is supposed to support other transport encoding like cbor and xml. This will not work in those cases.

The sdk v2 has a POJO structure in the response that offers an object traversal facility that's independent from its representation. In particular each SdkResponse is an instance of SdkPojo who as a method sdkFields returning a list of SdkFields holding a reference to the location and its value. Those objects are already available when intercepting afterExecution so we can just use them without adding extra efforts.

To debug I put this method on the interceptor:

void printSdkFields(String root, SdkPojo pojo) {
    for (SdkField<?> f : pojo.sdkFields()) {
      Object value = f.getValueOrDefault(pojo);
      String key = root + "." + f.locationName();
      if (value instanceof Collection) {
        int i = 0;
        for (Object v : (Collection<?>) value) {
          if (v instanceof SdkPojo) {
            printSdkFields(key + "[" + i + "]", (SdkPojo) v);
          } else {
            System.err.println(key + "[" + i + "]: " + v);
          }
          i++;
        }
      } else if (value instanceof SdkPojo) {
        printSdkFields(key, (SdkPojo) value);
      } else {
        System.err.println(key + ": " + value);
      }
    }
  }

I call this from afterExecution like this printSdkFields("$", context.response()); and the output I have is quite good
Example for kinesis:

$.Records[0].SequenceNumber: 21269319989652663814458848515492872193
$.Records[0].ApproximateArrivalTimestamp: +56726-05-21T03:34:19Z
$.Records[0].Data: SdkBytes(bytes=0x5f3c646174613e5f30)
$.Records[0].PartitionKey: partitionKey
$.Records[0].EncryptionType: null
$.Records[1].SequenceNumber: 21269319989652663814458848515492872193
$.Records[1].ApproximateArrivalTimestamp: +56726-05-21T03:50:59Z
$.Records[1].Data: SdkBytes(bytes=0x5f3c646174613e5f30)
$.Records[1].PartitionKey: partitionKey
$.Records[1].EncryptionType: null
$.NextShardIterator: AAA
$.MillisBehindLatest: 2100

I think this approach can save some time and also support other body encoding other than json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the RFC, the idea was to generalize to any service (not just AWS) and be able to convert request and response body to tags for observability purposes.

I totally agree with your concerns about overhead, so I tried to limit parsing as much as possible by

  • Avoiding materializing JSON and instead using event-based parsing to traverse in one go.
  • Buffering only payloads that start with '{'.
  • Having an upper limit of 256Kb for the buffer.
  • Disabled by default.
  • Limit the number of services it's enabled for (now only SQS, ApiGateway and EventBridge).

Also, so far we are going to release it internally for the Cloud Observability team and there is no intention to release it to GA at all.

Regarding your comment about services not using JSON, this is exactly what came up in testing I did recently. The format used depends not only on the service, but even on the minor version of the SDK. For example, SNS uses Http Form Params and XML, some earlier versions of the SDK for SQS also use XML instead of JSON. And as you mentioned, Kinesis uses CBOR.

I appreciate your efforts to experiment with an alternative approach and provide a very detailed idea of how to implement it. I think we should have a specialized version for AWS SDK to use the provided POJOs as you suggested. That way we can optimize drastically for AWS SDK. And at the same time expand the visibility to AWS services that don't use JSON.

Thank you!

@ygree ygree self-assigned this Oct 3, 2024
@ygree ygree removed the tag: do not merge Do not merge changes label Oct 5, 2024
@ygree ygree requested a review from amarziali October 15, 2024 19:31
Copy link
Collaborator

@amarziali amarziali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are improvements we can do specifically for AWS that would not require any json parsing that we should consider. Also, the json parsing can be do lazily. Since the feature is experimental and not enabled by default, I would not block from approval. However we should consider to implement such improvements.

if (config.isCloudRequestPayloadTaggingEnabled()
&& config.isCloudPayloadTaggingEnabledFor(awsServiceName)) {
InputStream body = requestBody.get().contentStreamProvider().newStream();
AgentTracer.get().addTagsFromRequestBody(span, body, "aws.request.body");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically we could have done it lazily in the agent thread (as tag postprocessor) so it would not affect the same thread of the aws request (i.e. only attach the buffer to the span and process on a TagPostProcessor). It would also have avoided the fact to expose those methods on the agent tracer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. I just wasn't sure how much we wanted to optimize it at all. Now I think we should go that extra mile to make it work without keeping buffers but using SdkPojos with all necessary parsing (still needed for inner strings/bytes that contain JSON) done outside of the main thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started rework in a separate PR #7811

@ygree ygree marked this pull request as draft October 17, 2024 00:05
@ygree ygree added the tag: do not merge Do not merge changes label Oct 18, 2024
@ygree ygree mentioned this pull request Oct 29, 2024
5 tasks
@ygree ygree changed the title AWS Payload Tagging AWS Payload Tagging (deprecated) Oct 29, 2024
@ygree ygree marked this pull request as ready for review October 29, 2024 17:45
@ygree
Copy link
Contributor Author

ygree commented Nov 1, 2024

Closing this in favor #7811

@ygree ygree closed this Nov 1, 2024
bouwkast added a commit to DataDog/dd-trace-py that referenced this pull request Nov 12, 2024
)

## Overview

This pull request adds the ability to expand AWS request/response
payloads as span tags.
This matches our lambda offerings and provides useful information to
developers when debugging communication between various AWS services.
This is based on the AWS Payload Tagging RFC and this implementation in
[dd-trace-node](DataDog/dd-trace-js#4309) and
this implementation in
[dd-trace-java](DataDog/dd-trace-java#7312).

This feature is _disabled_ by default.

When activated this will produce span tags such as:

```
 "aws.request.body.PublishBatchRequestEntries.0.Id": "1",
 "aws.request.body.PublishBatchRequestEntries.0.Message": "ironmaiden",
 "aws.request.body.PublishBatchRequestEntries.1.Id": "2",
 "aws.request.body.PublishBatchRequestEntries.1.Message": "megadeth"
 "aws.response.body.HTTPStatusCode": "200",
```

## Configuration

There are five new configuration options:

- `DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING`:
- `""` by default to indicate that AWS request payload expansion is
**disabled** for _requests_.
- `"all"` to define that AWS request payload expansion is **enabled**
for _requests_ using the default `JSONPath`s for redaction logic.
- a comma-separated list of user-supplied `JSONPath`s to define that AWS
request payload expansion is **enabled** for _requests_ using the
default `JSONPath`s and the user-supplied `JSONPath`s for redaction
logic.
- `DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING`:
- `""` by default to indicate that AWS response payload expansion is
**disabled** for _responses_.
- `"all"` to define that AWS response payload expansion is **enabled**
for _responses_ using the default `JSONPath`s for redaction logic.
- a comma-separated list of user-supplied `JSONPath`s to define that AWS
request payload expansion is **enabled** for _responses_ using the
default `JSONPath`s and the user-supplied `JSONPath`s for redaction
logic.
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_DEPTH` (not defined in RFC but
done to match NodeJS):
  - sets the depth after which we stop creating tags from a payload
  - defaults to a value of `10`
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_TAGS` (to match Java
implementation)
  - sets the maximum number of tags allowed to be expanded
  - defaults to a value of `758`
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_SERVICES` (to match Java
implementation)
  - a comma-separated list of supported AWS services
  - defaults to ` s3,sns,sqs,kinesis,eventbridge`

## Other

- [`jsonpath-ng` has been
vendored](https://github.com/h2non/jsonpath-ng/blob/master/jsonpath_ng/jsonpath.py)
- [`ply` has been vendored (v3.11) (dependency of
`jsonpath-ng`)](https://github.com/dabeaz/ply/releases/tag/3.11)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: erikayasuda <[email protected]>
quinna-h pushed a commit to DataDog/dd-trace-py that referenced this pull request Nov 13, 2024
)

## Overview

This pull request adds the ability to expand AWS request/response
payloads as span tags.
This matches our lambda offerings and provides useful information to
developers when debugging communication between various AWS services.
This is based on the AWS Payload Tagging RFC and this implementation in
[dd-trace-node](DataDog/dd-trace-js#4309) and
this implementation in
[dd-trace-java](DataDog/dd-trace-java#7312).

This feature is _disabled_ by default.

When activated this will produce span tags such as:

```
 "aws.request.body.PublishBatchRequestEntries.0.Id": "1",
 "aws.request.body.PublishBatchRequestEntries.0.Message": "ironmaiden",
 "aws.request.body.PublishBatchRequestEntries.1.Id": "2",
 "aws.request.body.PublishBatchRequestEntries.1.Message": "megadeth"
 "aws.response.body.HTTPStatusCode": "200",
```

## Configuration

There are five new configuration options:

- `DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING`:
- `""` by default to indicate that AWS request payload expansion is
**disabled** for _requests_.
- `"all"` to define that AWS request payload expansion is **enabled**
for _requests_ using the default `JSONPath`s for redaction logic.
- a comma-separated list of user-supplied `JSONPath`s to define that AWS
request payload expansion is **enabled** for _requests_ using the
default `JSONPath`s and the user-supplied `JSONPath`s for redaction
logic.
- `DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING`:
- `""` by default to indicate that AWS response payload expansion is
**disabled** for _responses_.
- `"all"` to define that AWS response payload expansion is **enabled**
for _responses_ using the default `JSONPath`s for redaction logic.
- a comma-separated list of user-supplied `JSONPath`s to define that AWS
request payload expansion is **enabled** for _responses_ using the
default `JSONPath`s and the user-supplied `JSONPath`s for redaction
logic.
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_DEPTH` (not defined in RFC but
done to match NodeJS):
  - sets the depth after which we stop creating tags from a payload
  - defaults to a value of `10`
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_TAGS` (to match Java
implementation)
  - sets the maximum number of tags allowed to be expanded
  - defaults to a value of `758`
- `DD_TRACE_CLOUD_PAYLOAD_TAGGING_SERVICES` (to match Java
implementation)
  - a comma-separated list of supported AWS services
  - defaults to ` s3,sns,sqs,kinesis,eventbridge`

## Other

- [`jsonpath-ng` has been
vendored](https://github.com/h2non/jsonpath-ng/blob/master/jsonpath_ng/jsonpath.py)
- [`ply` has been vendored (v3.11) (dependency of
`jsonpath-ng`)](https://github.com/dabeaz/ply/releases/tag/3.11)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: erikayasuda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inst: aws sdk AWS SDK instrumentation tag: do not merge Do not merge changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants