Reduce PendingTrace Lock Contention #9932

dougqh · 2025-11-11T20:18:43Z

What Does This Do

Aims to reduce lock contention in PendingTrace by only attempting partialFlush when a span has just been added to PendingTrace.

Prior to this change, we would also attempt a partialFlush after scope/context close as well, but closing a scope cannot cause us to cross the partialFlush threshold.

The theory is that this will improve our lock contention with virtual threads.
The concern is that virtual threads are often only restoring context, but then not creating a span.
That can lead the virtual thread to attempt a partialFlush which requires taking the PendingTrace lock.
If the PendingTrace lock cannot be acquired, then the virtual thread will be unmounted from its carrier thread.

Motivation

Report of high overhead and lock contention when using virtual threads

Additional Notes

Contributor Checklist

Format the title according the contribution guidelines
Assign the type: and (comp: or inst:) labels in addition to any useful labels
Don't use close, fix or any linking keywords when referencing an issue.
Use solves instead, and assign the PR milestone to the issue
Update the CODEOWNERS file on source file addition, move, or deletion
Update the public documentation in case of new configuration flag or behavior

Jira ticket: [PROJ-IDENT]

github-actions · 2025-11-11T20:18:53Z

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

dougqh · 2025-11-11T20:21:14Z

dd-trace-core/src/main/java/datadog/trace/core/PendingTrace.java

  }

-  private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
+  private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {


Right now, I'm curious what others think of this potential change.
I'm intending to write a microbenchmark to see if I can verify that this change is profitable.
I also think I can write a test verifies the PendingTrace behavior by using a custom writer.

This looks to me like clever trick. This changes a bit the write dynamic, where the next chance to write is when a new span is added, or when the root span is finished (and the other queueing states). I believe this is good. I've seen some instrumentations like aerospike that explicitly cancel the "continuation", but I don't think this is an issue.

I'm still not sure if this addresses the reported issue.
However, it does cut my macrobenchmark by 2-3%. Given that my macrobenchmark uses @Trace annotations which are rather heavy, I suspect the gains might be larger with typical auto-instrumentation.

datadog-datadog-prod-us1 · 2025-11-11T20:32:34Z

🎯 Code Coverage
• Patch Coverage: 100.00%
• Total Coverage: 63.18% (+3.59%)

View detailed report

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: f1fcdc0 | Docs | Datadog PR Page | Was this helpful? Give us feedback!}

mcculls

I think it's a good optimization.

Looking forward to seeing the microbenchmark results, I suspect it will show a positive improvement when there are a lot of context migrations.

pr-commenter · 2025-11-11T21:02:45Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	dougqh/pending-trace-contention-reduction
git_commit_date	1763665690	1763666268
git_commit_sha	`53aa83c`	`f1fcdc0`
release_version	1.57.0-SNAPSHOT~53aa83cb56	1.56.0-SNAPSHOT~f1fcdc0d78

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1763668349	1763668349
ci_job_id	1248773085	1248773085
ci_pipeline_id	83451040	83451040
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-uwa22qm0 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-uwa22qm0 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 60 metrics, 5 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.109 s) : 0, 1108951
Total [baseline] (8.876 s) : 0, 8875728
Agent [candidate] (1.11 s) : 0, 1110429
Total [candidate] (8.837 s) : 0, 8836847
section iast
Agent [baseline] (1.237 s) : 0, 1237217
Total [baseline] (9.542 s) : 0, 9541891
Agent [candidate] (1.251 s) : 0, 1250935
Total [candidate] (9.594 s) : 0, 9593815

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.109 s	-
Agent	iast	1.237 s	128.266 ms (11.6%)
Total	tracing	8.876 s	-
Total	iast	9.542 s	666.163 ms (7.5%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.11 s	-
Agent	iast	1.251 s	140.506 ms (12.7%)
Total	tracing	8.837 s	-
Total	iast	9.594 s	756.967 ms (8.6%)

gantt
    title insecure-bank - break down per module: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.479 ms) : 0, 1479
crashtracking [candidate] (1.45 ms) : 0, 1450
BytebuddyAgent [baseline] (712.493 ms) : 0, 712493
BytebuddyAgent [candidate] (713.996 ms) : 0, 713996
GlobalTracer [baseline] (251.02 ms) : 0, 251020
GlobalTracer [candidate] (251.104 ms) : 0, 251104
AppSec [baseline] (32.472 ms) : 0, 32472
AppSec [candidate] (32.411 ms) : 0, 32411
Debugger [baseline] (63.708 ms) : 0, 63708
Debugger [candidate] (63.762 ms) : 0, 63762
Remote Config [baseline] (629.965 µs) : 0, 630
Remote Config [candidate] (637.73 µs) : 0, 638
Telemetry [baseline] (8.371 ms) : 0, 8371
Telemetry [candidate] (8.343 ms) : 0, 8343
Flare Poller [baseline] (3.741 ms) : 0, 3741
Flare Poller [candidate] (3.707 ms) : 0, 3707
section iast
crashtracking [baseline] (1.447 ms) : 0, 1447
crashtracking [candidate] (1.459 ms) : 0, 1459
BytebuddyAgent [baseline] (830.857 ms) : 0, 830857
BytebuddyAgent [candidate] (841.792 ms) : 0, 841792
GlobalTracer [baseline] (237.377 ms) : 0, 237377
GlobalTracer [candidate] (238.998 ms) : 0, 238998
AppSec [baseline] (33.693 ms) : 0, 33693
AppSec [candidate] (32.996 ms) : 0, 32996
Debugger [baseline] (59.734 ms) : 0, 59734
Debugger [candidate] (60.5 ms) : 0, 60500
Remote Config [baseline] (542.15 µs) : 0, 542
Remote Config [candidate] (545.307 µs) : 0, 545
Telemetry [baseline] (7.571 ms) : 0, 7571
Telemetry [candidate] (7.614 ms) : 0, 7614
Flare Poller [baseline] (3.491 ms) : 0, 3491
Flare Poller [candidate] (3.502 ms) : 0, 3502
IAST [baseline] (27.656 ms) : 0, 27656
IAST [candidate] (28.383 ms) : 0, 28383

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.102 s) : 0, 1102102
Total [baseline] (10.809 s) : 0, 10808762
Agent [candidate] (1.105 s) : 0, 1104536
Total [candidate] (10.799 s) : 0, 10798548
section appsec
Agent [baseline] (1.285 s) : 0, 1285217
Total [baseline] (11.103 s) : 0, 11103268
Agent [candidate] (1.283 s) : 0, 1282869
Total [candidate] (11.165 s) : 0, 11164525
section iast
Agent [baseline] (1.242 s) : 0, 1241717
Total [baseline] (11.243 s) : 0, 11243402
Agent [candidate] (1.246 s) : 0, 1246196
Total [candidate] (11.314 s) : 0, 11313722
section profiling
Agent [baseline] (1.24 s) : 0, 1239603
Total [baseline] (11.199 s) : 0, 11198942
Agent [candidate] (1.235 s) : 0, 1235438
Total [candidate] (11.18 s) : 0, 11180137

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.102 s	-
Agent	appsec	1.285 s	183.115 ms (16.6%)
Agent	iast	1.242 s	139.615 ms (12.7%)
Agent	profiling	1.24 s	137.501 ms (12.5%)
Total	tracing	10.809 s	-
Total	appsec	11.103 s	294.506 ms (2.7%)
Total	iast	11.243 s	434.64 ms (4.0%)
Total	profiling	11.199 s	390.18 ms (3.6%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.105 s	-
Agent	appsec	1.283 s	178.333 ms (16.1%)
Agent	iast	1.246 s	141.66 ms (12.8%)
Agent	profiling	1.235 s	130.902 ms (11.9%)
Total	tracing	10.799 s	-
Total	appsec	11.165 s	365.977 ms (3.4%)
Total	iast	11.314 s	515.174 ms (4.8%)
Total	profiling	11.18 s	381.589 ms (3.5%)

gantt
    title petclinic - break down per module: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.447 ms) : 0, 1447
crashtracking [candidate] (1.457 ms) : 0, 1457
BytebuddyAgent [baseline] (707.943 ms) : 0, 707943
BytebuddyAgent [candidate] (709.213 ms) : 0, 709213
GlobalTracer [baseline] (249.181 ms) : 0, 249181
GlobalTracer [candidate] (249.773 ms) : 0, 249773
AppSec [baseline] (32.062 ms) : 0, 32062
AppSec [candidate] (32.15 ms) : 0, 32150
Debugger [baseline] (63.895 ms) : 0, 63895
Debugger [candidate] (64.363 ms) : 0, 64363
Remote Config [baseline] (642.997 µs) : 0, 643
Remote Config [candidate] (645.992 µs) : 0, 646
Telemetry [baseline] (8.312 ms) : 0, 8312
Telemetry [candidate] (8.26 ms) : 0, 8260
Flare Poller [baseline] (3.749 ms) : 0, 3749
Flare Poller [candidate] (3.688 ms) : 0, 3688
section appsec
crashtracking [baseline] (1.472 ms) : 0, 1472
crashtracking [candidate] (1.452 ms) : 0, 1452
BytebuddyAgent [baseline] (733.206 ms) : 0, 733206
BytebuddyAgent [candidate] (732.779 ms) : 0, 732779
GlobalTracer [baseline] (240.863 ms) : 0, 240863
GlobalTracer [candidate] (240.533 ms) : 0, 240533
AppSec [baseline] (175.12 ms) : 0, 175120
AppSec [candidate] (173.885 ms) : 0, 173885
Debugger [baseline] (61.806 ms) : 0, 61806
Debugger [candidate] (61.642 ms) : 0, 61642
Remote Config [baseline] (672.474 µs) : 0, 672
Remote Config [candidate] (673.269 µs) : 0, 673
Telemetry [baseline] (8.185 ms) : 0, 8185
Telemetry [candidate] (8.32 ms) : 0, 8320
Flare Poller [baseline] (3.993 ms) : 0, 3993
Flare Poller [candidate] (3.948 ms) : 0, 3948
IAST [baseline] (24.9 ms) : 0, 24900
IAST [candidate] (24.629 ms) : 0, 24629
section iast
crashtracking [baseline] (1.454 ms) : 0, 1454
crashtracking [candidate] (1.444 ms) : 0, 1444
BytebuddyAgent [baseline] (833.351 ms) : 0, 833351
BytebuddyAgent [candidate] (836.908 ms) : 0, 836908
GlobalTracer [baseline] (238.153 ms) : 0, 238153
GlobalTracer [candidate] (238.176 ms) : 0, 238176
AppSec [baseline] (30.323 ms) : 0, 30323
AppSec [candidate] (33.963 ms) : 0, 33963
Debugger [baseline] (60.729 ms) : 0, 60729
Debugger [candidate] (61.265 ms) : 0, 61265
Remote Config [baseline] (544.195 µs) : 0, 544
Remote Config [candidate] (553.171 µs) : 0, 553
Telemetry [baseline] (7.611 ms) : 0, 7611
Telemetry [candidate] (7.685 ms) : 0, 7685
Flare Poller [baseline] (3.462 ms) : 0, 3462
Flare Poller [candidate] (3.542 ms) : 0, 3542
IAST [baseline] (31.237 ms) : 0, 31237
IAST [candidate] (27.694 ms) : 0, 27694
section profiling
ProfilingAgent [baseline] (97.877 ms) : 0, 97877
ProfilingAgent [candidate] (96.653 ms) : 0, 96653
crashtracking [baseline] (1.439 ms) : 0, 1439
crashtracking [candidate] (1.429 ms) : 0, 1429
BytebuddyAgent [baseline] (737.217 ms) : 0, 737217
BytebuddyAgent [candidate] (736.838 ms) : 0, 736838
GlobalTracer [baseline] (224.443 ms) : 0, 224443
GlobalTracer [candidate] (223.215 ms) : 0, 223215
AppSec [baseline] (32.61 ms) : 0, 32610
AppSec [candidate] (32.191 ms) : 0, 32191
Debugger [baseline] (63.628 ms) : 0, 63628
Debugger [candidate] (63.027 ms) : 0, 63027
Remote Config [baseline] (660.251 µs) : 0, 660
Remote Config [candidate] (644.345 µs) : 0, 644
Telemetry [baseline] (8.053 ms) : 0, 8053
Telemetry [candidate] (7.945 ms) : 0, 7945
Flare Poller [baseline] (3.848 ms) : 0, 3848
Flare Poller [candidate] (3.765 ms) : 0, 3765
Profiling [baseline] (98.471 ms) : 0, 98471
Profiling [candidate] (97.232 ms) : 0, 97232

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	dougqh/pending-trace-contention-reduction
git_commit_date	1763665690	1763666268
git_commit_sha	`53aa83c`	`f1fcdc0`
release_version	1.57.0-SNAPSHOT~53aa83cb56	1.56.0-SNAPSHOT~f1fcdc0d78

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1763668710	1763668710
ci_job_id	1248773089	1248773089
ci_pipeline_id	83451040	83451040
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-q6zckk9f 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-q6zckk9f 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 17 metrics, 17 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:petclinic:profiling:high_load	worse [+0.557ms; +1.825ms] or [+2.816%; +9.223%]	unsure [+0.535ms; +2.616ms] or [+1.700%; +8.310%]	unstable [-37.916op/s; +11.103op/s] or [-16.264%; +4.763%]	20.978ms	33.056ms	219.719op/s	19.787ms	31.480ms	233.125op/s
scenario:load:petclinic:appsec:high_load	worse [+0.732ms; +1.325ms] or [+3.997%; +7.233%]	unsure [+318.548µs; +1657.235µs] or [+1.066%; +5.546%]	unstable [-37.342op/s; +13.404op/s] or [-14.976%; +5.376%]	19.349ms	30.870ms	237.375op/s	18.321ms	29.882ms	249.344op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.268 ms) : 18080, 18456
.   : milestone, 18268,
appsec (18.719 ms) : 18529, 18909
.   : milestone, 18719,
code_origins (18.073 ms) : 17891, 18256
.   : milestone, 18073,
iast (17.83 ms) : 17653, 18006
.   : milestone, 17830,
profiling (20.025 ms) : 19816, 20234
.   : milestone, 20025,
tracing (17.68 ms) : 17503, 17857
.   : milestone, 17680,
section candidate
no_agent (17.383 ms) : 17207, 17558
.   : milestone, 17383,
appsec (19.666 ms) : 19466, 19866
.   : milestone, 19666,
code_origins (17.583 ms) : 17408, 17759
.   : milestone, 17583,
iast (17.556 ms) : 17380, 17733
.   : milestone, 17556,
profiling (21.256 ms) : 21041, 21471
.   : milestone, 21256,
tracing (17.805 ms) : 17627, 17983
.   : milestone, 17805,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	18.268 ms [18.08 ms, 18.456 ms]	-
appsec	18.719 ms [18.529 ms, 18.909 ms]	450.739 µs (2.5%)
code_origins	18.073 ms [17.891 ms, 18.256 ms]	-194.614 µs (-1.1%)
iast	17.83 ms [17.653 ms, 18.006 ms]	-438.266 µs (-2.4%)
profiling	20.025 ms [19.816 ms, 20.234 ms]	1.757 ms (9.6%)
tracing	17.68 ms [17.503 ms, 17.857 ms]	-588.051 µs (-3.2%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	17.383 ms [17.207 ms, 17.558 ms]	-
appsec	19.666 ms [19.466 ms, 19.866 ms]	2.283 ms (13.1%)
code_origins	17.583 ms [17.408 ms, 17.759 ms]	200.852 µs (1.2%)
iast	17.556 ms [17.38 ms, 17.733 ms]	173.773 µs (1.0%)
profiling	21.256 ms [21.041 ms, 21.471 ms]	3.873 ms (22.3%)
tracing	17.805 ms [17.627 ms, 17.983 ms]	422.251 µs (2.4%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.262 ms) : 1249, 1275
.   : milestone, 1262,
iast (3.256 ms) : 3212, 3301
.   : milestone, 3256,
iast_FULL (5.618 ms) : 5563, 5673
.   : milestone, 5618,
iast_GLOBAL (3.542 ms) : 3489, 3595
.   : milestone, 3542,
profiling (2.074 ms) : 2056, 2092
.   : milestone, 2074,
tracing (1.826 ms) : 1811, 1841
.   : milestone, 1826,
section candidate
no_agent (1.186 ms) : 1175, 1197
.   : milestone, 1186,
iast (3.284 ms) : 3237, 3331
.   : milestone, 3284,
iast_FULL (5.8 ms) : 5743, 5858
.   : milestone, 5800,
iast_GLOBAL (3.568 ms) : 3516, 3619
.   : milestone, 3568,
profiling (2.006 ms) : 1989, 2023
.   : milestone, 2006,
tracing (1.81 ms) : 1795, 1825
.   : milestone, 1810,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.262 ms [1.249 ms, 1.275 ms]	-
iast	3.256 ms [3.212 ms, 3.301 ms]	1.994 ms (158.0%)
iast_FULL	5.618 ms [5.563 ms, 5.673 ms]	4.356 ms (345.1%)
iast_GLOBAL	3.542 ms [3.489 ms, 3.595 ms]	2.28 ms (180.6%)
profiling	2.074 ms [2.056 ms, 2.092 ms]	811.458 µs (64.3%)
tracing	1.826 ms [1.811 ms, 1.841 ms]	563.616 µs (44.7%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.186 ms [1.175 ms, 1.197 ms]	-
iast	3.284 ms [3.237 ms, 3.331 ms]	2.098 ms (176.9%)
iast_FULL	5.8 ms [5.743 ms, 5.858 ms]	4.614 ms (388.9%)
iast_GLOBAL	3.568 ms [3.516 ms, 3.619 ms]	2.381 ms (200.7%)
profiling	2.006 ms [1.989 ms, 2.023 ms]	819.76 µs (69.1%)
tracing	1.81 ms [1.795 ms, 1.825 ms]	623.901 µs (52.6%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	dougqh/pending-trace-contention-reduction
git_commit_date	1763665690	1763666268
git_commit_sha	`53aa83c`	`f1fcdc0`
release_version	1.57.0-SNAPSHOT~53aa83cb56	1.56.0-SNAPSHOT~f1fcdc0d78

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1763668362	1763668362
ci_job_id	1248773092	1248773092
ci_pipeline_id	83451040	83451040
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-pwt21l14 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-pwt21l14 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics.

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.483 ms) : 1472, 1495
.   : milestone, 1483,
appsec (3.691 ms) : 3475, 3907
.   : milestone, 3691,
iast (2.239 ms) : 2174, 2305
.   : milestone, 2239,
iast_GLOBAL (2.279 ms) : 2213, 2345
.   : milestone, 2279,
profiling (2.091 ms) : 2038, 2145
.   : milestone, 2091,
tracing (2.06 ms) : 2008, 2111
.   : milestone, 2060,
section candidate
no_agent (1.482 ms) : 1471, 1494
.   : milestone, 1482,
appsec (3.66 ms) : 3444, 3876
.   : milestone, 3660,
iast (2.222 ms) : 2157, 2287
.   : milestone, 2222,
iast_GLOBAL (2.269 ms) : 2204, 2335
.   : milestone, 2269,
profiling (2.506 ms) : 2345, 2666
.   : milestone, 2506,
tracing (2.065 ms) : 2014, 2116
.   : milestone, 2065,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.483 ms [1.472 ms, 1.495 ms]	-
appsec	3.691 ms [3.475 ms, 3.907 ms]	2.208 ms (148.8%)
iast	2.239 ms [2.174 ms, 2.305 ms]	756.199 µs (51.0%)
iast_GLOBAL	2.279 ms [2.213 ms, 2.345 ms]	795.804 µs (53.7%)
profiling	2.091 ms [2.038 ms, 2.145 ms]	608.188 µs (41.0%)
tracing	2.06 ms [2.008 ms, 2.111 ms]	576.387 µs (38.9%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.482 ms [1.471 ms, 1.494 ms]	-
appsec	3.66 ms [3.444 ms, 3.876 ms]	2.178 ms (147.0%)
iast	2.222 ms [2.157 ms, 2.287 ms]	739.816 µs (49.9%)
iast_GLOBAL	2.269 ms [2.204 ms, 2.335 ms]	787.158 µs (53.1%)
profiling	2.506 ms [2.345 ms, 2.666 ms]	1.024 ms (69.1%)
tracing	2.065 ms [2.014 ms, 2.116 ms]	582.914 µs (39.3%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.019 s) : 15019000, 15019000
.   : milestone, 15019000,
appsec (14.985 s) : 14985000, 14985000
.   : milestone, 14985000,
iast (18.562 s) : 18562000, 18562000
.   : milestone, 18562000,
iast_GLOBAL (18.083 s) : 18083000, 18083000
.   : milestone, 18083000,
profiling (14.768 s) : 14768000, 14768000
.   : milestone, 14768000,
tracing (14.841 s) : 14841000, 14841000
.   : milestone, 14841000,
section candidate
no_agent (15.774 s) : 15774000, 15774000
.   : milestone, 15774000,
appsec (14.761 s) : 14761000, 14761000
.   : milestone, 14761000,
iast (18.349 s) : 18349000, 18349000
.   : milestone, 18349000,
iast_GLOBAL (17.956 s) : 17956000, 17956000
.   : milestone, 17956000,
profiling (14.608 s) : 14608000, 14608000
.   : milestone, 14608000,
tracing (15.077 s) : 15077000, 15077000
.   : milestone, 15077000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.019 s [15.019 s, 15.019 s]	-
appsec	14.985 s [14.985 s, 14.985 s]	-34.0 ms (-0.2%)
iast	18.562 s [18.562 s, 18.562 s]	3.543 s (23.6%)
iast_GLOBAL	18.083 s [18.083 s, 18.083 s]	3.064 s (20.4%)
profiling	14.768 s [14.768 s, 14.768 s]	-251.0 ms (-1.7%)
tracing	14.841 s [14.841 s, 14.841 s]	-178.0 ms (-1.2%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.774 s [15.774 s, 15.774 s]	-
appsec	14.761 s [14.761 s, 14.761 s]	-1.013 s (-6.4%)
iast	18.349 s [18.349 s, 18.349 s]	2.575 s (16.3%)
iast_GLOBAL	17.956 s [17.956 s, 17.956 s]	2.182 s (13.8%)
profiling	14.608 s [14.608 s, 14.608 s]	-1.166 s (-7.4%)
tracing	15.077 s [15.077 s, 15.077 s]	-697.0 ms (-4.4%)

bric3 · 2025-11-12T16:08:00Z

dd-trace-core/src/main/java/datadog/trace/core/PendingTrace.java

  }

-  private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
+  private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {


nitpick: Maybe rename addedSpan for allowPartialWrite

I was thinking of this more as indicating what action had just been performed, so more like an event.
And then letting decrementRefAndMaybeWrite use the "event" information to determine which checks to perform.

Maybe there will be another situation where we'llhandle finish span vs scope close differently, although, probably not.

bric3 · 2025-11-12T17:07:36Z

dd-trace-core/src/main/java/datadog/trace/core/PendingTrace.java

  }

-  private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
+  private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {


This looks to me like clever trick. This changes a bit the write dynamic, where the next chance to write is when a new span is added, or when the root span is finished (and the other queueing states). I believe this is good. I've seen some instrumentations like aerospike that explicitly cancel the "continuation", but I don't think this is an issue.

AlexeyKuznetsov-DD · 2025-11-12T20:30:59Z

dd-trace-core/src/main/java/datadog/trace/core/PendingTrace.java

+      // DQH - We only trigger a partial flush, when a span has just been added
+      // This prevents a bunch of threads which are only performing scope/context operations
+      // from all fighting to perform the partialFlush after the threshold is crossed.
+
+      // This is an important optimization for virtual threads where a continuation might
+      // be created even though no span is created.  In that situation, virtual threads
+      // can end up fighting to perform the partialFlush.  And even trying to perform a
+      // partialFlush requires taking the PendingTrace lock which can lead to unmounting
+      // the virtual thread from its carrier thread.


I do not know whole picture, but just from my experience, what if we should not fight for flush at all?
Maybe we can refactor logic that all threads that interested in flushing would just set some flag to true and some background tread would check it and periodically flush data?
Does it make sense at all?

Maybe instead of boolean flag, counter would be better solution to implement. Also it would be useful info to know how many times flush was requested.

Yes, I'm inclined to agree. This was mostly intended as a quick fix / experiment to see if we could improve the reported issue. In my macrobenchmark, I did see a 2% throughput improvement but haven't yet replicated the stall that was reported.

I do like the idea of flipping a boolean. I also don't like that we're taking a long held lock in the application critical path, so there's definitely still a lot of room for improvement.

dougqh · 2025-11-13T20:24:14Z

I think it's a good optimization.

Looking forward to seeing the microbenchmark results, I suspect it will show a positive improvement when there are a lot of context migrations.

I did a quick stand-alone macrobenchmark. The macrobenchmark shows a modest but consistent 2% reduction in execution time.

* Attempting to reduce contention for virtual threads * spotless

dougqh added 2 commits November 11, 2025 15:10

Attempting to reduce contention for virtual threads

f59041e

spotless

34a6e6b

dougqh requested a review from a team as a code owner November 11, 2025 20:18

dougqh requested a review from smola November 11, 2025 20:18

dougqh added comp: core Tracer core tag: performance Performance related changes labels Nov 11, 2025

dougqh commented Nov 11, 2025

View reviewed changes

mcculls approved these changes Nov 11, 2025

View reviewed changes

mcculls added the type: enhancement Enhancements and improvements label Nov 11, 2025

bric3 approved these changes Nov 12, 2025

View reviewed changes

AlexeyKuznetsov-DD reviewed Nov 12, 2025

View reviewed changes

Merge branch 'master' into dougqh/pending-trace-contention-reduction

e61397a

dougqh enabled auto-merge (squash) November 20, 2025 18:55

Merge branch 'master' into dougqh/pending-trace-contention-reduction

f1fcdc0

dougqh merged commit f3f8d86 into master Nov 20, 2025
538 checks passed

dougqh deleted the dougqh/pending-trace-contention-reduction branch November 20, 2025 20:19

github-actions bot added this to the 1.57.0 milestone Nov 20, 2025

amarziali pushed a commit that referenced this pull request Nov 21, 2025

Reduce PendingTrace Lock Contention (#9932)

9fb8f43

* Attempting to reduce contention for virtual threads * spotless

Reduce PendingTrace Lock Contention #9932

Reduce PendingTrace Lock Contention #9932

Uh oh!

Conversation

dougqh commented Nov 11, 2025

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

github-actions bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

datadog-datadog-prod-us1 bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcculls left a comment

Choose a reason for hiding this comment

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dougqh commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Nov 11, 2025 •

edited

Loading

datadog-datadog-prod-us1 bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading