Skip to content

Conversation

@dougqh
Copy link
Contributor

@dougqh dougqh commented Nov 11, 2025

What Does This Do

Aims to reduce lock contention in PendingTrace by only attempting partialFlush when a span has just been added to PendingTrace.

Prior to this change, we would also attempt a partialFlush after scope/context close as well, but closing a scope cannot cause us to cross the partialFlush threshold.

The theory is that this will improve our lock contention with virtual threads.
The concern is that virtual threads are often only restoring context, but then not creating a span.
That can lead the virtual thread to attempt a partialFlush which requires taking the PendingTrace lock.
If the PendingTrace lock cannot be acquired, then the virtual thread will be unmounted from its carrier thread.

Motivation

Report of high overhead and lock contention when using virtual threads

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@dougqh dougqh requested a review from a team as a code owner November 11, 2025 20:18
@dougqh dougqh requested a review from smola November 11, 2025 20:18
@github-actions
Copy link
Contributor

github-actions bot commented Nov 11, 2025

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@dougqh dougqh added comp: core Tracer core tag: performance Performance related changes labels Nov 11, 2025
}

private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, I'm curious what others think of this potential change.
I'm intending to write a microbenchmark to see if I can verify that this change is profitable.
I also think I can write a test verifies the PendingTrace behavior by using a custom writer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me like clever trick. This changes a bit the write dynamic, where the next chance to write is when a new span is added, or when the root span is finished (and the other queueing states). I believe this is good. I've seen some instrumentations like aerospike that explicitly cancel the "continuation", but I don't think this is an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure if this addresses the reported issue.
However, it does cut my macrobenchmark by 2-3%. Given that my macrobenchmark uses @Trace annotations which are rather heavy, I suspect the gains might be larger with typical auto-instrumentation.

@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Nov 11, 2025

🎯 Code Coverage
Patch Coverage: 100.00%
Total Coverage: 63.18% (+3.59%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: f1fcdc0 | Docs | Datadog PR Page | Was this helpful? Give us feedback!

Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good optimization.

Looking forward to seeing the microbenchmark results, I suspect it will show a positive improvement when there are a lot of context migrations.

@mcculls mcculls added the type: enhancement Enhancements and improvements label Nov 11, 2025
@pr-commenter
Copy link

pr-commenter bot commented Nov 11, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/pending-trace-contention-reduction
git_commit_date 1763665690 1763666268
git_commit_sha 53aa83c f1fcdc0
release_version 1.57.0-SNAPSHOT~53aa83cb56 1.56.0-SNAPSHOT~f1fcdc0d78
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1763668349 1763668349
ci_job_id 1248773085 1248773085
ci_pipeline_id 83451040 83451040
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-uwa22qm0 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-uwa22qm0 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 60 metrics, 5 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.109 s) : 0, 1108951
Total [baseline] (8.876 s) : 0, 8875728
Agent [candidate] (1.11 s) : 0, 1110429
Total [candidate] (8.837 s) : 0, 8836847
section iast
Agent [baseline] (1.237 s) : 0, 1237217
Total [baseline] (9.542 s) : 0, 9541891
Agent [candidate] (1.251 s) : 0, 1250935
Total [candidate] (9.594 s) : 0, 9593815
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.109 s -
Agent iast 1.237 s 128.266 ms (11.6%)
Total tracing 8.876 s -
Total iast 9.542 s 666.163 ms (7.5%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.11 s -
Agent iast 1.251 s 140.506 ms (12.7%)
Total tracing 8.837 s -
Total iast 9.594 s 756.967 ms (8.6%)
gantt
    title insecure-bank - break down per module: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.479 ms) : 0, 1479
crashtracking [candidate] (1.45 ms) : 0, 1450
BytebuddyAgent [baseline] (712.493 ms) : 0, 712493
BytebuddyAgent [candidate] (713.996 ms) : 0, 713996
GlobalTracer [baseline] (251.02 ms) : 0, 251020
GlobalTracer [candidate] (251.104 ms) : 0, 251104
AppSec [baseline] (32.472 ms) : 0, 32472
AppSec [candidate] (32.411 ms) : 0, 32411
Debugger [baseline] (63.708 ms) : 0, 63708
Debugger [candidate] (63.762 ms) : 0, 63762
Remote Config [baseline] (629.965 µs) : 0, 630
Remote Config [candidate] (637.73 µs) : 0, 638
Telemetry [baseline] (8.371 ms) : 0, 8371
Telemetry [candidate] (8.343 ms) : 0, 8343
Flare Poller [baseline] (3.741 ms) : 0, 3741
Flare Poller [candidate] (3.707 ms) : 0, 3707
section iast
crashtracking [baseline] (1.447 ms) : 0, 1447
crashtracking [candidate] (1.459 ms) : 0, 1459
BytebuddyAgent [baseline] (830.857 ms) : 0, 830857
BytebuddyAgent [candidate] (841.792 ms) : 0, 841792
GlobalTracer [baseline] (237.377 ms) : 0, 237377
GlobalTracer [candidate] (238.998 ms) : 0, 238998
AppSec [baseline] (33.693 ms) : 0, 33693
AppSec [candidate] (32.996 ms) : 0, 32996
Debugger [baseline] (59.734 ms) : 0, 59734
Debugger [candidate] (60.5 ms) : 0, 60500
Remote Config [baseline] (542.15 µs) : 0, 542
Remote Config [candidate] (545.307 µs) : 0, 545
Telemetry [baseline] (7.571 ms) : 0, 7571
Telemetry [candidate] (7.614 ms) : 0, 7614
Flare Poller [baseline] (3.491 ms) : 0, 3491
Flare Poller [candidate] (3.502 ms) : 0, 3502
IAST [baseline] (27.656 ms) : 0, 27656
IAST [candidate] (28.383 ms) : 0, 28383
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.102 s) : 0, 1102102
Total [baseline] (10.809 s) : 0, 10808762
Agent [candidate] (1.105 s) : 0, 1104536
Total [candidate] (10.799 s) : 0, 10798548
section appsec
Agent [baseline] (1.285 s) : 0, 1285217
Total [baseline] (11.103 s) : 0, 11103268
Agent [candidate] (1.283 s) : 0, 1282869
Total [candidate] (11.165 s) : 0, 11164525
section iast
Agent [baseline] (1.242 s) : 0, 1241717
Total [baseline] (11.243 s) : 0, 11243402
Agent [candidate] (1.246 s) : 0, 1246196
Total [candidate] (11.314 s) : 0, 11313722
section profiling
Agent [baseline] (1.24 s) : 0, 1239603
Total [baseline] (11.199 s) : 0, 11198942
Agent [candidate] (1.235 s) : 0, 1235438
Total [candidate] (11.18 s) : 0, 11180137
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.102 s -
Agent appsec 1.285 s 183.115 ms (16.6%)
Agent iast 1.242 s 139.615 ms (12.7%)
Agent profiling 1.24 s 137.501 ms (12.5%)
Total tracing 10.809 s -
Total appsec 11.103 s 294.506 ms (2.7%)
Total iast 11.243 s 434.64 ms (4.0%)
Total profiling 11.199 s 390.18 ms (3.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.105 s -
Agent appsec 1.283 s 178.333 ms (16.1%)
Agent iast 1.246 s 141.66 ms (12.8%)
Agent profiling 1.235 s 130.902 ms (11.9%)
Total tracing 10.799 s -
Total appsec 11.165 s 365.977 ms (3.4%)
Total iast 11.314 s 515.174 ms (4.8%)
Total profiling 11.18 s 381.589 ms (3.5%)
gantt
    title petclinic - break down per module: candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.447 ms) : 0, 1447
crashtracking [candidate] (1.457 ms) : 0, 1457
BytebuddyAgent [baseline] (707.943 ms) : 0, 707943
BytebuddyAgent [candidate] (709.213 ms) : 0, 709213
GlobalTracer [baseline] (249.181 ms) : 0, 249181
GlobalTracer [candidate] (249.773 ms) : 0, 249773
AppSec [baseline] (32.062 ms) : 0, 32062
AppSec [candidate] (32.15 ms) : 0, 32150
Debugger [baseline] (63.895 ms) : 0, 63895
Debugger [candidate] (64.363 ms) : 0, 64363
Remote Config [baseline] (642.997 µs) : 0, 643
Remote Config [candidate] (645.992 µs) : 0, 646
Telemetry [baseline] (8.312 ms) : 0, 8312
Telemetry [candidate] (8.26 ms) : 0, 8260
Flare Poller [baseline] (3.749 ms) : 0, 3749
Flare Poller [candidate] (3.688 ms) : 0, 3688
section appsec
crashtracking [baseline] (1.472 ms) : 0, 1472
crashtracking [candidate] (1.452 ms) : 0, 1452
BytebuddyAgent [baseline] (733.206 ms) : 0, 733206
BytebuddyAgent [candidate] (732.779 ms) : 0, 732779
GlobalTracer [baseline] (240.863 ms) : 0, 240863
GlobalTracer [candidate] (240.533 ms) : 0, 240533
AppSec [baseline] (175.12 ms) : 0, 175120
AppSec [candidate] (173.885 ms) : 0, 173885
Debugger [baseline] (61.806 ms) : 0, 61806
Debugger [candidate] (61.642 ms) : 0, 61642
Remote Config [baseline] (672.474 µs) : 0, 672
Remote Config [candidate] (673.269 µs) : 0, 673
Telemetry [baseline] (8.185 ms) : 0, 8185
Telemetry [candidate] (8.32 ms) : 0, 8320
Flare Poller [baseline] (3.993 ms) : 0, 3993
Flare Poller [candidate] (3.948 ms) : 0, 3948
IAST [baseline] (24.9 ms) : 0, 24900
IAST [candidate] (24.629 ms) : 0, 24629
section iast
crashtracking [baseline] (1.454 ms) : 0, 1454
crashtracking [candidate] (1.444 ms) : 0, 1444
BytebuddyAgent [baseline] (833.351 ms) : 0, 833351
BytebuddyAgent [candidate] (836.908 ms) : 0, 836908
GlobalTracer [baseline] (238.153 ms) : 0, 238153
GlobalTracer [candidate] (238.176 ms) : 0, 238176
AppSec [baseline] (30.323 ms) : 0, 30323
AppSec [candidate] (33.963 ms) : 0, 33963
Debugger [baseline] (60.729 ms) : 0, 60729
Debugger [candidate] (61.265 ms) : 0, 61265
Remote Config [baseline] (544.195 µs) : 0, 544
Remote Config [candidate] (553.171 µs) : 0, 553
Telemetry [baseline] (7.611 ms) : 0, 7611
Telemetry [candidate] (7.685 ms) : 0, 7685
Flare Poller [baseline] (3.462 ms) : 0, 3462
Flare Poller [candidate] (3.542 ms) : 0, 3542
IAST [baseline] (31.237 ms) : 0, 31237
IAST [candidate] (27.694 ms) : 0, 27694
section profiling
ProfilingAgent [baseline] (97.877 ms) : 0, 97877
ProfilingAgent [candidate] (96.653 ms) : 0, 96653
crashtracking [baseline] (1.439 ms) : 0, 1439
crashtracking [candidate] (1.429 ms) : 0, 1429
BytebuddyAgent [baseline] (737.217 ms) : 0, 737217
BytebuddyAgent [candidate] (736.838 ms) : 0, 736838
GlobalTracer [baseline] (224.443 ms) : 0, 224443
GlobalTracer [candidate] (223.215 ms) : 0, 223215
AppSec [baseline] (32.61 ms) : 0, 32610
AppSec [candidate] (32.191 ms) : 0, 32191
Debugger [baseline] (63.628 ms) : 0, 63628
Debugger [candidate] (63.027 ms) : 0, 63027
Remote Config [baseline] (660.251 µs) : 0, 660
Remote Config [candidate] (644.345 µs) : 0, 644
Telemetry [baseline] (8.053 ms) : 0, 8053
Telemetry [candidate] (7.945 ms) : 0, 7945
Flare Poller [baseline] (3.848 ms) : 0, 3848
Flare Poller [candidate] (3.765 ms) : 0, 3765
Profiling [baseline] (98.471 ms) : 0, 98471
Profiling [candidate] (97.232 ms) : 0, 97232
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/pending-trace-contention-reduction
git_commit_date 1763665690 1763666268
git_commit_sha 53aa83c f1fcdc0
release_version 1.57.0-SNAPSHOT~53aa83cb56 1.56.0-SNAPSHOT~f1fcdc0d78
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1763668710 1763668710
ci_job_id 1248773089 1248773089
ci_pipeline_id 83451040 83451040
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-q6zckk9f 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-q6zckk9f 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 17 metrics, 17 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:profiling:high_load worse
[+0.557ms; +1.825ms] or [+2.816%; +9.223%]
unsure
[+0.535ms; +2.616ms] or [+1.700%; +8.310%]
unstable
[-37.916op/s; +11.103op/s] or [-16.264%; +4.763%]
20.978ms 33.056ms 219.719op/s 19.787ms 31.480ms 233.125op/s
scenario:load:petclinic:appsec:high_load worse
[+0.732ms; +1.325ms] or [+3.997%; +7.233%]
unsure
[+318.548µs; +1657.235µs] or [+1.066%; +5.546%]
unstable
[-37.342op/s; +13.404op/s] or [-14.976%; +5.376%]
19.349ms 30.870ms 237.375op/s 18.321ms 29.882ms 249.344op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.268 ms) : 18080, 18456
.   : milestone, 18268,
appsec (18.719 ms) : 18529, 18909
.   : milestone, 18719,
code_origins (18.073 ms) : 17891, 18256
.   : milestone, 18073,
iast (17.83 ms) : 17653, 18006
.   : milestone, 17830,
profiling (20.025 ms) : 19816, 20234
.   : milestone, 20025,
tracing (17.68 ms) : 17503, 17857
.   : milestone, 17680,
section candidate
no_agent (17.383 ms) : 17207, 17558
.   : milestone, 17383,
appsec (19.666 ms) : 19466, 19866
.   : milestone, 19666,
code_origins (17.583 ms) : 17408, 17759
.   : milestone, 17583,
iast (17.556 ms) : 17380, 17733
.   : milestone, 17556,
profiling (21.256 ms) : 21041, 21471
.   : milestone, 21256,
tracing (17.805 ms) : 17627, 17983
.   : milestone, 17805,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.268 ms [18.08 ms, 18.456 ms] -
appsec 18.719 ms [18.529 ms, 18.909 ms] 450.739 µs (2.5%)
code_origins 18.073 ms [17.891 ms, 18.256 ms] -194.614 µs (-1.1%)
iast 17.83 ms [17.653 ms, 18.006 ms] -438.266 µs (-2.4%)
profiling 20.025 ms [19.816 ms, 20.234 ms] 1.757 ms (9.6%)
tracing 17.68 ms [17.503 ms, 17.857 ms] -588.051 µs (-3.2%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 17.383 ms [17.207 ms, 17.558 ms] -
appsec 19.666 ms [19.466 ms, 19.866 ms] 2.283 ms (13.1%)
code_origins 17.583 ms [17.408 ms, 17.759 ms] 200.852 µs (1.2%)
iast 17.556 ms [17.38 ms, 17.733 ms] 173.773 µs (1.0%)
profiling 21.256 ms [21.041 ms, 21.471 ms] 3.873 ms (22.3%)
tracing 17.805 ms [17.627 ms, 17.983 ms] 422.251 µs (2.4%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.262 ms) : 1249, 1275
.   : milestone, 1262,
iast (3.256 ms) : 3212, 3301
.   : milestone, 3256,
iast_FULL (5.618 ms) : 5563, 5673
.   : milestone, 5618,
iast_GLOBAL (3.542 ms) : 3489, 3595
.   : milestone, 3542,
profiling (2.074 ms) : 2056, 2092
.   : milestone, 2074,
tracing (1.826 ms) : 1811, 1841
.   : milestone, 1826,
section candidate
no_agent (1.186 ms) : 1175, 1197
.   : milestone, 1186,
iast (3.284 ms) : 3237, 3331
.   : milestone, 3284,
iast_FULL (5.8 ms) : 5743, 5858
.   : milestone, 5800,
iast_GLOBAL (3.568 ms) : 3516, 3619
.   : milestone, 3568,
profiling (2.006 ms) : 1989, 2023
.   : milestone, 2006,
tracing (1.81 ms) : 1795, 1825
.   : milestone, 1810,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.262 ms [1.249 ms, 1.275 ms] -
iast 3.256 ms [3.212 ms, 3.301 ms] 1.994 ms (158.0%)
iast_FULL 5.618 ms [5.563 ms, 5.673 ms] 4.356 ms (345.1%)
iast_GLOBAL 3.542 ms [3.489 ms, 3.595 ms] 2.28 ms (180.6%)
profiling 2.074 ms [2.056 ms, 2.092 ms] 811.458 µs (64.3%)
tracing 1.826 ms [1.811 ms, 1.841 ms] 563.616 µs (44.7%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.186 ms [1.175 ms, 1.197 ms] -
iast 3.284 ms [3.237 ms, 3.331 ms] 2.098 ms (176.9%)
iast_FULL 5.8 ms [5.743 ms, 5.858 ms] 4.614 ms (388.9%)
iast_GLOBAL 3.568 ms [3.516 ms, 3.619 ms] 2.381 ms (200.7%)
profiling 2.006 ms [1.989 ms, 2.023 ms] 819.76 µs (69.1%)
tracing 1.81 ms [1.795 ms, 1.825 ms] 623.901 µs (52.6%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/pending-trace-contention-reduction
git_commit_date 1763665690 1763666268
git_commit_sha 53aa83c f1fcdc0
release_version 1.57.0-SNAPSHOT~53aa83cb56 1.56.0-SNAPSHOT~f1fcdc0d78
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1763668362 1763668362
ci_job_id 1248773092 1248773092
ci_pipeline_id 83451040 83451040
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-pwt21l14 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-pwt21l14 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.483 ms) : 1472, 1495
.   : milestone, 1483,
appsec (3.691 ms) : 3475, 3907
.   : milestone, 3691,
iast (2.239 ms) : 2174, 2305
.   : milestone, 2239,
iast_GLOBAL (2.279 ms) : 2213, 2345
.   : milestone, 2279,
profiling (2.091 ms) : 2038, 2145
.   : milestone, 2091,
tracing (2.06 ms) : 2008, 2111
.   : milestone, 2060,
section candidate
no_agent (1.482 ms) : 1471, 1494
.   : milestone, 1482,
appsec (3.66 ms) : 3444, 3876
.   : milestone, 3660,
iast (2.222 ms) : 2157, 2287
.   : milestone, 2222,
iast_GLOBAL (2.269 ms) : 2204, 2335
.   : milestone, 2269,
profiling (2.506 ms) : 2345, 2666
.   : milestone, 2506,
tracing (2.065 ms) : 2014, 2116
.   : milestone, 2065,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.483 ms [1.472 ms, 1.495 ms] -
appsec 3.691 ms [3.475 ms, 3.907 ms] 2.208 ms (148.8%)
iast 2.239 ms [2.174 ms, 2.305 ms] 756.199 µs (51.0%)
iast_GLOBAL 2.279 ms [2.213 ms, 2.345 ms] 795.804 µs (53.7%)
profiling 2.091 ms [2.038 ms, 2.145 ms] 608.188 µs (41.0%)
tracing 2.06 ms [2.008 ms, 2.111 ms] 576.387 µs (38.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.482 ms [1.471 ms, 1.494 ms] -
appsec 3.66 ms [3.444 ms, 3.876 ms] 2.178 ms (147.0%)
iast 2.222 ms [2.157 ms, 2.287 ms] 739.816 µs (49.9%)
iast_GLOBAL 2.269 ms [2.204 ms, 2.335 ms] 787.158 µs (53.1%)
profiling 2.506 ms [2.345 ms, 2.666 ms] 1.024 ms (69.1%)
tracing 2.065 ms [2.014 ms, 2.116 ms] 582.914 µs (39.3%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~f1fcdc0d78, baseline=1.57.0-SNAPSHOT~53aa83cb56
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.019 s) : 15019000, 15019000
.   : milestone, 15019000,
appsec (14.985 s) : 14985000, 14985000
.   : milestone, 14985000,
iast (18.562 s) : 18562000, 18562000
.   : milestone, 18562000,
iast_GLOBAL (18.083 s) : 18083000, 18083000
.   : milestone, 18083000,
profiling (14.768 s) : 14768000, 14768000
.   : milestone, 14768000,
tracing (14.841 s) : 14841000, 14841000
.   : milestone, 14841000,
section candidate
no_agent (15.774 s) : 15774000, 15774000
.   : milestone, 15774000,
appsec (14.761 s) : 14761000, 14761000
.   : milestone, 14761000,
iast (18.349 s) : 18349000, 18349000
.   : milestone, 18349000,
iast_GLOBAL (17.956 s) : 17956000, 17956000
.   : milestone, 17956000,
profiling (14.608 s) : 14608000, 14608000
.   : milestone, 14608000,
tracing (15.077 s) : 15077000, 15077000
.   : milestone, 15077000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.019 s [15.019 s, 15.019 s] -
appsec 14.985 s [14.985 s, 14.985 s] -34.0 ms (-0.2%)
iast 18.562 s [18.562 s, 18.562 s] 3.543 s (23.6%)
iast_GLOBAL 18.083 s [18.083 s, 18.083 s] 3.064 s (20.4%)
profiling 14.768 s [14.768 s, 14.768 s] -251.0 ms (-1.7%)
tracing 14.841 s [14.841 s, 14.841 s] -178.0 ms (-1.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.774 s [15.774 s, 15.774 s] -
appsec 14.761 s [14.761 s, 14.761 s] -1.013 s (-6.4%)
iast 18.349 s [18.349 s, 18.349 s] 2.575 s (16.3%)
iast_GLOBAL 17.956 s [17.956 s, 17.956 s] 2.182 s (13.8%)
profiling 14.608 s [14.608 s, 14.608 s] -1.166 s (-7.4%)
tracing 15.077 s [15.077 s, 15.077 s] -697.0 ms (-4.4%)

}

private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Maybe rename addedSpan for allowPartialWrite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of this more as indicating what action had just been performed, so more like an event.
And then letting decrementRefAndMaybeWrite use the "event" information to determine which checks to perform.

Maybe there will be another situation where we'llhandle finish span vs scope close differently, although, probably not.

}

private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me like clever trick. This changes a bit the write dynamic, where the next chance to write is when a new span is added, or when the root span is finished (and the other queueing states). I believe this is good. I've seen some instrumentations like aerospike that explicitly cancel the "continuation", but I don't think this is an issue.

Comment on lines +289 to +297
// DQH - We only trigger a partial flush, when a span has just been added
// This prevents a bunch of threads which are only performing scope/context operations
// from all fighting to perform the partialFlush after the threshold is crossed.

// This is an important optimization for virtual threads where a continuation might
// be created even though no span is created. In that situation, virtual threads
// can end up fighting to perform the partialFlush. And even trying to perform a
// partialFlush requires taking the PendingTrace lock which can lead to unmounting
// the virtual thread from its carrier thread.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know whole picture, but just from my experience, what if we should not fight for flush at all?
Maybe we can refactor logic that all threads that interested in flushing would just set some flag to true and some background tread would check it and periodically flush data?
Does it make sense at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of boolean flag, counter would be better solution to implement. Also it would be useful info to know how many times flush was requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm inclined to agree. This was mostly intended as a quick fix / experiment to see if we could improve the reported issue. In my macrobenchmark, I did see a 2% throughput improvement but haven't yet replicated the stall that was reported.

I do like the idea of flipping a boolean. I also don't like that we're taking a long held lock in the application critical path, so there's definitely still a lot of room for improvement.

@dougqh
Copy link
Contributor Author

dougqh commented Nov 13, 2025

I think it's a good optimization.

Looking forward to seeing the microbenchmark results, I suspect it will show a positive improvement when there are a lot of context migrations.

I did a quick stand-alone macrobenchmark. The macrobenchmark shows a modest but consistent 2% reduction in execution time.

@dougqh dougqh enabled auto-merge (squash) November 20, 2025 18:55
@dougqh dougqh merged commit f3f8d86 into master Nov 20, 2025
538 checks passed
@dougqh dougqh deleted the dougqh/pending-trace-contention-reduction branch November 20, 2025 20:19
@github-actions github-actions bot added this to the 1.57.0 milestone Nov 20, 2025
amarziali pushed a commit that referenced this pull request Nov 21, 2025
* Attempting to reduce contention for virtual threads

* spotless
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: performance Performance related changes type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants