Skip to content

Conversation

@p-datadog
Copy link
Member

@p-datadog p-datadog commented Nov 20, 2025

What does this PR do?
Changes telemetry to send events in forked children

Motivation:
Dynamic Instrumentation / Live Debugger require telemetry app-heartbeat events to properly render UI. These events are normally sent from forked children in forking web servers, and presently are missing for most customers.

Change log entry
Yes: fix Live Debugger / Dynamic Instrumentation UI for forking web servers

Additional Notes:
Telemetry has some special logic to deal with app-started/configuration-change events due to the component tree being created multiple times by dd-trace-rb, but system tests expecting a single set of events.

This PR further augments that logic to go back from configuration-change events to app-started in the forked children, which are reported as brand new processes.

How to test the change?

Tests added

@github-actions github-actions bot added core Involves Datadog core libraries profiling Involves Datadog profiling labels Nov 20, 2025
@github-actions
Copy link

github-actions bot commented Nov 20, 2025

Thank you for updating Change log entry section 👏

Visited at: 2025-11-20 18:37:32 UTC

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

Typing analysis

Note: Ignored files are excluded from the next sections.

steep:ignore comments

This PR introduces 1 steep:ignore comment.

steep:ignore comments (+1-0)Introduced:
lib/datadog/core/telemetry/worker.rb:284

Untyped methods

This PR introduces 1 untyped method and 5 partially typed methods, and clears 1 untyped method and 5 partially typed methods. It increases the percentage of typed methods from 54.48% to 54.62% (+0.14%).

Untyped methods (+1-1)Introduced:
sig/datadog/core/telemetry/worker.rbs:61
└── def buffer_klass: () -> untyped
Cleared:
sig/datadog/core/telemetry/worker.rbs:60
└── def buffer_klass: () -> untyped
Partially typed methods (+5-5)Introduced:
sig/datadog/core/telemetry/event/app_started.rbs:19
└── def configuration: (untyped settings, Core::Configuration::AgentSettings agent_settings) -> Array[Hash[Symbol, untyped]]
sig/datadog/core/telemetry/event/app_started.rbs:23
└── def conf_value: (String name, untyped value, Integer seq_id, String origin) -> Hash[Symbol, untyped]
sig/datadog/core/telemetry/event/app_started.rbs:27
└── def install_signature: (untyped settings) -> Hash[Symbol, Object]
sig/datadog/core/telemetry/event/app_started.rbs:29
└── def get_telemetry_origin: (untyped settings, String config_path) -> String
sig/datadog/core/telemetry/event/synth_app_client_configuration_change.rbs:8
└── def payload: () -> { ?products: untyped, configuration: untyped, ?install_signature: untyped }
Cleared:
sig/datadog/core/telemetry/event/app_started.rbs:17
└── def configuration: (untyped settings, Core::Configuration::AgentSettings agent_settings) -> Array[Hash[Symbol, untyped]]
sig/datadog/core/telemetry/event/app_started.rbs:21
└── def conf_value: (String name, untyped value, Integer seq_id, String origin) -> Hash[Symbol, untyped]
sig/datadog/core/telemetry/event/app_started.rbs:25
└── def install_signature: (untyped settings) -> Hash[Symbol, Object]
sig/datadog/core/telemetry/event/app_started.rbs:27
└── def get_telemetry_origin: (untyped settings, String config_path) -> String
sig/datadog/core/telemetry/event/synth_app_client_configuration_change.rbs:8
└── def payload: () -> { configuration: untyped }

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept to the end of the line to remove it from the stats.

@p-datadog p-datadog force-pushed the telemetry-in-children branch from 9176330 to 1e15731 Compare November 20, 2025 18:02
@p-datadog p-datadog marked this pull request as ready for review November 20, 2025 18:37
@p-datadog p-datadog requested review from a team as code owners November 20, 2025 18:37
@p-datadog p-datadog requested a review from mabdinur November 20, 2025 18:37
@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Nov 20, 2025

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage
Patch Coverage: 84.80%
Total Coverage: 95.15% (-0.03%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 114b87e | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@pr-commenter
Copy link

pr-commenter bot commented Nov 20, 2025

Benchmarks

Benchmark execution time: 2025-11-26 02:06:27

Comparing candidate commit 114b87e in PR branch telemetry-in-children with baseline commit ae94edc in branch master.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 43 metrics, 2 unstable metrics.

scenario:profiling - intern_all 1000 repeated strings

  • 🟩 throughput [+2583.423op/s; +2657.976op/s] or [+10.698%; +11.006%]

@mabdinur mabdinur requested a review from khanayan123 November 24, 2025 14:41
p added 3 commits November 24, 2025 10:58
* master:
  [APMAPI-1774] Fix stable config segfault during error handling (#5073)
  Disable Ruby 3.5 preview1 testing
  [🤖] Update Latest Dependency: https://github.com/DataDog/dd-trace-rb/actions/runs/19603081078
  [🤖] Update System Tests: https://github.com/DataDog/dd-trace-rb/actions/runs/19603099930
  [NO-TICKET] Workaround profiling benchmark flakiness on Ruby 2.6
  DI: make a test method longer to avoid flakiness (#5069)
Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment regarding internal changes (refactoring), but the functionality looks sound!

@p-datadog
Copy link
Member Author

I added a new documentation file that contains all of the info I think is relevant to telemetry development (initial event handling, fork handling, event submission prior to worker start).

@p-datadog p-datadog merged commit 18e70e4 into master Nov 27, 2025
544 checks passed
@p-datadog p-datadog deleted the telemetry-in-children branch November 27, 2025 16:53
@github-actions github-actions bot added this to the 2.23.0 milestone Nov 27, 2025
Strech added a commit that referenced this pull request Dec 10, 2025
p-datadog added a commit to p-datadog/dd-trace-rb that referenced this pull request Dec 16, 2025
p-datadog pushed a commit that referenced this pull request Dec 19, 2025
* telemetry-fork-2:
  Move flushing to core worker modules to fix open feature worker tests
  fix test, add another note
  fix worker race and add tests
  Telemetry: send events in forked children (#5074)
p-datadog pushed a commit that referenced this pull request Dec 20, 2025
…ate', 'u/fix-ruby-warnings', 'u/di-repeated-adds', 'u/di-probe-addition', 'u/rc-diagnostics', 'u/di-probe-removal-integration-test', 'u/transport-api-version', 'u/di-duration-flake-3' and 'u/telemetry-fork-2' into base

* u/fix-quoting:
  fix quoting

* u/after-fork-test-state:
  standard
  fix process discovery spec relying on global state
  fix global state dependency in crashtracking test

* u/fix-ruby-warnings:
  fix ruby warnings when accessing undefined instance variables

* u/di-repeated-adds:
  DEBUG-3499 DI: do not instrument when there is already an installed probe with the same id

* u/di-probe-addition:
  type
  fix ruby warning
  type
  standard
  DI: fix accounting when intrumenting upon class definition, add instrumentation leak detector

* u/rc-diagnostics:
  permit version to be missing
  note on custom
  RC: add diagnostics for invalid values

* u/di-probe-removal-integration-test:
  mark as di test
  fix test with RC changes backed out
  DI: rework remote config interface to use changes

* u/transport-api-version:
  type
  type
  Transports: remove api_version

* u/di-duration-flake-3:
  set DI test duration upper bound to 1000 seconds

* u/telemetry-fork-2:
  metrics fix
  expect_in_fork debugging
  fix ruby warnings when accessing undefined instance variables
  explain
  Telemetry: send events in forked children (#5074)
  investigating flaky test
  add assertions
  skip on jruby 9.2
  enable skipped tests potentially reproducing the race
  switch to keyword arg
  Move flushing to core worker modules to fix open feature worker tests
  fix test, add another note
  fix worker race and add tests
p-datadog pushed a commit that referenced this pull request Dec 29, 2025
* commit '10f07a270af2a866cf0b290c4f26b30f0d81a509':
  do everything from at fork monkey patch
  debugging helper
  expect_in_fork debugging
  metrics fix
  Create a helper for platform restriction for forking tests
  explain
  fix ruby warnings when accessing undefined instance variables
  Telemetry: send events in forked children (#5074)
  standard
  standard
  standard
  lets say this is no longer known flaky
  improve jruby exclusion in file descriptor leakage check
  retest
p-datadog pushed a commit that referenced this pull request Dec 29, 2025
* commit '1aa50c469ca15228f2633acc07cde92ee5c027cb':
  update tests
  fix double definition
  do everything from at fork monkey patch
  expect_in_fork debugging
  metrics fix
  Create a helper for platform restriction for forking tests
  explain
  fix ruby warnings when accessing undefined instance variables
  Telemetry: send events in forked children (#5074)
p-datadog pushed a commit that referenced this pull request Dec 30, 2025
…ece2f2598aec3d6944db49eec07d89799ab6'; commit 'ddb2e5ecc4be3c00b3937ee80e89b396abcc69d3' into base

* commit 'bc0dc00cba9d265a980d078b112f957dcaa1372c':
  debug-4548 Increase number of iterations for flakiness

* commit '2727ece2f2598aec3d6944db49eec07d89799ab6':
  improve diagnostics of "leaked" file descriptors for jruby

* commit 'ddb2e5ecc4be3c00b3937ee80e89b396abcc69d3':
  forking platform only
  standard
  specify host explicitly for ci
  standard
  remove debug
  standard
  steep
  remove debug
  update tests
  fix double definition
  do everything from at fork monkey patch
  expect_in_fork debugging
  metrics fix
  Create a helper for platform restriction for forking tests
  explain
  fix ruby warnings when accessing undefined instance variables
  Telemetry: send events in forked children (#5074)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Involves Datadog core libraries profiling Involves Datadog profiling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants