[UserEvents] Add end-to-end runtime test #121316

mdh1418 · 2025-11-03T19:40:13Z

With user_events support added in #115265, this PR looks to test a basic end-to-end user_events scenario.

Alternative testing approaches considered

Existing EventPipe runtime tests

Existing EventPipe tests under src/tests/tracing/eventpipe are incompatible with testing the user_events scenario due to:

Starting EventPipeSessions through DiagnosticClient ❌
DiagnosticClient does not have the support to send the IPC command to start a user_events based EventPipe session, because it requires the user_events_data file descriptor to be sent using SCM_RIGHTS (see https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md#passing_file_descriptor).
Using an EventPipeEventSource to validate events streamed through EventPipe ❌
User_events based EventPipe sessions do not stream events. Instead, events are written to configured TraceFS tracepoints, and currently only RecordTrace from https://github.com/microsoft/one-collect/ is capable of generating .nettrace traces from tracepoint user_events.

Native EventPipe Unit Tests

There are Mono Native EventPipe tests under src/mono/mono/eventpipe/test that are not hooked up to CI. These unit tests are built through linking the shared EventPipe interface library against Mono's EventPipe runtime shims and using Mono's test runner. To update these unit tests into the standard runtime tests structure, a larger investment is needed to either migrate EventPipe from using runtime shims to a OS Pal source shared by coreclr/nativeaot/mono (see #118874 (comment)) or build an EventPipe shared library specifically for the runtime test using a runtime-agnostic shim.
As existing mono unit tests don't currently test IPC commands, coupled with no existing runtime infrastructure to read events from tracepoints, there would be even more work on top of updating mono native eventpipe unit tests to even test the user_events scenario.

End-to-End Testing Added

A low-cost approach to testing .NET Runtime's user_events functionality leverages RecordTrace from https://github.com/microsoft/one-collect/, which is already capable of starting user_events based EventPipe sessions and generating .nettraces. (Note: dotnet-trace wraps around RecordTrace)
Despite adding an external dependency which allows RecordTrace failures to fail the end-to-end test, user_events was initially added with the intent to depend on RecordTrace for the end-to-end scenario, and there are no other ways to functionally test a user_events based eventpipe session.

Approach

Start Tracee app
Start tracing with RecordTrace + dotnet-common profile script
Stop RecordTrace (triggers .nettrace generation) and Tracee app
Validate the .nettrace for particular events from Tracee app

Dependencies:

CI runs the runtime test in an environment that supports user_events
CI runs the runtime test with permissions to access user_events_data.
Microsoft.OneCollect.RecordTrace (transitively resolved through a dotnet diagnostics public feed)
Microsoft.Diagnostics.Tracing.TraceEvent 3.1.24+ (to read NetTrace V6)

Copilot

Pull Request Overview

This PR adds a new test for UserEvents tracing on Linux that validates the runtime's ability to emit trace events through the user_events subsystem. The test uses the Microsoft.OneCollect.RecordTrace tool to capture events from a tracee process and validates that GC events were properly recorded.

Key changes include:

Addition of a new test infrastructure for UserEvents tracing
Upgrade of TraceEvent library from version 3.1.16 to 3.1.28
Implementation of multi-process test orchestration with native signal handling

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/tests/tracing/eventpipe/userevents/usereventstracee.cs	Implements tracee process that generates GC events for validation
src/tests/tracing/eventpipe/userevents/userevents.csproj	Project configuration including NuGet package references and build targets
src/tests/tracing/eventpipe/userevents/userevents.cs	Main test orchestration: spawns processes, collects traces, validates events
src/tests/tracing/eventpipe/userevents/dotnet-common.script	Configuration script for record-trace tool specifying provider and flags
eng/Versions.props	Updates TraceEvent package version to 3.1.28

src/tests/tracing/eventpipe/userevents/userevents.cs

src/tests/tracing/eventpipe/userevents/userevents.csproj

…nal_runtime_test

src/tests/tracing/eventpipe/userevents/userevents.csproj

jkotas · 2025-11-03T21:00:06Z

a basic end-to-end user_events scenario

I like this approach.

src/tests/tracing/eventpipe/userevents/userevents.cs

noahfalk · 2025-11-03T23:46:19Z

src/tests/tracing/eventpipe/userevents/userevents.cs

+        }
+
+        public static int TestEntryPoint()
+        {


We should add checks for:

process is elevated

OS is Linux

user_events are supported

Its likely at some point this test will be run in the wrong environment and the logs should make it trivial to diagnose.

Maybe we should not even build the test on none linux platforms. CLRTestTargetUnsupported msbuild property could be used to exclude a test on specific platforms.

The CLRTestTargetUnsupported is in the csproj, so it should hopefully prevent this test from running on non linux-x64/linux-arm64 platforms. Then again, I think more logic is needed to check for Alpine.

Added checks for geteuid and checking if sys/kernel/tracing/user_events_data exists

src/tests/tracing/eventpipe/userevents/userevents.cs

noahfalk · 2025-11-03T23:52:51Z

src/tests/tracing/eventpipe/userevents/userevents.cs

+
+        private static bool ValidateTraceeEvents(string traceFilePath)
+        {
+            string etlxPath = TraceLog.CreateFromEventPipeDataFile(traceFilePath);


You can parse the .nettrace file directly using EventPipeEventSource in TraceEvent. This avoids creating a 2nd file that the test also needs to clean up.

For some reason when I tried it last, it didn't work (pilot error), switched to using EventPipeEventSource. Right now the events are "unknown" with id. Maybe its cause I'm using the Dynamic parser? I'll look into TraceEvent more closely

noahfalk · 2025-11-03T23:57:52Z

src/tests/tracing/eventpipe/userevents/userevents.cs

+            recordTraceStartInfo.RedirectStandardError = true;
+
+            using Process traceeProcess = Process.Start(traceeStartInfo);
+            using Process recordTraceProcess = Process.Start(recordTraceStartInfo);


To ensure tracer observes the tracee we should start the tracer process first.

Switched. Originally I was wondering if we should pass --pid {traceePid}, but the process isolation should prevent this test from tracing another runtime test and mistaking the events should the tracee process crash. But I guess even in that case... it showed that user_events were collected.

noahfalk · 2025-11-04T00:00:46Z

src/tests/tracing/eventpipe/userevents/usereventstracee.cs

+        public static void Run()
+        {
+            long startTimestamp = Stopwatch.GetTimestamp();
+            long targetTicks = Stopwatch.Frequency * 10; // 10s


How about 1 second instead? Ideally we want tests to run quickly whenever possible.

lateralusX · 2025-11-04T09:04:23Z

src/tests/tracing/eventpipe/userevents/userevents.cs

+            bool startEventFound = false;
+            bool stopEventFound = false;
+
+            source.AllEvents += (TraceEvent e) =>


Is the plan to add more tests here that checks for metadata/fields, rundown events, callstacks etc or should we add some basic verification to this test or plan to extend existing EventPipe tests to also work over UserEvents?

This was mainly to add a basic verification that the end to end runtime side (from accepting the ipc message to writing to the tracepoints worked). Since User_events is built on EventPipe, my initial thought is that duplicating the existing eventpipe tests for user events wouldn't be adding anything. I think we can add more tests later on, but not sure what coverage is good and reasonable. I'm not even sure yet if our CI machines have user_events, or if they run with elevated privileges, so this was mainly to see if we can have a basic E2E test going.

We should definitely not duplicate but maybe look on extending the existing event pipe tests to run over user events + additional validate logic but agree that is something we could look at later.

So, if this is mainly a smoke test then maybe we should make sure we at least hit things we know is handled in the one-collect library, during the work we hit a number of things that needed special attention, like activity id's, custom metadata and potential stack traces. Right now, these only tests one runtime start/stop event fired under very unique circumstances. Maybe we should do some short multi-threading scenario as well, making sure we won't hit any races in the code path unique to user events?

…nal_runtime_test

mdh1418 · 2025-11-26T05:58:00Z

Looks like the reason the .NET runtime events aren't being captured in the .nettrace is because a session isn't actually being started.
On helix machines, the diagnostic port is created under helix's provisioned environment's tempdirectory which is of the form /datadisks/disk1/work/<workID>/t/. RecordTrace currently only scans /tmp/ for these diagnostic ports. I'm planning on adding a config value for eventpipe/userevents debugging for more stresslogs for better diagnostics on whether the point of failure is in the runtime side or external.

[UserEvents] Add end-to-end runtime test

47231e1

mdh1418 requested review from Copilot, hoyosjs, jkoritzinsky, jkotas, lateralusX, noahfalk and steveisok November 3, 2025 19:40

github-actions bot added the area-Tracing-coreclr label Nov 3, 2025

dotnet-policy-service bot assigned mdh1418 Nov 3, 2025

mdh1418 requested a review from akoeplinger November 3, 2025 19:42

Copilot AI reviewed Nov 3, 2025

View reviewed changes

jkotas reviewed Nov 3, 2025

View reviewed changes

src/tests/tracing/eventpipe/userevents/userevents.csproj Outdated Show resolved Hide resolved

Merge remote-tracking branch 'upstream/main' into user_events_functio…

963efe1

…nal_runtime_test

jkotas reviewed Nov 3, 2025

View reviewed changes

src/tests/tracing/eventpipe/userevents/userevents.csproj Outdated Show resolved Hide resolved

mdh1418 added 4 commits November 3, 2025 20:31

Cleanup test props

e712df6

Add ProcessStartInfo and using keyword

2b246be

Fix process exit detection

a7afeb9

Retain nettrace test asset

c41916a

noahfalk reviewed Nov 4, 2025

View reviewed changes

lateralusX reviewed Nov 4, 2025

View reviewed changes

mdh1418 added 2 commits November 7, 2025 02:57

Address feedback

963af3f

Fix RecordTrace package resolving

005cc94

This was referenced Nov 12, 2025

[browser] Wasm.Build.Tests timeout - Timed out after 10s waiting for 'WASM EXIT' #116697

Open

Unable to pull image from mcr.microsoft.com #117164

Open

mdh1418 force-pushed the user_events_functional_runtime_test branch from edcad95 to d81ac77 Compare November 12, 2025 18:24

Attempt to copy userevents assets for CI

9009189

mdh1418 force-pushed the user_events_functional_runtime_test branch 5 times, most recently from bb6923e to b1d4234 Compare November 13, 2025 15:31

Enable Helix tests to run record-trace

7395930

mdh1418 force-pushed the user_events_functional_runtime_test branch from b1d4234 to 7395930 Compare November 13, 2025 17:43

This was referenced Nov 13, 2025

browser-wasm linux Release LibraryTests queues timing out #117974

Open

[android] Android.Device_Emulator.JIT.Test failing on emulators with CoreCLR #112633

Open

Timeout in HostFactoryResolverTests.NoSpecialEntryPointPatternCanRunInParallel #114704

Open

Add requirements check and upload trace

10b7e42

build-analysis bot mentioned this pull request Nov 18, 2025

SIGSEGV in System.Net.Quic.Functional.Tests #121575

Open

Debugging event emission

8c28d62

mdh1418 force-pushed the user_events_functional_runtime_test branch from 0286358 to 8c28d62 Compare November 18, 2025 18:35

mdh1418 added 3 commits November 25, 2025 16:57

Add EP log macros

0d7bf28

Add stress logs to user_events setup and write paths

31eac0f

Set logging env for tracee

55071a8

mdh1418 requested review from MichalStrehovsky and vitek-karas as code owners November 25, 2025 20:46

mdh1418 added 6 commits November 26, 2025 01:06

[TEMP] disable all other pipelines

9bc0e94

Add server_loop logs

5103de3

Merge remote-tracking branch 'upstream/main' into user_events_functio…

6fe61d2

…nal_runtime_test

comment out another fixup

c7b93e6

Lets print the port too

391858d

try to limit runtime tests

cd48fdf

[UserEvents] Add end-to-end runtime test #121316

Are you sure you want to change the base?

[UserEvents] Add end-to-end runtime test #121316

Conversation

mdh1418 commented Nov 3, 2025

Alternative testing approaches considered

Existing EventPipe runtime tests

Native EventPipe Unit Tests

End-to-End Testing Added

Approach

Dependencies:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkotas commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lateralusX Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lateralusX Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdh1418 commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lateralusX Nov 4, 2025 •

edited

Loading

lateralusX Nov 4, 2025 •

edited

Loading