Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nimbler tracing #9147

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Documentation/guides/native-tracing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Tracing Android for .NET applications with built-in capabilities
11 changes: 11 additions & 0 deletions Documentation/workflow/SystemProperties.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,17 @@ categories:
method execution timing which is written to a file named
`methods.txt`. `timing=bare` should be used in preference to this
category.
* `native-tracing`
Enable built-in tracing capabilities, using default settings.
Tracing settings can be tuned using the [debug.mono.native-tracing](#debugmononative-tracing)
property.

### debug.mono.native-tracing

[Full documentation](../guides/native-tracing.md)

Available options:


#### Timing events format

Expand Down
103 changes: 103 additions & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Introduction

We have a rudimentary call tracing system in place which can record native, Java and managed (sort of)
stack traces when necessary. However, this system was quickly slapped up together in order to try
to find out why marshal methods break Blazor apps. The system doesn't optimized collection and delivery,
making it heavy (e.g. it relies on logcat to log large stack traces) and hard to read the results.

This branch is an attempt to fix the above downsides and implement a generic (as far as we are concerned -
not for arbitrary use in other products) framework to store timed events in a thread-safe manner and with
as little work at runtime at possible, but with options to do more when necessary.

# Design goals and notes

## Collection

Records are stored in pre-allocated buffers, one per thread. Buffer is allocated at the time when thread
attaches to the runtime, its pointer is stored in TLS as well as in some central structure for further
processing. At runtime, the pointer from TLS is used to store events, thus enable lockless operation.

The collection process can be started at the following points:

* startup of the application
* after an initial delay
* by a p/invoke at the app discretion
* by an external signal/intent (signal might be faster)

The collection process can be stopped at the following points:

* exit from the application
* after a designated delay from the start
* by a p/invoke at the app discretion
* by an external signal/intent

`Buffer` is used loosely here, the collected data may be called in a linked list or some other form of
container.

Each trace point may indicate that it wants to store one or more stack traces (native, Java and managed)

### Native call traces

If call stack trace is to be collected, gather:

1. Shared library name or address, if name not available (we might be able to get away with using just
the address, if we can gleam the address from memory map post-mortem)
2. entry point address

### Java call traces

Those will require some JNI work as it might not be possible to map stack frame addresses to Java methods
post-mortem.

### Managed call traces

Might be heavy, may require collecting all the info at run time.

## Delivery

At the point where collection is stopped, the results are dumped to a location on device, into a single
structured file and optionally compressed (`lz4` probably)

The dumped data should contain enough information to identify native frames, most likely the process
memory map(s).

Events collected from threads are stored in separate "streams" in the output file, no ordering or processing
is done at this point.

## Time stamping

Each event is time-stamped using the highest resolution clock available. No ordering of events is attempted
at run time.

## Multi-threaded collection

No locks should be used, pointer to buffer stored in thread's TLS and in some central structure for use when
dumping.

## Managed call tracing

While MonoVM runtime can let us know when every method is called, we don't want to employ this technique here
because it would require filtering out the methods we're not interested in at run time, and that means string
allocation, comparison - too much time wasted. Instead we will have a system in place that makes the whole
tracing nimbler.

### Assembly and type tracing

Each trace record must identify the assembly the method is in, the class token and the method token. Assemblies are
not identified by their MVVID, but rather by an ordinal number assigned to them at build time and stored somewhere
in `obj/` for future reference. Each instrumented method has hard-coded class and method tokens within that
assembly.

### Filters

We'll provide a way to provide a filter to instrument only the desired methods/classes. Filters apply to assemblies
after linking but **before** AOT since we must instrument the methods before AOT processes them. Three kinds
of targets:

* `Methods in types`.
A regex might be a good idea here.
* `Full class name`.
By default all type methods are instrumented, should be possible to exclude some (regex)
* `Marshal method wrappers`.
They are **not** covered by the above two kinds, have to be enabled explicitly. By default all are instrumented,
possible to filter with a regex. Both inclusion and exclusion must be supported.
10 changes: 10 additions & 0 deletions src/Xamarin.Android.Build.Tasks/ManifestOverlays/Tracing.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools">
<application>
<receiver android:name="mono.android.app.StopTracingAndDumpData" tools:node="replace" android:exported="true">
<intent-filter>
<action android:name="mono.android.app.STOP_TRACING_AND_DUMP_DATA"/>
</intent-filter>
</receiver>
</application>
</manifest>
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,9 @@
<None Include="ManifestOverlays\Timing.xml">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="ManifestOverlays\Tracing.xml">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
<ItemGroup>
<_SharedRuntimeAssemblies Include="@(MonoProfileAssembly->'$(_SharedRuntimeBuildPath)v1.0\%(Identity)')" />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1524,6 +1524,10 @@ because xbuild doesn't support framework reference assemblies.
Include="$(MSBuildThisFileDirectory)\ManifestOverlays\Timing.xml"
Condition=" '$(_AndroidFastTiming)' == 'True' "
/>
<AndroidManifestOverlay
Include="$(MSBuildThisFileDirectory)\ManifestOverlays\Tracing.xml"
Condition=" '$(_AndroidEnableNativeStackTracing)' == 'True' "
/>
</ItemGroup>
<ManifestMerger
ToolPath="$(JavaToolPath)"
Expand Down
1 change: 1 addition & 0 deletions src/java-runtime/java/mono/android/Runtime.java
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ public static native void initInternal (
public static native void destroyContexts (int[] contextIDs);
public static native void propagateUncaughtException (Thread javaThread, Throwable javaException);
public static native void dumpTimingData ();
public static native void stopTracingAndDumpData ();
}

final class XamarinUncaughtExceptionHandler implements Thread.UncaughtExceptionHandler {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
package mono.android.app;

public class StopTracingAndDumpData extends android.content.BroadcastReceiver {
@Override
public void onReceive (android.content.Context context, android.content.Intent intent) {
mono.android.Runtime.stopTracingAndDumpData ();
}
}
8 changes: 8 additions & 0 deletions src/native/monodroid/mono_android_Runtime.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 7 additions & 1 deletion src/native/monodroid/monodroid-glue-internal.hh
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "cpp-util.hh"
#include "xxhash.hh"
#include "monodroid-dl.hh"
#include "tracing.hh"

#include <mono/utils/mono-counters.h>
#include <mono/metadata/profiler.h>
Expand Down Expand Up @@ -209,8 +210,9 @@ namespace xamarin::android::internal

void gather_bundled_assemblies (jstring_array_wrapper &runtimeApks, size_t *out_user_assemblies_count, bool have_split_apks);
static bool should_register_file (const char *filename);
void set_trace_options ();
void set_mono_jit_trace_options ();
void set_profile_options ();
void initialize_native_tracing ();

void log_jit_event (MonoMethod *method, const char *event_name);
static void jit_begin (MonoProfiler *prof, MonoMethod *method);
Expand Down Expand Up @@ -255,6 +257,10 @@ namespace xamarin::android::internal
timing_period jit_time;
FILE *jit_log;
MonoProfilerHandle profiler_handle;
TracingAutoStartMode tracing_auto_start_mode = TracingAutoStartMode::None;
TracingAutoStopMode tracing_auto_stop_mode = TracingAutoStopMode::None;
size_t tracing_start_delay_ms = 0;
size_t tracing_stop_delay_ms = 0;

/*
* If set, monodroid will spin in a loop until the debugger breaks the wait by
Expand Down
114 changes: 112 additions & 2 deletions src/native/monodroid/monodroid-glue.cc
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#include <array>
#include <charconv>
#include <cctype>
#include <cerrno>
#include <cstdarg>
Expand Down Expand Up @@ -1040,7 +1041,7 @@ MonodroidRuntime::set_debug_env_vars (void)
#endif /* DEBUG */

inline void
MonodroidRuntime::set_trace_options (void)
MonodroidRuntime::set_mono_jit_trace_options (void)
{
dynamic_local_string<PROPERTY_VALUE_BUFFER_LEN> value;
if (AndroidSystem::monodroid_get_system_property (SharedConstants::DEBUG_MONO_TRACE_PROPERTY, value) == 0)
Expand All @@ -1049,6 +1050,108 @@ MonodroidRuntime::set_trace_options (void)
mono_jit_set_trace_options (value.get ());
}

inline void
MonodroidRuntime::initialize_native_tracing ()
{
if (!Logger::native_tracing_enabled ()) [[likely]] {
return;
}

dynamic_local_string<PROPERTY_VALUE_BUFFER_LEN> value;
if (AndroidSystem::monodroid_get_system_property (SharedConstants::DEBUG_MONO_NATIVE_TRACING, value) == 0 || value.empty ()) {
tracing_auto_start_mode = TracingAutoStartMode::Startup;
tracing_auto_stop_mode = TracingAutoStopMode::DelayFromStart;
tracing_stop_delay_ms = TracingConstants::DEFAULT_STOP_DELAY_MS;
return;
}

constexpr std::string_view param_start_mode_startup { "start-mode=startup" };
constexpr std::string_view param_start_mode_delay { "start-mode=delay" };
constexpr std::string_view param_start_mode_justinit { "start-mode=just-init" };

constexpr std::string_view param_stop_mode_delay { "stop-mode=delay" };
constexpr std::string_view param_stop_mode_absolute_delay { "stop-mode=absolute-delay" };

constexpr std::string_view param_start_delay { "start-delay=" };
constexpr std::string_view param_stop_delay { "stop-delay=" };

string_segment param;
while (value.next_token (',', param)) {
if (param.equal (param_start_mode_startup)) {
tracing_auto_start_mode = TracingAutoStartMode::Startup;
continue;
}

if (param.equal (param_start_mode_delay)) {
tracing_auto_start_mode = TracingAutoStartMode::Delay;
tracing_start_delay_ms = TracingConstants::DEFAULT_START_DELAY_MS;
continue;
}

if (param.equal (param_start_mode_justinit)) {
tracing_auto_start_mode = TracingAutoStartMode::JustInit;
continue;
}

if (param.equal (param_stop_mode_delay)) {
tracing_auto_stop_mode = TracingAutoStopMode::DelayFromStart;
tracing_stop_delay_ms = TracingConstants::DEFAULT_STOP_DELAY_MS;
continue;
}

if (param.equal (param_stop_mode_absolute_delay)) {
tracing_auto_stop_mode = TracingAutoStopMode::AbsoluteDelay;
tracing_stop_delay_ms = TracingConstants::DEFAULT_STOP_DELAY_MS;
continue;
}

auto convert_delay = [](string_segment const& s, size_t start, size_t default_value) {
if (s.length () <= start) {
log_warn (
LOG_DEFAULT,
"Expected value in tracing setting '%s', using the default value of %zums",
s.start (),
default_value
);
return default_value;
}

size_t ret{};
auto [ptr, errorc] = std::from_chars (s.start () + start, s.start () + s.length (), ret);
if (errorc == std::errc ()) {
return ret;
}

if (errorc == std::errc::invalid_argument) {
log_warn (
LOG_DEFAULT,
"Tracing setting value is not a decimal integer: %s. Using the default value of %zums",
s.start (),
default_value
);
} else if (errorc == std::errc::result_out_of_range) {
log_warn (
LOG_DEFAULT,
"Tracing setting value exceeds the maximum allowed one (%zu): %s. Using the default value of %zums",
std::numeric_limits<size_t>::max (),
s.start (),
default_value
);
}

return default_value;
};

if (param.starts_with (param_start_delay)) {
tracing_start_delay_ms = convert_delay (param, param_start_delay.length () + 1, TracingConstants::DEFAULT_START_DELAY_MS);
}

if (param.starts_with (param_stop_delay)) {
tracing_stop_delay_ms = convert_delay (param, param_stop_delay.length () + 1, TracingConstants::DEFAULT_STOP_DELAY_MS);
}
}
}

inline void
MonodroidRuntime::set_profile_options ()
{
Expand Down Expand Up @@ -1428,6 +1531,7 @@ MonodroidRuntime::Java_mono_android_Runtime_initInternal (JNIEnv *env, jclass kl
mono_trace_set_level_string (mono_log_level.get ());
}

initialize_native_tracing ();
setup_mono_tracing (mono_log_mask, have_log_assembly, have_log_gc);
install_logging_handlers ();

Expand All @@ -1442,7 +1546,7 @@ MonodroidRuntime::Java_mono_android_Runtime_initInternal (JNIEnv *env, jclass kl

set_profile_options ();

set_trace_options ();
set_mono_jit_trace_options ();

#if defined (DEBUG)
debug.start_debugging_and_profiling ();
Expand Down Expand Up @@ -1614,6 +1718,12 @@ JNICALL Java_mono_android_Runtime_dumpTimingData ([[maybe_unused]] JNIEnv *env,
internal_timing->dump ();
}

JNIEXPORT void
JNICALL Java_mono_android_Runtime_stopTracingAndDumpData ([[maybe_unused]] JNIEnv *env, [[maybe_unused]] jclass klass)
{
// TODO: implement
}

JNIEXPORT void
JNICALL Java_mono_android_Runtime_register (JNIEnv *env, [[maybe_unused]] jclass klass, jstring managedType, jclass nativeClass, jstring methods)
{
Expand Down
5 changes: 5 additions & 0 deletions src/native/runtime-base/logger.cc
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,11 @@ Logger::init_logging_categories (char*& mono_log_mask, char*& mono_log_level) no
continue;
}

if (param.starts_with ("native-tracing")) {
_native_tracing_enabled = true;
continue;
}

if (param.starts_with ("timing=fast-bare")) {
log_categories |= LOG_TIMING;
_log_timing_categories |= LogTimingCategories::FastBare;
Expand Down
Loading
Loading