-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SDK] Better control of threads executed by opentelemetry-cpp #3175
base: main
Are you sure you want to change the base?
[SDK] Better control of threads executed by opentelemetry-cpp #3175
Conversation
✅ Deploy Preview for opentelemetry-cpp-api-docs canceled.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3175 +/- ##
==========================================
- Coverage 88.16% 87.72% -0.44%
==========================================
Files 198 198
Lines 6224 6258 +34
==========================================
+ Hits 5487 5489 +2
- Misses 737 769 +32
|
Do you think we can implement a worker pool to share threads among multiple exporters and processors? Since most of them simply wait for a timeout, there is no need to create a separate thread for each component. |
For some applications, running in a multi threaded environment, it is critical to control how a thread executes in finer details. For example, I have:
When the exporter uses CURL to post to the endpoint, it makes a TCP/IP connection. (a), (b) and (c) do need to use different TCP/IP stacks, or different named networks, to connect to their respective endpoints. This is achieved by calling Having dedicated threads (as currently) makes this easier. If execution of each exporter was to be multiplexed and executed in the same worker thread, the context will need to change for each exporter, calling setns() many more time, and introducing more complexity to counter the effect of a pool worker thread, so it won't help but instead get in the way. setns() is just an example, there can be other use cases, like binding to a CPU, etc. Overall, the need is for the main application to inject arbitrary code in the opentelemetry-cpp execution code path, for opentelemetry threads. |
Could please also add this instrumentation for OTLP file exporters? |
Sure, implemented. Every place where opentelemetry-cpp starts a new |
An example of using this feature, to set thread names to the operating system on Linux:
In order, threads are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks.
virtual void BeforeWait() {} | ||
virtual void AfterWait() {} | ||
virtual void BeforeLoad() {} | ||
virtual void AfterLoad() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if there's a way to avoid virtual calls for these methods to reduce their associated cost. One potential approach is to allow users to provide a callback that the SDK invokes during thread lifecycle events. For example:
class ThreadInstrumentation {
ThreadInstrumentation( std::function<> callback = DefaultCallback)
: callback_(std::move(callback)) {}
bool OnOperation(ThreadOperation opr, ThreadContext &context) {
return callback_(operation, context);
}
private:
std::function<bool(ThreadOperation, const ThreadContext&)> callback_;
}
ThreadContext
could carry information like thread_id
or be omitted if unnecessary. This approach lets the SDK call OnOperation
internally while allowing the callback to define behavior for specific stages.
Not a blocker as I don't see these virtual methods to be called in the hot-path, but if this is something we need to consider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the time, the instrumentation will be absent (nullptr), so that:
if (worker_thread_instrumentation_ != nullptr)
{
worker_thread_instrumentation->OnXXX();
}
will not even enter the if block, so it is definitively not in the hot path.
When some instrumentation is provided, the virtual function will land in user code, or in the noop default implementation.
I expect that using a virtual function or a callback std::function to be equivalent, in term of overhead: the call will be dynamic (using a function pointer), and not static with the target known at compile time. A virtual method versus a callback std::function versus plain old C function pointers is just a different coding style, they all boil down to the same thing (a function pointer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About passing thread context:
The subclass of ThreadInstrumentation is free to hold state, and can even keep per thread instance state in thread local storage if it really wants to, for complex cases.
This pushes complexity to the user (and disclosure, this is how I use it).
For calls that are paired, we could help a bit to propagate state like this:
void *opaque_wait_context = nullptr;
if (worker_thread_instrumentation_ != nullptr)
{
opaque_wait_context = worker_thread_instrumentation->BeforeWait();
}
// do something
if (worker_thread_instrumentation_ != nullptr)
{
worker_thread_instrumentation->AfterWait(opaque_wait_context);
}
and likewise for OnStart/OnEnd, and BeforeLoad/AfterLoad.
With this opaque pointer, the user code is free to use it (or not), and it avoids complexity with thread local storage.
I can implement this if you agree with it, or we can revise it later, let me know what you prefer.
Fixes #3174
Changes
This feature provides a way for applications, when configuring the SDK and exporters,
to participate in the execution path of internal opentelemetry-cpp threads.
The opentelemetry-cpp library provides the following:
ThreadInstrumentation
interface,BatchSpanProcessorRuntimeOptions
PeriodicExportingMetricReaderRuntimeOptions
BatchLogRecordProcessorRuntimeOptions
OtlpHttpExporterRuntimeOptions
OtlpHttpMetricExporterRuntimeOptions
OtlpHttpLogRecordExporterRuntimeOptions
ThreadInstrumentation parameters
, to optionally configure the CURLHttpClient
OtlpFileExporterRuntimeOptions
OtlpFileMetricExporterRuntimeOptions
OtlpFileLogRecordExporterRuntimeOptions
OtlpFileClientRuntimeOptions
Using the optional runtime options structures,
an application can subclass the
ThreadInstrumentation
interface,and be notified of specific events of interest during the execution of an internal opentelemetry-cpp thread.
This allows an application to call, for example:
pthread_setaffinity_np()
, see related Add BatchSpanProcessor option to set CPU affinity on the worker thread #1822setns()
, to control the network namespace used by HTTP CURL connectionspthread_setname_np()
, for better observability from the operating systemSee the documentation for
ThreadInstrumentation
for details.A new example program,
example_otlp_instrumented_http
, shows how to use the feature,and add application logic in the thread execution code path.
Note that this feature is experimental, protected by a WITH_THREAD_INSTRUMENTATION_PREVIEW flag in CMake.
Various runtime options structures, as well as the thread instrumentation interface, may change without notice before this feature is declared stable.
For significant contributions please make sure you have completed the following items:
CHANGELOG.md
updated for non-trivial changes