-
Notifications
You must be signed in to change notification settings - Fork 911
Stabilize Logger.Enabled #4208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Question to @open-telemetry/technical-committee: Do we want to stabilize the Logger.Enabled API sooner than we stabilize the spec defining how SDK implements it? Or do we want to stabilize Enabled for API and SDK at the same time? |
Same for Metrics too: https://github.com/open-telemetry/opentelemetry-specification/pull/4219/files#r1767789558 |
The lack of stabilization of From OTel Go perspective, the SDK support can be experimental. See: https://pkg.go.dev/go.opentelemetry.io/otel/sdk/log/internal/x. This is currently the only known blocker for stabilizing the OTel Go Logs. |
@open-telemetry/technical-committee, are you able to revalidate if the issues listed as blockers are still seen as blockers or if they can be addressed after stabilization of Logger.Enabled in Logs Bridge API? Personally, I think the main blocker is to have at least 3 prototypes of the API in different languages. |
To clarify the process: we expect 3 prototypes in 3 different languages that can be used by the end users, so that they can try the feature, provide feedback, submit bugs and issues about it. This is a necessary process before the spec section is marked "Stable". From this perspective a PR does not counts as a prototype since it is not easily usable by the end users. A PR is fine for proposing new experimental features and demonstrating how they would work, but it is not enough for stabilizing the spec.
@pellared you either need to find a way to have unstable APIs in Go or wait until other languages implement the prototypes. Either way the ability to have unstable APIs is very valuable and this is likely to come up again as Otel evolves and we keep adding new experimental APIs to existing signals. -- As a side note: I encourage using maturity levels between "Development" and "Stable" to signal increasing level of confidence in the capability (both in the spec and the SDK). For example if we have 1-2 prototypes then we can move the maturity level of the feature from "Development" to "Alpha" or "Beta" to signal it is moving closer to the "Stable" state. |
OK. See we need 3 different languages to have it released as experimental API. Here is how the experimental Logger.Enabled API is currently defined in a 3 languages:
I will do my best to work on this with others to move this forward (as we have inconsistencies).
All major log bridges need it so it does not even make sense to stabilize the rest as the Logs API would not be usable in Go ecosystem. From #3917:
We then need to wait for other languages to add it.
I am not sure if we can use Footnotes |
We can bring as many levels from OTEP 0232 to the spec as we will believe is useful. I started with 3 but we can bring more if we feel there is value. I personally think it can be valuable to have more granularity between Development (the most immature) and Stable (the most mature). It is an important signal and having just a binary value for it I think is not nuanced enough. Stabilization is a process, often a long one at Otel. As you move along that process it is important to indicate the progress by updating the level labels. |
FWIW this is a change in policy. Many features have been stabilized that relied on "Go implementations" which were just PRs. I'm not sure it is fair to make this change in policy in such an ad hoc manner. |
@MrAlias my post is a result of a discussion by a few TCs while triaging this issue, so it is not an official policy change yet. I tried finding the current policy but couldn't. This document does not seem to have an opinion about what criteria must be met before spec stabilization. If anyone is aware of where do we state how many prototypes are needed please post the link. If the policy does not exist in written form or we need to modify it I will create an issue so that we can discuss and formalize it. Let's keep this issue open for now so that we can apply consistent rules after we clarify what the rules are. |
I am gonna move this back to TC inbox. |
@jack-berg, I have question on what is required regarding the SDK implementation specification/design to unblock the stabilization of
From our OTel Go experience, it is better to stabilize the API first and then gather feedback for months before stabilizing anything in the SDK. In OTel Go all our logging bridges use The stabilization of |
I also want to call out that we already have 4 working prototypes of They have slight differences in the accepted parameters and I think this needs to be sorted out. @open-telemetry/cpp-maintainers, @open-telemetry/php-maintainers, @open-telemetry/rust-maintainers, I need your help here. |
Honestly, I've been struggling to keep up with / track all the open issues / PRs related to this. The current state of the operation is that Your OTEP #4290 tries to establish this corresponding SDK behavior. I agree with some of the content in there, and in particular, think that adding
Five - I originally proposed this experimental method with an accompanying java prototype 🙂 |
👍 |
PHP's implementation of When we do catch up, I'll follow the latest spec (although I might wait for this effort to complete first) |
I do not see them as no brainers. I am personally not convinced that these are good proposals. Regarding Regarding I find adding opt-in |
I've said something to this effect in other comments, but extending LogRecordProcessor with We could do this. Some users will benefit from it, preferring to do pipeline style work in the SDK vs. the collector. But most users will have a single batch log record processor paired with the OTLP exporter. And these users will want / expect easy ways to configure which logs make it into their pipeline. Filtering logs by severity is table stakes for any log system. For a system like OpenTelemetry that prioritizes correlation across signals, filtering logs based on whether the active span is sampled is obvious low hanging fruit. So if we need some mechanism for filtering logs by severity and trace context, the next question is where does that configuration mechanism live. Options:
I'm opposed to bringing proper pipeline support to SDKs (its currently possible but you have to jump through hoops) because its a large implementation burden that has to be paid 11 times (once for each language implementation), and it duplicates the capabilities of the general purpose and extremely powerful collector. From a prioritization standpoint, asking resource-constrained language maintainers to implement better pipeline tooling doesn't seem like a good use of time right now, given all the other project objectives - especially semconv and stable instrumentation. Not bringing proper pipeline support to SDKs means the users who want to do collector-style things in SDKs have to jump through more hoops with worse ergonomics. This isn't ideal, but tradeoffs. If the community decides to that bringing proper pipeline support to SDKs is important, I do think its important to solve it holistically, and look at how the concepts apply to traces and metrics as well as logs. |
👍 |
Fixes #4363 Towards #4208 (uses Severity Level passed via Logger.Enabled) Towards stabilization of OpenTelemetry Go Logs API and SDK. ## Use cases Below are some use cases where the new functionality can be used: 1. Bridge features like `LogLevelEnabled` in log bridge/appender implementations. This is needed for **all** (but one) currently supported log bridges in OTel Go Contrib. 2. Configure a minimum log severity level for a certain log processor. 3. Filter out log and event records when they are inside a span that has been sampled out (span is valid and has sampled flag of `false`). 4. **Efficiently** support high-performance logging destination like [Linux user_events](https://docs.kernel.org/trace/user_events.html) and [ETW (Event Tracing for Windows)](https://learn.microsoft.com/windows/win32/etw/about-event-tracing). 5. Bridge Logs API to a language-specific logging library (the other way than usual). ## Changes Add `Enabled` opt-in operation to the `LogRecordProcessor`. I created an OTEP first which was a great for having a lot of discussions and evaluations of different proposals: - #4290 Most importantly from #4290 (comment): > Among Go SIG we were evaluating a few times an alternative to provide some new "filter" abstraction which is decoupled from the "processor". However, we faced more issues than benefits going this route (some if this is described here, but there were more issues: open-telemetry/opentelemetry-go#5825 (comment) . With the current opt-in `Processor.Enabled` we faced less issues so far. > We also do not want to replicate all features from the logging libraries. If someone prefer the log4j (or other) filter design then someone can always use a bridge and use log4j for filtering. `Enabled` callback hook is the simplest design (yet very flexible) which makes it easy to implement in the SDKs. This design is inspired from the design of the two most popular Go structured logging libraries: https://pkg.go.dev/log/slog (standard library) and https://pkg.go.dev/go.uber.org/zap. > > It is worth to adding that Rust design is similar and it also has an `Enabled` hook. See #4363 (comment). Basically we want to add something like https://docs.rs/log/latest/log/trait.Log.html#tymethod.enabled to the `LogRecordProcessor` and allow users to implement `Enabled` in the way that it will meet their requirements. > > I also want to call out form https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/0265-event-vision.md#open-questions: > > > How to support routing logs from the Logs API to a language-specific logging library > > To support this we would need a log record processor which bridges the Logs API calls to given logging library. For such case we would need an `Enabled` hook in `Processor` to efficiently bridge `Logger.Enabled` calls. A filterer design would not satisfy such use case. I decided to name the new operation `Enabled` as: 1. this name is already used in logging libraries in many languages: #4439 (comment) 2. it matches the name of the API call (for all trace, metrics and logs APIs). I was also considering `OnEnabled` to have the same pattern as for `Emit` and `OnEmit`. However, we already have `ForceFlush` and `Shutdown` which does not follow this pattern so I preferred to keep the simple `Enabled` name. For `OnEmit` I could also imagine `OnEmitted` (or `OnEmitting`) which does something after (or just before like we have `OnEnding` in `SpanProcessor`) `OnEmit` on all registered processors were called. Yet, I do not imagine something similar for `Enabled` as calling `Enabled` should not have any side-effects. Therefore, I decided to name it `Enabled`. I want to highlight that a processor cannot assume `Enabled` was called before `OnEmit`, because of the following reasons: 1. **Backward compatibility** – Existing processors may already perform filtering without relying on `Enabled`. For example: [Add Advanced Processing to Logs Supplementary Guidelines #4407](#4407). 2. **Self-sufficiency of `OnEmit`** – Since `Enabled` is optional, `OnEmit` should be able to handle filtering independently. A processor filtering events should do so in `OnEmit`, not just in `Enabled`. 3. **Greater flexibility** – Some processors, such as the ETW processor, don’t benefit from redundant filtering. ETW already filters out events internally, making an additional check unnecessary. 4. **Performance considerations** – Calling `Enabled` from `OnEmit` introduces overhead, as it requires converting `OnEmit` parameters to match `Enabled`'s expected input. 5. **Avoiding fragile assumptions** – Enforcing constraints that the compiler cannot validate increases the risk of introducing bugs. This feature is already implemented in OpenTelemetry Go: - open-telemetry/opentelemetry-go#6317 We have one processor in Contrib which takes advantage of this functionality: - https://pkg.go.dev/go.opentelemetry.io/contrib/processors/minsev This feautre (however with some differences) is also avaiable in OTel Rust; #4363 (comment): > OTel Rust also has this capability. Here's an example where it is leveraged to improve performance by dropping unwanted log early. https://github.com/open-telemetry/opentelemetry-rust/blob/88cae2cf7d0ff54a042d281a0df20f096d18bf82/opentelemetry-appender-tracing/benches/logs.rs#L78-L85 --------- Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Sam Xie <[email protected]>
According to spec compliance matrix it is implemented in 3 languages:
I saw that @open-telemetry/technical-committee, are there any required steps to mark this as stable? |
Did a quick peak at the go, rust, cpp, php implementations:
Personally, I'm comfortable stabilizing |
Otel Rust added more parameters than the spec has. The extra parameters are compile time generated/static in most logging libraries in Rust and is readily available. So it won't cost extra, but allows more advanced filtering. |
@jack-berg, thanks. I think it would be beneficial to have approval from at least one maintainer per language listed above. If nobody objects I can create a PR tomorrow. |
Yes. Otel C++'s logging API was meant for end-user usage too, even before Spec made that relaxation recently. |
According to #4208 (comment) PHP should also be fine. @brettmc, am I correct? |
@pellared correct, that won't be an issue for PHP. |
Stabilize Logger.Enabled API
Blockers:
The text was updated successfully, but these errors were encountered: