[Prototype] Pipeline Components - Extension Support #2141
Conversation
- Add extensions as a dedicated section in PipelineConfig (sibling to nodes) - Introduce ExtensionWrapper (Local/Shared), ExtensionRegistry, sealed ExtensionTrait, and extension_traits! macro - Add BearerTokenProvider trait for authentication extensions - Implement AzureIdentityAuthExtension (managed identity + dev credentials) - Refactor Azure Monitor exporter to consume auth via extension registry - Pass ExtensionRegistry to receiver/exporter start() signatures - Generate EXTENSION_FACTORIES distributed slice via engine-macros - Reject extension URNs placed in the nodes section with a clear error - Add extension-system.md documentation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2141 +/- ##
==========================================
- Coverage 87.44% 87.22% -0.23%
==========================================
Files 558 566 +8
Lines 185764 186872 +1108
==========================================
+ Hits 162447 163003 +556
- Misses 22791 23343 +552
Partials 526 526
🚀 New features to boost your workflow:
|
7b8e00e to
b9bf187
Compare
7bf7e21 to
5f970e0
Compare
5f970e0 to
f71275d
Compare
Based on utpilla's insight in open-telemetry#2113 that extensions never touch pipeline data.
jmacd
left a comment
There was a problem hiding this comment.
I reviewed extension-system.md. Looks good!
I think we can add more detail and flexibility in the future. We should focus on the configuration model, especially threads-vs-cores sharing questions, and the minimum required for our Azure auth core to serve as an extension for both azmon and parquet+object_store exporters.
As a nice-to-have, I think we should consider adding a core component that does some extremely-basic form of the extension we're adding, like a basicauth extension (receivers), like a headersetter extension (exporters).
| Processors do not receive the registry (they don't need cross-cutting | ||
| capabilities directly). |
There was a problem hiding this comment.
nit: Eventually, we will find ways to use processor extensions. In the Collector, we find cross cutting concerns like memory limiters and persistent key/value stores.
| data-path components initialize. | ||
|
|
||
| 2. **PData-free.** Extensions are completely decoupled from the pipeline data | ||
| type (`PData`). They receive their own `ExtensionControlMsg` messages |
There was a problem hiding this comment.
Note: the admin component is in a similar position, needing to be decoupled from the PData type of the engine, yet interoperating with the engine.
# Change Summary This PR adds a design proposal describing the extension system for the **OTel Dataflow Engine**. The document introduces a capability-based extension architecture allowing receivers, processors, and exporters to access non-pdata functionality through well-defined capability interfaces maintained in the engine core. The proposal covers: * core concepts such as **capabilities**, **extension providers**, and **extension instances** * integration of extensions into the **existing configuration model** * the **user experience** for declaring extensions and binding capabilities * the **developer experience** for implementing extension providers * the **runtime architecture** for resolving and instantiating extensions * the **execution models** supported by extensions (local vs shared) * comparison with the **Go Collector extension model** * a **phased evolution plan** (native extensions → hierarchical placement → WASM extensions) * implementation recommendations for building **high-performance extensions aligned with the engine's thread-per-core design** The goal of this document is to provide maintainers with a clear architectural proposal to review before implementing the extension system. ## What issue does this PR close? * Related to #2267, #2230, #2141, #2113 ## How are these changes tested? This PR introduces **documentation only** and does not modify runtime code. ## Are there any user-facing changes? Yes. This proposal describes a **future extension system** that will introduce new configuration capabilities such as: * an `extensions` section in pipeline configurations * a `capabilities` section in node definitions These changes are not implemented yet but outline the intended user-facing configuration model for extensions. --------- Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
Pipeline Components - Extension Support Proposal
Extensions are pipeline components alongside receivers, processors, and exporters -- but they do not participate in data-path connections (no PData channels, control messages only).
Extensions can be implemented as local (
!Sendfutures, single-threadedLocalSet) or shared (Sendfutures, multi-threaded), giving extension authors flexibility with some caveats based on theSendonly extension traits in this design.Extensions can optionally implement extension traits (e.g.,
BearerTokenProvider) defined in the engine crate to expose capabilities to other components, or opt out and serve purely as background tasks.Extension traits require
Send + Clone + 'static. If an extension publishes any trait, the concrete type implementing that trait must satisfy these bounds -- meaning shared state must be managed viaArc(or similar) by the extension author.The extension's lifecycle method (
start) takesBox<Self>by move -- the extension instance is consumed, not cloned. What is cloned is the extension struct itself duringextension_traits(): the macro clonesselfinto eachTraitRegistration, and those clones are inserted into theExtensionRegistry. After trait collection, the original extension is consumed bystart(). Each consumer (receiver/exporter) receives a clone of the registry, and callingregistry.get::<dyn Trait>(name)returns a fresh clone of the stored trait object.Extensions are configured as a sibling to
nodesin the pipeline YAML (dedicatedextensions:section), not insidenodes.Extension trait types are sealed -- new trait types can only be added inside the engine crate; external crates can implement existing traits but cannot define new ones.
Extensions are started first, before exporters, processors, and receivers, so their capabilities are available at component initialization.
Extension registry is passed via start methods of exporters and receivers. Can be added to "process" as well if needed. Considered putting it into the EffectHandler, but it didn't feel like the right place for that.
Created a separate control message channel for Extensions based on the changes that were done by @utpilla 's PR #2141.
Alternatives Considered
Arc-based registry -- true single instance withArcclones, but enforces aSyncboundary on all extensions.Rc-based registry -- only works for local components. It might be acceptable to say that shared components don't have extension support, but this limits flexibility.Clonerequirement.Why Local and Shared Variants
Local and shared extension variants exist for two reasons:
Send(because the registry isSend), thestart()async body of a local extension can use!Sendtypes (Rc,RefCell,LocalSetspawning, etc.). This means a local extension that publishesSendtraits can still use non-thread-safe optimizations inside its own event loop, even though its struct fields must beSend + Clonefor trait registration.