diff --git a/docs/extension-system-architecture.md b/docs/extension-system-architecture.md new file mode 100644 index 0000000000..3f766f99ec --- /dev/null +++ b/docs/extension-system-architecture.md @@ -0,0 +1,879 @@ +# Extension System Architecture + +## Overview + +This document describes the architecture of the extension +system in the OTAP dataflow engine -- what extensions are, +how they integrate into the pipeline lifecycle, and how to +implement new ones. + +## What Are Extensions? + +Extensions are standalone pipeline components that provide +**shared, cross-cutting capabilities** -- such as authentication, +service discovery, or health checking -- to data-path nodes +(receivers, exporters). They are configured as siblings to +`nodes`, not as nodes themselves, and they never touch pipeline +data directly. + +### Why Do We Need Them? + +Before extensions, cross-cutting concerns like authentication +were embedded directly inside individual exporters. This led to: + +- **Duplication** -- every exporter needing auth carried its own + credential management and token refresh loop. +- **Tight coupling** -- credential-specific dependencies (e.g., + `azure_identity`) leaked into exporter crates even when unused. +- **No sharing** -- multiple exporters targeting the same tenant + each acquired and refreshed tokens independently. + +Extensions solve this by extracting shared capabilities into +named, independently-running components. An extension can +optionally expose well-defined traits through a type-safe +registry for data-path nodes to look up by name, or it can +simply run as a pure background task (e.g., certificate +rotation, service discovery refresh) without publishing +any capabilities at all. Either +way: no direct dependencies between nodes, no duplicated logic, +no wasted resources. + +## Architecture Overview + +```text ++-----------------------------------------------+ +| Pipeline Engine | +| | +| +-------------+ +-------------+ | +| | Extension A | | Extension B | ... | +| | (auth) | | (background)| | +| | Arc | +-------------+ | +| +------+------+ | +| | extension_capabilities!() macro | +| | clones self per trait, producing | +| | Vec | +| v | +| +----------------------+ | +| | CapabilityRegistry | (built once) | +| | stores cloned trait | | +| | objects by name | | +| +--+---------------+---+ | +| | clone() | clone() | +| v v | +| +----------+ +----------+ | +| | Receiver | | Exporter | | +| | (own | | (own | | +| | registry| | registry| | +| | clone) | | clone) | | +| +----------+ +----------+ | +| | +| get() returns a cloned Box; | +| all clones share state via Arc inside the | +| extension -- the registry itself holds | +| type-erased cloneable trait objects | ++-----------------------------------------------+ +``` + +### Key Design Decisions + +1. **Extensions start first, shut down last.** The engine + spawns extensions before any data-path nodes so their + capabilities are available at initialization. At shutdown, + extensions terminate only after all data-path nodes have + drained -- ensuring capabilities like auth tokens remain + available during final flushes. Extension instances are + scoped to a single pipeline -- they are not shared across + pipelines. + +2. **PData-free.** Extensions are completely decoupled from + the pipeline data type (`PData`). They receive their own + `ExtensionControlMsg` messages (shutdown, config + updates, telemetry collection) through a dedicated + control channel and never process pipeline data directly. + +3. **Separate control channel.** Extensions use + `ExtensionControlSender` / `ExtensionControlMsg` instead + of the pipeline's `PipelineCtrlMsgSender`. This + prevents extensions from holding clones of the pipeline + control channel sender, which would block the channel + from closing and prevent graceful shutdown. + +4. **Local/Shared split.** Like receivers and exporters, + extensions have both local (`!Send` futures) and shared + (`Send` futures) variants. Local extensions run on the + single-threaded `LocalSet`; shared extensions can be + spawned on multi-threaded runtimes. `ExtensionWrapper` + abstracts over both variants. + +5. **Registry-based lookup.** The `CapabilityRegistry` is + passed to receiver, processor, and exporter factories + at construction time -- not at `start()`. This means + capabilities are resolved during pipeline build, catching + missing extensions early. The API and naming are + node-agnostic -- all node types receive the registry + through the same factory parameter. + +6. **Optional capability publishing.** Extensions that expose + capabilities override `extension_capabilities()` to register + capability implementations in the registry. Extensions that are + pure background tasks simply use the default (empty) + implementation and never appear in the registry. + +### State Management: Clone + Arc + +Extensions use the **Clone + Arc** pattern for shared state +management. This is the same convention used by widely-adopted +Rust libraries including: + +- **[axum](https://docs.rs/axum/latest/axum/extract/struct.State.html#shared-mutable-state)** + (25k stars, by the tokio team) — documents this pattern + explicitly: + + > *"As state is global within a Router you can't directly + > get a mutable reference to the state. The most basic + > solution is to use an `Arc>`. Which kind of + > mutex you need depends on your use case. See the tokio + > docs for more details."* + +- **tonic** (gRPC framework, by the tokio team) — services + must be `Clone`; shared state is wrapped in `Arc`. +- **tower** (`Service` trait) — clone-per-request model; + documentation recommends `Arc` for shared state. +- **reqwest::Client**, **kube::Client**, **AWS SDK clients** + — all use `Clone` wrapping an `Arc`. + +In our extension system, the extension struct is `Clone`. +When cloned into the capability registry and distributed to +consumers, all clones share the same underlying state via +`Arc`. No per-clone copies are made of shared resources +like credentials or broadcast channels. + +**Example — shared state via `Arc`:** + +```rust +#[derive(Clone)] +pub struct MyExtension { + // Shared across all clones — Arc is the sharing primitive + credential: Arc, + token_sender: Arc>>, + + // Cheap to clone — small owned values + scope: String, + method: AuthMethod, +} +``` + +**Interior mutability without `Mutex`:** + +For hot-path operations, `Arc>` is not the only +option. The following `Send`-compatible primitives provide +lock-free interior mutability: + +| Primitive | Cost | Use case | +|---|---|---| +| `AtomicU64` | ~1ns | Counters, flags | +| `ArcSwap` | ~2ns read | Swapping configs, token caches | +| `watch::Sender` | ~3ns read | Push-based state updates | +| `DashMap` | ~10-15ns | Concurrent hash maps | + +These are preferred over `Mutex` for extension state that +is read frequently on the hot path. `Mutex` (~15ns) is +acceptable for state accessed infrequently (e.g., config +updates). + +**Why `Send` only, not `Rc`/`RefCell`:** + +Capability traits must be `Send` so they can be stored in +the registry and distributed to any consumer — including +shared (`Send`) receivers and exporters. This rules out +`Rc` and `RefCell` in the extension struct. However, the +performance difference is minimal. + +## Core Types + +### Extension Lifecycle Trait + +The lifecycle contract every extension implements. Two +variants exist -- local and shared -- mirroring the pattern +used by receivers and exporters. + +**Local** (`engine/src/local/extension.rs`): + +```rust +#[async_trait(?Send)] +pub trait Extension { + async fn start( + self: Box, + ctrl_chan: ControlChannel, + effect_handler: EffectHandler, + ) -> Result; + + fn extension_capabilities(&self) + -> Vec + { + Vec::new() + } +} +``` + +**Shared** (`engine/src/shared/extension.rs`): + +```rust +#[async_trait] +pub trait Extension: Send { + async fn start( + self: Box, + ctrl_chan: ControlChannel, + effect_handler: EffectHandler, + ) -> Result; + + fn extension_capabilities(&self) + -> Vec + { + Vec::new() + } +} +``` + +Key points: + +- **Not generic over `PData`.** Unlike receivers, + processors, and exporters, extensions never touch pipeline + data. This is the fundamental difference. +- **`start()` takes ownership** via `Box`, moving the + extension into its own task. After this, the engine can + only reach it through the control channel. +- **`ControlChannel`** wraps a receiver for + `ExtensionControlMsg` (shutdown, config + updates, telemetry collection). No pipeline data ever + flows through it. +- **`EffectHandler`** provides node identity and metrics + reporting. Extensions manage their own timers directly + (e.g., `tokio::time`) rather than through the engine's + timer infrastructure. +- **`extension_capabilities()`** defaults to empty. Extensions + that publish capabilities override it (typically via the + `extension_capabilities!` macro) to return a + `Vec`. During pipeline build, the + engine calls this method on each extension and inserts + the returned registrations into the `CapabilityRegistry` + under the extension's configured name. Pure background + tasks leave the default and never appear in the + registry. +- The only difference between local and shared is the + `Send` bound: local uses `#[async_trait(?Send)]` (futures + can be `!Send`), shared uses `#[async_trait]` (futures + must be `Send`). This allows certain optimizations in + code paths that don't cross into the extension traits + the extension implements. + +### ExtensionWrapper + +Engine-internal adapter (`engine/src/extension.rs`) that +wraps a local or shared extension into a single type the +engine can manage uniformly. It is a **non-generic** enum: + +```rust +pub enum ExtensionWrapper { + Local { /* local::Extension impl + channels */ }, + Shared { /* shared::Extension impl + channels */ }, +} +``` + +Each variant holds: + +- The boxed extension instance +- A `ControlSender` / `ControlReceiver` pair for + `ExtensionControlMsg` +- The extension's `NodeId`, user config, and runtime config +- An optional `NodeTelemetryGuard` for lifecycle cleanup + +Responsibilities: + +- **Construction** -- `ExtensionWrapper::local()` and + `::shared()` create the control channel and box the + extension. +- **Trait registration** -- `register_capabilities()` calls + the extension's `extension_capabilities()` and inserts the + results into the `CapabilityRegistry` under the + extension's name. +- **Control sender** -- `extension_control_sender()` + produces an `ExtensionControlSender` that the engine + stores separately for shutdown orchestration. +- **Start** -- `start()` takes ownership, constructs the + appropriate `ControlChannel` and `EffectHandler`, and + calls the extension's `start()` method. No + `PipelineCtrlMsgSender` is passed -- extensions are + fully PData-free. +- **Telemetry** -- implements `TelemetryWrapped` for + control-channel metrics and node telemetry guards. + +`ExtensionWrapper` does **not** implement `Node` or +`Controllable` -- extensions are not data-path nodes. + +### ExtensionControlMsg + +Defined in `engine/src/control.rs`. A PData-free subset of +`NodeControlMsg` -- extensions never process pipeline data, +so they have no `Ack`, `Nack`, or `DelayedData` variants. + +```rust +#[derive(Debug, Clone)] +pub enum ExtensionControlMsg { + Config { config: serde_json::Value }, + CollectTelemetry { + metrics_reporter: MetricsReporter, + }, + Shutdown { deadline: Instant, reason: String }, +} +``` + +Each variant: + +- **`Config`** -- notifies the extension of a configuration + change (hot reload). +- **`CollectTelemetry`** -- asks the extension to flush its + local metrics into the provided `MetricsReporter`. +- **`Shutdown`** -- requests graceful shutdown with a + deadline and human-readable reason. + +Extensions manage their own periodic timers directly via +`tokio::time` rather than receiving timer ticks from +the engine. This keeps the control channel lean and +avoids unnecessary wakeups. + +These messages flow through a dedicated channel per +extension (created by `ExtensionWrapper`), kept separate +from the pipeline's `PipelineCtrlMsgSender` to avoid +blocking graceful shutdown (see Key Design Decision #3). + +`ExtensionControlSender` wraps the sender side of this +channel and is stored by the engine's +`PipelineCtrlMsgManager` for shutdown orchestration. + +### CapabilityRegistry and CapabilityRegistration + +Defined in `engine/src/extension/registry.rs`. + +**`CapabilityRegistration`** is a self-contained record produced +by the `extension_capabilities!` macro. Each registration carries: + +- A cloned copy of the concrete extension value + (type-erased via `Box`) +- A monomorphised `coerce` function pointer that knows how + to clone the concrete value and wrap it as + `Box` +- The `TypeId` of `Box` for lookup + +**`CapabilityRegistry`** stores these registrations and +serves lookups. It is `Clone + Send + Default`. + +```rust +// Keyed by (extension_name, TypeId::of::>()) +#[derive(Default, Clone)] +pub struct CapabilityRegistry { + handles: HashMap<(String, TypeId), RegistryEntry>, +} +``` + +The lookup API: + +```rust +// Consumer retrieves a trait object by name: +let provider: Box = registry + .get::("azure_auth")?; +``` + +How it works end-to-end: + +1. The `extension_capabilities!` macro clones the extension + instance once per trait, pairs each clone with a + monomorphised `coerce` fn, and returns + `Vec`. +2. During pipeline build, the engine calls + `ExtensionWrapper::register_capabilities()`, which calls + `extension_capabilities()` on the extension and inserts the + registrations into the registry under the extension's + configured name. +3. The registry is passed to each receiver/exporter/processor + factory as the last parameter of `create()`. Components + resolve their capabilities at construction time. +4. `get::(name)` looks up the entry by + `(name, TypeId::of::>())`, invokes the + stored `coerce` fn to produce a fresh + `Box`, and returns it. + +A single extension can implement multiple capabilities, +exposing different interfaces through granular traits: + +```rust +extension_capabilities!(BearerTokenProvider, HealthCheck); +``` + +This is useful for extensibility and version management -- +an extension can implement both `TraitA` and `TraitAv2` +simultaneously, letting consumers migrate at their own +pace while the extension supports both versions. + +Error discrimination: + +- **`NotFound`** -- no extension registered under that name. +- **`TraitNotImplemented`** -- extension exists but doesn't + expose the requested trait. + +### Sealed Capabilities and the `extension_capabilities!` Macro + +**Sealed capabilities** -- The `ExtensionCapability` marker trait +(`engine/src/extension/registry.rs`) restricts which capability +types can be stored in the registry. It uses a sealed +pattern: + +```rust +pub(crate) mod private { + pub trait Sealed {} +} +pub trait ExtensionCapability: private::Sealed {} +``` + +Each extension trait file self-registers: + +```rust +// In bearer_token_provider.rs: +impl private::Sealed for dyn BearerTokenProvider {} +impl ExtensionCapability for dyn BearerTokenProvider {} +``` + +Because `Sealed` is `pub(crate)`, external crates can +*implement* existing extension capabilities but cannot define +new capability types -- keeping the set of extension capabilities +well-defined and documented within the engine crate. + +**`extension_capabilities!` macro** -- A convenience macro that +extension writers use inside their `impl Extension` block +to wire up capability registration: + +```rust +#[async_trait(?Send)] +impl Extension for MyExtension { + extension_capabilities!(BearerTokenProvider); + + async fn start(...) { ... } +} +``` + +The macro handles the boilerplate that would otherwise +be error-prone: + +- Verifies at **compile time** that each listed trait + implements `ExtensionCapability` (sealed), catching attempts + to register unsupported capabilities. +- Verifies the concrete type implements each listed trait + plus `Clone + Send + 'static`. +- Creates monomorphised `coerce` function pointers for + type-safe downcasting -- these are the `fn` pointers + stored in `CapabilityRegistration` that the registry uses + to produce `Box` on lookup. + +Without the macro, extension writers would need to +manually construct `CapabilityRegistration` values with the +correct `TypeId` and coerce functions -- a process that +is both tedious and easy to get wrong. + +The macro's `Clone` requirement is intentional -- it +signals to extension developers that their type will be +cloned during registration (and again on each registry +`get()` call). This encourages holding internal state +behind `Arc` so that clones are cheap (just a reference +count bump) and all clones observe the same underlying +state. + +#### Design Alternative: `Arc` vs Boxed Clone + +An alternative design would have the registry store +`Arc` directly, giving true single-instance +sharing via pointer incrementation. However, `Arc` +requires `Sync` on the inner type -- which conflicts with +the engine's architecture where neither local nor shared +components require `Sync`. By using boxed deep clones +with a `Send`-only requirement, the registry works +naturally with both local and shared components. Extension +authors get the same cheap-clone semantics in practice by +wrapping their internal state in `Arc`, but without +imposing `Sync` at the trait boundary. + +#### Why Extension Capabilities Are `Send`-Only + +Extension capabilities (e.g., `BearerTokenProvider`) require +`Send` but not `Sync`. There is no `!Send` variant of +extension traits -- unlike the `Extension` lifecycle trait +which has local/shared variants. This simplifies +extension implementation: a single trait implementation +works for both local and shared consumers. + +Supporting additional boundary types is possible but +adds complexity at multiple levels: + +- **`Send + Sync`** could be supported by adding an + `Arc`-based storage bucket to the existing registry + (no separate registry needed). But it introduces a + decision point for every new trait: `Send`-only or + `Send + Sync`? +- **`!Send`** cannot coexist in the same registry -- any + `!Send` value (e.g., `Rc`) poisons the registry's + `Send` bound, making it unusable by shared components. + A separate local-only registry or a split view would + be required. +- **Extension writers** would need to reason about which + boundary their trait belongs to, likely requiring + different macros or marker types to select the right + storage path. + +`Send`-only avoids all of this: one storage mechanism, +one macro, one mental model -- and it covers all current +use cases. + +#### Returning `Sync` Values from `Send`-Only Traits + +Some consumers need `Send + Sync` values -- for example, +tonic interceptors must be `Clone + Send + Sync`. The +current design handles this without requiring `Sync` on +the trait object itself: the extension trait stays +`Send`-only, but its methods can return `Send + Sync` +values: + +```rust +#[async_trait] +pub trait InterceptorProvider: Send { + fn interceptor(&self) + -> Arc; +} +``` + +The registry stores `Box` +(`Send`, not `Sync`). The consumer calls +`.interceptor()` and gets back an +`Arc` that it can share +across threads freely. The `Sync` requirement stays on +the returned value, not on the trait object or the +extension struct. + +In practice the extension writer simply holds the +interceptor in an `Arc` field (which they already need +for cheap clones), so the implementation is trivial and +adds no friction. + +### BearerTokenProvider + +The first concrete extension trait, defined in +`engine/src/extension/bearer_token_provider.rs`. It +provides authentication tokens to consumers: + +```rust +#[async_trait] +pub trait BearerTokenProvider: Send { + async fn get_token(&self) + -> Result; + + fn subscribe_token_refresh(&self) + -> watch::Receiver>; +} +``` + +- **`get_token()`** returns a `BearerToken` containing + a `Secret`-wrapped token value and a UNIX-timestamp + expiry. `Secret` redacts the value in `Debug` output + to prevent accidental credential leakage in logs. +- **`subscribe_token_refresh()`** returns a + `tokio::sync::watch::Receiver` for reactive + notification when tokens are refreshed -- consumers + can update HTTP headers in a `tokio::select!` branch + without polling. + +This trait demonstrates the typical extension capability +pattern: + +- `Send`-only (no `Sync` required) +- Self-registers as a sealed `ExtensionCapability` via the + two-line `impl Sealed` / `impl ExtensionCapability` pattern +- Consumers look it up by name: + `registry.get::("auth")` + +### Adding a New Extension Capability + +Using `BearerTokenProvider` as the real example. + +**1. Define the trait** in a new file under +`engine/src/extension/` +(`bearer_token_provider.rs`): + +```rust +use async_trait::async_trait; + +#[async_trait] +pub trait BearerTokenProvider: Send { + async fn get_token(&self) + -> Result; + + fn subscribe_token_refresh(&self) + -> watch::Receiver>; +} +``` + +**2. Seal it** in the same file -- these two lines +register the trait for use with the registry: + +```rust +impl super::registry::private::Sealed + for dyn BearerTokenProvider {} +impl super::registry::ExtensionCapability + for dyn BearerTokenProvider {} +``` + +**3. Export the module** in `engine/src/extension.rs`: + +```rust +pub mod bearer_token_provider; +``` + +That's it for the engine side. The capability is now usable +in extension implementations and registry lookups. + +### Implementing an Extension + +Using the Azure Identity Auth Extension +(`contrib-nodes/src/extensions/ +azure_identity_auth_extension/`) as the real example. + +**1. Define a `Clone` struct** with shared state behind +`Arc`: + +```rust +#[derive(Clone)] +pub struct AzureIdentityAuthExtension { + credential: Arc, + scope: String, + method: AuthMethod, + token_sender: + Arc>>, +} +``` + +All state is behind `Arc` -- cloning is cheap and all +clones observe the same token broadcast channel. + +**2. Implement the extension trait** on the struct: + +```rust +#[async_trait] +impl BearerTokenProvider + for AzureIdentityAuthExtension +{ + async fn get_token(&self) + -> Result + { + let access_token = + self.get_token_with_retry().await?; + Ok(BearerToken::new( + access_token.token.secret().to_string(), + access_token.expires_on.unix_timestamp(), + )) + } + + fn subscribe_token_refresh(&self) + -> watch::Receiver> + { + self.token_sender.subscribe() + } +} +``` + +**3. Implement the `Extension` lifecycle trait** with +`extension_capabilities!` to wire up registration: + +```rust +#[async_trait(?Send)] +impl Extension for AzureIdentityAuthExtension { + extension_capabilities!(BearerTokenProvider); + + async fn start( + self: Box, + mut ctrl_chan: ControlChannel, + effect_handler: EffectHandler, + ) -> Result { + // proactive token refresh loop via + // tokio::select!, broadcasting new tokens + // to all subscribers via watch::Sender + } +} +``` + +**4. Register the factory** via `distributed_slice` so +the engine discovers it automatically: + +```rust +#[distributed_slice(OTAP_EXTENSION_FACTORIES)] +pub static AZURE_IDENTITY_AUTH_EXTENSION: + ExtensionFactory = ExtensionFactory { + name: AZURE_IDENTITY_AUTH_EXTENSION_URN, + create: |_ctx, node, node_config, ext_config| { + let cfg: Config = serde_json::from_value( + node_config.config.clone(), + )?; + cfg.validate()?; + let ext = + AzureIdentityAuthExtension::new(cfg)?; + Ok(ExtensionWrapper::local( + ext, node, node_config, ext_config, + )) + }, + validate_config: + validate_typed_config::, +}; +``` + +### Using an Extension + +**1. Configure it** in the pipeline YAML -- extensions +are siblings to `nodes`, not inside them: + +```yaml +groups: + default: + pipelines: + main: + extensions: + azure-auth: + type: "urn:microsoft:extension:azure_identity_auth" + config: + method: "managedidentity" + client_id: "your-client-id" + scope: "https://monitor.azure.com/.default" + + nodes: + azure-monitor-exporter: + type: "urn:microsoft:exporter:azure_monitor" + config: + auth: "azure-auth" # reference by name +``` + +Supports two auth methods: + +- **`managed_identity`** -- system or user-assigned + managed identity (production). +- **`development`** -- Azure CLI / Developer CLI + credentials (local development). + +**2. Look up in the factory** and subscribe to token +refreshes in `start()`: + +```rust +// In the factory create() closure: +let auth = capability_registry + .get::( + &cfg.auth, + )?; +// Pass auth to the exporter constructor: +let exporter = AzureMonitorExporter::new( + pipeline_ctx, cfg, auth, +)?; + +// In start(), subscribe to the stored auth field: +let mut token_rx = + self.auth.subscribe_token_refresh(); +token_rx.wait_for(|t| t.is_some()).await?; + +// In the event loop: +tokio::select! { + _ = token_rx.changed() => { + if let Some(token) = + token_rx.borrow_and_update().as_ref() + { + client_pool.update_auth( + bearer_header(token), + ); + } + } + // ... other branches +} +``` + +The exporter's config holds the extension name as a +string. The factory receives the `CapabilityRegistry` +and resolves the auth extension at construction time. +The exporter stores the resulting `Box` +as a field, using it directly in `start()` without +any registry lookup. + +This pattern eliminated ~380 lines of duplicated auth +code from the Azure Monitor exporter, replacing it with +a ~10-line registry lookup and reactive subscription. + +## Pipeline Lifecycle + +How extensions integrate into the pipeline's build, +start, steady-state, and shutdown phases. + +```text +1. Config parsing + +- Extensions parsed from the `extensions` + | section (sibling to `nodes`) + +- NodeKind::Extension recognized in node_urn + +- Placing an extension URN in `nodes` is + rejected with ExtensionInNodesSection error + +2. Pipeline build (PipelineFactory) + +- Create extensions FIRST from the + | `extensions` section + +- register_capabilities() -- collect + | CapabilityRegistration from each extension, + | insert into CapabilityRegistry + +- Create data-path nodes (receivers, + | processors, exporters) -- each factory + | receives &CapabilityRegistry as last param + +- Telemetry setup (channel metrics, node + telemetry guards) + +3. Pipeline start (RuntimePipeline::run) + +- Spawn extension tasks FIRST + +- Spawn exporter tasks + +- Spawn processor tasks + +- Spawn receiver tasks + +4. Steady state + +- Extensions run their event loops (e.g., + | token refresh) + +- Data-path components use registry lookups + | as needed + +- ExtensionControlMsg flows normally + (config, telemetry) + +5. Shutdown + +- Data-path nodes receive Shutdown and drain + +- Pipeline control channel closes after all + | data-path nodes finish + +- PipelineCtrlMsgManager::shutdown_extensions() + | sends ExtensionControlMsg::Shutdown to all + | extensions with a 5-second deadline + +- Extensions terminate AFTER data-path is + fully drained +``` + +**Why start-first?** Extensions provide capabilities +that data-path nodes depend on during initialization. +If an exporter needs a token at startup, the extension +must already be running and ready. + +**Why shutdown-last?** Extensions provide capabilities +that data-path nodes depend on during graceful shutdown. +If exporters are flushing final batches, they may still +need valid credentials. Shutting down extensions first +would cause those final exports to fail. + +**Why separate control senders?** Extension control +senders (`Vec`) are stored +separately from data-path `ControlSenders`. +This is because extensions use `ExtensionControlMsg` +(PData-free) rather than `NodeControlMsg`, and +keeping them separate ensures the pipeline control +channel can close naturally when all data-path senders +are dropped -- without extensions holding it open. diff --git a/rust/otap-dataflow/Cargo.toml b/rust/otap-dataflow/Cargo.toml index 97ae9ddda3..b797f37d64 100644 --- a/rust/otap-dataflow/Cargo.toml +++ b/rust/otap-dataflow/Cargo.toml @@ -221,6 +221,9 @@ experimental-tls = ["otap-df-otap/experimental-tls", "dep:rustls"] contrib-exporters = ["otap-df-contrib-nodes/contrib-exporters"] geneva-exporter = ["otap-df-contrib-nodes/geneva-exporter"] azure-monitor-exporter = ["otap-df-contrib-nodes/azure-monitor-exporter"] +# Contrib extensions (opt-in) - now in contrib-nodes +contrib-extensions = ["otap-df-contrib-nodes/contrib-extensions"] +azure-identity-auth-extension = ["otap-df-contrib-nodes/azure-identity-auth-extension"] # Contrib processors (opt-in) - now in contrib-nodes contrib-processors = ["otap-df-contrib-nodes/contrib-processors"] condense-attributes-processor = ["otap-df-contrib-nodes/condense-attributes-processor"] @@ -284,6 +287,14 @@ module_name_repetitions = "allow" broken_intra_doc_links = "deny" missing_crate_level_docs = "deny" +# Optimize third-party dependencies even in debug builds. +# Without this, pure-Rust implementations of compression (miniz_oxide/flate2), +# serialization (serde_json), and protobuf decoding (prost) run at opt-level 0, +# which can be 100-500x slower than optimized code — making debug builds +# unusable for any real workload testing. +[profile.dev.package."*"] +opt-level = 2 + [profile.release] debug = "line-tables-only" # minimum required for profiling diff --git a/rust/otap-dataflow/crates/config/src/node.rs b/rust/otap-dataflow/crates/config/src/node.rs index 3e96a83d80..0864419709 100644 --- a/rust/otap-dataflow/crates/config/src/node.rs +++ b/rust/otap-dataflow/crates/config/src/node.rs @@ -88,6 +88,8 @@ pub enum NodeKind { Processor, /// A sink of signals Exporter, + /// A provider of shared capabilities (e.g., auth, service discovery). + Extension, // ToDo(LQ) : Add more node kinds as needed. // A connector between two pipelines @@ -102,6 +104,7 @@ impl From for Cow<'static, str> { NodeKind::Receiver => "receiver".into(), NodeKind::Processor => "processor".into(), NodeKind::Exporter => "exporter".into(), + NodeKind::Extension => "extension".into(), NodeKind::ProcessorChain => "processor_chain".into(), } } diff --git a/rust/otap-dataflow/crates/config/src/node_urn.rs b/rust/otap-dataflow/crates/config/src/node_urn.rs index ed39f870aa..aac39739f1 100644 --- a/rust/otap-dataflow/crates/config/src/node_urn.rs +++ b/rust/otap-dataflow/crates/config/src/node_urn.rs @@ -208,6 +208,7 @@ const fn kind_suffix(expected_kind: NodeKind) -> &'static str { NodeKind::Receiver => "receiver", NodeKind::Processor | NodeKind::ProcessorChain => "processor", NodeKind::Exporter => "exporter", + NodeKind::Extension => "extension", } } @@ -228,9 +229,12 @@ fn parse_kind(raw: &str, kind: &str) -> Result { "receiver" => Ok(NodeKind::Receiver), "processor" => Ok(NodeKind::Processor), "exporter" => Ok(NodeKind::Exporter), + "extension" => Ok(NodeKind::Extension), _ => Err(invalid_plugin_urn( raw, - format!("expected kind `receiver`, `processor`, or `exporter`, found `{kind}`"), + format!( + "expected kind `receiver`, `processor`, `exporter`, or `extension`, found `{kind}`" + ), )), } } diff --git a/rust/otap-dataflow/crates/config/src/pipeline.rs b/rust/otap-dataflow/crates/config/src/pipeline.rs index c6efedd3a9..3736b523ff 100644 --- a/rust/otap-dataflow/crates/config/src/pipeline.rs +++ b/rust/otap-dataflow/crates/config/src/pipeline.rs @@ -41,14 +41,26 @@ pub struct PipelineConfig { #[serde(default, skip_serializing_if = "Option::is_none")] policies: Option, - /// All nodes in this pipeline, keyed by node ID. + /// All data-path nodes in this pipeline, keyed by node ID. + /// + /// This includes receivers, processors, and exporters — but NOT extensions. + /// Extensions are configured in the sibling `extensions` section. #[serde(default)] nodes: PipelineNodes, + /// Pipeline extensions, keyed by extension ID. + /// + /// Extensions are long-lived components that run alongside the pipeline and + /// expose functionality (e.g., authentication, service discovery) to other + /// components. Unlike nodes, extensions do NOT participate in data-path + /// connections. + #[serde(default, skip_serializing_if = "PipelineNodes::is_empty")] + extensions: PipelineNodes, + /// Explicit graph connections between nodes. /// /// When provided, these connections are used as the authoritative topology for - /// the main pipeline graph. + /// the main pipeline graph. Extensions are not part of connections. #[serde(default, skip_serializing_if = "Vec::is_empty")] connections: Vec, } @@ -474,17 +486,28 @@ impl PipelineConfig { self.policies.as_ref() } - /// Returns a reference to the main pipeline nodes. + /// Returns a reference to the main pipeline nodes (receivers, processors, exporters). #[must_use] pub const fn nodes(&self) -> &PipelineNodes { &self.nodes } - /// Returns an iterator visiting all nodes in the pipeline. + /// Returns a reference to the pipeline extensions. + #[must_use] + pub const fn extensions(&self) -> &PipelineNodes { + &self.extensions + } + + /// Returns an iterator visiting all data-path nodes in the pipeline. pub fn node_iter(&self) -> impl Iterator)> { self.nodes.iter() } + /// Returns an iterator visiting all extension nodes in the pipeline. + pub fn extension_iter(&self) -> impl Iterator)> { + self.extensions.iter() + } + /// Returns true if the pipeline graph is defined with top-level connections. #[must_use] pub fn has_connections(&self) -> bool { @@ -496,11 +519,16 @@ impl PipelineConfig { self.connections.iter() } - /// Creates a consuming iterator over the nodes in the pipeline. + /// Creates a consuming iterator over the data-path nodes in the pipeline. pub fn node_into_iter(self) -> impl Iterator)> { self.nodes.into_iter() } + /// Creates a consuming iterator over the extensions in the pipeline. + pub fn extension_into_iter(self) -> impl Iterator)> { + self.extensions.into_iter() + } + /// Remove unconnected nodes from the main pipeline graph and return removed node descriptors. /// /// Connectivity is defined by top-level `connections`: @@ -526,6 +554,8 @@ impl PipelineConfig { !has_incoming || !has_outgoing } NodeKind::Exporter => !has_incoming, + // Extensions are in a separate section and never appear in `nodes`. + NodeKind::Extension => false, }; if should_remove { @@ -590,17 +620,20 @@ impl PipelineConfig { r#type: PipelineType::Otap, policies, nodes, + extensions: PipelineNodes::default(), connections, } } - /// Normalize plugin URNs for pipeline nodes. + /// Normalize plugin URNs for pipeline nodes and extensions. fn canonicalize_plugin_urns( &mut self, pipeline_group_id: &PipelineGroupId, pipeline_id: &PipelineId, ) -> Result<(), Error> { self.nodes + .canonicalize_plugin_urns(pipeline_group_id, pipeline_id)?; + self.extensions .canonicalize_plugin_urns(pipeline_group_id, pipeline_id) } @@ -878,6 +911,7 @@ fn prune_connection( pub struct PipelineConfigBuilder { description: Option, nodes: HashMap, + extensions: HashMap, duplicate_nodes: Vec, pending_connections: Vec, } @@ -896,6 +930,7 @@ impl PipelineConfigBuilder { Self { description: None, nodes: HashMap::new(), + extensions: HashMap::new(), duplicate_nodes: Vec::new(), pending_connections: Vec::new(), } @@ -966,6 +1001,33 @@ impl PipelineConfigBuilder { self.add_node(id, node_type, config) } + /// Add an extension (configured as a sibling to nodes, not as a node). + pub fn add_extension, U: Into>( + mut self, + id: S, + node_type: U, + config: Option, + ) -> Self { + let id = id.into(); + let node_type = node_type.into(); + if self.extensions.contains_key(&id) || self.nodes.contains_key(&id) { + self.duplicate_nodes.push(id.clone()); + } else { + _ = self.extensions.insert( + id.clone(), + NodeUserConfig { + r#type: node_type, + description: None, + entity: None, + outputs: Vec::new(), + default_output: None, + config: config.unwrap_or(Value::Null), + }, + ); + } + self + } + /// Connects a source node output port to one or more target nodes /// with a given dispatch policy. pub fn connect( @@ -1166,6 +1228,11 @@ impl PipelineConfigBuilder { .into_iter() .map(|(id, node)| (id, Arc::new(node))) .collect(), + extensions: self + .extensions + .into_iter() + .map(|(id, node)| (id, Arc::new(node))) + .collect(), connections: built_connections, policies: None, r#type: pipeline_type, diff --git a/rust/otap-dataflow/crates/contrib-nodes/Cargo.toml b/rust/otap-dataflow/crates/contrib-nodes/Cargo.toml index 5feadea6fe..dc4563a9b8 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/Cargo.toml +++ b/rust/otap-dataflow/crates/contrib-nodes/Cargo.toml @@ -54,6 +54,14 @@ contrib-exporters = [ "geneva-exporter", "azure-monitor-exporter", ] +contrib-extensions = [ + "azure-identity-auth-extension", +] +azure-identity-auth-extension = [ + "dep:azure_identity", + "dep:azure_core", + "dep:rand", +] geneva-exporter = [ "dep:geneva-uploader", "dep:opentelemetry-proto", diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/auth.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/auth.rs deleted file mode 100644 index 62d3170001..0000000000 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/auth.rs +++ /dev/null @@ -1,368 +0,0 @@ -// Copyright The OpenTelemetry Authors -// SPDX-License-Identifier: Apache-2.0 - -use azure_core::credentials::{AccessToken, TokenCredential}; -use azure_identity::{ - DeveloperToolsCredential, DeveloperToolsCredentialOptions, ManagedIdentityCredential, - ManagedIdentityCredentialOptions, UserAssignedId, -}; -use otap_df_telemetry::{otel_debug, otel_warn}; -use std::sync::Arc; - -use super::Error; -use super::config::{AuthConfig, AuthMethod}; -use super::metrics::AzureMonitorExporterMetricsRc; - -/// Minimum delay between token refresh retry attempts in seconds. -const MIN_RETRY_DELAY_SECS: f64 = 5.0; -/// Maximum delay between token refresh retry attempts in seconds. -const MAX_RETRY_DELAY_SECS: f64 = 30.0; -/// Maximum jitter percentage (±10%) to add to retry delays. -const MAX_RETRY_JITTER_RATIO: f64 = 0.10; - -#[derive(Clone, Debug)] -// TODO - Consolidate with crates/otap/src/{cloud_auth,object_store)/azure.rs -pub struct Auth { - credential: Arc, - scope: String, - metrics: AzureMonitorExporterMetricsRc, -} - -impl Auth { - pub fn new( - auth_config: &AuthConfig, - metrics: AzureMonitorExporterMetricsRc, - ) -> Result { - let credential = Self::create_credential(auth_config)?; - - Ok(Self { - credential, - scope: auth_config.scope.clone(), - metrics, - }) - } - - #[cfg(test)] - pub fn from_credential( - credential: Arc, - scope: String, - metrics: AzureMonitorExporterMetricsRc, - ) -> Self { - Self { - credential, - scope, - metrics, - } - } - - async fn get_token_internal(&self) -> Result { - let token_response = self - .credential - .get_token( - &[&self.scope], - Some(azure_core::credentials::TokenRequestOptions::default()), - ) - .await - .map_err(Error::token_acquisition)?; - - Ok(token_response) - } - - pub async fn get_token(&mut self) -> Result { - let mut attempt = 0_i32; - let start = tokio::time::Instant::now(); - loop { - attempt += 1; - - match self.get_token_internal().await { - Ok(token) => { - otel_debug!("azure_monitor_exporter.auth.get_token_succeeded", expires_on = %token.expires_on); - let mut m = self.metrics.borrow_mut(); - m.add_auth_success_latency(start.elapsed().as_millis() as f64); - return Ok(token); - } - Err(e) => { - otel_warn!("azure_monitor_exporter.auth.get_token_failed", attempt = attempt, error = %e); - self.metrics.borrow_mut().add_auth_failure(); - } - } - - // Calculate exponential backoff: 5s, 10s, 20s, 30s (capped) - let base_delay_secs = MIN_RETRY_DELAY_SECS * 2.0_f64.powi(attempt - 1); - let capped_delay_secs = base_delay_secs.min(MAX_RETRY_DELAY_SECS); - - // Add jitter: random value between -10% and +10% of the delay - let jitter_range = capped_delay_secs * MAX_RETRY_JITTER_RATIO; - let jitter = if jitter_range > 0.0 { - let random_factor = rand::random::() * 2.0 - 1.0; - random_factor * jitter_range - } else { - 0.0 - }; - - let delay_secs = (capped_delay_secs + jitter).max(1.0); - let delay = tokio::time::Duration::from_secs_f64(delay_secs); - - otel_warn!( - "azure_monitor_exporter.auth.retry_scheduled", - delay_secs = %delay_secs - ); - tokio::time::sleep(delay).await; - } - } - - fn create_credential(auth_config: &AuthConfig) -> Result, Error> { - match auth_config.method { - AuthMethod::ManagedIdentity => { - let mut options = ManagedIdentityCredentialOptions::default(); - - if let Some(client_id) = &auth_config.client_id { - options.user_assigned_id = Some(UserAssignedId::ClientId(client_id.clone())); - } - - Ok(ManagedIdentityCredential::new(Some(options)) - .map_err(|e| Error::create_credential(AuthMethod::ManagedIdentity, e))?) - } - AuthMethod::Development => Ok(DeveloperToolsCredential::new(Some( - DeveloperToolsCredentialOptions::default(), - )) - .map_err(|e| Error::create_credential(AuthMethod::Development, e))?), - } - } -} - -#[cfg(test)] -mod tests { - use super::super::metrics::{AzureMonitorExporterMetrics, AzureMonitorExporterMetricsTracker}; - use super::*; - use azure_core::credentials::TokenRequestOptions; - use azure_core::time::OffsetDateTime; - use otap_df_telemetry::registry::TelemetryRegistryHandle; - use otap_df_telemetry::testing::EmptyAttributes; - use std::cell::RefCell; - use std::rc::Rc; - use std::sync::atomic::{AtomicUsize, Ordering}; - - fn create_test_metrics() -> AzureMonitorExporterMetricsRc { - let registry = TelemetryRegistryHandle::new(); - let metric_set = - registry.register_metric_set::(EmptyAttributes()); - Rc::new(RefCell::new(AzureMonitorExporterMetricsTracker::new( - metric_set, - ))) - } - - #[derive(Debug)] - struct MockCredential { - token: String, - expires_in: azure_core::time::Duration, - call_count: Arc, - } - - fn make_mock_credential( - token: &str, - expires_in: azure_core::time::Duration, - call_count: Arc, - ) -> Arc { - let cred: Arc = Arc::new(MockCredential { - token: token.to_string(), - expires_in, - call_count, - }); - cred - } - - #[async_trait::async_trait] - impl TokenCredential for MockCredential { - async fn get_token( - &self, - _scopes: &[&str], - _options: Option>, - ) -> azure_core::Result { - let _ = self.call_count.fetch_add(1, Ordering::SeqCst); - - Ok(AccessToken { - token: self.token.clone().into(), - expires_on: OffsetDateTime::now_utc() + self.expires_in, - }) - } - } - - // ==================== Construction Tests ==================== - - #[tokio::test] - async fn test_from_credential_creates_auth() { - let credential = make_mock_credential( - "test_token", - azure_core::time::Duration::minutes(60), - Arc::new(AtomicUsize::new(0)), - ); - - let auth = - Auth::from_credential(credential, "test_scope".to_string(), create_test_metrics()); - assert_eq!(auth.scope, "test_scope"); - } - - #[tokio::test] - async fn test_new_with_managed_identity_user_assigned() { - let auth_config = AuthConfig { - method: AuthMethod::ManagedIdentity, - client_id: Some("test-client-id".to_string()), - scope: "https://test.scope".to_string(), - }; - - let auth = Auth::new(&auth_config, create_test_metrics()); - assert!(auth.is_ok()); - let auth = auth.unwrap(); - assert_eq!(auth.scope, "https://test.scope"); - } - - #[tokio::test] - async fn test_new_with_managed_identity_system_assigned() { - let auth_config = AuthConfig { - method: AuthMethod::ManagedIdentity, - client_id: None, - scope: "https://test.scope".to_string(), - }; - - let auth = Auth::new(&auth_config, create_test_metrics()); - assert!(auth.is_ok()); - } - - #[tokio::test] - async fn test_new_with_development_auth() { - let auth_config = AuthConfig { - method: AuthMethod::Development, - client_id: None, - scope: "https://test.scope".to_string(), - }; - - // May fail if Azure CLI not installed - both outcomes are valid - let result = Auth::new(&auth_config, create_test_metrics()); - match result { - Ok(auth) => assert_eq!(auth.scope, "https://test.scope"), - Err(Error::Auth { - kind: super::super::error::AuthErrorKind::CreateCredential { method }, - .. - }) => { - assert_eq!(method, AuthMethod::Development); - } - Err(err) => panic!("Unexpected error type: {:?}", err), - } - } - - // ==================== Token Fetching Tests ==================== - - #[tokio::test] - async fn test_get_token_internal_returns_valid_token() { - let call_count = Arc::new(AtomicUsize::new(0)); - let credential = make_mock_credential( - "test_token", - azure_core::time::Duration::minutes(60), - call_count.clone(), - ); - - let auth = Auth::from_credential(credential, "scope".to_string(), create_test_metrics()); - - let token = auth.get_token_internal().await.unwrap(); - assert_eq!(token.token.secret(), "test_token"); - assert_eq!(call_count.load(Ordering::SeqCst), 1); - } - - #[tokio::test] - async fn test_get_token_internal_calls_credential_each_time() { - let call_count = Arc::new(AtomicUsize::new(0)); - let credential = make_mock_credential( - "test_token", - azure_core::time::Duration::minutes(60), - call_count.clone(), - ); - - let auth = Auth::from_credential(credential, "scope".to_string(), create_test_metrics()); - - // Each call to get_token_internal should call the credential - let _ = auth.get_token_internal().await.unwrap(); - assert_eq!(call_count.load(Ordering::SeqCst), 1); - - let _ = auth.get_token_internal().await.unwrap(); - assert_eq!(call_count.load(Ordering::SeqCst), 2); - - let _ = auth.get_token_internal().await.unwrap(); - assert_eq!(call_count.load(Ordering::SeqCst), 3); - } - - #[tokio::test] - async fn test_get_token_internal_returns_cloned_tokens() { - let credential = make_mock_credential( - "test_token", - azure_core::time::Duration::minutes(60), - Arc::new(AtomicUsize::new(0)), - ); - - let auth = Auth::from_credential(credential, "scope".to_string(), create_test_metrics()); - - let token1 = auth.get_token_internal().await.unwrap(); - let token2 = auth.get_token_internal().await.unwrap(); - - // Same value from both calls - assert_eq!(token1.token.secret(), token2.token.secret()); - } - - // ==================== Error Handling Tests ==================== - - #[tokio::test] - async fn test_get_token_internal_propagates_credential_error() { - #[derive(Debug)] - struct FailingCredential; - - #[async_trait::async_trait] - impl TokenCredential for FailingCredential { - async fn get_token( - &self, - _scopes: &[&str], - _options: Option>, - ) -> azure_core::Result { - Err(azure_core::error::Error::new( - azure_core::error::ErrorKind::Credential, - "Mock credential failure", - )) - } - } - - let cred = FailingCredential; - let credential: Arc = Arc::new(cred); - let auth = Auth::from_credential(credential, "scope".to_string(), create_test_metrics()); - - let result = auth.get_token_internal().await; - assert!(result.is_err()); - match result.unwrap_err() { - Error::Auth { - kind: super::super::error::AuthErrorKind::TokenAcquisition, - .. - } => {} - err => panic!("Expected Auth token acquisition error, got: {:?}", err), - } - } - - // ==================== Clone Behavior Tests ==================== - - #[tokio::test] - async fn test_cloned_auth_shares_credential() { - let call_count = Arc::new(AtomicUsize::new(0)); - let credential = make_mock_credential( - "test_token", - azure_core::time::Duration::minutes(60), - call_count.clone(), - ); - - let auth1 = Auth::from_credential(credential, "scope".to_string(), create_test_metrics()); - let auth2 = auth1.clone(); - - // Both auth instances share the same credential - let _ = auth1.get_token_internal().await.unwrap(); - assert_eq!(call_count.load(Ordering::SeqCst), 1); - - let _ = auth2.get_token_internal().await.unwrap(); - assert_eq!(call_count.load(Ordering::SeqCst), 2); - } -} diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/config.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/config.rs index 74d3b51efc..b27195e8eb 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/config.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/config.rs @@ -13,70 +13,11 @@ pub struct Config { /// API configuration for Azure Monitor pub api: ApiConfig, - /// Authentication configuration + /// Name of the authentication extension to use for token acquisition. + /// This should match the name of an Azure Identity Auth Extension configured + /// in the pipeline. #[serde(default)] - pub auth: AuthConfig, -} - -/// Authentication method for Azure -#[derive(Debug, Deserialize, Clone, PartialEq, Default)] -#[serde(rename_all = "lowercase")] -pub enum AuthMethod { - /// Use Managed Identity (system or user-assigned with client_id) - #[serde(alias = "msi", alias = "managed_identity")] - #[default] - ManagedIdentity, - - /// Use developer tools (Azure CLI, Azure Developer CLI) - #[serde(alias = "dev", alias = "developer", alias = "cli")] - Development, -} - -/// Authentication configuration for Azure -#[derive(Debug, Deserialize, Clone)] -pub struct AuthConfig { - /// Authentication method to use - #[serde(default)] - pub method: AuthMethod, - - /// Client ID for user-assigned managed identity (optional) - /// Only used when method is ManagedIdentity - /// If not provided with ManagedIdentity, system-assigned identity will be used - pub client_id: Option, - - /// OAuth scope for token acquisition (defaults to "https://monitor.azure.com/.default") - #[serde(default = "default_scope")] - pub scope: String, -} - -impl AuthConfig { - /// Returns a human-readable name for the configured authentication method. - pub fn auth_method_name(&self) -> &'static str { - match self.method { - AuthMethod::ManagedIdentity => { - if self.client_id.is_some() { - "user_assigned_managed_identity" - } else { - "system_assigned_managed_identity" - } - } - AuthMethod::Development => "developer_tools", - } - } -} - -impl Default for AuthConfig { - fn default() -> Self { - Self { - method: AuthMethod::default(), - client_id: None, - scope: default_scope(), - } - } -} - -fn default_scope() -> String { - "https://monitor.azure.com/.default".to_string() + pub auth: String, } /// API configuration for connecting to Azure Monitor @@ -115,13 +56,6 @@ pub struct SchemaConfig { impl Config { /// Validate the configuration pub fn validate(&self) -> Result<(), Error> { - // Validate auth configuration - if self.auth.scope.is_empty() { - return Err(Error::Config( - "Invalid configuration: auth scope must be non-empty".to_string(), - )); - } - // Validate API configuration if self.api.dcr_endpoint.is_empty() { return Err(Error::Config( @@ -210,11 +144,7 @@ mod tests { dcr: "mydcr".to_string(), schema: SchemaConfig::default(), }, - auth: AuthConfig { - scope: "https://monitor.azure.com/.default".to_string(), - client_id: Some("myclientid".to_string()), - method: AuthMethod::ManagedIdentity, - }, + auth: "azure_identity_auth".to_string(), }; assert!(config.validate().is_ok()); @@ -229,7 +159,7 @@ mod tests { dcr: "".to_string(), schema: SchemaConfig::default(), }, - auth: AuthConfig::default(), + auth: String::new(), }; let result = config.validate(); @@ -257,7 +187,7 @@ mod tests { ]), }, }, - auth: AuthConfig::default(), + auth: "azure_identity_auth".to_string(), }; let result = config.validate(); @@ -297,7 +227,7 @@ mod tests { ]), }, }, - auth: AuthConfig::default(), + auth: "azure_identity_auth".to_string(), }; let result = config.validate(); @@ -326,7 +256,7 @@ mod tests { )]), }, }, - auth: AuthConfig::default(), + auth: "azure_identity_auth".to_string(), }; let result = config.validate(); diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/error.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/error.rs index 936c363985..a054451f74 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/error.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/error.rs @@ -1,7 +1,6 @@ // Copyright The OpenTelemetry Authors // SPDX-License-Identifier: Apache-2.0 -use super::config::AuthMethod; use http::StatusCode; use http::header::InvalidHeaderValue; @@ -124,7 +123,7 @@ pub enum Error { /// Failed to create auth handler. #[error("Failed to create auth handler")] - AuthHandlerCreation(#[source] Box), + AuthHandlerCreation(#[source] Box), /// Client pool initialization failed. #[error("Client pool initialization failed")] @@ -146,15 +145,9 @@ pub enum Error { }, } -/// Authentication error classification. +/// Authentication error classification for HTTP-level auth errors. #[derive(Debug, Clone)] pub enum AuthErrorKind { - /// Failed to create credential (during setup). - CreateCredential { method: AuthMethod }, - /// Failed to acquire token. - TokenAcquisition, - /// Token refresh failed during retry. - TokenRefresh, /// Server returned 401. Unauthorized, /// Server returned 403. @@ -164,9 +157,6 @@ pub enum AuthErrorKind { impl std::fmt::Display for AuthErrorKind { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { - Self::CreateCredential { method } => write!(f, "create credential: {method:?}"), - Self::TokenAcquisition => write!(f, "token acquisition"), - Self::TokenRefresh => write!(f, "token refresh"), Self::Unauthorized => write!(f, "unauthorized"), Self::Forbidden => write!(f, "forbidden"), } @@ -242,26 +232,6 @@ impl Error { Self::Network { kind, source } } - /// Creates a credential creation error. - #[must_use] - pub fn create_credential(method: AuthMethod, source: azure_core::error::Error) -> Self { - Self::Auth { - kind: AuthErrorKind::CreateCredential { method }, - source: Some(source), - body: None, - } - } - - /// Creates a token acquisition error. - #[must_use] - pub fn token_acquisition(source: azure_core::error::Error) -> Self { - Self::Auth { - kind: AuthErrorKind::TokenAcquisition, - source: Some(source), - body: None, - } - } - /// Creates an unauthorized (401) error. #[must_use] pub fn unauthorized(body: String) -> Self { @@ -328,34 +298,6 @@ mod tests { // ==================== Auth Error Tests ==================== - #[test] - fn test_auth_create_credential_message() { - let azure_error = azure_core::error::Error::with_message( - azure_core::error::ErrorKind::Credential, - "managed identity not available", - ); - let error = Error::create_credential(AuthMethod::ManagedIdentity, azure_error); - assert_eq!( - error.to_string(), - "Auth error (create credential: ManagedIdentity): managed identity not available" - ); - assert!(error.source().is_some()); - } - - #[test] - fn test_auth_token_acquisition_message() { - let azure_error = azure_core::error::Error::with_message( - azure_core::error::ErrorKind::Credential, - "token expired", - ); - let error = Error::token_acquisition(azure_error); - assert_eq!( - error.to_string(), - "Auth error (token acquisition): token expired" - ); - assert!(error.source().is_some()); - } - #[test] fn test_auth_unauthorized_message() { let error = Error::unauthorized("invalid token".to_string()); @@ -459,13 +401,6 @@ mod tests { } .is_retryable() ); - assert!( - !Error::token_acquisition(azure_core::error::Error::with_message( - azure_core::error::ErrorKind::Credential, - "test" - )) - .is_retryable() - ); } // ==================== Display Tests ==================== @@ -481,18 +416,6 @@ mod tests { #[test] fn test_auth_error_kind_display() { - assert_eq!( - AuthErrorKind::CreateCredential { - method: AuthMethod::ManagedIdentity - } - .to_string(), - "create credential: ManagedIdentity" - ); - assert_eq!( - AuthErrorKind::TokenAcquisition.to_string(), - "token acquisition" - ); - assert_eq!(AuthErrorKind::TokenRefresh.to_string(), "token refresh"); assert_eq!(AuthErrorKind::Unauthorized.to_string(), "unauthorized"); assert_eq!(AuthErrorKind::Forbidden.to_string(), "forbidden"); } diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/exporter.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/exporter.rs index 74863ef7f9..f56ef5aa49 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/exporter.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/exporter.rs @@ -2,13 +2,13 @@ // SPDX-License-Identifier: Apache-2.0 use async_trait::async_trait; -use azure_core::credentials::AccessToken; use otap_df_channel::error::RecvError; use otap_df_config::SignalType; use otap_df_engine::ConsumerEffectHandlerExtension; use otap_df_engine::context::PipelineContext; use otap_df_engine::control::{AckMsg, NackMsg, NodeControlMsg}; use otap_df_engine::error::Error as EngineError; +use otap_df_engine::extension::bearer_token_provider::BearerTokenProvider; use otap_df_engine::local::exporter::{EffectHandler, Exporter}; use otap_df_engine::message::{Message, MessageChannel}; use otap_df_engine::terminal_state::TerminalState; @@ -18,7 +18,6 @@ use otap_df_pdata::views::otlp::bytes::logs::RawLogsData; use otap_df_pdata::{OtapArrowRecords, OtapPayload}; use otap_df_pdata_views::views::logs::LogsDataView; -use super::auth::Auth; use super::client::LogsIngestionClientPool; use super::config::Config; use super::error::Error; @@ -40,16 +39,11 @@ use std::rc::Rc; const MAX_IN_FLIGHT_EXPORTS: usize = 16; const PERIODIC_EXPORT_INTERVAL: u64 = 3; const HEARTBEAT_INTERVAL_SECONDS: u64 = 60; -/// Minimum interval between token refresh attempts (10 seconds). -const MIN_TOKEN_REFRESH_INTERVAL_SECS: u64 = 10; -/// Buffer time before token expiry to trigger a refresh. -/// Azure Identity SDK caches tokens internally and won't issue a new token -/// until ~5 minutes before expiry, so we schedule refresh at 295 seconds before expiry. -const TOKEN_EXPIRY_BUFFER_SECS: u64 = 295; /// Azure Monitor exporter. pub struct AzureMonitorExporter { config: Config, + auth: Box, transformer: Transformer, gzip_batcher: GzipBatcher, state: AzureMonitorExporterState, @@ -62,7 +56,11 @@ pub struct AzureMonitorExporter { impl AzureMonitorExporter { /// Build a new exporter from configuration. - pub fn new(pipeline_ctx: PipelineContext, config: Config) -> Result { + pub fn new( + pipeline_ctx: PipelineContext, + config: Config, + auth: Box, + ) -> Result { // Validate configuration config .validate() @@ -85,6 +83,7 @@ impl AzureMonitorExporter { Ok(Self { config, + auth, transformer, gzip_batcher, state: AzureMonitorExporterState::new(), @@ -356,25 +355,6 @@ impl AzureMonitorExporter { Ok(()) } - #[inline] - fn get_next_token_refresh(token: AccessToken) -> tokio::time::Instant { - let now = azure_core::time::OffsetDateTime::now_utc(); - let duration_remaining = if token.expires_on > now { - (token.expires_on - now).unsigned_abs() - } else { - std::time::Duration::ZERO - }; - - let token_valid_until = tokio::time::Instant::now() + duration_remaining; - let next_token_refresh = - token_valid_until - tokio::time::Duration::from_secs(TOKEN_EXPIRY_BUFFER_SECS); - std::cmp::max( - next_token_refresh, - tokio::time::Instant::now() - + tokio::time::Duration::from_secs(MIN_TOKEN_REFRESH_INTERVAL_SECS), - ) - } - async fn handle_message( &mut self, effect_handler: &EffectHandler, @@ -468,16 +448,13 @@ impl Exporter for AzureMonitorExporter { endpoint = self.config.api.dcr_endpoint.as_str(), stream = self.config.api.stream_name.as_str(), dcr = self.config.api.dcr.as_str(), - auth_method = self.config.auth.auth_method_name() + auth_extension = self.config.auth.as_str() ); let mut msg_id = 0; - let mut auth = Auth::new(&self.config.auth, self.metrics.clone()).map_err(|e| { - let error = Error::AuthHandlerCreation(Box::new(e)); - EngineError::InternalError { - message: error.to_string(), - } - })?; + + // Subscribe to token refresh events from the auth extension (resolved at factory time) + let mut token_rx = self.auth.subscribe_token_refresh(); self.client_pool .initialize(&self.config.api) @@ -489,6 +466,27 @@ impl Exporter for AzureMonitorExporter { } })?; + // Wait for the initial token — blocks until the auth extension provides one + otel_info!("azure_monitor_exporter.auth.waiting_for_initial_token"); + let _ = + token_rx + .wait_for(|t| t.is_some()) + .await + .map_err(|_| EngineError::InternalError { + message: "Auth extension closed before providing a token".to_string(), + })?; + + // Set the initial token on the client pool and heartbeat + if let Some(token) = token_rx.borrow().as_ref() { + let header = HeaderValue::from_str(&format!("Bearer {}", token.token.secret())) + .map_err(|e| EngineError::InternalError { + message: format!("Failed to create auth header: {e:?}"), + })?; + self.client_pool.update_auth(header.clone()); + self.heartbeat.update_auth(header); + otel_info!("azure_monitor_exporter.auth.initial_token_set"); + } + // Start periodic telemetry collection and retain the cancel handle for graceful shutdown let telemetry_timer_cancel_handle = effect_handler .start_periodic_telemetry(std::time::Duration::from_secs(1)) @@ -497,7 +495,6 @@ impl Exporter for AzureMonitorExporter { message: format!("Failed to start telemetry timer: {e}"), })?; - let mut next_token_refresh = tokio::time::Instant::now(); let mut next_periodic_export = tokio::time::Instant::now() + tokio::time::Duration::from_secs(PERIODIC_EXPORT_INTERVAL); let mut next_heartbeat_send = tokio::time::Instant::now(); @@ -509,37 +506,18 @@ impl Exporter for AzureMonitorExporter { tokio::select! { biased; - _ = tokio::time::sleep_until(next_token_refresh) => { - match auth.get_token().await { - Ok(access_token) => { - match HeaderValue::from_str(&format!("Bearer {}", access_token.token.secret())) { - Ok(header) => { - self.client_pool.update_auth(header.clone()); - self.heartbeat.update_auth(header.clone()); - - // Schedule next token refresh - next_token_refresh = Self::get_next_token_refresh(access_token); - - let refresh_in = next_token_refresh.saturating_duration_since(tokio::time::Instant::now()); - let total_secs = refresh_in.as_secs(); - let hours = total_secs / 3600; - let minutes = (total_secs % 3600) / 60; - let seconds = total_secs % 60; - - otel_info!("azure_monitor_exporter.auth.token_refresh", refresh_in = format!("{}h {}m {}s", hours, minutes, seconds)); - } - Err(e) => { - otel_error!("azure_monitor_exporter.auth.header_creation_failed", error = ?e); - // Retry every 10 seconds - next_token_refresh = tokio::time::Instant::now() + tokio::time::Duration::from_secs(10); - } + // React to token refresh events from the auth extension + _ = token_rx.changed() => { + if let Some(token) = token_rx.borrow_and_update().as_ref() { + match HeaderValue::from_str(&format!("Bearer {}", token.token.secret())) { + Ok(header) => { + self.client_pool.update_auth(header.clone()); + self.heartbeat.update_auth(header); + otel_info!("azure_monitor_exporter.auth.token_refreshed"); + } + Err(e) => { + otel_error!("azure_monitor_exporter.auth.header_creation_failed", error = ?e); } - - } - Err(e) => { - otel_error!("azure_monitor_exporter.auth.token_refresh_failed", error = ?e); - // Retry every 10 seconds - next_token_refresh = tokio::time::Instant::now() + tokio::time::Duration::from_secs(10); } } } @@ -622,12 +600,12 @@ impl Exporter for AzureMonitorExporter { #[cfg(test)] mod tests { - use super::super::config::{ApiConfig, AuthConfig, SchemaConfig}; + use super::super::config::{ApiConfig, SchemaConfig}; use super::*; - use azure_core::time::OffsetDateTime; use bytes::Bytes; use http::StatusCode; use otap_df_engine::context::{ControllerContext, PipelineContext}; + use otap_df_engine::extension::bearer_token_provider::BearerToken; use otap_df_engine::local::exporter::EffectHandler; use otap_df_engine::node::NodeId; use otap_df_otap::pdata::Context; @@ -635,6 +613,35 @@ mod tests { use otap_df_telemetry::reporter::MetricsReporter; use std::collections::HashMap; + /// A no-op BearerTokenProvider for unit tests that don't exercise auth. + struct MockTokenProvider { + token_tx: tokio::sync::watch::Sender>, + } + + impl MockTokenProvider { + fn new() -> Self { + let (token_tx, _) = tokio::sync::watch::channel(None); + Self { token_tx } + } + } + + #[async_trait] + impl BearerTokenProvider for MockTokenProvider { + async fn get_token( + &self, + ) -> Result { + Err("mock: no token available".into()) + } + + fn subscribe_token_refresh(&self) -> tokio::sync::watch::Receiver> { + self.token_tx.subscribe() + } + } + + fn create_mock_auth() -> Box { + Box::new(MockTokenProvider::new()) + } + fn create_test_pipeline_ctx() -> PipelineContext { let registry = TelemetryRegistryHandle::new(); let controller = ControllerContext::new(registry); @@ -653,7 +660,7 @@ mod tests { log_record_mapping: HashMap::new(), }, }, - auth: AuthConfig::default(), + auth: "azure_identity_auth".to_string(), } } @@ -661,39 +668,15 @@ mod tests { fn test_new_validates_config() { let config = create_test_config(); let pipeline_ctx = create_test_pipeline_ctx(); - let _ = AzureMonitorExporter::new(pipeline_ctx, config).unwrap(); - } - - #[test] - fn test_get_next_token_refresh_logic() { - let now = OffsetDateTime::now_utc(); - let expires_on = now + azure_core::time::Duration::seconds(3600); - - let token = AccessToken { - token: "secret".into(), - expires_on, - }; - - let refresh_at = AzureMonitorExporter::get_next_token_refresh(token); - let duration_until_refresh = refresh_at.duration_since(tokio::time::Instant::now()); - - // Should be 3600 - 295 = 3305 seconds before refresh - // Allow some delta for execution time - let expected = 3305.0; - let actual = duration_until_refresh.as_secs_f64(); - assert!( - (actual - expected).abs() < 5.0, - "Expected ~{}, got {}", - expected, - actual - ); + let _ = AzureMonitorExporter::new(pipeline_ctx, config, create_mock_auth()).unwrap(); } #[tokio::test] async fn test_handle_export_success() { let config = create_test_config(); let pipeline_ctx = create_test_pipeline_ctx(); - let mut exporter = AzureMonitorExporter::new(pipeline_ctx, config).unwrap(); + let mut exporter = + AzureMonitorExporter::new(pipeline_ctx, config, create_mock_auth()).unwrap(); let (_, reporter) = MetricsReporter::create_new_and_receiver(10); let node_id = NodeId { @@ -739,7 +722,8 @@ mod tests { async fn test_handle_export_failure() { let config = create_test_config(); let pipeline_ctx = create_test_pipeline_ctx(); - let mut exporter = AzureMonitorExporter::new(pipeline_ctx, config).unwrap(); + let mut exporter = + AzureMonitorExporter::new(pipeline_ctx, config, create_mock_auth()).unwrap(); let (_, reporter) = MetricsReporter::create_new_and_receiver(10); let node_id = NodeId { diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/mod.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/mod.rs index 0cddc2ce43..220e47ba6a 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/mod.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/mod.rs @@ -18,7 +18,6 @@ use std::sync::Arc; use otap_df_otap::OTAP_EXPORTER_FACTORIES; use otap_df_otap::pdata::OtapPdata; -mod auth; mod client; mod config; mod error; @@ -48,28 +47,39 @@ pub const AZURE_MONITOR_EXPORTER_URN: &str = "urn:microsoft:exporter:azure_monit #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static AZURE_MONITOR_EXPORTER: ExporterFactory = ExporterFactory { name: AZURE_MONITOR_EXPORTER_URN, - create: |pipeline_ctx: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - // Deserialize user config JSON into typed Config - let cfg: Config = serde_json::from_value(node_config.config.clone()).map_err(|e| { - otap_df_config::error::Error::InvalidUserConfig { - error: e.to_string(), - } - })?; - - Ok(ExporterWrapper::local( - AzureMonitorExporter::new(pipeline_ctx, cfg).map_err(|e| { + create: + |pipeline_ctx: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + // Deserialize user config JSON into typed Config + let cfg: Config = serde_json::from_value(node_config.config.clone()).map_err(|e| { otap_df_config::error::Error::InvalidUserConfig { error: e.to_string(), } - })?, - node, - node_config, - exporter_config, - )) - }, + })?; + + // Resolve the auth extension at factory time + let auth = capability_registry + .get::( + &cfg.auth, + ) + .map_err(|e| otap_df_config::error::Error::InvalidUserConfig { + error: format!("auth extension '{}' not found: {e}", cfg.auth), + })?; + + Ok(ExporterWrapper::local( + AzureMonitorExporter::new(pipeline_ctx, cfg, auth).map_err(|e| { + otap_df_config::error::Error::InvalidUserConfig { + error: e.to_string(), + } + })?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/otlp-ame.yaml b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/otlp-ame.yaml index b8f3dac8c5..9389305723 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/otlp-ame.yaml +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/otlp-ame.yaml @@ -24,6 +24,23 @@ groups: default: pipelines: main: + extensions: + azure-auth: + type: "urn:microsoft:extension:azure_identity_auth" + config: + # Use managed identity for AKS workloads + method: "managedidentity" # or "msi" + + # For user-assigned managed identity (common in AKS) + # Get this from your AKS managed identity configuration + client_id: "YOUR-USER-ASSIGNED-IDENTITY-CLIENT-ID" + + # For system-assigned managed identity (less common in AKS) + # Comment out the client_id field above + + # OAuth scope for Azure Monitor + scope: "https://monitor.azure.com/.default" + nodes: otlp-receiver: type: "urn:otel:receiver:otlp" @@ -70,20 +87,8 @@ groups: "exception.message": "ExceptionMessage" "user.id": "UserId" - # Authentication configuration for AKS - auth: - # Use managed identity for AKS workloads - method: "managedidentity" # or "msi" - - # For user-assigned managed identity (common in AKS) - # Get this from your AKS managed identity configuration - client_id: "YOUR-USER-ASSIGNED-IDENTITY-CLIENT-ID" - - # For system-assigned managed identity (less common in AKS) - # Comment out the client_id field above - - # OAuth scope for Azure Monitor - scope: "https://monitor.azure.com/.default" + # Reference the auth extension by name + auth: "azure-auth" connections: - from: otlp-receiver diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/transformer.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/transformer.rs index 5c5ab2f9f2..0d87a87bb1 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/transformer.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/azure_monitor_exporter/transformer.rs @@ -411,7 +411,7 @@ mod tests { } fn create_test_config() -> Config { - use super::super::config::{ApiConfig, AuthConfig, SchemaConfig}; + use super::super::config::{ApiConfig, SchemaConfig}; Config { api: ApiConfig { @@ -431,7 +431,7 @@ mod tests { ]), }, }, - auth: AuthConfig::default(), + auth: "azure_identity_auth".to_string(), } } @@ -719,7 +719,7 @@ mod tests { #[test] fn test_empty_schema_mappings() { - use super::super::config::{ApiConfig, AuthConfig, SchemaConfig}; + use super::super::config::{ApiConfig, SchemaConfig}; let config = Config { api: ApiConfig { @@ -732,7 +732,7 @@ mod tests { log_record_mapping: HashMap::new(), }, }, - auth: AuthConfig::default(), + auth: "azure_identity_auth".to_string(), }; let transformer = Transformer::new(&config, create_test_metrics()); diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/geneva_exporter/mod.rs b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/geneva_exporter/mod.rs index b98baf9746..7ff4c3220e 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/exporters/geneva_exporter/mod.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/exporters/geneva_exporter/mod.rs @@ -480,17 +480,19 @@ impl GenevaExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static GENEVA_EXPORTER: ExporterFactory = ExporterFactory { name: GENEVA_EXPORTER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - GenevaExporter::from_config(pipeline, &node_config.config)?, - node, - node_config, - exporter_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + GenevaExporter::from_config(pipeline, &node_config.config)?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/config.rs b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/config.rs new file mode 100644 index 0000000000..ba1d6f961a --- /dev/null +++ b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/config.rs @@ -0,0 +1,155 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Configuration types for the Azure Identity Auth Extension. + +use serde::Deserialize; + +use super::Error; + +/// Authentication method for Azure. +#[derive(Debug, Deserialize, Clone, PartialEq, Default)] +#[serde(rename_all = "lowercase")] +pub enum AuthMethod { + /// Use Managed Identity (system or user-assigned with client_id). + #[serde(alias = "msi", alias = "managed_identity")] + #[default] + ManagedIdentity, + + /// Use developer tools (Azure CLI, Azure Developer CLI). + #[serde(alias = "dev", alias = "developer", alias = "cli")] + Development, +} + +impl std::fmt::Display for AuthMethod { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + match self { + AuthMethod::ManagedIdentity => write!(f, "managed_identity"), + AuthMethod::Development => write!(f, "development"), + } + } +} + +/// Configuration for the Azure Identity Auth Extension. +#[derive(Debug, Deserialize, Clone)] +#[serde(deny_unknown_fields)] +pub struct Config { + /// Authentication method to use. + #[serde(default)] + pub method: AuthMethod, + + /// Client ID for user-assigned managed identity (optional). + /// Only used when method is ManagedIdentity. + /// If not provided with ManagedIdentity, system-assigned identity will be used. + pub client_id: Option, + + /// OAuth scope for token acquisition. + /// Defaults to "https://management.azure.com/.default" for general Azure management. + #[serde(default = "default_scope")] + pub scope: String, +} + +impl Default for Config { + fn default() -> Self { + Self { + method: AuthMethod::default(), + client_id: None, + scope: default_scope(), + } + } +} + +impl Config { + /// Validate the configuration. + pub fn validate(&self) -> Result<(), Error> { + if self.scope.is_empty() { + return Err(Error::Config("OAuth scope cannot be empty".to_string())); + } + + Ok(()) + } +} + +fn default_scope() -> String { + "https://management.azure.com/.default".to_string() +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_default_config() { + let config = Config::default(); + assert_eq!(config.method, AuthMethod::ManagedIdentity); + assert!(config.client_id.is_none()); + assert_eq!(config.scope, "https://management.azure.com/.default"); + } + + #[test] + fn test_auth_method_display() { + assert_eq!( + format!("{}", AuthMethod::ManagedIdentity), + "managed_identity" + ); + assert_eq!(format!("{}", AuthMethod::Development), "development"); + } + + #[test] + fn test_config_validation_empty_scope() { + let config = Config { + method: AuthMethod::ManagedIdentity, + client_id: None, + scope: "".to_string(), + }; + let result = config.validate(); + assert!(result.is_err()); + } + + #[test] + fn test_config_validation_valid() { + let config = Config::default(); + let result = config.validate(); + assert!(result.is_ok()); + } + + #[test] + fn test_deserialize_managed_identity_system_assigned() { + let json = r#"{ + "method": "managed_identity", + "scope": "https://monitor.azure.com/.default" + }"#; + let config: Config = serde_json::from_str(json).unwrap(); + assert_eq!(config.method, AuthMethod::ManagedIdentity); + assert!(config.client_id.is_none()); + } + + #[test] + fn test_deserialize_managed_identity_user_assigned() { + let json = r#"{ + "method": "msi", + "client_id": "12345-abcde", + "scope": "https://monitor.azure.com/.default" + }"#; + let config: Config = serde_json::from_str(json).unwrap(); + assert_eq!(config.method, AuthMethod::ManagedIdentity); + assert_eq!(config.client_id, Some("12345-abcde".to_string())); + } + + #[test] + fn test_deserialize_development() { + let json = r#"{ + "method": "development" + }"#; + let config: Config = serde_json::from_str(json).unwrap(); + assert_eq!(config.method, AuthMethod::Development); + } + + #[test] + fn test_deserialize_with_defaults() { + let json = r#"{}"#; + let config: Config = serde_json::from_str(json).unwrap(); + assert_eq!(config.method, AuthMethod::ManagedIdentity); + assert_eq!(config.scope, "https://management.azure.com/.default"); + } +} diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/error.rs b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/error.rs new file mode 100644 index 0000000000..f1315cb16e --- /dev/null +++ b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/error.rs @@ -0,0 +1,133 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Error types for the Azure Identity Auth Extension. + +use super::config::AuthMethod; + +/// Error definitions for Azure Identity Auth Extension. +#[derive(thiserror::Error, Debug)] +pub enum Error { + // ==================== Configuration Errors ==================== + /// Error during configuration of a component. + #[error("Configuration error: {0}")] + Config(String), + + // ==================== Authentication Errors ==================== + /// Authentication/authorization error. + #[error("Auth error ({kind})")] + Auth { + /// The kind of authentication error. + kind: AuthErrorKind, + /// The underlying Azure error, if any. + #[source] + source: Option, + }, + + // ==================== Internal Errors ==================== + /// Shutdown requested. + #[error("Shutdown requested: {reason}")] + Shutdown { + /// The reason for shutdown. + reason: String, + }, +} + +/// Specific authentication error variants. +#[derive(Debug, Clone, PartialEq)] +pub enum AuthErrorKind { + /// Failed to create the credential provider. + CreateCredential { + /// The authentication method that failed. + method: AuthMethod, + }, + + /// Failed to acquire a token. + TokenAcquisition, + + /// Token has expired and refresh failed. + TokenExpired, +} + +impl std::fmt::Display for AuthErrorKind { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + match self { + AuthErrorKind::CreateCredential { method } => { + write!(f, "failed to create credential for method: {}", method) + } + AuthErrorKind::TokenAcquisition => write!(f, "failed to acquire token"), + AuthErrorKind::TokenExpired => write!(f, "token expired and refresh failed"), + } + } +} + +impl Error { + /// Creates a new credential creation error. + #[must_use] + pub fn create_credential(method: AuthMethod, source: azure_core::error::Error) -> Self { + Error::Auth { + kind: AuthErrorKind::CreateCredential { method }, + source: Some(source), + } + } + + /// Creates a new token acquisition error. + #[must_use] + pub fn token_acquisition(source: azure_core::error::Error) -> Self { + Error::Auth { + kind: AuthErrorKind::TokenAcquisition, + source: Some(source), + } + } + + /// Creates a new token expired error. + #[must_use] + pub fn token_expired() -> Self { + Error::Auth { + kind: AuthErrorKind::TokenExpired, + source: None, + } + } + + /// Creates a new shutdown error. + #[must_use] + pub fn shutdown(reason: impl Into) -> Self { + Error::Shutdown { + reason: reason.into(), + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_config_error_display() { + let err = Error::Config("test error".to_string()); + assert_eq!(format!("{}", err), "Configuration error: test error"); + } + + #[test] + fn test_auth_error_kind_display() { + let kind = AuthErrorKind::CreateCredential { + method: AuthMethod::ManagedIdentity, + }; + assert!(format!("{}", kind).contains("managed_identity")); + + let kind = AuthErrorKind::TokenAcquisition; + assert_eq!(format!("{}", kind), "failed to acquire token"); + + let kind = AuthErrorKind::TokenExpired; + assert_eq!(format!("{}", kind), "token expired and refresh failed"); + } + + #[test] + fn test_shutdown_error() { + let err = Error::shutdown("test reason"); + match err { + Error::Shutdown { reason } => assert_eq!(reason, "test reason"), + _ => panic!("Expected Shutdown error"), + } + } +} diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/extension.rs b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/extension.rs new file mode 100644 index 0000000000..f4935e1cda --- /dev/null +++ b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/extension.rs @@ -0,0 +1,659 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Azure Identity Auth Extension implementation. +//! +//! This extension provides Azure authentication services to the pipeline. +//! It manages Azure credentials and provides token acquisition capabilities +//! to consumers (e.g., exporters) via the [`BearerTokenProvider`] trait. +//! +//! # Architecture +//! +//! `AzureIdentityAuthExtension` is a single `Clone` struct that serves both +//! as the pipeline extension (implementing [`Extension`] and driving the token +//! refresh loop) and as the registry service (implementing [`BearerTokenProvider`]). +//! +//! Consumers retrieve it from the extension +//! registry via `registry.get::("name")`. +//! +//! State is shared through `Arc`: +//! - `Arc` — the Azure credential provider +//! - `Arc>>` — token broadcast channel + +use async_trait::async_trait; +use azure_core::credentials::{AccessToken, TokenCredential}; +use azure_identity::{ + DeveloperToolsCredential, DeveloperToolsCredentialOptions, ManagedIdentityCredential, + ManagedIdentityCredentialOptions, UserAssignedId, +}; +use otap_df_engine::extension::bearer_token_provider::{BearerToken, BearerTokenProvider}; +use otap_df_telemetry::{otel_debug, otel_error, otel_info, otel_warn}; +use std::sync::Arc; +use tokio::sync::watch; + +use otap_df_engine::control::ExtensionControlMsg; +use otap_df_engine::error::Error as EngineError; +use otap_df_engine::local::extension::{ControlChannel, EffectHandler, Extension}; +use otap_df_engine::terminal_state::TerminalState; + +use super::config::{AuthMethod, Config}; +use super::error::Error; + +/// Minimum delay between token refresh retry attempts in seconds. +const MIN_RETRY_DELAY_SECS: f64 = 5.0; +/// Maximum delay between token refresh retry attempts in seconds. +const MAX_RETRY_DELAY_SECS: f64 = 30.0; +/// Maximum jitter percentage (±10%) to add to retry delays. +const MAX_RETRY_JITTER_RATIO: f64 = 0.10; + +/// Buffer time before token expiry to trigger refresh (in seconds). +/// Tokens will be refreshed ~5 minutes before they expire. +const TOKEN_EXPIRY_BUFFER_SECS: u64 = 299; +/// Minimum interval between token refresh attempts (in seconds). +const MIN_TOKEN_REFRESH_INTERVAL_SECS: u64 = 10; +/// Retry interval when token refresh fails (in seconds). +const TOKEN_REFRESH_RETRY_SECS: u64 = 10; + +/// Azure Identity Auth Extension. +/// +/// This is a single `Clone` struct that serves as both the pipeline extension +/// (implementing [`Extension`] to drive the token refresh loop) and the registry +/// service (implementing [`BearerTokenProvider`]). +/// +/// Consumers retrieve this via `registry.get::("name")`. +/// Cheap to clone — all state is behind `Arc`. +#[derive(Clone)] +pub struct AzureIdentityAuthExtension { + /// The configured name of this extension instance (from YAML config key). + name: String, + /// The Azure credential provider. + credential: Arc, + /// Human-readable description of the credential type created. + credential_type: &'static str, + /// The OAuth scope for token acquisition. + scope: String, + /// The authentication method being used. + method: AuthMethod, + /// Optional client ID for user-assigned managed identity. + client_id: Option, + /// Sender for broadcasting / subscribing to token refresh events. + token_sender: Arc>>, +} + +impl AzureIdentityAuthExtension { + /// Creates a new Azure Identity Auth Extension. + pub fn new(name: String, config: Config) -> Result { + let (credential, credential_type) = Self::create_credential(&config)?; + let (token_sender, _) = watch::channel(None); + let token_sender = Arc::new(token_sender); + + Ok(Self { + name, + credential, + credential_type, + scope: config.scope, + method: config.method, + client_id: config.client_id, + token_sender, + }) + } + + /// Creates a credential provider based on the configuration. + /// + /// Returns the credential and a human-readable description of the credential type. + fn create_credential( + config: &Config, + ) -> Result<(Arc, &'static str), Error> { + match config.method { + AuthMethod::ManagedIdentity => { + let mut options = ManagedIdentityCredentialOptions::default(); + + let credential_type = if let Some(client_id) = &config.client_id { + options.user_assigned_id = Some(UserAssignedId::ClientId(client_id.clone())); + "user_assigned_managed_identity" + } else { + "system_assigned_managed_identity" + }; + + Ok(( + ManagedIdentityCredential::new(Some(options)) + .map_err(|e| Error::create_credential(AuthMethod::ManagedIdentity, e))?, + credential_type, + )) + } + AuthMethod::Development => Ok(( + DeveloperToolsCredential::new(Some(DeveloperToolsCredentialOptions::default())) + .map_err(|e| Error::create_credential(AuthMethod::Development, e))?, + "developer_tools", + )), + } + } + + /// Gets a token directly from the credential provider. + async fn get_token_internal(&self) -> Result { + self.credential + .get_token( + &[&self.scope], + Some(azure_core::credentials::TokenRequestOptions::default()), + ) + .await + .map_err(Error::token_acquisition) + } + + /// Gets a token with retry logic and exponential backoff. + async fn get_token_with_retry(&self) -> Result { + let mut attempt = 0_i32; + loop { + attempt += 1; + + match self.get_token_internal().await { + Ok(token) => { + otel_debug!( + "azure_identity_auth.get_token_succeeded", + expires_on = %token.expires_on + ); + return Ok(token); + } + Err(e) => { + otel_warn!( + "azure_identity_auth.get_token_failed", + attempt = attempt, + error = %e + ); + } + } + + // Calculate exponential backoff: 5s, 10s, 20s, 30s (capped) + let base_delay_secs = MIN_RETRY_DELAY_SECS * 2.0_f64.powi(attempt - 1); + let capped_delay_secs = base_delay_secs.min(MAX_RETRY_DELAY_SECS); + + // Add jitter: random value between -10% and +10% of the delay + let jitter_range = capped_delay_secs * MAX_RETRY_JITTER_RATIO; + let jitter = if jitter_range > 0.0 { + let random_factor = rand::random::() * 2.0 - 1.0; + random_factor * jitter_range + } else { + 0.0 + }; + + let delay_secs = (capped_delay_secs + jitter).max(1.0); + let delay = tokio::time::Duration::from_secs_f64(delay_secs); + + otel_warn!( + "azure_identity_auth.retry_scheduled", + delay_secs = %delay_secs + ); + tokio::time::sleep(delay).await; + } + } + + /// Calculates when the next token refresh should occur. + fn get_next_token_refresh(token: &BearerToken) -> tokio::time::Instant { + let now_secs = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .map(|d| d.as_secs() as i64) + .unwrap_or(0); + + let duration_remaining = if token.expires_on > now_secs { + std::time::Duration::from_secs((token.expires_on - now_secs) as u64) + } else { + std::time::Duration::ZERO + }; + + let token_valid_until = tokio::time::Instant::now() + duration_remaining; + let next_token_refresh = + token_valid_until - tokio::time::Duration::from_secs(TOKEN_EXPIRY_BUFFER_SECS); + std::cmp::max( + next_token_refresh, + tokio::time::Instant::now() + + tokio::time::Duration::from_secs(MIN_TOKEN_REFRESH_INTERVAL_SECS), + ) + } + + /// Returns the authentication method being used. + #[must_use] + pub fn method(&self) -> &AuthMethod { + &self.method + } + + /// Returns the OAuth scope. + #[must_use] + pub fn scope(&self) -> &str { + &self.scope + } +} + +#[async_trait] +impl BearerTokenProvider for AzureIdentityAuthExtension { + async fn get_token(&self) -> Result { + let access_token = self.get_token_with_retry().await?; + + Ok(BearerToken::new( + access_token.token.secret().to_string(), + access_token.expires_on.unix_timestamp(), + )) + } + + fn subscribe_token_refresh(&self) -> watch::Receiver> { + self.token_sender.subscribe() + } +} + +#[async_trait(?Send)] +impl Extension for AzureIdentityAuthExtension { + otap_df_engine::extension_capabilities!(BearerTokenProvider); + + async fn start( + self: Box, + mut ctrl_chan: ControlChannel, + _: EffectHandler, + ) -> Result { + otel_info!( + "azure_identity_auth.start", + name = self.name.as_str(), + credential_type = self.credential_type, + scope = self.scope.as_str(), + client_id = self.client_id.as_deref().unwrap_or("none"), + ); + + // Fetch initial token immediately + let mut next_token_refresh = tokio::time::Instant::now(); + + // Main event loop — extensions handle control messages and proactive token refresh + loop { + tokio::select! { + biased; + + // Proactive token refresh — keeps Azure Identity's internal cache warm + _ = tokio::time::sleep_until(next_token_refresh) => { + match self.get_token_with_retry().await { + Ok(access_token) => { + let bearer_token = BearerToken::new( + access_token.token.secret().to_string(), + access_token.expires_on.unix_timestamp(), + ); + + // Broadcast the new token to all subscribers + let _ = self.token_sender.send(Some(bearer_token.clone())); + + // Schedule next refresh + next_token_refresh = Self::get_next_token_refresh(&bearer_token); + + let refresh_in = next_token_refresh.saturating_duration_since(tokio::time::Instant::now()); + let total_secs = refresh_in.as_secs(); + let hours = total_secs / 3600; + let minutes = (total_secs % 3600) / 60; + let seconds = total_secs % 60; + + otel_info!( + "azure_identity_auth.token_refreshed", + refresh_in = format!("{}h {}m {}s", hours, minutes, seconds) + ); + } + Err(e) => { + otel_error!( + "azure_identity_auth.token_refresh_loop_failed", + error = ?e, + retry_secs = TOKEN_REFRESH_RETRY_SECS + ); + // Retry after a short delay + next_token_refresh = tokio::time::Instant::now() + + tokio::time::Duration::from_secs(TOKEN_REFRESH_RETRY_SECS); + } + } + } + + // Handle control messages + msg = ctrl_chan.recv() => { + match msg? { + ExtensionControlMsg::Shutdown { reason, .. } => { + otel_info!( + "azure_identity_auth.shutdown", + reason = %reason + ); + break; + } + ExtensionControlMsg::Config { config } => { + otel_info!( + "azure_identity_auth.config_update", + config = ?config + ); + } + ExtensionControlMsg::CollectTelemetry { .. } => { + // Telemetry collection handled by pipeline metrics + } + } + } + } + } + + Ok(TerminalState::default()) + } +} + +#[cfg(test)] +mod tests { + use super::*; + use azure_core::credentials::TokenRequestOptions; + use azure_core::time::OffsetDateTime; + use std::sync::atomic::{AtomicUsize, Ordering}; + + #[derive(Debug)] + struct MockCredential { + token: String, + expires_in: azure_core::time::Duration, + call_count: Arc, + } + + fn make_mock_credential( + token: &str, + expires_in: azure_core::time::Duration, + call_count: Arc, + ) -> Arc { + Arc::new(MockCredential { + token: token.to_string(), + expires_in, + call_count, + }) + } + + #[async_trait::async_trait] + impl TokenCredential for MockCredential { + async fn get_token( + &self, + _scopes: &[&str], + _options: Option>, + ) -> azure_core::Result { + let _ = self.call_count.fetch_add(1, Ordering::SeqCst); + + Ok(AccessToken { + token: self.token.clone().into(), + expires_on: OffsetDateTime::now_utc() + self.expires_in, + }) + } + } + + /// Creates a test extension from a mock credential. + fn make_test_extension( + credential: Arc, + scope: &str, + ) -> AzureIdentityAuthExtension { + let (token_sender, _) = watch::channel(None); + AzureIdentityAuthExtension { + name: "test".to_string(), + credential, + credential_type: "mock", + scope: scope.to_string(), + method: AuthMethod::ManagedIdentity, + client_id: None, + token_sender: Arc::new(token_sender), + } + } + + // ==================== Construction Tests ==================== + + #[tokio::test] + async fn test_new_with_managed_identity_system_assigned() { + let config = Config { + method: AuthMethod::ManagedIdentity, + client_id: None, + scope: "https://test.scope".to_string(), + }; + + let result = AzureIdentityAuthExtension::new("test".to_string(), config); + assert!(result.is_ok()); + let ext = result.unwrap(); + assert_eq!(ext.scope(), "https://test.scope"); + } + + #[tokio::test] + async fn test_new_with_managed_identity_user_assigned() { + let config = Config { + method: AuthMethod::ManagedIdentity, + client_id: Some("test-client-id".to_string()), + scope: "https://test.scope".to_string(), + }; + + let result = AzureIdentityAuthExtension::new("test".to_string(), config); + assert!(result.is_ok()); + } + + #[tokio::test] + async fn test_new_with_development_auth() { + let config = Config { + method: AuthMethod::Development, + client_id: None, + scope: "https://test.scope".to_string(), + }; + + // May fail if Azure CLI not installed — both outcomes are valid + let result = AzureIdentityAuthExtension::new("test".to_string(), config); + match result { + Ok(ext) => assert_eq!(ext.scope(), "https://test.scope"), + Err(Error::Auth { + kind: super::super::error::AuthErrorKind::CreateCredential { method }, + .. + }) => { + assert_eq!(method, AuthMethod::Development); + } + Err(err) => panic!("Unexpected error type: {:?}", err), + } + } + + // ==================== Token Fetching Tests ==================== + + #[tokio::test] + async fn test_get_token_internal_returns_valid_token() { + let call_count = Arc::new(AtomicUsize::new(0)); + let credential = make_mock_credential( + "test_token", + azure_core::time::Duration::minutes(60), + call_count.clone(), + ); + + let service = make_test_extension(credential, "scope"); + + let token = service.get_token_internal().await.unwrap(); + assert_eq!(token.token.secret(), "test_token"); + assert_eq!(call_count.load(Ordering::SeqCst), 1); + } + + #[tokio::test] + async fn test_get_token_internal_calls_credential_each_time() { + let call_count = Arc::new(AtomicUsize::new(0)); + let credential = make_mock_credential( + "test_token", + azure_core::time::Duration::minutes(60), + call_count.clone(), + ); + + let service = make_test_extension(credential, "scope"); + + let _ = service.get_token_internal().await.unwrap(); + assert_eq!(call_count.load(Ordering::SeqCst), 1); + + let _ = service.get_token_internal().await.unwrap(); + assert_eq!(call_count.load(Ordering::SeqCst), 2); + + let _ = service.get_token_internal().await.unwrap(); + assert_eq!(call_count.load(Ordering::SeqCst), 3); + } + + // ==================== BearerTokenProvider Trait Tests ==================== + + #[tokio::test] + async fn test_bearer_token_provider_get_token() { + let call_count = Arc::new(AtomicUsize::new(0)); + let credential = make_mock_credential( + "bearer_test_token", + azure_core::time::Duration::minutes(60), + call_count.clone(), + ); + + let service = make_test_extension(credential, "scope"); + + // Use the BearerTokenProvider trait method + let token: BearerToken = BearerTokenProvider::get_token(&service).await.unwrap(); + assert_eq!(token.token.secret(), "bearer_test_token"); + assert!(token.expires_on > 0); + assert_eq!(call_count.load(Ordering::SeqCst), 1); + } + + #[tokio::test] + async fn test_bearer_token_provider_subscribe_token_refresh() { + let credential = make_mock_credential( + "test_token", + azure_core::time::Duration::minutes(60), + Arc::new(AtomicUsize::new(0)), + ); + + let (token_sender, _) = watch::channel(None); + let token_sender = Arc::new(token_sender); + let service = AzureIdentityAuthExtension { + name: "test".to_string(), + credential, + credential_type: "mock", + scope: "scope".to_string(), + method: AuthMethod::ManagedIdentity, + client_id: None, + token_sender: token_sender.clone(), + }; + + // Get a subscriber + let mut rx = BearerTokenProvider::subscribe_token_refresh(&service); + + // Initially should be None + assert!(rx.borrow().is_none()); + + // Simulate token broadcast (as the extension would do) + let new_token = BearerToken::new("refreshed_token".to_string(), 12345); + let _ = token_sender.send(Some(new_token)); + + // Subscriber should receive the update + rx.changed().await.unwrap(); + let received = rx.borrow(); + assert!(received.is_some()); + let received_token = received.as_ref().unwrap(); + assert_eq!(received_token.token.secret(), "refreshed_token"); + assert_eq!(received_token.expires_on, 12345); + } + + #[tokio::test] + async fn test_multiple_subscribers_receive_token_updates() { + let credential = make_mock_credential( + "test_token", + azure_core::time::Duration::minutes(60), + Arc::new(AtomicUsize::new(0)), + ); + + let (token_sender, _) = watch::channel(None); + let token_sender = Arc::new(token_sender); + let service = AzureIdentityAuthExtension { + name: "test".to_string(), + credential, + credential_type: "mock", + scope: "scope".to_string(), + method: AuthMethod::ManagedIdentity, + client_id: None, + token_sender: token_sender.clone(), + }; + + // Create multiple subscribers + let mut rx1 = BearerTokenProvider::subscribe_token_refresh(&service); + let mut rx2 = BearerTokenProvider::subscribe_token_refresh(&service); + + // Broadcast a token + let token = BearerToken::new("broadcast_token".to_string(), 99999); + let _ = token_sender.send(Some(token)); + + // Both subscribers should receive the update + rx1.changed().await.unwrap(); + rx2.changed().await.unwrap(); + + assert_eq!( + rx1.borrow().as_ref().unwrap().token.secret(), + "broadcast_token" + ); + assert_eq!( + rx2.borrow().as_ref().unwrap().token.secret(), + "broadcast_token" + ); + } + + // ==================== Token Refresh Scheduling Tests ==================== + + #[test] + fn test_get_next_token_refresh_schedules_before_expiry() { + let now_secs = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_secs() as i64; + + let token = BearerToken::new("test".to_string(), now_secs + 600); + let next_refresh = AzureIdentityAuthExtension::get_next_token_refresh(&token); + + let now = tokio::time::Instant::now(); + let min_expected = now + tokio::time::Duration::from_secs(MIN_TOKEN_REFRESH_INTERVAL_SECS); + + assert!(next_refresh >= min_expected); + } + + #[test] + fn test_get_next_token_refresh_respects_minimum_interval() { + let now_secs = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_secs() as i64; + + let token = BearerToken::new("test".to_string(), now_secs + 5); + let next_refresh = AzureIdentityAuthExtension::get_next_token_refresh(&token); + + let now = tokio::time::Instant::now(); + let min_expected = + now + tokio::time::Duration::from_secs(MIN_TOKEN_REFRESH_INTERVAL_SECS - 1); + + assert!(next_refresh >= min_expected); + } + + #[test] + fn test_get_next_token_refresh_handles_expired_token() { + let now_secs = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_secs() as i64; + + let token = BearerToken::new("test".to_string(), now_secs - 100); + let next_refresh = AzureIdentityAuthExtension::get_next_token_refresh(&token); + + let now = tokio::time::Instant::now(); + let min_expected = + now + tokio::time::Duration::from_secs(MIN_TOKEN_REFRESH_INTERVAL_SECS - 1); + + assert!(next_refresh >= min_expected); + } + + #[test] + fn test_get_next_token_refresh_long_lived_token() { + let now_secs = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_secs() as i64; + + let token = BearerToken::new("test".to_string(), now_secs + 3600); + let next_refresh = AzureIdentityAuthExtension::get_next_token_refresh(&token); + + let now = tokio::time::Instant::now(); + let expected_approx = + now + tokio::time::Duration::from_secs(3600 - TOKEN_EXPIRY_BUFFER_SECS); + + let tolerance = tokio::time::Duration::from_secs(2); + assert!(next_refresh >= expected_approx - tolerance); + assert!(next_refresh <= expected_approx + tolerance); + } + + #[test] + fn test_provider_is_send() { + fn assert_send() {} + assert_send::(); + } +} diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/mod.rs b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/mod.rs new file mode 100644 index 0000000000..baf677f9d9 --- /dev/null +++ b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/azure_identity_auth_extension/mod.rs @@ -0,0 +1,121 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Azure Identity Auth Extension for OTAP. +//! +//! Provides Azure authentication services to the pipeline using Azure Identity. +//! This extension manages token acquisition and refresh, making credentials +//! available to other components (e.g., exporters) that need Azure authentication. +//! +//! # Features +//! +//! - Managed Identity authentication (system or user-assigned) +//! - Developer tools authentication (Azure CLI, Azure Developer CLI) +//! - Automatic token refresh with exponential backoff retry +//! - Shared credential access across pipeline components +//! +//! # Usage +//! +//! Configure the extension in the pipeline configuration: +//! +//! ```yaml +//! extensions: +//! azure_auth: +//! type: "urn:microsoft:extension:azure_identity_auth" +//! config: +//! method: managed_identity +//! scope: "https://monitor.azure.com/.default" +//! ``` +//! +//! Consumers retrieve the extension by name from the registry: +//! +//! ```ignore +//! let provider: Box = extension_registry +//! .get::("azure_auth")?; +//! let mut token_rx = provider.subscribe_token_refresh(); +//! ``` + +use linkme::distributed_slice; +use otap_df_config::node::NodeUserConfig; +use otap_df_engine::ExtensionFactory; +use otap_df_engine::config::ExtensionConfig; +use otap_df_engine::context::PipelineContext; +use otap_df_engine::extension::ExtensionWrapper; +use otap_df_engine::node::NodeId; +use std::sync::Arc; + +use otap_df_otap::OTAP_EXTENSION_FACTORIES; + +mod config; +mod error; +mod extension; + +pub use config::{AuthMethod, Config}; +pub use error::Error; +pub use extension::AzureIdentityAuthExtension; + +/// URN identifying the Azure Identity Auth Extension in configuration pipelines. +pub const AZURE_IDENTITY_AUTH_EXTENSION_URN: &str = "urn:microsoft:extension:azure_identity_auth"; + +/// Register Azure Identity Auth Extension with the OTAP extension factory. +/// +/// Uses the `distributed_slice` macro for automatic discovery by the dataflow engine. +#[allow(unsafe_code)] +#[distributed_slice(OTAP_EXTENSION_FACTORIES)] +pub static AZURE_IDENTITY_AUTH_EXTENSION: ExtensionFactory = ExtensionFactory { + name: AZURE_IDENTITY_AUTH_EXTENSION_URN, + create: |_: PipelineContext, + node: NodeId, + node_config: Arc, + extension_config: &ExtensionConfig| { + // Deserialize user config JSON into typed Config + let cfg: Config = serde_json::from_value(node_config.config.clone()).map_err(|e| { + otap_df_config::error::Error::InvalidUserConfig { + error: e.to_string(), + } + })?; + + // Validate the configuration + cfg.validate() + .map_err(|e| otap_df_config::error::Error::InvalidUserConfig { + error: e.to_string(), + })?; + + // Create the extension + let extension = + AzureIdentityAuthExtension::new(node.name.to_string(), cfg).map_err(|e| { + otap_df_config::error::Error::InvalidUserConfig { + error: e.to_string(), + } + })?; + + Ok(ExtensionWrapper::local( + extension, + node, + node_config, + extension_config, + )) + }, + validate_config: otap_df_config::validation::validate_typed_config::, +}; + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_extension_urn() { + assert_eq!( + AZURE_IDENTITY_AUTH_EXTENSION_URN, + "urn:microsoft:extension:azure_identity_auth" + ); + } + + #[test] + fn test_factory_name_matches_urn() { + assert_eq!( + AZURE_IDENTITY_AUTH_EXTENSION.name, + AZURE_IDENTITY_AUTH_EXTENSION_URN + ); + } +} diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/extensions/mod.rs b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/mod.rs new file mode 100644 index 0000000000..5646ad619f --- /dev/null +++ b/rust/otap-dataflow/crates/contrib-nodes/src/extensions/mod.rs @@ -0,0 +1,6 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +/// Azure Identity Auth Extension +#[cfg(feature = "azure-identity-auth-extension")] +pub mod azure_identity_auth_extension; diff --git a/rust/otap-dataflow/crates/contrib-nodes/src/lib.rs b/rust/otap-dataflow/crates/contrib-nodes/src/lib.rs index 2b5cb59c14..cb29ecf7d6 100644 --- a/rust/otap-dataflow/crates/contrib-nodes/src/lib.rs +++ b/rust/otap-dataflow/crates/contrib-nodes/src/lib.rs @@ -1,7 +1,10 @@ // Copyright The OpenTelemetry Authors // SPDX-License-Identifier: Apache-2.0 -//! Implementation of the Contrib nodes (receiver, exporter, processor). +//! Implementation of the Contrib nodes (receiver, exporter, processor, extension). + +/// Extension implementations for contrib nodes. +pub mod extensions; /// Exporter implementations for contrib nodes. pub mod exporters; diff --git a/rust/otap-dataflow/crates/engine-macros/src/lib.rs b/rust/otap-dataflow/crates/engine-macros/src/lib.rs index dcad1a7ef3..53bd6f5adf 100644 --- a/rust/otap-dataflow/crates/engine-macros/src/lib.rs +++ b/rust/otap-dataflow/crates/engine-macros/src/lib.rs @@ -75,6 +75,7 @@ pub fn pipeline_factory(args: TokenStream, input: TokenStream) -> TokenStream { let receiver_factories_name = quote::format_ident!("{}_RECEIVER_FACTORIES", prefix); let processor_factories_name = quote::format_ident!("{}_PROCESSOR_FACTORIES", prefix); let exporter_factories_name = quote::format_ident!("{}_EXPORTER_FACTORIES", prefix); + let extension_factories_name = quote::format_ident!("{}_EXTENSION_FACTORIES", prefix); let get_receiver_factory_map_name = quote::format_ident!( "get_{}_receiver_factory_map", prefix.to_string().to_lowercase() @@ -87,6 +88,10 @@ pub fn pipeline_factory(args: TokenStream, input: TokenStream) -> TokenStream { "get_{}_exporter_factory_map", prefix.to_string().to_lowercase() ); + let get_extension_factory_map_name = quote::format_ident!( + "get_{}_extension_factory_map", + prefix.to_string().to_lowercase() + ); let output = quote! { /// A slice of receiver factories. @@ -101,14 +106,19 @@ pub fn pipeline_factory(args: TokenStream, input: TokenStream) -> TokenStream { #[::otap_df_engine::distributed_slice] pub static #exporter_factories_name: [::otap_df_engine::ExporterFactory<#pdata_type>] = [..]; + /// A slice of extension factories (PData-free). + #[::otap_df_engine::distributed_slice] + pub static #extension_factories_name: [::otap_df_engine::ExtensionFactory] = [..]; + /// The factory registry instance. #registry_vis static #registry_name: std::sync::LazyLock> = std::sync::LazyLock::new(|| { // Reference build_registry to avoid unused import warning, even though we don't call it let _ = build_factory::<#pdata_type>; - PipelineFactory::new( + PipelineFactory::with_extensions( &#receiver_factories_name, &#processor_factories_name, &#exporter_factories_name, + &#extension_factories_name, ) }); @@ -126,6 +136,11 @@ pub fn pipeline_factory(args: TokenStream, input: TokenStream) -> TokenStream { pub fn #get_exporter_factory_map_name() -> &'static std::collections::HashMap<&'static str, ::otap_df_engine::ExporterFactory<#pdata_type>> { #registry_name.get_exporter_factory_map() } + + /// Gets the extension factory map, initializing it if necessary. + pub fn #get_extension_factory_map_name() -> &'static std::collections::HashMap<&'static str, ::otap_df_engine::ExtensionFactory> { + #registry_name.get_extension_factory_map() + } }; output.into() diff --git a/rust/otap-dataflow/crates/engine/src/channel_mode.rs b/rust/otap-dataflow/crates/engine/src/channel_mode.rs index dc0d40aecd..72044dbe0a 100644 --- a/rust/otap-dataflow/crates/engine/src/channel_mode.rs +++ b/rust/otap-dataflow/crates/engine/src/channel_mode.rs @@ -20,7 +20,6 @@ use crate::channel_metrics::{ ChannelSenderMetrics, control_channel_id, }; use crate::context::PipelineContext; -use crate::control::NodeControlMsg; use crate::entity_context::current_node_telemetry_handle; use crate::local::message::{LocalReceiver, LocalSender}; use crate::shared::message::{SharedReceiver, SharedSender}; @@ -171,24 +170,24 @@ impl ChannelMode for SharedMode { } } -/// Generic helper used by receiver, processor, and exporter wrappers. +/// Generic helper used by receiver, processor, exporter, and extension wrappers. /// It keeps local and shared wiring identical while still emitting mode-specific code. /// +/// The `Msg` parameter is the control-message type carried by the channel. Data-plane +/// nodes use `NodeControlMsg`, while extensions use `ExtensionControlMsg`. +/// /// The logic first attempts to unwrap the inner MPSC channel so metrics can be attached. /// If the channel is already wrapped, it preserves the existing wrapper to avoid double /// instrumentation. -pub(crate) fn wrap_control_channel_metrics( +pub(crate) fn wrap_control_channel_metrics( node_id: &crate::node::NodeId, pipeline_ctx: &PipelineContext, channel_metrics: &mut ChannelMetricsRegistry, channel_metrics_enabled: bool, capacity: u64, - control_sender: M::ControlSender>, - control_receiver: M::ControlReceiver>, -) -> ( - M::ControlSender>, - M::ControlReceiver>, -) + control_sender: M::ControlSender, + control_receiver: M::ControlReceiver, +) -> (M::ControlSender, M::ControlReceiver) where M: ChannelMode, { diff --git a/rust/otap-dataflow/crates/engine/src/config.rs b/rust/otap-dataflow/crates/engine/src/config.rs index 41097d9e58..9c61df9da9 100644 --- a/rust/otap-dataflow/crates/engine/src/config.rs +++ b/rust/otap-dataflow/crates/engine/src/config.rs @@ -64,6 +64,17 @@ pub struct ExporterConfig { pub input_pdata_channel: PdataChannelConfig, } +/// Runtime configuration for an extension. +/// +/// Extensions only have a control channel — they do not process pipeline data. +#[derive(Clone, Debug)] +pub struct ExtensionConfig { + /// Name of the extension. + pub name: NodeId, + /// Configuration for control channel. + pub control_channel: ControlChannelConfig, +} + impl ReceiverConfig { /// Creates a new receiver configuration with default channel capacities. pub fn new(name: T) -> Self @@ -172,3 +183,28 @@ impl ExporterConfig { } } } + +impl ExtensionConfig { + /// Creates a new extension configuration with default channel capacities. + #[must_use] + pub fn new(name: T) -> Self + where + T: Into, + { + Self::with_control_channel_capacity(name, DEFAULT_CONTROL_CHANNEL_CAPACITY) + } + + /// Creates a new extension configuration with explicit control channel capacity. + #[must_use] + pub fn with_control_channel_capacity(name: T, control_channel_capacity: usize) -> Self + where + T: Into, + { + ExtensionConfig { + name: name.into(), + control_channel: ControlChannelConfig { + capacity: control_channel_capacity, + }, + } + } +} diff --git a/rust/otap-dataflow/crates/engine/src/control.rs b/rust/otap-dataflow/crates/engine/src/control.rs index 29481092d9..57545052b8 100644 --- a/rust/otap-dataflow/crates/engine/src/control.rs +++ b/rust/otap-dataflow/crates/engine/src/control.rs @@ -185,6 +185,41 @@ impl NackMsg { } } +/// Control messages sent to extensions. +/// +/// This is a PData-free subset of [`NodeControlMsg`] — extensions never process +/// pipeline data, so they have no `Ack`, `Nack`, or `DelayedData` variants. +#[derive(Debug, Clone)] +pub enum ExtensionControlMsg { + /// Notifies the extension of a configuration change. + Config { + /// The new configuration as a JSON value. + config: serde_json::Value, + }, + + /// Asks the extension to collect/flush its local telemetry metrics. + CollectTelemetry { + /// Metrics reporter used to collect telemetry metrics. + metrics_reporter: MetricsReporter, + }, + + /// Requests a graceful shutdown. + Shutdown { + /// Deadline for shutdown. + deadline: Instant, + /// Human-readable reason for the shutdown. + reason: String, + }, +} + +impl ExtensionControlMsg { + /// Returns `true` if this control message is a shutdown request. + #[must_use] + pub const fn is_shutdown(&self) -> bool { + matches!(self, ExtensionControlMsg::Shutdown { .. }) + } +} + /// Control messages sent by the pipeline engine to nodes to manage their behavior, /// configuration, and lifecycle. #[derive(Debug, Clone)] @@ -551,6 +586,27 @@ where } } +/// A control sender for a single extension. +/// +/// Stored separately from [`ControlSenders`] because extensions use +/// [`ExtensionControlMsg`] (PData-free) rather than [`NodeControlMsg`]. +pub struct ExtensionControlSender { + /// Unique identifier of the extension. + pub(crate) node_id: NodeId, + /// The control message sender for the extension. + pub(crate) sender: Sender, +} + +impl ExtensionControlSender { + /// Sends a control message to the extension, awaiting until the message is sent. + pub async fn send( + &self, + msg: ExtensionControlMsg, + ) -> Result<(), SendError> { + self.sender.send(msg).await + } +} + #[cfg(test)] mod tests { use super::*; diff --git a/rust/otap-dataflow/crates/engine/src/error.rs b/rust/otap-dataflow/crates/engine/src/error.rs index 3bf6768241..044e1270d4 100644 --- a/rust/otap-dataflow/crates/engine/src/error.rs +++ b/rust/otap-dataflow/crates/engine/src/error.rs @@ -323,6 +323,27 @@ pub enum Error { plugin_urn: NodeUrn, }, + /// The specified extension already exists in the pipeline. + #[error("The extension `{extension}` already exists")] + ExtensionAlreadyExists { + /// The name of the extension that already exists. + extension: NodeId, + }, + + /// Unknown extension plugin. + #[error("Unknown extension plugin `{plugin_urn}`")] + UnknownExtension { + /// The name of the unknown extension plugin. + plugin_urn: NodeUrn, + }, + + /// An extension was placed in the `nodes` section instead of `extensions`. + #[error("Extension `{node}` was placed in `nodes` but belongs in the `extensions` section")] + ExtensionInNodesSection { + /// The node name that was misconfigured. + node: NodeName, + }, + /// Unknown node. #[error("Unknown node `{node}`")] UnknownNode { @@ -471,6 +492,9 @@ impl Error { Error::UnknownReceiver { .. } => "UnknownReceiver", Error::UnsupportedNodeKind { .. } => "UnsupportedNodeKind", Error::InvalidNodeWiring { .. } => "InvalidNodeWiring", + Error::ExtensionAlreadyExists { .. } => "ExtensionAlreadyExists", + Error::UnknownExtension { .. } => "UnknownExtension", + Error::ExtensionInNodesSection { .. } => "ExtensionInNodesSection", } .to_owned() } diff --git a/rust/otap-dataflow/crates/engine/src/exporter.rs b/rust/otap-dataflow/crates/engine/src/exporter.rs index ab1578406a..ca1368db52 100644 --- a/rust/otap-dataflow/crates/engine/src/exporter.rs +++ b/rust/otap-dataflow/crates/engine/src/exporter.rs @@ -209,7 +209,7 @@ impl ExporterWrapper { .. } => { let (control_sender, control_receiver) = - wrap_control_channel_metrics::( + wrap_control_channel_metrics::>( &node_id, pipeline_ctx, channel_metrics, @@ -242,7 +242,7 @@ impl ExporterWrapper { .. } => { let (control_sender, control_receiver) = - wrap_control_channel_metrics::( + wrap_control_channel_metrics::>( &node_id, pipeline_ctx, channel_metrics, diff --git a/rust/otap-dataflow/crates/engine/src/extension.rs b/rust/otap-dataflow/crates/engine/src/extension.rs new file mode 100644 index 0000000000..5aaa9aaf50 --- /dev/null +++ b/rust/otap-dataflow/crates/engine/src/extension.rs @@ -0,0 +1,483 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Extension wrapper providing a unified interface over local (`!Send`) and +//! shared (`Send`) extension implementations. +//! +//! Extensions are PData-free — they never process pipeline data, only control +//! messages. This module wraps local and shared variants into a single +//! `ExtensionWrapper` that the engine can start and manage. +//! +//! For the extension lifecycle traits, see [`local::extension`](crate::local::extension) +//! and [`shared::extension`](crate::shared::extension). +//! +//! For the registry and sealed trait infrastructure, see +//! [`registry`](registry). +//! +//! For built-in extension traits, see +//! [`bearer_token_provider`](bearer_token_provider). + +pub mod registry; + +/// Extension traits that components can implement to expose capabilities. +pub mod bearer_token_provider; + +use crate::channel_metrics::ChannelMetricsRegistry; +use crate::channel_mode::{LocalMode, SharedMode, wrap_control_channel_metrics}; +use crate::config::ExtensionConfig; +use crate::context::PipelineContext; +use crate::control::ExtensionControlMsg; +use crate::entity_context::NodeTelemetryGuard; +use crate::local::extension as local; +use crate::local::message::{LocalReceiver, LocalSender}; +use crate::node::NodeId; +use crate::shared::extension as shared; +use crate::shared::message::{SharedReceiver, SharedSender}; +use crate::terminal_state::TerminalState; +use otap_df_channel::mpsc; +use otap_df_config::node::NodeUserConfig; +use otap_df_telemetry::reporter::MetricsReporter; +use std::sync::Arc; + +/// A wrapper for the extension that allows for both `Send` and `!Send` implementations. +/// +/// Extensions are NOT generic over PData — they operate exclusively on +/// [`ExtensionControlMsg`], keeping the extension system entirely decoupled +/// from the data-plane type. +pub enum ExtensionWrapper { + /// An extension with a `!Send` implementation. + Local { + /// Index identifier for the node. + node_id: NodeId, + /// The user configuration for the node. + user_config: Arc, + /// The runtime configuration for the extension. + runtime_config: ExtensionConfig, + /// The extension instance. + extension: Box, + /// A sender for control messages. + control_sender: LocalSender, + /// A receiver for control messages. + control_receiver: Option>, + /// Telemetry guard for node lifecycle cleanup. + telemetry: Option, + }, + /// An extension with a `Send` implementation. + Shared { + /// Index identifier for the node. + node_id: NodeId, + /// The user configuration for the node. + user_config: Arc, + /// The runtime configuration for the extension. + runtime_config: ExtensionConfig, + /// The extension instance. + extension: Box, + /// A sender for control messages. + control_sender: SharedSender, + /// A receiver for control messages. + control_receiver: Option>, + /// Telemetry guard for node lifecycle cleanup. + telemetry: Option, + }, +} + +impl ExtensionWrapper { + /// Creates a new local `ExtensionWrapper` with the given extension and configuration (!Send + /// implementation). + pub fn local( + extension: E, + node_id: NodeId, + user_config: Arc, + config: &ExtensionConfig, + ) -> Self + where + E: local::Extension + 'static, + { + let (control_sender, control_receiver) = + mpsc::Channel::new(config.control_channel.capacity); + + ExtensionWrapper::Local { + node_id, + user_config, + runtime_config: config.clone(), + extension: Box::new(extension), + control_sender: LocalSender::mpsc(control_sender), + control_receiver: Some(LocalReceiver::mpsc(control_receiver)), + telemetry: None, + } + } + + /// Creates a new shared `ExtensionWrapper` with the given extension and configuration (Send + /// implementation). + pub fn shared( + extension: E, + node_id: NodeId, + user_config: Arc, + config: &ExtensionConfig, + ) -> Self + where + E: shared::Extension + 'static, + { + let (control_sender, control_receiver) = + tokio::sync::mpsc::channel(config.control_channel.capacity); + + ExtensionWrapper::Shared { + node_id, + user_config, + runtime_config: config.clone(), + extension: Box::new(extension), + control_sender: SharedSender::mpsc(control_sender), + control_receiver: Some(SharedReceiver::mpsc(control_receiver)), + telemetry: None, + } + } + + /// Returns whether this extension uses a shared (Send) implementation. + #[must_use] + pub fn is_shared(&self) -> bool { + match self { + ExtensionWrapper::Local { .. } => false, + ExtensionWrapper::Shared { .. } => true, + } + } + + /// Returns the node ID of this extension. + #[must_use] + pub fn node_id(&self) -> NodeId { + match self { + ExtensionWrapper::Local { node_id, .. } => node_id.clone(), + ExtensionWrapper::Shared { node_id, .. } => node_id.clone(), + } + } + + /// Returns the user configuration for this extension. + #[must_use] + pub fn user_config(&self) -> Arc { + match self { + ExtensionWrapper::Local { user_config, .. } => user_config.clone(), + ExtensionWrapper::Shared { user_config, .. } => user_config.clone(), + } + } + + /// Collects the extension's trait registrations and inserts them into + /// the registry under the given name. + /// + /// Called by the engine during pipeline build. + pub fn register_traits(&self, registry: &mut registry::CapabilityRegistry, name: &str) { + let registrations = match self { + ExtensionWrapper::Local { extension, .. } => extension.extension_capabilities(), + ExtensionWrapper::Shared { extension, .. } => extension.extension_capabilities(), + }; + registry.register_all(name, registrations); + } + + pub(crate) fn with_node_telemetry_guard(self, guard: NodeTelemetryGuard) -> Self { + match self { + ExtensionWrapper::Local { + node_id, + user_config, + runtime_config, + extension, + control_sender, + control_receiver, + .. + } => ExtensionWrapper::Local { + node_id, + user_config, + runtime_config, + extension, + control_sender, + control_receiver, + telemetry: Some(guard), + }, + ExtensionWrapper::Shared { + node_id, + user_config, + runtime_config, + extension, + control_sender, + control_receiver, + .. + } => ExtensionWrapper::Shared { + node_id, + user_config, + runtime_config, + extension, + control_sender, + control_receiver, + telemetry: Some(guard), + }, + } + } + + pub(crate) const fn take_telemetry_guard(&mut self) -> Option { + match self { + ExtensionWrapper::Local { telemetry, .. } => telemetry.take(), + ExtensionWrapper::Shared { telemetry, .. } => telemetry.take(), + } + } + + pub(crate) fn with_control_channel_metrics( + self, + pipeline_ctx: &PipelineContext, + channel_metrics: &mut ChannelMetricsRegistry, + channel_metrics_enabled: bool, + ) -> Self { + match self { + ExtensionWrapper::Local { + node_id, + runtime_config, + control_sender, + control_receiver, + user_config, + extension, + telemetry, + .. + } => { + let control_receiver = control_receiver.expect("control_receiver already taken"); + + let (control_sender, control_receiver) = + wrap_control_channel_metrics::( + &node_id, + pipeline_ctx, + channel_metrics, + channel_metrics_enabled, + runtime_config.control_channel.capacity as u64, + control_sender, + control_receiver, + ); + + ExtensionWrapper::Local { + node_id, + user_config, + runtime_config, + extension, + control_sender, + control_receiver: Some(control_receiver), + telemetry, + } + } + ExtensionWrapper::Shared { + node_id, + runtime_config, + control_sender, + control_receiver, + user_config, + extension, + telemetry, + .. + } => { + let control_receiver = control_receiver.expect("control_receiver already taken"); + + let (control_sender, control_receiver) = + wrap_control_channel_metrics::( + &node_id, + pipeline_ctx, + channel_metrics, + channel_metrics_enabled, + runtime_config.control_channel.capacity as u64, + control_sender, + control_receiver, + ); + + ExtensionWrapper::Shared { + node_id, + user_config, + runtime_config, + extension, + control_sender, + control_receiver: Some(control_receiver), + telemetry, + } + } + } + } + + /// Returns an `ExtensionControlSender` for sending control messages to this extension. + pub(crate) fn extension_control_sender(&self) -> crate::control::ExtensionControlSender { + match self { + ExtensionWrapper::Local { + node_id, + control_sender, + .. + } => crate::control::ExtensionControlSender { + node_id: node_id.clone(), + sender: crate::message::Sender::Local(control_sender.clone()), + }, + ExtensionWrapper::Shared { + node_id, + control_sender, + .. + } => crate::control::ExtensionControlSender { + node_id: node_id.clone(), + sender: crate::message::Sender::Shared(control_sender.clone()), + }, + } + } + + /// Starts the extension and begins its operation. + /// + /// Extensions do NOT receive a `PipelineCtrlMsgSender` — they are fully + /// PData-free and manage their own timers directly via `tokio::time`. + pub async fn start( + self, + metrics_reporter: MetricsReporter, + ) -> Result { + match self { + ExtensionWrapper::Local { + node_id, + extension, + control_receiver, + .. + } => { + let effect_handler = local::EffectHandler::new(node_id, metrics_reporter); + + let control_receiver = + control_receiver.expect("control_receiver missing from ExtensionWrapper"); + + let ctrl_chan = local::ControlChannel::new(control_receiver); + extension.start(ctrl_chan, effect_handler).await + } + ExtensionWrapper::Shared { + node_id, + extension, + control_receiver, + .. + } => { + let effect_handler = shared::EffectHandler::new(node_id, metrics_reporter); + + let control_receiver = + control_receiver.expect("control_receiver missing from ExtensionWrapper"); + + let ctrl_chan = shared::ControlChannel::new(control_receiver); + extension.start(ctrl_chan, effect_handler).await + } + } + } +} + +// ── TelemetryWrapped impl ─────────────────────────────────────────────────── + +impl crate::TelemetryWrapped for ExtensionWrapper { + fn with_control_channel_metrics( + self, + pipeline_ctx: &PipelineContext, + channel_metrics: &mut ChannelMetricsRegistry, + channel_metrics_enabled: bool, + ) -> Self { + ExtensionWrapper::with_control_channel_metrics( + self, + pipeline_ctx, + channel_metrics, + channel_metrics_enabled, + ) + } + + fn with_node_telemetry_guard(self, guard: NodeTelemetryGuard) -> Self { + ExtensionWrapper::with_node_telemetry_guard(self, guard) + } +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::control::ExtensionControlMsg; + use crate::testing::{CtrlMsgCounters, test_node}; + use async_trait::async_trait; + use otap_df_config::node::NodeUserConfig; + use serde_json::Value; + + #[derive(Clone)] + struct TestExtension { + counter: CtrlMsgCounters, + } + + impl TestExtension { + fn new(counter: CtrlMsgCounters) -> Self { + TestExtension { counter } + } + } + + #[async_trait(?Send)] + impl local::Extension for TestExtension { + async fn start( + self: Box, + mut ctrl_chan: local::ControlChannel, + _effect_handler: local::EffectHandler, + ) -> Result { + loop { + match ctrl_chan.recv().await? { + ExtensionControlMsg::Config { .. } => { + self.counter.increment_config(); + } + ExtensionControlMsg::Shutdown { .. } => { + self.counter.increment_shutdown(); + break; + } + ExtensionControlMsg::CollectTelemetry { .. } => {} + } + } + Ok(TerminalState::default()) + } + } + + #[test] + fn test_extension_wrapper_local_creation() { + let counter = CtrlMsgCounters::new(); + let extension = TestExtension::new(counter); + let node_id = test_node("test_extension"); + let user_config = Arc::new(NodeUserConfig::with_user_config( + "urn:otap:extension:test".into(), + Value::Null, + )); + let config = ExtensionConfig::new("test_extension"); + + let wrapper = ExtensionWrapper::local(extension, node_id, user_config, &config); + + assert!(!wrapper.is_shared()); + } + + #[test] + fn test_extension_wrapper_shared_creation() { + let counter = CtrlMsgCounters::new(); + let node_id = test_node("test_extension_shared"); + let user_config = Arc::new(NodeUserConfig::with_user_config( + "urn:otap:extension:test".into(), + Value::Null, + )); + let config = ExtensionConfig::new("test_extension_shared"); + + let shared_ext = SharedTestExtension::new(counter); + let wrapper = ExtensionWrapper::shared(shared_ext, node_id, user_config, &config); + + assert!(wrapper.is_shared()); + } + + #[derive(Clone)] + struct SharedTestExtension { + counter: CtrlMsgCounters, + } + + impl SharedTestExtension { + fn new(counter: CtrlMsgCounters) -> Self { + SharedTestExtension { counter } + } + } + + #[async_trait] + impl shared::Extension for SharedTestExtension { + async fn start( + self: Box, + mut ctrl_chan: shared::ControlChannel, + _effect_handler: shared::EffectHandler, + ) -> Result { + loop { + if let ExtensionControlMsg::Shutdown { .. } = ctrl_chan.recv().await? { + self.counter.increment_shutdown(); + break; + } + } + Ok(TerminalState::default()) + } + } +} diff --git a/rust/otap-dataflow/crates/engine/src/extension/bearer_token_provider.rs b/rust/otap-dataflow/crates/engine/src/extension/bearer_token_provider.rs new file mode 100644 index 0000000000..f1fc1da4d7 --- /dev/null +++ b/rust/otap-dataflow/crates/engine/src/extension/bearer_token_provider.rs @@ -0,0 +1,167 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Token provider extension trait. +//! +//! This module also contains the sealed-trait impls that register +//! `dyn BearerTokenProvider` as a valid [`ExtensionCapability`](super::registry::ExtensionCapability). + +use async_trait::async_trait; +use std::borrow::Cow; + +// ── Sealed ExtensionCapability registration ────────────────────────────────────── +// +// Every extension trait file must add these two impls so the type can be +// stored in the CapabilityRegistry. Copy this block when adding a new +// extension trait. +impl super::registry::private::Sealed for dyn BearerTokenProvider {} +impl super::registry::ExtensionCapability for dyn BearerTokenProvider {} + +/// Represents a secret value that should not be exposed in logs or debug output. +/// +/// The [`Debug`] implementation will not print the actual secret value. +#[derive(Clone, Eq)] +pub struct Secret(Cow<'static, str>); + +impl Secret { + /// Creates a new `Secret`. + #[must_use] + pub fn new(value: T) -> Self + where + T: Into>, + { + Self(value.into()) + } + + /// Returns the secret value. + #[must_use] + pub fn secret(&self) -> &str { + &self.0 + } +} + +impl PartialEq for Secret { + fn eq(&self, other: &Self) -> bool { + self.secret() == other.secret() + } +} + +impl From for Secret { + fn from(value: String) -> Self { + Self::new(value) + } +} + +impl From<&'static str> for Secret { + fn from(value: &'static str) -> Self { + Self::new(value) + } +} + +impl std::fmt::Debug for Secret { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + f.write_str("Secret") + } +} + +/// Represents a bearer token with its expiration time. +/// +/// The token value is wrapped in [`Secret`] to prevent accidental exposure +/// in logs or debug output. +#[derive(Debug, Clone)] +pub struct BearerToken { + /// The token value. + pub token: Secret, + + /// The expiration time as a UNIX timestamp (seconds since epoch). + pub expires_on: i64, +} + +impl BearerToken { + /// Creates a new bearer token. + #[must_use] + pub fn new(token: T, expires_on: i64) -> Self + where + T: Into, + { + Self { + token: token.into(), + expires_on, + } + } +} + +/// A trait for components that can provide bearer authentication tokens. +/// +/// Extensions implementing this trait can be looked up by other components +/// (e.g., exporters) to obtain tokens for authentication. +/// +/// # Thread Safety +/// +/// - The returned future is `Send` for use with async runtimes like tokio +/// - The error type is `Send + Sync` for safe propagation across threads +/// +/// # Subscribing to Token Refresh Events +/// +/// Use [`subscribe_token_refresh`](BearerTokenProvider::subscribe_token_refresh) to receive notifications when +/// tokens are refreshed. This is useful for updating HTTP headers or other +/// authentication state without polling. +/// +/// # Implementing This Trait +/// +/// External crates can implement this trait on their extension types: +/// +/// ```ignore +/// use async_trait::async_trait; +/// use otap_df_engine::extension::bearer_token_provider::{BearerToken, BearerTokenProvider}; +/// use otap_df_engine::extension::registry::Error; +/// +/// struct MyAuthExtension { /* ... */ } +/// +/// #[async_trait] +/// impl BearerTokenProvider for MyAuthExtension { +/// async fn get_token(&self) -> Result { +/// // ... acquire token ... +/// Ok(BearerToken { token: "...".into(), expires_on: 0 }) +/// } +/// +/// fn subscribe_token_refresh(&self) -> tokio::sync::watch::Receiver> { +/// self.token_sender.subscribe() +/// } +/// } +/// ``` +#[async_trait] +pub trait BearerTokenProvider: Send { + /// Returns an authentication token. + /// + /// # Errors + /// + /// Returns an error if the token cannot be obtained. + async fn get_token(&self) -> Result; + + /// Subscribes to token refresh events. + /// + /// Returns a new receiver that will be notified whenever the token + /// is refreshed. Each call creates an independent subscription. + /// The receiver always contains the latest token value (or `None` + /// if no token has been acquired yet). + /// + /// # Example + /// + /// ```ignore + /// let auth = extension_registry.get::("auth")?; + /// let mut token_rx = auth.subscribe_token_refresh(); + /// + /// loop { + /// tokio::select! { + /// _ = token_rx.changed() => { + /// if let Some(token) = token_rx.borrow().as_ref() { + /// // Update headers, etc. + /// } + /// } + /// // ... other branches + /// } + /// } + /// ``` + fn subscribe_token_refresh(&self) -> tokio::sync::watch::Receiver>; +} diff --git a/rust/otap-dataflow/crates/engine/src/extension/registry.rs b/rust/otap-dataflow/crates/engine/src/extension/registry.rs new file mode 100644 index 0000000000..48227cecb2 --- /dev/null +++ b/rust/otap-dataflow/crates/engine/src/extension/registry.rs @@ -0,0 +1,587 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Extension registry for storing and retrieving extension trait implementations by name. +//! +//! The registry stores `Box` for type-erased storage and produces +//! `Box` for trait-based lookups. It is `Clone` and `Send` — cloning +//! deep-copies each stored extension (which is cheap when the extension itself +//! wraps shared state in `Arc`). +//! +//! Extensions that publish traits override +//! [`Extension::extension_capabilities`](crate::local::extension::Extension::extension_capabilities), +//! using the [`extension_capabilities!`] macro to declare their trait implementations. +//! The engine calls `extension_capabilities()` during pipeline build and inserts the +//! results into the registry. +//! +//! # Extension writer contract +//! +//! Extension structs that publish traits must be `Clone + Send + 'static`. +//! Shared mutable state (e.g. credentials, token senders) should be held behind +//! `Arc` so that independent clones still observe the same state. +//! +//! Extensions that don't publish any traits (pure background tasks) have no +//! `Clone` requirement. +//! +//! # Example +//! +//! ```ignore +//! // In the Extension impl: +//! fn extension_capabilities(&self) -> Vec { +//! extension_capabilities!(self => BearerTokenProvider) +//! } +//! +//! // A consumer retrieves an owned trait object: +//! let provider: Box = registry +//! .get::("azure_auth")?; +//! provider.get_token().await?; +//! ``` + +use std::any::{Any, TypeId}; +use std::collections::HashMap; + +// ── Sealed trait infrastructure ───────────────────────────────────────────── + +// Sealed module — `pub(crate)` so extension trait files in `extension/` can +// add `impl Sealed` for their own `dyn Trait` types, while external crates +// cannot. +pub(crate) mod private { + /// Sealing trait — prevents external crates from implementing + /// [`ExtensionCapability`](super::ExtensionCapability). + pub trait Sealed {} +} + +/// Marker trait for extension trait types that can be stored in the +/// [`CapabilityRegistry`]. +/// +/// This trait is **sealed** — it can only be implemented inside this crate. +/// Each extension trait file in `extension/` adds its own `impl Sealed` + +/// `impl ExtensionCapability` pair (see +/// [`bearer_token_provider`](super::bearer_token_provider) for the pattern). +pub trait ExtensionCapability: private::Sealed {} + +/// Error type for extension trait operations. +/// +/// Thread-safe error type compatible with any `thiserror`-derived error. +pub type Error = Box; + +// ── CloneAnySend helper trait ──────────────────────────────────────────────── + +/// Internal trait for type-erased, cloneable, `Send` storage. +/// +/// Each concrete `T: Clone + Send + 'static` gets a blanket implementation. +/// `Box` implements `Clone` via `clone_box()`. +pub(crate) trait CloneAnySend: Send { + /// Deep-clone into a new boxed trait object. + fn clone_box(&self) -> Box; + /// Access the concrete value as `&dyn Any` for downcasting. + fn as_any_ref(&self) -> &dyn Any; +} + +impl CloneAnySend for T { + fn clone_box(&self) -> Box { + Box::new(self.clone()) + } + fn as_any_ref(&self) -> &dyn Any { + self + } +} + +impl Clone for Box { + fn clone(&self) -> Self { + // Explicit double-deref so method resolution dispatches through the + // vtable of `dyn CloneAnySend` (→ concrete type), NOT through the + // blanket `CloneAnySend for Box` which would recurse. + (**self).clone_box() + } +} + +// ── RegistryEntry ──────────────────────────────────────────────────────────── + +/// A single entry in the registry: a cloneable concrete value plus a coerce +/// function that knows how to produce `Box` (containing a +/// `Box`) from a `&dyn Any` reference pointing at the concrete type. +/// +/// The `coerce` function pointer is monomorphised at registration time (inside +/// the [`extension_capabilities!`] macro) and is `Copy`, so the entry is +/// cheaply cloneable. +struct RegistryEntry { + /// The concrete extension value, type-erased but cloneable. + value: Box, + /// Clones the concrete value out of `&dyn Any` and wraps it as + /// `Box>` erased to `Box`. + coerce: fn(&dyn Any) -> Box, +} + +impl Clone for RegistryEntry { + fn clone(&self) -> Self { + Self { + value: self.value.clone(), + coerce: self.coerce, + } + } +} + +// ── CapabilityRegistration ──────────────────────────────────────────────────────── + +/// A self-contained registration for one trait that an extension implements. +/// +/// Produced by the [`extension_capabilities!`] macro. Each registration carries: +/// - A cloned copy of the concrete extension value (type-erased) +/// - A monomorphised `coerce` function pointer for producing `Box` +/// - The `TypeId` of `Box` for registry lookup +/// +/// The extension writer just returns `Vec` from +/// [`Extension::extension_capabilities`](crate::local::extension::Extension::extension_capabilities); +/// the engine inserts them into the [`CapabilityRegistry`] by name. +pub struct CapabilityRegistration { + /// `TypeId` of `Box` — used as registry lookup key. + trait_id: TypeId, + /// The concrete extension value, type-erased but cloneable. + value: Box, + /// Monomorphised fn: given `&dyn Any` pointing at the concrete extension + /// type, clone it, wrap in `Box`, and return as + /// `Box`. + coerce: fn(&dyn Any) -> Box, +} + +impl CapabilityRegistration { + /// Creates a new trait registration. + /// + /// This is intended for use by the [`extension_capabilities!`] macro — not for + /// direct use by extension writers. + #[doc(hidden)] + pub fn new( + trait_id: TypeId, + value: impl Clone + Send + 'static, + coerce: fn(&dyn Any) -> Box, + ) -> Self { + Self { + trait_id, + value: Box::new(value), + coerce, + } + } +} + +// ── Public types ───────────────────────────────────────────────────────────── + +/// Error when retrieving an extension trait. +#[derive(Debug)] +pub enum ExtensionError { + /// Extension not found by name. + NotFound { + /// The name of the extension that was not found. + name: String, + }, + /// Extension found but doesn't implement the requested trait. + TraitNotImplemented { + /// The name of the extension. + name: String, + /// The expected trait name. + expected: &'static str, + }, +} + +impl std::fmt::Display for ExtensionError { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + match self { + ExtensionError::NotFound { name } => { + write!(f, "extension '{}' not found", name) + } + ExtensionError::TraitNotImplemented { name, expected } => { + write!( + f, + "extension '{}' does not implement trait {}", + name, expected + ) + } + } + } +} + +impl std::error::Error for ExtensionError {} + +// ── CapabilityRegistry ──────────────────────────────────────────────────────── + +/// Registry for extension trait implementations. +/// +/// Extensions register themselves here during pipeline build so other components +/// can look them up by name and retrieve `Box` references. +/// +/// The registry is `Clone` and `Send`. Cloning deep-copies each stored +/// extension value (cheap when the extension wraps shared state in `Arc`). +/// Each `get` call returns a freshly-cloned `Box`. +#[derive(Default, Clone)] +pub struct CapabilityRegistry { + /// `(extension_name, TypeId::of::>())` → `RegistryEntry` + handles: HashMap<(String, TypeId), RegistryEntry>, +} + +impl CapabilityRegistry { + /// Create a new empty registry. + #[must_use] + pub fn new() -> Self { + Self { + handles: HashMap::new(), + } + } + + /// Insert pre-built trait registrations for an extension. + /// + /// Each [`CapabilityRegistration`] carries a cloned value and coerce function. + /// This method inserts them into the registry keyed by `(name, trait_id)`. + /// + /// Called by the engine during pipeline build — not intended for direct use + /// by extension writers. + pub(crate) fn register_all(&mut self, name: &str, registrations: Vec) { + for reg in registrations { + let entry = RegistryEntry { + value: reg.value, + coerce: reg.coerce, + }; + let _ = self.handles.insert((name.to_string(), reg.trait_id), entry); + } + } + + /// Get an owned clone of a trait implementation by extension name. + /// + /// Returns `Box` — a fresh clone produced from the stored + /// extension value. The clone shares any `Arc`-wrapped state with the + /// original and with other clones. + /// + /// # Type Parameters + /// + /// * `T` - The trait type (e.g., `dyn BearerTokenProvider`). + /// + /// # Errors + /// + /// Returns `ExtensionError::NotFound` if no extension with that name exists. + /// Returns `ExtensionError::TraitNotImplemented` if the extension doesn't expose that trait. + /// + /// # Example + /// + /// ```ignore + /// let provider: Box = registry + /// .get::("azure_auth")?; + /// provider.get_token().await?; + /// ``` + pub fn get(&self, name: &str) -> Result, ExtensionError> { + let key = (name.to_string(), TypeId::of::>()); + let entry = self.handles.get(&key).ok_or_else(|| { + // Distinguish "extension not found" from "trait not implemented" + let has_any = self.handles.keys().any(|(n, _)| n == name); + if has_any { + ExtensionError::TraitNotImplemented { + name: name.to_string(), + expected: std::any::type_name::(), + } + } else { + ExtensionError::NotFound { + name: name.to_string(), + } + } + })?; + + // Coerce produces Box that is actually Box>. + // Explicit deref (*entry.value) ensures we dispatch through the vtable + // of `dyn CloneAnySend` to reach the concrete type, not the blanket + // impl on `Box` itself. + let erased = (entry.coerce)((*entry.value).as_any_ref()); + let double_boxed = erased + .downcast::>() + .expect("TypeId matched but downcast failed — this is a bug"); + + Ok(*double_boxed) + } + + /// Check if an extension exists by name. + #[must_use] + pub fn contains(&self, name: &str) -> bool { + self.handles.keys().any(|(n, _)| n == name) + } + + /// Returns the number of registered extensions (unique names). + #[must_use] + pub fn len(&self) -> usize { + self.handles + .keys() + .map(|(n, _)| n) + .collect::>() + .len() + } + + /// Returns true if no extensions are registered. + #[must_use] + pub fn is_empty(&self) -> bool { + self.handles.is_empty() + } + + /// Returns an iterator over unique extension names. + pub fn names(&self) -> impl Iterator { + self.handles + .keys() + .map(|(n, _)| n) + .collect::>() + .into_iter() + } +} + +impl std::fmt::Debug for CapabilityRegistry { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + let names: Vec<&String> = self.names().collect(); + f.debug_struct("CapabilityRegistry") + .field("extensions", &names) + .finish() + } +} + +/// Macro to declare which extension traits a concrete type implements. +/// +/// Has two forms: +/// +/// ## Convenience form (inside `impl Extension` block) +/// +/// Expands to a complete +/// [`Extension::extension_capabilities`](crate::local::extension::Extension::extension_capabilities) +/// method definition. Place it directly inside an `impl Extension` block: +/// +/// ```ignore +/// #[async_trait(?Send)] +/// impl Extension for MyExtension { +/// otap_df_engine::extension_capabilities!(BearerTokenProvider, SomeOtherTrait); +/// +/// async fn start(...) { ... } +/// } +/// ``` +/// +/// ## Explicit form (returns `Vec`) +/// +/// Returns `Vec` — self-contained registrations each carrying +/// a cloned copy of `self` and a monomorphised coerce function pointer. The +/// extension writer returns this from +/// [`Extension::extension_capabilities`](crate::local::extension::Extension::extension_capabilities); +/// the engine inserts the registrations into the [`CapabilityRegistry`] by name. +/// +/// ```ignore +/// fn extension_capabilities(&self) -> Vec { +/// extension_capabilities!(self => BearerTokenProvider) +/// } +/// ``` +/// +/// # Type Safety +/// +/// The macro verifies at compile time that: +/// - Each listed trait implements [`ExtensionCapability`] (sealed) +/// - The concrete type implements each listed trait plus `Clone + Send + 'static` +#[macro_export] +macro_rules! extension_capabilities { + // Explicit form: `extension_capabilities!(self => Trait1, Trait2)` + ($self:expr => $($trait:ident),* $(,)?) => {{ + let mut __regs: Vec<$crate::extension::registry::CapabilityRegistration> = Vec::new(); + $( + { + // Compile-time: ensure the trait is a sealed ExtensionCapability. + const _: fn() = || { + fn assert_extension_capability() {} + assert_extension_capability::(); + }; + + // Generic coerce fn — monomorphised for concrete T by the call + // to `__make_reg` below. + fn __coerce( + any: &dyn std::any::Any, + ) -> Box { + let concrete = any + .downcast_ref::() + .expect("registry entry type mismatch — this is a bug"); + let cloned = concrete.clone(); + let trait_obj: Box = Box::new(cloned); + Box::new(trait_obj) as Box + } + + // Generic helper whose T is inferred from $self. + fn __make_reg( + instance: &T, + ) -> $crate::extension::registry::CapabilityRegistration { + $crate::extension::registry::CapabilityRegistration::new( + std::any::TypeId::of::>(), + instance.clone(), + __coerce::, + ) + } + + __regs.push(__make_reg($self)); + } + )* + __regs + }}; + // Convenience form: `extension_capabilities!(Trait1, Trait2)` + // Expands to a full method definition inside an `impl Extension` block. + ($($trait:ident),* $(,)?) => { + fn extension_capabilities(&self) -> Vec<$crate::extension::registry::CapabilityRegistration> { + $crate::extension_capabilities!(self => $($trait),*) + } + }; +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::extension::bearer_token_provider::BearerToken; + use crate::extension::bearer_token_provider::BearerTokenProvider; + use tokio::sync::watch; + + #[derive(Clone)] + struct TestTokenProvider { + token: String, + } + + #[async_trait::async_trait] + impl BearerTokenProvider for TestTokenProvider { + async fn get_token(&self) -> Result { + Ok(BearerToken::new(self.token.clone(), 0)) + } + + fn subscribe_token_refresh(&self) -> watch::Receiver> { + let (tx, rx) = watch::channel(None); + drop(tx); + rx + } + } + + /// Helper: register a TestTokenProvider with the given name. + fn register_provider(registry: &mut CapabilityRegistry, name: &str, token: &str) { + let instance = TestTokenProvider { + token: token.to_string(), + }; + let regs = crate::extension_capabilities!(&instance => BearerTokenProvider); + registry.register_all(name, regs); + } + + #[test] + fn test_register_and_get() { + let mut registry = CapabilityRegistry::new(); + register_provider(&mut registry, "test_ext", "test_token"); + + let result: Result, _> = + registry.get::("test_ext"); + assert!(result.is_ok()); + } + + #[test] + fn test_get_returns_independent_clones() { + let mut registry = CapabilityRegistry::new(); + register_provider(&mut registry, "ext", "shared_test"); + + let a: Box = + registry.get::("ext").unwrap(); + let b: Box = + registry.get::("ext").unwrap(); + + // Both are independent clones (different pointers) + assert!(!std::ptr::eq( + &*a as *const dyn BearerTokenProvider, + &*b as *const dyn BearerTokenProvider, + )); + } + + #[test] + fn test_registry_clone_produces_deep_copy() { + let mut registry = CapabilityRegistry::new(); + register_provider(&mut registry, "ext", "clone_test"); + + let cloned = registry.clone(); + + let from_original: Box = + registry.get::("ext").unwrap(); + let from_clone: Box = + cloned.get::("ext").unwrap(); + + // Deep copy — different pointers + assert!(!std::ptr::eq( + &*from_original as *const dyn BearerTokenProvider, + &*from_clone as *const dyn BearerTokenProvider, + )); + } + + #[test] + fn test_not_found() { + let registry = CapabilityRegistry::new(); + let result = registry.get::("missing"); + assert!(matches!(result, Err(ExtensionError::NotFound { .. }))); + } + + #[test] + fn test_extension_error_display() { + let not_found = ExtensionError::NotFound { + name: "missing_ext".to_string(), + }; + let display = format!("{}", not_found); + assert!(display.contains("missing_ext")); + assert!(display.contains("not found")); + + let not_impl = ExtensionError::TraitNotImplemented { + name: "my_ext".to_string(), + expected: "BearerTokenProvider", + }; + let display = format!("{}", not_impl); + assert!(display.contains("my_ext")); + assert!(display.contains("BearerTokenProvider")); + } + + #[test] + fn test_registry_debug() { + let mut registry = CapabilityRegistry::new(); + register_provider(&mut registry, "test_ext", "test"); + + let debug_str = format!("{:?}", registry); + assert!(debug_str.contains("CapabilityRegistry")); + assert!(debug_str.contains("test_ext")); + } + + #[test] + fn test_contains_and_len() { + let mut registry = CapabilityRegistry::new(); + assert!(registry.is_empty()); + assert_eq!(registry.len(), 0); + + register_provider(&mut registry, "ext", "test"); + assert!(registry.contains("ext")); + assert!(!registry.contains("missing")); + assert_eq!(registry.len(), 1); + } + + #[tokio::test] + async fn test_get_extension_actually_works() { + let mut registry = CapabilityRegistry::new(); + register_provider(&mut registry, "auth", "real_token"); + + let provider: Box = + registry.get::("auth").unwrap(); + let token = provider.get_token().await.unwrap(); + assert_eq!(token.token.secret(), "real_token"); + } + + #[test] + fn test_multiple_extensions_same_trait() { + let mut registry = CapabilityRegistry::new(); + register_provider(&mut registry, "azure_prod", "prod_token"); + register_provider(&mut registry, "azure_staging", "staging_token"); + + assert_eq!(registry.len(), 2); + + let _p1 = registry + .get::("azure_prod") + .unwrap(); + let _p2 = registry + .get::("azure_staging") + .unwrap(); + } + + #[test] + fn test_registry_is_send() { + fn assert_send() {} + assert_send::(); + } +} diff --git a/rust/otap-dataflow/crates/engine/src/lib.rs b/rust/otap-dataflow/crates/engine/src/lib.rs index ad5233d80d..0934f5b959 100644 --- a/rust/otap-dataflow/crates/engine/src/lib.rs +++ b/rust/otap-dataflow/crates/engine/src/lib.rs @@ -9,12 +9,14 @@ use crate::{ CHANNEL_MODE_LOCAL, CHANNEL_MODE_SHARED, CHANNEL_TYPE_MPMC, CHANNEL_TYPE_MPSC, ChannelMetricsRegistry, ChannelReceiverMetrics, ChannelSenderMetrics, }, - config::{ExporterConfig, ProcessorConfig, ReceiverConfig}, + config::{ExporterConfig, ExtensionConfig, ProcessorConfig, ReceiverConfig}, control::{AckMsg, CallData, NackMsg}, effect_handler::SourceTagging, entity_context::{NodeTelemetryGuard, NodeTelemetryHandle, with_node_telemetry_handle}, error::{Error, TypedError}, exporter::ExporterWrapper, + extension::ExtensionWrapper, + extension::registry::CapabilityRegistry, local::message::{LocalReceiver, LocalSender}, message::{Receiver, Sender}, node::{Node, NodeDefs, NodeId, NodeName, NodeType}, @@ -49,6 +51,7 @@ use std::{ pub mod error; pub mod exporter; +pub mod extension; pub mod message; pub mod processor; pub mod receiver; @@ -92,6 +95,7 @@ pub struct ReceiverFactory { node: NodeId, node_config: Arc, receiver_config: &ReceiverConfig, + capability_registry: &CapabilityRegistry, ) -> Result, otap_df_config::error::Error>, /// Optional wiring constraints enforced during pipeline build. pub wiring_contract: wiring_contract::WiringContract, @@ -131,6 +135,7 @@ pub struct ProcessorFactory { node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + capability_registry: &CapabilityRegistry, ) -> Result, otap_df_config::error::Error>, /// Optional wiring constraints enforced during pipeline build. pub wiring_contract: wiring_contract::WiringContract, @@ -162,7 +167,7 @@ impl NamedFactory for ProcessorFactory { /// A factory for creating exporter. pub struct ExporterFactory { - /// The name of the receiver. + /// The name of the exporter. pub name: &'static str, /// A function that creates a new exporter instance. pub create: fn( @@ -170,6 +175,7 @@ pub struct ExporterFactory { node: NodeId, node_config: Arc, exporter_config: &ExporterConfig, + capability_registry: &CapabilityRegistry, ) -> Result, otap_df_config::error::Error>, /// Optional wiring constraints enforced during pipeline build. pub wiring_contract: wiring_contract::WiringContract, @@ -199,6 +205,45 @@ impl NamedFactory for ExporterFactory { } } +/// A factory for creating extensions. +/// +/// Extension factories are NOT generic over PData — extensions never process +/// pipeline data. This makes them fully decoupled from the data-plane type. +pub struct ExtensionFactory { + /// The name of the extension. + pub name: &'static str, + /// A function that creates a new extension instance. + pub create: fn( + pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + extension_config: &ExtensionConfig, + ) -> Result, + /// Validates the node-specific config statically, without creating the component. + /// + /// Use [`otap_df_config::validation::validate_typed_config`] for components with a + /// typed `Config` struct, or [`otap_df_config::validation::no_config`] for components + /// that accept no user configuration. + pub validate_config: fn(config: &serde_json::Value) -> Result<(), otap_df_config::error::Error>, +} + +// Note: We don't use `#[derive(Clone)]` here for consistency with other factories. +impl Clone for ExtensionFactory { + fn clone(&self) -> Self { + ExtensionFactory { + name: self.name, + create: self.create, + validate_config: self.validate_config, + } + } +} + +impl NamedFactory for ExtensionFactory { + fn name(&self) -> &'static str { + self.name + } +} + /// Returns a map of factory names to factory instances. pub fn get_factory_map( factory_map: &'static OnceLock>, @@ -421,15 +466,18 @@ pub const fn build_factory() -> PipelineFactory { /// A pipeline factory. /// -/// This factory contains a registry of all the micro-factories for receivers, processors, and -/// exporters, as well as the logic for creating pipelines based on a given configuration. +/// This factory contains a registry of all the micro-factories for receivers, processors, +/// exporters, and extensions, as well as the logic for creating pipelines based on a given +/// configuration. pub struct PipelineFactory { receiver_factory_map: OnceLock>>, processor_factory_map: OnceLock>>, exporter_factory_map: OnceLock>>, + extension_factory_map: OnceLock>, receiver_factories: &'static [ReceiverFactory], processor_factories: &'static [ProcessorFactory], exporter_factories: &'static [ExporterFactory], + extension_factories: &'static [ExtensionFactory], } impl PipelineFactory { @@ -444,9 +492,31 @@ impl PipelineFactory { receiver_factory_map: OnceLock::new(), processor_factory_map: OnceLock::new(), exporter_factory_map: OnceLock::new(), + extension_factory_map: OnceLock::new(), + receiver_factories, + processor_factories, + exporter_factories, + extension_factories: &[], + } + } + + /// Creates a new factory registry with the given factory slices, including extensions. + #[must_use] + pub const fn with_extensions( + receiver_factories: &'static [ReceiverFactory], + processor_factories: &'static [ProcessorFactory], + exporter_factories: &'static [ExporterFactory], + extension_factories: &'static [ExtensionFactory], + ) -> Self { + Self { + receiver_factory_map: OnceLock::new(), + processor_factory_map: OnceLock::new(), + exporter_factory_map: OnceLock::new(), + extension_factory_map: OnceLock::new(), receiver_factories, processor_factories, exporter_factories, + extension_factories, } } @@ -480,16 +550,28 @@ impl PipelineFactory { }) } + /// Gets the extension factory map, initializing it if necessary. + pub fn get_extension_factory_map(&self) -> &HashMap<&'static str, ExtensionFactory> { + self.extension_factory_map.get_or_init(|| { + self.extension_factories + .iter() + .map(|f| (f.name(), f.clone())) + .collect::>() + }) + } + /// Builds a runtime pipeline from the given pipeline configuration. /// /// Main phases: - /// 1) Create runtime nodes and register telemetry. - /// 2) Plan hyper edge wiring: resolve destinations, pick channel type (shared/local, + /// 1) Create runtime nodes (receivers, processors, exporters) and extensions, + /// then register telemetry. + /// 2) Build the extension registry from trait registrations. + /// 3) Plan hyper-edge wiring: resolve destinations, pick channel type (shared/local, /// MPSC/MPMC), create channel endpoints, and register channel metrics. - /// 3) Apply wiring: attach senders to source ports and receivers to destination nodes, + /// 4) Apply wiring: attach senders to source ports and receivers to destination nodes, /// then publish collected channel metrics on the pipeline. /// - /// [config] -> [nodes] -> [hyper-edges] -> [wiring plan] -> [pipeline] + /// [config] -> [nodes + extensions] -> [extension registry] -> [hyper-edges] -> [wiring plan] -> [pipeline] /// /// The `internal_telemetry` settings are injected into any receiver with the /// `INTERNAL_TELEMETRY_RECEIVER_URN` plugin URN, enabling it to consume logs @@ -502,9 +584,6 @@ impl PipelineFactory { telemetry_policy: TelemetryPolicy, internal_telemetry: Option, ) -> Result, Error> { - let mut receivers = Vec::new(); - let mut processors = Vec::new(); - let mut exporters = Vec::new(); let mut build_state = BuildState::new(); let pipeline_group_id = pipeline_ctx.pipeline_group_id(); @@ -559,6 +638,7 @@ impl PipelineFactory { let mut receiver_count = 0usize; let mut processor_count = 0usize; let mut exporter_count = 0usize; + let mut extension_count = 0usize; let mut node_ids: HashMap = HashMap::new(); for (name, node_config) in config.node_iter() { @@ -578,6 +658,9 @@ impl PipelineFactory { exporter_count += 1; (NodeType::Exporter, pn) } + otap_df_config::node::NodeKind::Extension => { + return Err(Error::ExtensionInNodesSection { node: name.clone() }); + } otap_df_config::node::NodeKind::ProcessorChain => { return Err(Error::UnsupportedNodeKind { kind: "ProcessorChain".into(), @@ -588,6 +671,14 @@ impl PipelineFactory { let _ = node_ids.insert(name.clone(), node_id); } + // Allocate node IDs for extensions from the dedicated `extensions` section. + for (name, _ext_config) in config.extension_iter() { + let pn = PipeNode::new(extension_count); + extension_count += 1; + let node_id = build_state.next_node_id(name.clone(), NodeType::Extension, pn)?; + let _ = node_ids.insert(name.clone(), node_id); + } + let node_names: NodeNameIndex = Arc::new( node_ids .iter() @@ -596,9 +687,50 @@ impl PipelineFactory { ); pipeline_ctx.set_node_names(node_names); - // Second pass: create runtime nodes. Node IDs were pre-assigned above, + // Second pass: create extension runtime nodes FIRST so the capability + // registry is available when data-path nodes are created. + let mut extensions = Vec::new(); + for (name, node_config) in config.extension_iter() { + let node_id = node_ids.get(name).expect("allocated in first pass").clone(); + let base_ctx = pipeline_ctx.with_node_context( + name.clone(), + node_config.r#type.clone(), + otap_df_config::node::NodeKind::Extension, + node_config.identity_attributes(), + ); + let wrapper = self.build_node_wrapper( + &mut build_state, + &base_ctx, + NodeType::Extension, + node_id.clone(), + channel_metrics_enabled, + || { + self.create_extension( + &base_ctx, + node_id.clone(), + node_config.clone(), + channel_capacity_policy.control.node, + ) + }, + )?; + extensions.push(wrapper); + } + + // Build capability registry from extension trait registrations. + let mut extension_registry = CapabilityRegistry::new(); + for ext in &extensions { + let name = ext.node_id().name.as_ref().to_string(); + ext.register_traits(&mut extension_registry, &name); + } + + // Third pass: create data-path runtime nodes. Node IDs were pre-assigned above, // so we look them up from `node_ids` instead of calling `next_node_id`. + // The capability_registry is passed to each factory so components can resolve + // capabilities at construction time. // ToDo(LQ): Collect all errors instead of failing fast to provide better feedback. + let mut receivers = Vec::new(); + let mut processors = Vec::new(); + let mut exporters = Vec::new(); for (name, node_config) in config.node_iter() { let node_kind = node_config.kind(); let node_id = node_ids.get(name).expect("allocated in first pass").clone(); @@ -633,6 +765,7 @@ impl PipelineFactory { node_config.clone(), channel_capacity_policy.control.node, channel_capacity_policy.pdata, + &extension_registry, ) }, )?; @@ -652,6 +785,7 @@ impl PipelineFactory { node_config.clone(), channel_capacity_policy.control.node, channel_capacity_policy.pdata, + &extension_registry, ) }, )?; @@ -671,11 +805,16 @@ impl PipelineFactory { node_config.clone(), channel_capacity_policy.control.node, channel_capacity_policy.pdata, + &extension_registry, ) }, )?; exporters.push(wrapper); } + otap_df_config::node::NodeKind::Extension => { + // Rejected in first pass — extensions must be in the `extensions` section. + unreachable!("rejected in first pass"); + } otap_df_config::node::NodeKind::ProcessorChain => { // ToDo(LQ): Implement processor chain optimization to eliminate intermediary channels. unreachable!("rejected in first pass"); @@ -694,6 +833,7 @@ impl PipelineFactory { receivers, processors, exporters, + extensions, nodes, telemetry_policy, ); @@ -772,6 +912,11 @@ impl PipelineFactory { kind: "ProcessorChain".into(), }); } + otap_df_config::node::NodeKind::Extension => { + // Extensions are in the `extensions` section and don't participate + // in pdata wiring; skip if somehow encountered. + continue; + } }; _ = contracts_by_node.insert(node_name.as_ref().to_string().into(), contract); @@ -1309,6 +1454,7 @@ impl PipelineFactory { node_config: Arc, control_channel_capacity: usize, pdata_channel_capacity: usize, + capability_registry: &CapabilityRegistry, ) -> Result, Error> { let pipeline_group_id = pipeline_ctx.pipeline_group_id(); let pipeline_id = pipeline_ctx.pipeline_id(); @@ -1348,6 +1494,7 @@ impl PipelineFactory { node_id.clone(), node_config, &runtime_config, + capability_registry, ) .map_err(|e| Error::ConfigError(Box::new(e)))?; @@ -1370,6 +1517,7 @@ impl PipelineFactory { node_config: Arc, control_channel_capacity: usize, pdata_channel_capacity: usize, + capability_registry: &CapabilityRegistry, ) -> Result, Error> { let pipeline_group_id = pipeline_ctx.pipeline_group_id(); let pipeline_id = pipeline_ctx.pipeline_id(); @@ -1409,6 +1557,7 @@ impl PipelineFactory { node_id.clone(), node_config.clone(), &processor_config, + capability_registry, ) .map_err(|e| Error::ConfigError(Box::new(e)))?; @@ -1431,6 +1580,7 @@ impl PipelineFactory { node_config: Arc, control_channel_capacity: usize, pdata_channel_capacity: usize, + capability_registry: &CapabilityRegistry, ) -> Result, Error> { let pipeline_group_id = pipeline_ctx.pipeline_group_id(); let pipeline_id = pipeline_ctx.pipeline_id(); @@ -1470,6 +1620,7 @@ impl PipelineFactory { node_id.clone(), node_config, &exporter_config, + capability_registry, ) .map_err(|e| Error::ConfigError(Box::new(e)))?; @@ -1483,6 +1634,63 @@ impl PipelineFactory { Ok(exporter) } + + /// Creates an extension node. + fn create_extension( + &self, + pipeline_ctx: &PipelineContext, + node_id: NodeId, + node_config: Arc, + control_channel_capacity: usize, + ) -> Result { + let pipeline_group_id = pipeline_ctx.pipeline_group_id(); + let pipeline_id = pipeline_ctx.pipeline_id(); + let core_id = pipeline_ctx.core_id(); + let name = node_id.name.clone(); + + otel_debug!( + "extension.create.start", + pipeline_group_id = pipeline_group_id.as_ref(), + pipeline_id = pipeline_id.as_ref(), + core_id = core_id, + node_id = name.as_ref(), + ); + + // Validate plugin URN structure during registration + let normalized = otap_df_config::node_urn::validate_plugin_urn( + node_config.r#type.as_ref(), + otap_df_config::node::NodeKind::Extension, + ) + .map_err(|e| Error::ConfigError(Box::new(e)))?; + + let factory = self + .get_extension_factory_map() + .get(normalized.as_str()) + .ok_or(Error::UnknownExtension { + plugin_urn: normalized, + })?; + let extension_config = + ExtensionConfig::with_control_channel_capacity(name.clone(), control_channel_capacity); + let create = factory.create; + + let extension = create( + (*pipeline_ctx).clone(), + node_id.clone(), + node_config, + &extension_config, + ) + .map_err(|e| Error::ConfigError(Box::new(e)))?; + + otel_debug!( + "extension.create.complete", + pipeline_group_id = pipeline_group_id.as_ref(), + pipeline_id = pipeline_id.as_ref(), + core_id = core_id, + node_id = name.as_ref(), + ); + + Ok(extension) + } } trait TelemetryWrapped: Sized { @@ -1598,6 +1806,7 @@ impl BuildState { NodeType::Receiver => Error::ReceiverAlreadyExists { receiver: node_id }, NodeType::Processor => Error::ProcessorAlreadyExists { processor: node_id }, NodeType::Exporter => Error::ExporterAlreadyExists { exporter: node_id }, + NodeType::Extension => Error::ExtensionAlreadyExists { extension: node_id }, }); } @@ -1631,7 +1840,9 @@ impl BuildState { let registration = self.registration(name)?; match registration.node_type { NodeType::Processor | NodeType::Exporter => Ok(registration.node_id.clone()), - NodeType::Receiver => Err(Error::UnknownNode { node: name.clone() }), + NodeType::Receiver | NodeType::Extension => { + Err(Error::UnknownNode { node: name.clone() }) + } } } } diff --git a/rust/otap-dataflow/crates/engine/src/local/extension.rs b/rust/otap-dataflow/crates/engine/src/local/extension.rs new file mode 100644 index 0000000000..8a2086afcb --- /dev/null +++ b/rust/otap-dataflow/crates/engine/src/local/extension.rs @@ -0,0 +1,192 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Trait and structures used to implement local extensions (!Send). +//! +//! An extension is a long-lived component that runs alongside the pipeline and +//! exposes functionality (e.g., authentication, service discovery) to other +//! components through the [`CapabilityRegistry`](crate::extension::registry::CapabilityRegistry). +//! +//! Unlike receivers, processors, and exporters, extensions: +//! - Do NOT process pipeline data (PData) +//! - Do NOT have input/output pdata channels +//! - Only receive control messages (shutdown, timer ticks, config updates) +//! +//! # Thread Safety +//! +//! This implementation is designed to be used in a single-threaded environment. +//! The `Extension` trait does not require the `Send` bound on the returned future, +//! allowing for the use of non-thread-safe types. +//! +//! # Scalability +//! +//! To ensure scalability, the pipeline engine will start multiple instances of the same pipeline +//! in parallel on different cores, each with its own extension instance. + +use crate::control::ExtensionControlMsg; +use crate::error::Error; +use crate::extension::registry::CapabilityRegistration; +use crate::local::message::LocalReceiver; +use crate::node::NodeId; +use crate::terminal_state::TerminalState; +use async_trait::async_trait; +use otap_df_channel::error::RecvError; +use otap_df_telemetry::reporter::MetricsReporter; + +/// A trait for pipeline extensions (!Send definition). +/// +/// Extensions are long-lived components that run alongside the pipeline and +/// expose functionality (e.g., authentication, service discovery) to other +/// components through the [`CapabilityRegistry`](crate::extension::registry::CapabilityRegistry). +/// +/// Unlike receivers, processors, and exporters, extensions are NOT generic over +/// PData — they never process pipeline data. +/// +/// # Example +/// +/// ```ignore +/// use async_trait::async_trait; +/// use otap_df_engine::local::extension::{Extension, EffectHandler, ControlChannel}; +/// use otap_df_engine::control::ExtensionControlMsg; +/// use otap_df_engine::terminal_state::TerminalState; +/// use otap_df_engine::error::Error; +/// +/// struct MyAuthExtension { /* ... */ } +/// +/// #[async_trait(?Send)] +/// impl Extension for MyAuthExtension { +/// async fn start( +/// self: Box, +/// mut ctrl_chan: ControlChannel, +/// effect_handler: EffectHandler, +/// ) -> Result { +/// loop { +/// match ctrl_chan.recv().await? { +/// ExtensionControlMsg::Shutdown { .. } => break, +/// _ => {} +/// } +/// } +/// Ok(TerminalState::default()) +/// } +/// } +/// ``` +#[async_trait(?Send)] +pub trait Extension { + /// Starts the extension. + /// + /// The pipeline engine calls this to start the extension in a dedicated task. + /// Extensions are started BEFORE receivers, processors, and exporters so that + /// their capabilities are available when data-path components initialize. + /// + /// The extension is taken as `Box` so the method takes ownership once + /// `start` is called. This lets it move into an independent task, after which + /// the pipeline can only reach it through the control-message channel. + /// + /// # Parameters + /// + /// - `ctrl_chan`: A channel to receive control messages. Extensions only + /// receive [`ExtensionControlMsg`] — never PData. + /// - `effect_handler`: A handler to perform side effects such as + /// info logging. + /// + /// # Errors + /// + /// Returns an [`Error`] if an unrecoverable error occurs. + async fn start( + self: Box, + ctrl_chan: ControlChannel, + effect_handler: EffectHandler, + ) -> Result; + + /// Returns extension trait registrations for this extension. + /// + /// Override this method to publish traits that other pipeline components can + /// consume via `registry.get::(name)`. The default + /// implementation returns an empty vec — suitable for pure background-task + /// extensions that do not expose any traits. + /// + /// Inside the override, use the [`extension_capabilities!`](crate::extension_capabilities!) macro: + /// + /// ```ignore + /// fn extension_capabilities(&self) -> Vec { + /// extension_capabilities!(self => BearerTokenProvider) + /// } + /// ``` + fn extension_capabilities(&self) -> Vec { + Vec::new() + } +} + +/// A channel for receiving control messages for local extensions. +/// +/// Extensions only receive control messages (shutdown, timer ticks, config updates). +/// They do not process pipeline data (PData). +/// +/// Unlike the shared variant, there is no draining/deadline logic. A `Shutdown` +/// message is returned immediately when received. Extensions shut down last +/// (after data-plane nodes), so there is typically nothing left to drain. +pub struct ControlChannel { + control_rx: Option>, +} + +impl ControlChannel { + /// Creates a new `ControlChannel` with the given control receiver. + #[must_use] + pub const fn new(control_rx: LocalReceiver) -> Self { + ControlChannel { + control_rx: Some(control_rx), + } + } + + /// Asynchronously receives the next control message. + /// + /// # Errors + /// + /// Returns a [`RecvError`] if the channel is closed. + pub async fn recv(&mut self) -> Result { + let rx = self.control_rx.as_mut().ok_or(RecvError::Closed)?; + rx.recv().await + } +} + +/// A `!Send` implementation of the EffectHandler for extensions. +/// +/// Provides extensions with the ability to: +/// - Print info messages +/// - Access node identity +/// +/// Extensions manage their own timers directly via `tokio::time` rather than +/// through the engine's timer infrastructure, keeping the extension system +/// fully PData-free. +#[derive(Clone)] +pub struct EffectHandler { + node_id: NodeId, + #[allow(dead_code)] + metrics_reporter: MetricsReporter, +} + +impl EffectHandler { + /// Creates a new local (!Send) `EffectHandler` for the given extension node. + #[must_use] + pub const fn new(node_id: NodeId, metrics_reporter: MetricsReporter) -> Self { + EffectHandler { + node_id, + metrics_reporter, + } + } + + /// Returns the id of the extension associated with this handler. + #[must_use] + pub fn extension_id(&self) -> NodeId { + self.node_id.clone() + } + + /// Print an info message to stdout. + pub async fn info(&self, message: &str) { + use tokio::io::{AsyncWriteExt, stdout}; + let mut out = stdout(); + let _ = out.write_all(message.as_bytes()).await; + let _ = out.write_all(b"\n").await; + let _ = out.flush().await; + } +} diff --git a/rust/otap-dataflow/crates/engine/src/local/mod.rs b/rust/otap-dataflow/crates/engine/src/local/mod.rs index b4bb3e477d..5298d4a9e2 100644 --- a/rust/otap-dataflow/crates/engine/src/local/mod.rs +++ b/rust/otap-dataflow/crates/engine/src/local/mod.rs @@ -1,9 +1,11 @@ // Copyright The OpenTelemetry Authors // SPDX-License-Identifier: Apache-2.0 -//! Traits and structs defining the local (!Send) version of receivers, processors, and exporters. +//! Traits and structs defining the local (!Send) version of receivers, processors, exporters, +//! and extensions. pub mod exporter; +pub mod extension; pub mod message; pub mod processor; pub mod receiver; diff --git a/rust/otap-dataflow/crates/engine/src/node.rs b/rust/otap-dataflow/crates/engine/src/node.rs index ff9f460b04..25b49eab81 100644 --- a/rust/otap-dataflow/crates/engine/src/node.rs +++ b/rust/otap-dataflow/crates/engine/src/node.rs @@ -59,6 +59,8 @@ pub enum NodeType { Processor, /// Represents a node that exports data to an external destination. Exporter, + /// Represents a node that provides shared capabilities to other components. + Extension, } /// Trait for nodes that can send pdata to a specific port. diff --git a/rust/otap-dataflow/crates/engine/src/pipeline_ctrl.rs b/rust/otap-dataflow/crates/engine/src/pipeline_ctrl.rs index ee1230c381..33de1a6f28 100644 --- a/rust/otap-dataflow/crates/engine/src/pipeline_ctrl.rs +++ b/rust/otap-dataflow/crates/engine/src/pipeline_ctrl.rs @@ -16,7 +16,8 @@ use crate::context::PipelineContext; use crate::control::RouteData; use crate::control::UnwindData; use crate::control::{ - AckMsg, ControlSenders, NackMsg, NodeControlMsg, PipelineControlMsg, PipelineCtrlMsgReceiver, + AckMsg, ControlSenders, ExtensionControlMsg, ExtensionControlSender, NackMsg, NodeControlMsg, + PipelineControlMsg, PipelineCtrlMsgReceiver, }; use crate::error::Error; use crate::pipeline_metrics::PipelineMetricsMonitor; @@ -216,8 +217,11 @@ pub struct PipelineCtrlMsgManager { pipeline_context: PipelineContext, /// Receives control messages from nodes (e.g., start/cancel timer). pipeline_ctrl_msg_receiver: PipelineCtrlMsgReceiver, - /// Allows sending control messages back to nodes. + /// Allows sending control messages back to data-plane nodes. control_senders: ControlSenders, + /// Extension control senders for shutdown-last. + /// Extensions are shut down AFTER data-plane nodes have drained. + extension_control_senders: Vec, /// Repeating timers for generic TimerTick. tick_timers: TimerSet, /// Repeating timers for telemetry collection (CollectTelemetry). @@ -252,6 +256,7 @@ impl PipelineCtrlMsgManager { pipeline_context: PipelineContext, pipeline_ctrl_msg_receiver: PipelineCtrlMsgReceiver, control_senders: ControlSenders, + extension_control_senders: Vec, event_reporter: ObservedEventReporter, metrics_reporter: MetricsReporter, telemetry_policy: TelemetryPolicy, @@ -263,6 +268,7 @@ impl PipelineCtrlMsgManager { pipeline_context, pipeline_ctrl_msg_receiver, control_senders, + extension_control_senders, tick_timers: TimerSet::new(), telemetry_timers: TimerSet::new(), delayed_data: BinaryHeap::new(), @@ -558,6 +564,10 @@ impl PipelineCtrlMsgManager { } } + // Shutdown-last: send shutdown to extensions AFTER data-plane nodes + // have drained and the main loop has exited. + self.shutdown_extensions().await; + // Final metrics flush on shutdown. if self.telemetry.channel_metrics >= MetricLevel::Normal { let _ = self.report_node_metrics(); @@ -627,6 +637,28 @@ impl PipelineCtrlMsgManager { Ok(()) } + /// Sends `ExtensionControlMsg::Shutdown` to all extensions. + /// + /// Called after the data-plane draining loop exits, ensuring extensions + /// remain available to data-plane nodes throughout their shutdown sequence. + async fn shutdown_extensions(&self) { + let deadline = Instant::now() + Duration::from_secs(5); + let reason = "pipeline shutdown complete".to_string(); + for ext_sender in &self.extension_control_senders { + let msg = ExtensionControlMsg::Shutdown { + deadline, + reason: reason.clone(), + }; + if let Err(e) = ext_sender.send(msg).await { + otel_warn!( + "extension.shutdown.send_failed", + node_id = ext_sender.node_id.name.as_ref(), + error = ?e + ); + } + } + } + /// Non-blocking send: try to deliver immediately, buffer on backpressure. /// /// Previously this method was `async` and fell back to `sender.send(msg).await` @@ -863,6 +895,7 @@ mod tests { pipeline_context, pipeline_rx, control_senders, + Vec::new(), observed_state_store.reporter(SendPolicy::default()), metrics_reporter, TelemetryPolicy::default(), @@ -1308,6 +1341,7 @@ mod tests { pipeline_context, pipeline_rx, ControlSenders::new(), + Vec::new(), observed_state_store.reporter(SendPolicy::default()), metrics_reporter, TelemetryPolicy::default(), @@ -2241,6 +2275,7 @@ mod tests { pipeline_context, pipeline_rx, control_senders, + Vec::new(), observed_state_store.reporter(SendPolicy::default()), metrics_reporter, telemetry_policy, @@ -2939,6 +2974,7 @@ mod tests { pipeline_context, pipeline_rx, control_senders, + Vec::new(), observed_state_store.reporter(SendPolicy::default()), metrics_reporter, TelemetryPolicy::default(), diff --git a/rust/otap-dataflow/crates/engine/src/processor.rs b/rust/otap-dataflow/crates/engine/src/processor.rs index 12668e8e1e..43c4109ab4 100644 --- a/rust/otap-dataflow/crates/engine/src/processor.rs +++ b/rust/otap-dataflow/crates/engine/src/processor.rs @@ -250,7 +250,7 @@ impl ProcessorWrapper { source_tag, } => { let (control_sender, control_receiver) = - wrap_control_channel_metrics::( + wrap_control_channel_metrics::>( &node_id, pipeline_ctx, channel_metrics, @@ -286,7 +286,7 @@ impl ProcessorWrapper { source_tag, } => { let (control_sender, control_receiver) = - wrap_control_channel_metrics::( + wrap_control_channel_metrics::>( &node_id, pipeline_ctx, channel_metrics, diff --git a/rust/otap-dataflow/crates/engine/src/receiver.rs b/rust/otap-dataflow/crates/engine/src/receiver.rs index c79f9373c4..af8fb45b1d 100644 --- a/rust/otap-dataflow/crates/engine/src/receiver.rs +++ b/rust/otap-dataflow/crates/engine/src/receiver.rs @@ -234,7 +234,7 @@ impl ReceiverWrapper { .. } => { let (control_sender, control_receiver) = - wrap_control_channel_metrics::( + wrap_control_channel_metrics::>( &node_id, pipeline_ctx, channel_metrics, @@ -271,7 +271,7 @@ impl ReceiverWrapper { .. } => { let (control_sender, control_receiver) = - wrap_control_channel_metrics::( + wrap_control_channel_metrics::>( &node_id, pipeline_ctx, channel_metrics, diff --git a/rust/otap-dataflow/crates/engine/src/runtime_pipeline.rs b/rust/otap-dataflow/crates/engine/src/runtime_pipeline.rs index 003b5de170..c02410f8ae 100644 --- a/rust/otap-dataflow/crates/engine/src/runtime_pipeline.rs +++ b/rust/otap-dataflow/crates/engine/src/runtime_pipeline.rs @@ -9,10 +9,12 @@ use crate::Unwindable; use crate::channel_metrics::{ChannelMetricsHandle, ConsumedMetrics, ProducedMetrics}; use crate::context::PipelineContext; use crate::control::{ - ControlSenders, Controllable, NodeControlMsg, PipelineCtrlMsgReceiver, PipelineCtrlMsgSender, + ControlSenders, Controllable, ExtensionControlSender, NodeControlMsg, PipelineCtrlMsgReceiver, + PipelineCtrlMsgSender, }; use crate::entity_context::{NodeTaskContext, NodeTelemetryHandle, instrument_with_node_context}; use crate::error::{Error, TypedError}; +use crate::extension::ExtensionWrapper; use crate::node::{Node, NodeDefs, NodeId, NodeType, NodeWithPDataReceiver, NodeWithPDataSender}; use crate::pipeline_ctrl::{NodeMetricHandles, PipelineCtrlMsgManager}; use crate::terminal_state::TerminalState; @@ -90,8 +92,10 @@ pub struct RuntimePipeline { processors: Vec>, /// A map node id to exporter runtime node. exporters: Vec>, + /// Extension runtime nodes (PData-free). + extensions: Vec, - /// A precomputed map of all node IDs to their Node trait objects (? @@@) for efficient access + /// A precomputed map of all node IDs to their node definitions for efficient access. /// Indexed by NodeIndex nodes: NodeDefs, /// Channel metrics handles collected during build. @@ -127,6 +131,7 @@ impl RuntimePipeline { receivers: Vec>, processors: Vec>, exporters: Vec>, + extensions: Vec, nodes: NodeDefs, telemetry_policy: TelemetryPolicy, ) -> Self { @@ -135,6 +140,7 @@ impl RuntimePipeline { receivers, processors, exporters, + extensions, nodes, channel_metrics: Default::default(), telemetry_policy, @@ -148,7 +154,7 @@ impl RuntimePipeline { /// Returns the number of nodes in the pipeline. #[must_use] pub const fn node_count(&self) -> usize { - self.receivers.len() + self.processors.len() + self.exporters.len() + self.receivers.len() + self.processors.len() + self.exporters.len() + self.extensions.len() } /// Returns a reference to the pipeline configuration. @@ -177,6 +183,7 @@ impl RuntimePipeli receivers, processors, exporters, + extensions, nodes: _nodes, channel_metrics, telemetry_policy, @@ -194,8 +201,45 @@ impl RuntimePipeli // ToDo create an optimized version of FuturesUnordered that can be used for !Send, !Sync tasks let mut futures = FuturesUnordered::new(); let mut control_senders = ControlSenders::default(); + let mut extension_control_senders: Vec = Vec::new(); let mut node_metric_entries: Vec<(usize, NodeMetricHandles)> = Vec::new(); + // Spawn extension tasks first so they are ready before other components. + // Extensions do NOT register in control_senders (they use ExtensionControlMsg, + // not NodeControlMsg). They do NOT receive pipeline_ctrl_msg_tx. + // Instead, their control senders are tracked separately for shutdown-last. + for extension in extensions { + let mut extension = extension; + extension_control_senders.push(extension.extension_control_sender()); + let telemetry_guard = extension.take_telemetry_guard(); + let node_entity_key = telemetry_guard.as_ref().map(|t| t.entity_key()); + let telemetry_handle = telemetry_guard.as_ref().map(|t| t.handle()); + let effect_metrics_reporter = metrics_reporter.clone(); + let final_metrics_reporter = metrics_reporter.clone(); + let fut = async move { + let result = extension + .start(effect_metrics_reporter) + .await + .map(|terminal_state| { + report_terminal_metrics(&final_metrics_reporter, terminal_state); + }); + drop(telemetry_guard); + result + }; + if let Some(handle) = telemetry_handle { + let input_key = handle.input_channel_key(); + let output_keys = handle.output_channel_keys(); + let node_ctx = + NodeTaskContext::new(node_entity_key, Some(handle), input_key, output_keys); + futures.push(local_tasks.spawn_local(instrument_with_node_context(node_ctx, fut))); + } else if let Some(key) = node_entity_key { + let node_ctx = NodeTaskContext::new(Some(key), None, None, Vec::new()); + futures.push(local_tasks.spawn_local(instrument_with_node_context(node_ctx, fut))); + } else { + futures.push(local_tasks.spawn_local(fut)); + } + } + // Spawn node tasks and register their control senders, scoping telemetry where available. for exporter in exporters { let mut exporter = exporter; @@ -347,6 +391,7 @@ impl RuntimePipeli pipeline_context, pipeline_ctrl_msg_rx, control_senders, + extension_control_senders, event_reporter, metrics_reporter, telemetry_policy, @@ -391,7 +436,9 @@ impl RuntimePipeli } impl RuntimePipeline { - /// Gets a reference to any node by its ID as a Node trait object + /// Gets a reference to any node by its ID as a Node trait object. + /// + /// Returns `None` for extensions — they do not implement `Node`. #[must_use] pub fn get_node(&self, node_id: usize) -> Option<&dyn Node> { let ndef = self.nodes.get(node_id)?; @@ -409,6 +456,8 @@ impl RuntimePipeline { .exporters .get(ndef.inner.index) .map(|e| e as &dyn Node), + // Extensions are PData-free and don't implement Node. + NodeType::Extension => None, } } @@ -430,6 +479,7 @@ impl RuntimePipeline { .get_mut(ndef.inner.index) .map(|p| p as &mut dyn NodeWithPDataSender), NodeType::Exporter => None, + NodeType::Extension => None, } } @@ -451,10 +501,15 @@ impl RuntimePipeline { .exporters .get_mut(ndef.inner.index) .map(|e| e as &mut dyn NodeWithPDataReceiver), + NodeType::Extension => None, } } /// Sends a node control message to the specified node. + /// + /// Extensions cannot receive `NodeControlMsg` — they use + /// `ExtensionControlMsg` instead. Attempting to send to an extension + /// returns an error. pub async fn send_node_control_message( &self, node_id: &NodeId, @@ -483,6 +538,15 @@ impl RuntimePipeline { .send_control_msg(ctrl_msg) .await } + NodeType::Extension => { + // Extensions use ExtensionControlMsg, not NodeControlMsg. + return Err(TypedError::Error(Error::InternalError { + message: format!( + "cannot send NodeControlMsg to extension {:?}; use ExtensionControlMsg", + node_id + ), + })); + } } .map_err(|e| TypedError::NodeControlMsgSendError { node_id: node_id.index, diff --git a/rust/otap-dataflow/crates/engine/src/shared/extension.rs b/rust/otap-dataflow/crates/engine/src/shared/extension.rs new file mode 100644 index 0000000000..66fc1085df --- /dev/null +++ b/rust/otap-dataflow/crates/engine/src/shared/extension.rs @@ -0,0 +1,222 @@ +// Copyright The OpenTelemetry Authors +// SPDX-License-Identifier: Apache-2.0 + +//! Trait and structures used to implement shared extensions (Send bound). +//! +//! An extension is a long-lived component that runs alongside the pipeline and +//! exposes functionality (e.g., authentication, service discovery) to other +//! components through the [`CapabilityRegistry`](crate::extension::registry::CapabilityRegistry). +//! +//! Unlike receivers, processors, and exporters, extensions: +//! - Do NOT process pipeline data (PData) +//! - Do NOT have input/output pdata channels +//! - Only receive control messages (shutdown, timer ticks, config updates) +//! +//! # Thread Safety +//! +//! This implementation is designed for use in both single-threaded and multi-threaded environments. +//! The `Extension` trait requires the `Send` bound, enabling the use of thread-safe types. +//! +//! # Scalability +//! +//! To ensure scalability, the pipeline engine will start multiple instances of the same pipeline +//! in parallel on different cores, each with its own extension instance. + +use crate::control::ExtensionControlMsg; +use crate::error::Error; +use crate::extension::registry::CapabilityRegistration; +use crate::node::NodeId; +use crate::shared::message::SharedReceiver; +use crate::terminal_state::TerminalState; +use async_trait::async_trait; +use otap_df_channel::error::RecvError; +use otap_df_telemetry::reporter::MetricsReporter; +use std::pin::Pin; +use std::time::Instant; +use tokio::time::{Sleep, sleep_until}; + +/// A trait for pipeline extensions (Send definition). +/// +/// Extensions are long-lived components that run alongside the pipeline and +/// expose functionality (e.g., authentication, service discovery) to other +/// components through the [`CapabilityRegistry`](crate::extension::registry::CapabilityRegistry). +/// +/// Unlike receivers, processors, and exporters, extensions are NOT generic over +/// PData — they never process pipeline data. +#[async_trait] +pub trait Extension: Send { + /// Starts the extension. + /// + /// The pipeline engine calls this to start the extension in a dedicated task. + /// Extensions are started BEFORE receivers, processors, and exporters so that + /// their capabilities are available when data-path components initialize. + /// + /// The extension is taken as `Box` so the method takes ownership once + /// `start` is called. This lets it move into an independent task, after which + /// the pipeline can only reach it through the control-message channel. + /// + /// Unlike the local variant, the returned future is `Send` (via `#[async_trait]`), + /// enabling use in multi-threaded runtime contexts. + /// + /// # Parameters + /// + /// - `ctrl_chan`: A channel to receive control messages. Extensions do not + /// receive PData messages — only control messages (shutdown, timer, config). + /// - `effect_handler`: A handler to perform side effects such as + /// info logging. + /// + /// # Errors + /// + /// Returns an [`Error`] if an unrecoverable error occurs. + async fn start( + self: Box, + ctrl_chan: ControlChannel, + effect_handler: EffectHandler, + ) -> Result; + + /// Returns extension trait registrations for this extension. + /// + /// Override this method to publish traits that other pipeline components can + /// consume via `registry.get::(name)`. The default + /// implementation returns an empty vec — suitable for pure background-task + /// extensions that do not expose any traits. + fn extension_capabilities(&self) -> Vec { + Vec::new() + } +} + +/// A channel for receiving control messages for shared extensions. +/// +/// Extensions only receive control messages (shutdown, timer ticks, config updates). +/// They do not process pipeline data (PData). +/// +/// When a `Shutdown` message arrives with a future deadline, the channel waits +/// until the deadline expires, then returns the `Shutdown`. No further messages +/// are delivered during this grace period. +pub struct ControlChannel { + control_rx: Option>, + /// Once a Shutdown is seen, this is set to `Some(instant)` at which point + /// no more messages will be accepted. + shutting_down_deadline: Option, + /// Holds the Shutdown message until after we've finished draining. + pending_shutdown: Option, +} + +impl ControlChannel { + /// Creates a new `ControlChannel` with the given control receiver. + #[must_use] + pub const fn new(control_rx: SharedReceiver) -> Self { + ControlChannel { + control_rx: Some(control_rx), + shutting_down_deadline: None, + pending_shutdown: None, + } + } + + /// Asynchronously receives the next control message to process. + /// + /// # Errors + /// + /// Returns a [`RecvError`] if the channel is closed. + pub async fn recv(&mut self) -> Result { + let mut sleep_until_deadline: Option>> = None; + + loop { + if self.control_rx.is_none() { + return Err(RecvError::Closed); + } + + // Draining mode: Shutdown pending + if let Some(dl) = self.shutting_down_deadline { + if Instant::now() >= dl { + let shutdown = self + .pending_shutdown + .take() + .expect("pending_shutdown must exist"); + self.shutdown(); + return Ok(shutdown); + } + + if sleep_until_deadline.is_none() { + sleep_until_deadline = Some(Box::pin(sleep_until(dl.into()))); + } + + tokio::select! { + biased; + _ = sleep_until_deadline.as_mut().expect("sleep_until_deadline must exist") => { + let shutdown = self.pending_shutdown + .take() + .expect("pending_shutdown must exist"); + self.shutdown(); + return Ok(shutdown); + } + } + } + + // Normal mode: no shutdown yet + tokio::select! { + biased; + ctrl = self.control_rx.as_mut().expect("control_rx must exist").recv() => match ctrl { + Ok(ExtensionControlMsg::Shutdown { deadline, reason }) => { + if deadline.duration_since(Instant::now()).is_zero() { + self.shutdown(); + return Ok(ExtensionControlMsg::Shutdown { deadline, reason }); + } + self.shutting_down_deadline = Some(deadline); + self.pending_shutdown = Some(ExtensionControlMsg::Shutdown { deadline, reason }); + continue; + } + Ok(msg) => return Ok(msg), + Err(e) => return Err(e), + }, + } + } + } + + fn shutdown(&mut self) { + self.shutting_down_deadline = None; + drop(self.control_rx.take().expect("control_rx must exist")); + } +} + +/// A `Send` implementation of the EffectHandler for extensions. +/// +/// Provides extensions with the ability to: +/// - Print info messages +/// - Access node identity +/// +/// Extensions manage their own timers directly via `tokio::time` rather than +/// through the engine's timer infrastructure, keeping the extension system +/// fully PData-free. +#[derive(Clone)] +pub struct EffectHandler { + node_id: NodeId, + #[allow(dead_code)] + metrics_reporter: MetricsReporter, +} + +impl EffectHandler { + /// Creates a new shared (Send) `EffectHandler` for the given extension node. + #[must_use] + pub const fn new(node_id: NodeId, metrics_reporter: MetricsReporter) -> Self { + EffectHandler { + node_id, + metrics_reporter, + } + } + + /// Returns the id of the extension associated with this handler. + #[must_use] + pub fn extension_id(&self) -> NodeId { + self.node_id.clone() + } + + /// Print an info message to stdout. + pub async fn info(&self, message: &str) { + use tokio::io::{AsyncWriteExt, stdout}; + let mut out = stdout(); + let _ = out.write_all(message.as_bytes()).await; + let _ = out.write_all(b"\n").await; + let _ = out.flush().await; + } +} diff --git a/rust/otap-dataflow/crates/engine/src/shared/mod.rs b/rust/otap-dataflow/crates/engine/src/shared/mod.rs index 3f6f246f76..0e9d066850 100644 --- a/rust/otap-dataflow/crates/engine/src/shared/mod.rs +++ b/rust/otap-dataflow/crates/engine/src/shared/mod.rs @@ -1,9 +1,11 @@ // Copyright The OpenTelemetry Authors // SPDX-License-Identifier: Apache-2.0 -//! Traits and structs defining the shared (Send) version of receivers, processors, and exporters. +//! Traits and structs defining the shared (Send) version of receivers, processors, exporters, +//! and extensions. pub mod exporter; +pub mod extension; pub mod message; pub mod processor; pub mod receiver; diff --git a/rust/otap-dataflow/crates/engine/src/testing/exporter.rs b/rust/otap-dataflow/crates/engine/src/testing/exporter.rs index 087221908f..4a772847ec 100644 --- a/rust/otap-dataflow/crates/engine/src/testing/exporter.rs +++ b/rust/otap-dataflow/crates/engine/src/testing/exporter.rs @@ -15,6 +15,7 @@ use crate::control::{ }; use crate::error::Error; use crate::exporter::ExporterWrapper; +use crate::extension::registry::CapabilityRegistry; use crate::local::message::{LocalReceiver, LocalSender}; use crate::message::{Receiver, Sender}; use crate::node::NodeWithPDataReceiver; @@ -378,5 +379,11 @@ pub fn create_exporter_from_factory( node_config.config = config; let exporter_config = ExporterConfig::new("test_exporter"); - (factory.create)(pipeline_ctx, node, Arc::new(node_config), &exporter_config) + (factory.create)( + pipeline_ctx, + node, + Arc::new(node_config), + &exporter_config, + &CapabilityRegistry::new(), + ) } diff --git a/rust/otap-dataflow/crates/otap/src/attributes_processor.rs b/rust/otap-dataflow/crates/otap/src/attributes_processor.rs index 4a7cc013d6..865974a821 100644 --- a/rust/otap-dataflow/crates/otap/src/attributes_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/attributes_processor.rs @@ -429,6 +429,7 @@ pub fn create_attributes_processor( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let mut proc = AttributesProcessor::from_config(&node_config.config)?; proc.metrics = Some(pipeline_ctx.register_metrics::()); @@ -449,8 +450,9 @@ pub static ATTRIBUTES_PROCESSOR_FACTORY: otap_df_engine::ProcessorFactory, - proc_cfg: &ProcessorConfig| { - create_attributes_processor(pipeline_ctx, node, node_config, proc_cfg) + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + create_attributes_processor(pipeline_ctx, node, node_config, proc_cfg, _capability_registry) }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, @@ -633,9 +635,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -715,9 +722,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -802,9 +814,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -883,9 +900,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -953,9 +975,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1021,9 +1048,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1092,9 +1124,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1176,9 +1213,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1246,9 +1288,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1322,9 +1369,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1402,9 +1454,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1481,9 +1538,14 @@ mod tests { let rt: TestRuntime = TestRuntime::new(); let mut node_config = NodeUserConfig::new_processor_config(ATTRIBUTES_PROCESSOR_URN); node_config.config = cfg; - let proc = - create_attributes_processor(pipeline_ctx, node, Arc::new(node_config), rt.config()) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_config), + rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); phase @@ -1551,8 +1613,14 @@ mod telemetry_tests { node_cfg.config = cfg; let proc_cfg = ProcessorConfig::new("attr_proc"); let node = test_node(proc_cfg.name.clone()); - let proc = create_attributes_processor(pipeline_ctx, node, Arc::new(node_cfg), &proc_cfg) - .expect("create processor"); + let proc = create_attributes_processor( + pipeline_ctx, + node, + Arc::new(node_cfg), + &proc_cfg, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); // 4) Build a minimal OTLP logs request that has a signal-level attribute 'a' let input_bytes = { diff --git a/rust/otap-dataflow/crates/otap/src/batch_processor.rs b/rust/otap-dataflow/crates/otap/src/batch_processor.rs index 270fa31b30..b1e477f7af 100644 --- a/rust/otap-dataflow/crates/otap/src/batch_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/batch_processor.rs @@ -1034,6 +1034,7 @@ pub fn create_otap_batch_processor( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let metrics = pipeline_ctx.register_metrics::(); let proc = BatchProcessor::build_from_json(&node_config.config, metrics)?; @@ -1317,8 +1318,9 @@ pub static OTAP_BATCH_PROCESSOR_FACTORY: otap_df_engine::ProcessorFactory, - proc_cfg: &ProcessorConfig| { - create_otap_batch_processor(pipeline_ctx, node, node_config, proc_cfg) + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + create_otap_batch_processor(pipeline_ctx, node, node_config, proc_cfg, _capability_registry) }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, @@ -1393,9 +1395,14 @@ mod tests { let mut node_config = NodeUserConfig::new_processor_config(OTAP_BATCH_PROCESSOR_URN); node_config.config = cfg; let proc_config = ProcessorConfig::new("batch"); - let proc = - create_otap_batch_processor(pipeline_ctx, node, Arc::new(node_config), &proc_config) - .expect("create processor"); + let proc = create_otap_batch_processor( + pipeline_ctx, + node, + Arc::new(node_config), + &proc_config, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("create processor"); let phase = rt.set_processor(proc); @@ -1481,8 +1488,14 @@ mod tests { // Create processor via factory and ensure the provided NodeUserConfig is preserved let proc_cfg = ProcessorConfig::new("batch"); let node = test_node(proc_cfg.name.clone()); - let wrapper = create_otap_batch_processor(pipeline_ctx, node, nuc.clone(), &proc_cfg) - .expect("factory should succeed"); + let wrapper = create_otap_batch_processor( + pipeline_ctx, + node, + nuc.clone(), + &proc_cfg, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), + ) + .expect("factory should succeed"); let uc = wrapper.user_config(); assert!(uc.outputs.iter().any(|port| port.as_ref() == "main_output")); diff --git a/rust/otap-dataflow/crates/otap/src/console_exporter.rs b/rust/otap-dataflow/crates/otap/src/console_exporter.rs index 921e6fdb60..1b0215f18c 100644 --- a/rust/otap-dataflow/crates/otap/src/console_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/console_exporter.rs @@ -76,21 +76,23 @@ impl ConsoleExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static CONSOLE_EXPORTER: ExporterFactory = ExporterFactory { name: CONSOLE_EXPORTER_URN, - create: |_pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - let config: ConsoleExporterConfig = serde_json::from_value(node_config.config.clone()) - .map_err(|e| ConfigError::InvalidUserConfig { - error: format!("Failed to parse console exporter config: {}", e), - })?; - Ok(ExporterWrapper::local( - ConsoleExporter::new(config), - node, - node_config, - exporter_config, - )) - }, + create: + |_pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + let config: ConsoleExporterConfig = serde_json::from_value(node_config.config.clone()) + .map_err(|e| ConfigError::InvalidUserConfig { + error: format!("Failed to parse console exporter config: {}", e), + })?; + Ok(ExporterWrapper::local( + ConsoleExporter::new(config), + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/content_router.rs b/rust/otap-dataflow/crates/otap/src/content_router.rs index 3a91b033a5..00ca9074de 100644 --- a/rust/otap-dataflow/crates/otap/src/content_router.rs +++ b/rust/otap-dataflow/crates/otap/src/content_router.rs @@ -630,6 +630,7 @@ pub fn create_content_router( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let router_config: ContentRouterConfig = serde_json::from_value(node_config.config.clone()) .map_err(|e| ConfigError::InvalidUserConfig { @@ -654,20 +655,24 @@ pub static CONTENT_ROUTER_FACTORY: ProcessorFactory = ProcessorFactor name: CONTENT_ROUTER_URN, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - proc_cfg: &ProcessorConfig| { - let router_config: ContentRouterConfig = serde_json::from_value(node_config.config.clone()) - .map_err(|e| ConfigError::InvalidUserConfig { - error: format!("Failed to parse ContentRouter configuration: {e}"), - })?; - router_config.validate(&node_config.outputs)?; - - let router = ContentRouter::with_pipeline_ctx(pipeline, router_config); - - Ok(ProcessorWrapper::local(router, node, node_config, proc_cfg)) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + let router_config: ContentRouterConfig = + serde_json::from_value(node_config.config.clone()).map_err(|e| { + ConfigError::InvalidUserConfig { + error: format!("Failed to parse ContentRouter configuration: {e}"), + } + })?; + router_config.validate(&node_config.outputs)?; + + let router = ContentRouter::with_pipeline_ctx(pipeline, router_config); + + Ok(ProcessorWrapper::local(router, node, node_config, proc_cfg)) + }, }; #[cfg(test)] @@ -1006,6 +1011,7 @@ mod tests { test_node(processor_config.name.clone()), Arc::new(node_config), &processor_config, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ); assert!(result.is_ok()); } @@ -1025,6 +1031,7 @@ mod tests { test_node(processor_config.name.clone()), Arc::new(node_config), &processor_config, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ); assert!(result.is_err()); } @@ -1039,6 +1046,7 @@ mod tests { test_node(processor_config.name.clone()), Arc::new(node_config), &processor_config, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ); assert!(result.is_err()); } diff --git a/rust/otap-dataflow/crates/otap/src/debug_processor.rs b/rust/otap-dataflow/crates/otap/src/debug_processor.rs index ef31d2187b..67d36a91f9 100644 --- a/rust/otap-dataflow/crates/otap/src/debug_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/debug_processor.rs @@ -68,6 +68,7 @@ pub fn create_debug_processor( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { Ok(ProcessorWrapper::local( DebugProcessor::from_config(pipeline_ctx, &node_config.config)?, @@ -86,8 +87,9 @@ pub static DEBUG_PROCESSOR_FACTORY: otap_df_engine::ProcessorFactory create: |pipeline_ctx: PipelineContext, node: NodeId, node_config: Arc, - proc_cfg: &ProcessorConfig| { - create_debug_processor(pipeline_ctx, node, node_config, proc_cfg) + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + create_debug_processor(pipeline_ctx, node, node_config, proc_cfg, _capability_registry) }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, diff --git a/rust/otap-dataflow/crates/otap/src/durable_buffer_processor/mod.rs b/rust/otap-dataflow/crates/otap/src/durable_buffer_processor/mod.rs index d58f4ef21c..cd288c9801 100644 --- a/rust/otap-dataflow/crates/otap/src/durable_buffer_processor/mod.rs +++ b/rust/otap-dataflow/crates/otap/src/durable_buffer_processor/mod.rs @@ -1808,6 +1808,7 @@ pub fn create_durable_buffer( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let config: DurableBufferConfig = serde_json::from_value(node_config.config.clone()).map_err(|e| { diff --git a/rust/otap-dataflow/crates/otap/src/error_exporter.rs b/rust/otap-dataflow/crates/otap/src/error_exporter.rs index 5fe16cc46a..7a08712a6c 100644 --- a/rust/otap-dataflow/crates/otap/src/error_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/error_exporter.rs @@ -49,6 +49,7 @@ impl ErrorExporter { node: NodeId, node_config: Arc, exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, otap_df_config::error::Error> { let config: ErrorExporterConfig = serde_json::from_value(node_config.config.clone()) .map_err(|e| otap_df_config::error::Error::InvalidUserConfig { diff --git a/rust/otap-dataflow/crates/otap/src/fake_data_generator.rs b/rust/otap-dataflow/crates/otap/src/fake_data_generator.rs index e70024f795..40c6660828 100644 --- a/rust/otap-dataflow/crates/otap/src/fake_data_generator.rs +++ b/rust/otap-dataflow/crates/otap/src/fake_data_generator.rs @@ -63,17 +63,19 @@ pub struct FakeGeneratorReceiver { #[distributed_slice(OTAP_RECEIVER_FACTORIES)] pub static OTAP_FAKE_DATA_GENERATOR: ReceiverFactory = ReceiverFactory { name: OTAP_FAKE_DATA_GENERATOR_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - receiver_config: &ReceiverConfig| { - Ok(ReceiverWrapper::local( - FakeGeneratorReceiver::from_config(pipeline, &node_config.config)?, - node, - node_config, - receiver_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + receiver_config: &ReceiverConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ReceiverWrapper::local( + FakeGeneratorReceiver::from_config(pipeline, &node_config.config)?, + node, + node_config, + receiver_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/fanout_processor.rs b/rust/otap-dataflow/crates/otap/src/fanout_processor.rs index 81d4b058fd..760f50cb77 100644 --- a/rust/otap-dataflow/crates/otap/src/fanout_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/fanout_processor.rs @@ -1137,6 +1137,7 @@ pub fn create_fanout_processor( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let fanout = FanoutProcessor::from_config(pipeline_ctx.clone(), &node_config, &node_config.config)?; @@ -1153,12 +1154,20 @@ pub fn create_fanout_processor( #[distributed_slice(OTAP_PROCESSOR_FACTORIES)] pub static FANOUT_PROCESSOR_FACTORY: ProcessorFactory = ProcessorFactory { name: FANOUT_PROCESSOR_URN, - create: |pipeline_ctx: PipelineContext, - node: NodeId, - node_config: Arc, - proc_cfg: &ProcessorConfig| { - create_fanout_processor(pipeline_ctx, node, node_config, proc_cfg) - }, + create: + |pipeline_ctx: PipelineContext, + node: NodeId, + node_config: Arc, + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + create_fanout_processor( + pipeline_ctx, + node, + node_config, + proc_cfg, + _capability_registry, + ) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract { output_fanout: otap_df_engine::wiring_contract::OutputFanoutRule::AtMostPerOutput(1), }, diff --git a/rust/otap-dataflow/crates/otap/src/filter_processor.rs b/rust/otap-dataflow/crates/otap/src/filter_processor.rs index f4f095513b..9b253a5d6a 100644 --- a/rust/otap-dataflow/crates/otap/src/filter_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/filter_processor.rs @@ -49,6 +49,7 @@ pub fn create_filter_processor( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { Ok(ProcessorWrapper::local( FilterProcessor::from_config(pipeline_ctx, &node_config.config)?, @@ -67,8 +68,9 @@ pub static FILTER_PROCESSOR_FACTORY: otap_df_engine::ProcessorFactory create: |pipeline_ctx: PipelineContext, node: NodeId, node_config: Arc, - proc_cfg: &ProcessorConfig| { - create_filter_processor(pipeline_ctx, node, node_config, proc_cfg) + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + create_filter_processor(pipeline_ctx, node, node_config, proc_cfg, _capability_registry) }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, diff --git a/rust/otap-dataflow/crates/otap/src/internal_telemetry_receiver.rs b/rust/otap-dataflow/crates/otap/src/internal_telemetry_receiver.rs index 24d7096f33..221fb420ae 100644 --- a/rust/otap-dataflow/crates/otap/src/internal_telemetry_receiver.rs +++ b/rust/otap-dataflow/crates/otap/src/internal_telemetry_receiver.rs @@ -52,27 +52,29 @@ pub struct InternalTelemetryReceiver { #[distributed_slice(OTAP_RECEIVER_FACTORIES)] pub static INTERNAL_TELEMETRY_RECEIVER: ReceiverFactory = ReceiverFactory { name: INTERNAL_TELEMETRY_RECEIVER_URN, - create: |mut pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - receiver_config: &ReceiverConfig| { - // Get internal telemetry settings from the pipeline context - let internal_telemetry = pipeline.take_internal_telemetry().ok_or_else(|| { + create: + |mut pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + receiver_config: &ReceiverConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + // Get internal telemetry settings from the pipeline context + let internal_telemetry = pipeline.take_internal_telemetry().ok_or_else(|| { otap_df_config::error::Error::InvalidUserConfig { error: "InternalTelemetryReceiver requires internal telemetry settings in pipeline context".to_owned(), } })?; - Ok(ReceiverWrapper::local( - InternalTelemetryReceiver::new_with_telemetry( - InternalTelemetryReceiver::parse_config(&node_config.config)?, - internal_telemetry, - ), - node, - node_config, - receiver_config, - )) - }, + Ok(ReceiverWrapper::local( + InternalTelemetryReceiver::new_with_telemetry( + InternalTelemetryReceiver::parse_config(&node_config.config)?, + internal_telemetry, + ), + node, + node_config, + receiver_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/noop_exporter.rs b/rust/otap-dataflow/crates/otap/src/noop_exporter.rs index a0c0e8d85b..6bc2b4c007 100644 --- a/rust/otap-dataflow/crates/otap/src/noop_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/noop_exporter.rs @@ -29,17 +29,19 @@ pub struct NoopExporter; #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static NOOP_EXPORTER: ExporterFactory = ExporterFactory { name: NOOP_EXPORTER_URN, - create: |_pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - NoopExporter {}, - node, - node_config, - exporter_config, - )) - }, + create: + |_pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + NoopExporter {}, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::no_config, }; diff --git a/rust/otap-dataflow/crates/otap/src/otap_exporter.rs b/rust/otap-dataflow/crates/otap/src/otap_exporter.rs index 0ddd07329b..7e094401ac 100644 --- a/rust/otap-dataflow/crates/otap/src/otap_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/otap_exporter.rs @@ -63,17 +63,19 @@ pub struct OTAPExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static OTAP_EXPORTER: ExporterFactory = ExporterFactory { name: OTAP_EXPORTER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - OTAPExporter::from_config(pipeline, &node_config.config)?, - node, - node_config, - exporter_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + OTAPExporter::from_config(pipeline, &node_config.config)?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/otap_receiver.rs b/rust/otap-dataflow/crates/otap/src/otap_receiver.rs index 9825cbe543..e128682936 100644 --- a/rust/otap-dataflow/crates/otap/src/otap_receiver.rs +++ b/rust/otap-dataflow/crates/otap/src/otap_receiver.rs @@ -117,17 +117,19 @@ pub struct OTAPReceiver { #[distributed_slice(OTAP_RECEIVER_FACTORIES)] pub static OTAP_RECEIVER: ReceiverFactory = ReceiverFactory { name: OTAP_RECEIVER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - receiver_config: &ReceiverConfig| { - Ok(ReceiverWrapper::shared( - OTAPReceiver::from_config(pipeline, &node_config.config)?, - node, - node_config, - receiver_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + receiver_config: &ReceiverConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ReceiverWrapper::shared( + OTAPReceiver::from_config(pipeline, &node_config.config)?, + node, + node_config, + receiver_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/otlp_exporter.rs b/rust/otap-dataflow/crates/otap/src/otlp_exporter.rs index 4ce6928548..bd27809f78 100644 --- a/rust/otap-dataflow/crates/otap/src/otlp_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/otlp_exporter.rs @@ -77,17 +77,19 @@ pub struct OTLPExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static OTLP_EXPORTER: ExporterFactory = ExporterFactory { name: OTLP_EXPORTER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - OTLPExporter::from_config(pipeline, &node_config.config)?, - node, - node_config, - exporter_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + OTLPExporter::from_config(pipeline, &node_config.config)?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs b/rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs index ed5200b94b..488303e7a5 100644 --- a/rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs +++ b/rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs @@ -93,6 +93,7 @@ fn factory_create( node: NodeId, node_config: Arc, exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { Ok(ExporterWrapper::local( OtlpHttpExporter::from_config(pipeline, &node_config.config)?, diff --git a/rust/otap-dataflow/crates/otap/src/otlp_receiver.rs b/rust/otap-dataflow/crates/otap/src/otlp_receiver.rs index 89e626ed9e..25854d9533 100644 --- a/rust/otap-dataflow/crates/otap/src/otlp_receiver.rs +++ b/rust/otap-dataflow/crates/otap/src/otlp_receiver.rs @@ -193,20 +193,22 @@ pub struct OTLPReceiver { #[distributed_slice(OTAP_RECEIVER_FACTORIES)] pub static OTLP_RECEIVER: ReceiverFactory = ReceiverFactory { name: OTLP_RECEIVER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - receiver_config: &ReceiverConfig| { - let mut receiver = OTLPReceiver::from_config(pipeline, &node_config.config)?; - receiver.tune_max_concurrent_requests(receiver_config.output_pdata_channel.capacity); - - Ok(ReceiverWrapper::shared( - receiver, - node, - node_config, - receiver_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + receiver_config: &ReceiverConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + let mut receiver = OTLPReceiver::from_config(pipeline, &node_config.config)?; + receiver.tune_max_concurrent_requests(receiver_config.output_pdata_channel.capacity); + + Ok(ReceiverWrapper::shared( + receiver, + node, + node_config, + receiver_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/parquet_exporter.rs b/rust/otap-dataflow/crates/otap/src/parquet_exporter.rs index 5dfe539b97..ca971e61d9 100644 --- a/rust/otap-dataflow/crates/otap/src/parquet_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/parquet_exporter.rs @@ -75,17 +75,19 @@ pub struct ParquetExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static PARQUET_EXPORTER: ExporterFactory = ExporterFactory { name: PARQUET_EXPORTER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - ParquetExporter::from_config(pipeline, &node_config.config)?, - node, - node_config, - exporter_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + ParquetExporter::from_config(pipeline, &node_config.config)?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/perf_exporter/exporter.rs b/rust/otap-dataflow/crates/otap/src/perf_exporter/exporter.rs index ce9d7cac00..1a79040c48 100644 --- a/rust/otap-dataflow/crates/otap/src/perf_exporter/exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/perf_exporter/exporter.rs @@ -63,17 +63,19 @@ pub struct PerfExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static PERF_EXPORTER: ExporterFactory = ExporterFactory { name: OTAP_PERF_EXPORTER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - PerfExporter::from_config(pipeline, &node_config.config)?, - node, - node_config, - exporter_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + PerfExporter::from_config(pipeline, &node_config.config)?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/retry_processor.rs b/rust/otap-dataflow/crates/otap/src/retry_processor.rs index da3609f670..726e75cb61 100644 --- a/rust/otap-dataflow/crates/otap/src/retry_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/retry_processor.rs @@ -350,6 +350,7 @@ pub fn create_retry_processor( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let config: RetryConfig = serde_json::from_value(node_config.config.clone()).map_err(|e| { ConfigError::InvalidUserConfig { @@ -855,6 +856,7 @@ mod test { node, Arc::new(node_config), rt.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ) .expect("create processor"); diff --git a/rust/otap-dataflow/crates/otap/src/signal_type_router.rs b/rust/otap-dataflow/crates/otap/src/signal_type_router.rs index 6b04a0984a..5d21ec462d 100644 --- a/rust/otap-dataflow/crates/otap/src/signal_type_router.rs +++ b/rust/otap-dataflow/crates/otap/src/signal_type_router.rs @@ -234,6 +234,7 @@ pub fn create_signal_type_router( node: NodeId, node_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { // Deserialize the (currently empty) router configuration let router_config: SignalTypeRouterConfig = serde_json::from_value(node_config.config.clone()) @@ -260,23 +261,25 @@ pub fn create_signal_type_router( #[distributed_slice(OTAP_PROCESSOR_FACTORIES)] pub static SIGNAL_TYPE_ROUTER_FACTORY: ProcessorFactory = ProcessorFactory { name: SIGNAL_TYPE_ROUTER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - proc_cfg: &ProcessorConfig| { - // Deserialize the (currently empty) router configuration - let router_config: SignalTypeRouterConfig = - serde_json::from_value(node_config.config.clone()).map_err(|e| { - ConfigError::InvalidUserConfig { - error: format!("Failed to parse SignalTypeRouter configuration: {e}"), - } - })?; + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + proc_cfg: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + // Deserialize the (currently empty) router configuration + let router_config: SignalTypeRouterConfig = + serde_json::from_value(node_config.config.clone()).map_err(|e| { + ConfigError::InvalidUserConfig { + error: format!("Failed to parse SignalTypeRouter configuration: {e}"), + } + })?; - // Create the router with metrics registered via PipelineContext - let router = SignalTypeRouter::with_pipeline_ctx(pipeline, router_config); + // Create the router with metrics registered via PipelineContext + let router = SignalTypeRouter::with_pipeline_ctx(pipeline, router_config); - Ok(ProcessorWrapper::local(router, node, node_config, proc_cfg)) - }, + Ok(ProcessorWrapper::local(router, node, node_config, proc_cfg)) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; @@ -304,6 +307,7 @@ mod tests { test_node(processor_config.name.clone()), Arc::new(node_config), &processor_config, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ); assert!(result.is_ok()); } @@ -319,6 +323,7 @@ mod tests { test_node(processor_config.name.clone()), Arc::new(node_config), &processor_config, + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ); assert!(result.is_err()); } diff --git a/rust/otap-dataflow/crates/otap/src/syslog_cef_receiver.rs b/rust/otap-dataflow/crates/otap/src/syslog_cef_receiver.rs index a184b93700..b3fdc5f705 100644 --- a/rust/otap-dataflow/crates/otap/src/syslog_cef_receiver.rs +++ b/rust/otap-dataflow/crates/otap/src/syslog_cef_receiver.rs @@ -201,17 +201,19 @@ impl SyslogCefReceiver { #[distributed_slice(OTAP_RECEIVER_FACTORIES)] pub static SYSLOG_CEF_RECEIVER: ReceiverFactory = ReceiverFactory { name: SYSLOG_CEF_RECEIVER_URN, - create: |pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - receiver_config: &ReceiverConfig| { - Ok(ReceiverWrapper::local( - SyslogCefReceiver::from_config(pipeline, &node_config.config)?, - node, - node_config, - receiver_config, - )) - }, + create: + |pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + receiver_config: &ReceiverConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ReceiverWrapper::local( + SyslogCefReceiver::from_config(pipeline, &node_config.config)?, + node, + node_config, + receiver_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/crates/otap/src/topic_exporter.rs b/rust/otap-dataflow/crates/otap/src/topic_exporter.rs index f0e597534f..deabbb05bd 100644 --- a/rust/otap-dataflow/crates/otap/src/topic_exporter.rs +++ b/rust/otap-dataflow/crates/otap/src/topic_exporter.rs @@ -53,18 +53,20 @@ pub struct TopicExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] pub static TOPIC_EXPORTER: ExporterFactory = ExporterFactory { name: TOPIC_EXPORTER_URN, - create: |_pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - let config = TopicExporter::parse_config(&node_config.config)?; - Ok(ExporterWrapper::local( - TopicExporter { config }, - node, - node_config, - exporter_config, - )) - }, + create: + |_pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + let config = TopicExporter::parse_config(&node_config.config)?; + Ok(ExporterWrapper::local( + TopicExporter { config }, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: |config| TopicExporter::parse_config(config).map(|_| ()), }; diff --git a/rust/otap-dataflow/crates/otap/src/topic_receiver.rs b/rust/otap-dataflow/crates/otap/src/topic_receiver.rs index 8948d14b6f..c5b8efb05f 100644 --- a/rust/otap-dataflow/crates/otap/src/topic_receiver.rs +++ b/rust/otap-dataflow/crates/otap/src/topic_receiver.rs @@ -70,18 +70,20 @@ pub struct TopicReceiver { #[distributed_slice(OTAP_RECEIVER_FACTORIES)] pub static TOPIC_RECEIVER: ReceiverFactory = ReceiverFactory { name: TOPIC_RECEIVER_URN, - create: |_pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - receiver_config: &ReceiverConfig| { - let config = TopicReceiver::parse_config(&node_config.config)?; - Ok(ReceiverWrapper::local( - TopicReceiver { config }, - node, - node_config, - receiver_config, - )) - }, + create: + |_pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + receiver_config: &ReceiverConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + let config = TopicReceiver::parse_config(&node_config.config)?; + Ok(ReceiverWrapper::local( + TopicReceiver { config }, + node, + node_config, + receiver_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: |config| TopicReceiver::parse_config(config).map(|_| ()), }; diff --git a/rust/otap-dataflow/crates/otap/src/transform_processor.rs b/rust/otap-dataflow/crates/otap/src/transform_processor.rs index a9beae744a..a65e9a0872 100644 --- a/rust/otap-dataflow/crates/otap/src/transform_processor.rs +++ b/rust/otap-dataflow/crates/otap/src/transform_processor.rs @@ -321,6 +321,7 @@ fn create_transform_processor( node_id: NodeId, user_config: Arc, processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry, ) -> Result, ConfigError> { let processor = TransformProcessor::from_config(&pipeline_ctx, &user_config.config)?; Ok(ProcessorWrapper::local( @@ -533,6 +534,7 @@ mod test { node_id, Arc::new(node_config), runtime.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ) } @@ -1037,6 +1039,7 @@ mod test { node_id, Arc::new(node_config), runtime.config(), + &otap_df_engine::extension::registry::CapabilityRegistry::new(), ) .expect("created processor"); diff --git a/rust/otap-dataflow/crates/otap/tests/common/counting_exporter.rs b/rust/otap-dataflow/crates/otap/tests/common/counting_exporter.rs index 8357049b49..861d3c7bd8 100644 --- a/rust/otap-dataflow/crates/otap/tests/common/counting_exporter.rs +++ b/rust/otap-dataflow/crates/otap/tests/common/counting_exporter.rs @@ -63,23 +63,25 @@ struct CountingExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] static COUNTING_EXPORTER: ExporterFactory = ExporterFactory { name: COUNTING_EXPORTER_URN, - create: |_pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - // Look up counter by ID from node config - let counter_id = node_config - .config - .get("counter_id") - .and_then(|v| v.as_str()); - let counter = counter_id.and_then(get_counter); - Ok(ExporterWrapper::local( - CountingExporter { counter }, - node, - node_config, - exporter_config, - )) - }, + create: + |_pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + // Look up counter by ID from node config + let counter_id = node_config + .config + .get("counter_id") + .and_then(|v| v.as_str()); + let counter = counter_id.and_then(get_counter); + Ok(ExporterWrapper::local( + CountingExporter { counter }, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: |_| Ok(()), }; diff --git a/rust/otap-dataflow/crates/otap/tests/common/flaky_exporter.rs b/rust/otap-dataflow/crates/otap/tests/common/flaky_exporter.rs index 8b4ea8f082..3dc5425f7b 100644 --- a/rust/otap-dataflow/crates/otap/tests/common/flaky_exporter.rs +++ b/rust/otap-dataflow/crates/otap/tests/common/flaky_exporter.rs @@ -140,29 +140,31 @@ struct FlakyExporter { #[distributed_slice(OTAP_EXPORTER_FACTORIES)] static FLAKY_EXPORTER: ExporterFactory = ExporterFactory { name: FLAKY_EXPORTER_URN, - create: |_pipeline: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - // Look up state by ID from node config - let flaky_id = node_config.config.get("flaky_id").and_then(|v| v.as_str()); - let (counter, should_ack, nack_count, permanent_nack, permanent_nack_count) = flaky_id - .and_then(get_state) - .map(|(c, a, n, p, pc)| (Some(c), Some(a), Some(n), Some(p), Some(pc))) - .unwrap_or((None, None, None, None, None)); - Ok(ExporterWrapper::local( - FlakyExporter { - counter, - should_ack, - nack_count, - permanent_nack, - permanent_nack_count, - }, - node, - node_config, - exporter_config, - )) - }, + create: + |_pipeline: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + // Look up state by ID from node config + let flaky_id = node_config.config.get("flaky_id").and_then(|v| v.as_str()); + let (counter, should_ack, nack_count, permanent_nack, permanent_nack_count) = flaky_id + .and_then(get_state) + .map(|(c, a, n, p, pc)| (Some(c), Some(a), Some(n), Some(p), Some(pc))) + .unwrap_or((None, None, None, None, None)); + Ok(ExporterWrapper::local( + FlakyExporter { + counter, + should_ack, + nack_count, + permanent_nack, + permanent_nack_count, + }, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: |_| Ok(()), }; diff --git a/rust/otap-dataflow/crates/validation/src/fanout_processor.rs b/rust/otap-dataflow/crates/validation/src/fanout_processor.rs index 9e763e0732..fdbe81892f 100644 --- a/rust/otap-dataflow/crates/validation/src/fanout_processor.rs +++ b/rust/otap-dataflow/crates/validation/src/fanout_processor.rs @@ -41,18 +41,20 @@ struct FanoutProcessor { /// Distributed-slice factory that registers the fanout processor with the engine. pub static FANOUT_PROCESSOR_FACTORY: ProcessorFactory = ProcessorFactory { name: FANOUT_PROCESSOR_URN, - create: |pipeline_ctx: PipelineContext, - node: NodeId, - node_config: Arc, - processor_config: &ProcessorConfig| { - let metrics = pipeline_ctx.register_metrics::(); - Ok(ProcessorWrapper::local( - FanoutProcessor { metrics }, - node, - node_config, - processor_config, - )) - }, + create: + |pipeline_ctx: PipelineContext, + node: NodeId, + node_config: Arc, + processor_config: &ProcessorConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + let metrics = pipeline_ctx.register_metrics::(); + Ok(ProcessorWrapper::local( + FanoutProcessor { metrics }, + node, + node_config, + processor_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract { output_fanout: otap_df_engine::wiring_contract::OutputFanoutRule::AtMostPerOutput(1), }, diff --git a/rust/otap-dataflow/crates/validation/src/validation_exporter.rs b/rust/otap-dataflow/crates/validation/src/validation_exporter.rs index faf94c67b9..c2baeeadcd 100644 --- a/rust/otap-dataflow/crates/validation/src/validation_exporter.rs +++ b/rust/otap-dataflow/crates/validation/src/validation_exporter.rs @@ -76,17 +76,19 @@ pub struct ValidationExporter { /// Distributed-slice factory that registers the validation exporter with the engine. pub static VALIDATION_EXPORTER_FACTORY: ExporterFactory = ExporterFactory { name: VALIDATION_EXPORTER_URN, - create: |pipeline_ctx: PipelineContext, - node: NodeId, - node_config: Arc, - exporter_config: &ExporterConfig| { - Ok(ExporterWrapper::local( - ValidationExporter::from_config(pipeline_ctx, &node_config.config)?, - node, - node_config, - exporter_config, - )) - }, + create: + |pipeline_ctx: PipelineContext, + node: NodeId, + node_config: Arc, + exporter_config: &ExporterConfig, + _capability_registry: &otap_df_engine::extension::registry::CapabilityRegistry| { + Ok(ExporterWrapper::local( + ValidationExporter::from_config(pipeline_ctx, &node_config.config)?, + node, + node_config, + exporter_config, + )) + }, wiring_contract: otap_df_engine::wiring_contract::WiringContract::UNRESTRICTED, validate_config: otap_df_config::validation::validate_typed_config::, }; diff --git a/rust/otap-dataflow/scripts/validate-configs.sh b/rust/otap-dataflow/scripts/validate-configs.sh index 3d16350453..56b86bdd0b 100755 --- a/rust/otap-dataflow/scripts/validate-configs.sh +++ b/rust/otap-dataflow/scripts/validate-configs.sh @@ -26,7 +26,7 @@ else # Note: --all-features cannot be used because jemalloc and mimalloc are # mutually exclusive (compile_error! in non-test builds). cargo build \ - --features azure,aws,experimental-tls,contrib-exporters,contrib-processors,recordset-kql-processor,azure-monitor-exporter,geneva-exporter,condense-attributes-processor,resource-validator-processor \ + --features azure,aws,experimental-tls,contrib-exporters,contrib-processors,recordset-kql-processor,azure-monitor-exporter,azure-identity-auth-extension,geneva-exporter,condense-attributes-processor,resource-validator-processor \ --manifest-path "$PROJECT_DIR/Cargo.toml" BINARY="$PROJECT_DIR/target/debug/df_engine" fi diff --git a/rust/otap-dataflow/src/main.rs b/rust/otap-dataflow/src/main.rs index abddc234e3..d4254074e1 100644 --- a/rust/otap-dataflow/src/main.rs +++ b/rust/otap-dataflow/src/main.rs @@ -195,6 +195,17 @@ fn validate_pipeline_components( .get_exporter_factory_map() .get(urn_str) .map(|f| f.validate_config), + NodeKind::Extension => { + return Err(std::io::Error::other(format!( + "Extension `{}` was placed in `nodes` but belongs in the `extensions` section \ + (pipeline_group={} pipeline={} node={})", + urn_str, + pipeline_group_id.as_ref(), + pipeline_id.as_ref(), + node_id.as_ref() + )) + .into()); + } }; match validate_config_fn { @@ -203,6 +214,7 @@ fn validate_pipeline_components( NodeKind::Receiver => "receiver", NodeKind::Processor | NodeKind::ProcessorChain => "processor", NodeKind::Exporter => "exporter", + NodeKind::Extension => unreachable!("rejected above"), }; return Err(std::io::Error::other(format!( "Unknown {} component `{}` in pipeline_group={} pipeline={} node={}", @@ -229,6 +241,40 @@ fn validate_pipeline_components( } } + // Validate extensions from the dedicated `extensions` section. + for (ext_id, ext_cfg) in pipeline_cfg.extension_iter() { + let urn_str = ext_cfg.r#type.as_str(); + let validate_config_fn = OTAP_PIPELINE_FACTORY + .get_extension_factory_map() + .get(urn_str) + .map(|f| f.validate_config); + + match validate_config_fn { + None => { + return Err(std::io::Error::other(format!( + "Unknown extension component `{}` in pipeline_group={} pipeline={} extension={}", + urn_str, + pipeline_group_id.as_ref(), + pipeline_id.as_ref(), + ext_id.as_ref() + )) + .into()); + } + Some(validate_fn) => { + validate_fn(&ext_cfg.config).map_err(|e| { + std::io::Error::other(format!( + "Invalid config for extension `{}` in pipeline_group={} pipeline={} extension={}: {}", + urn_str, + pipeline_group_id.as_ref(), + pipeline_id.as_ref(), + ext_id.as_ref(), + e + )) + })?; + } + } + } + Ok(()) }