From 81972e3122aac21b0fc1f61708c0aab2271128ca Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Mon, 23 Aug 2021 15:37:08 -0400 Subject: [PATCH 01/19] Kick off the specification Signed-off-by: Ben Johnson --- docs/specs/component.md | 65 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) create mode 100644 docs/specs/component.md diff --git a/docs/specs/component.md b/docs/specs/component.md new file mode 100644 index 0000000000000..73349d13f44e3 --- /dev/null +++ b/docs/specs/component.md @@ -0,0 +1,65 @@ +# Component Specification + +This document specifies Vector Component behavior (source, transforms, and +sinks) for the development of Vector. + +The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, +“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be +interpreted as described in [RFC 2119]. + + + +1. [Introduction](#introduction) +1. [How to read this document](#how-to-read-this-document) +1. [Instrumentation](#instrumentation) + 1. [Events](#events) + 1. [EventRecevied](#eventrecevied) + + + +## Introduction + +Vector is a highly flexible observability data pipeline due to its directed +acyclic graph processing model. Each node in the graph is a Vector Component, +and in order to meet our [high user experience expectations] each Component must +adhere to a common set of behaviorial rules. This document aims to clearly +outline these rules to guide new component development and ongoing maintenance. + +## How to read this document + +This document is written from the broad perspective of a Vector component. +Unless otherwise stated, a section applies to all component types, although, +most sections will be broken along component lines for easy adherence. + +## Instrumentation + +### Events + +Vector implements an [event driven pattern ([RFC 2064]) for internal +instrumentation. This section lists all required and optional events that a +component MUST emit. + +There is leeway in the implementation of these events: + +* Events MAY be augmented with additional component-specific context. For + example, the `socket` source adds `mode` attribute as additional context. +* The naming of the event MAY be component specific. For example, + `SocketEventReceived` since the `socket` source adds additional context. + +#### EventRecevied + +ALL components MUST emit an `EventReceived` event immediately after receiving +a Vector event with the following telemetry: + +* Metrics + * MUST increment the `events_in_total` counter by 1. + * SHOULD increment the `bytes_in_total` counter by the total number of bytes. + * For sources, the total bytes coming off the wire. + * For everything else, the event's JSON byte size representation. +* Logging + * MUST log a `Received one event.` message at the `trace` level with no rate + limiting. + +[high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md +[RFC 2064]: https://github.com/timberio/vector/blob/master/rfcs/2020-03-17-2064-event-driven-observability.md +[RFC 2119]: https://datatracker.ietf.org/doc/html/rfc2119 From 0dae96be39f13aa4bbae7296e82d985361353e66 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Tue, 24 Aug 2021 13:27:08 -0400 Subject: [PATCH 02/19] Add a new BytesReceived event Signed-off-by: Ben Johnson --- docs/specs/component.md | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 73349d13f44e3..814ba5359fcd0 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -13,6 +13,7 @@ interpreted as described in [RFC 2119]. 1. [How to read this document](#how-to-read-this-document) 1. [Instrumentation](#instrumentation) 1. [Events](#events) + 1. [BytesReceived](#bytesreceived) 1. [EventRecevied](#eventrecevied) @@ -35,7 +36,7 @@ most sections will be broken along component lines for easy adherence. ### Events -Vector implements an [event driven pattern ([RFC 2064]) for internal +Vector implements an event driven pattern ([RFC 2064]) for internal instrumentation. This section lists all required and optional events that a component MUST emit. @@ -45,19 +46,35 @@ There is leeway in the implementation of these events: example, the `socket` source adds `mode` attribute as additional context. * The naming of the event MAY be component specific. For example, `SocketEventReceived` since the `socket` source adds additional context. +* Components MAY emit events for batches of Vector events for performance + reasons, but the resulting metrics state MUST be equivalent to emitting + individual events. For example, emitting the `EventsReceived` event for 10 + events MUST increment the `events_in_total` by 10. + +#### BytesReceived + +*Sources* MUST emit a `BytesReceived` event immediately after receiving bytes +from the upstream source, before the creation of a Vector event. The following +telemetry MUST be included: + +* Metrics + * MUST increment the `bytes_in_total` counter by the number of bytes + received. +* Logging + * MUST log a `Bytes received.` message at the `trace` level with no rate + limiting. #### EventRecevied -ALL components MUST emit an `EventReceived` event immediately after receiving -a Vector event with the following telemetry: +*Components* MUST emit an `EventReceived` event immediately after receiving or +creating a Vector event. * Metrics * MUST increment the `events_in_total` counter by 1. - * SHOULD increment the `bytes_in_total` counter by the total number of bytes. - * For sources, the total bytes coming off the wire. - * For everything else, the event's JSON byte size representation. + * SHOULD increment the `event_bytes_in_total` counter by the event's byte + size in JSON representation. * Logging - * MUST log a `Received one event.` message at the `trace` level with no rate + * MUST log a `Event received.` message at the `trace` level with no rate limiting. [high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md From b64fbef840d4ca65bbb1d5d3c94099046f31cc35 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Tue, 24 Aug 2021 13:32:40 -0400 Subject: [PATCH 03/19] Better language for event implementation leeway Signed-off-by: Ben Johnson --- docs/specs/component.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 814ba5359fcd0..7e013ec918ffd 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -43,11 +43,12 @@ component MUST emit. There is leeway in the implementation of these events: * Events MAY be augmented with additional component-specific context. For - example, the `socket` source adds `mode` attribute as additional context. -* The naming of the event MAY be component specific. For example, - `SocketEventReceived` since the `socket` source adds additional context. + example, the `socket` source adds a `mode` attribute as additional context. +* The naming of the events MAY deviate to satisfy implementation. For example, + the `socket` source may rename the `EventRecevied` event to + `SocketEventReceived` to add additional socket specific context. * Components MAY emit events for batches of Vector events for performance - reasons, but the resulting metrics state MUST be equivalent to emitting + reasons, but the resulting telemetry state MUST be equivalent to emitting individual events. For example, emitting the `EventsReceived` event for 10 events MUST increment the `events_in_total` by 10. @@ -71,7 +72,7 @@ creating a Vector event. * Metrics * MUST increment the `events_in_total` counter by 1. - * SHOULD increment the `event_bytes_in_total` counter by the event's byte + * MUST increment the `event_bytes_in_total` counter by the event's byte size in JSON representation. * Logging * MUST log a `Event received.` message at the `trace` level with no rate From a484f33766840e8d5910d3393ce170f8c8f4de21 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 12:02:23 -0400 Subject: [PATCH 04/19] Checkpoint Signed-off-by: Ben Johnson --- docs/specs/component.md | 63 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 56 insertions(+), 7 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 7e013ec918ffd..d8a9ce3ab9db0 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -10,11 +10,16 @@ interpreted as described in [RFC 2119]. 1. [Introduction](#introduction) +1. [Scope](#scope) 1. [How to read this document](#how-to-read-this-document) 1. [Instrumentation](#instrumentation) + 1. [Batching](#batching) 1. [Events](#events) 1. [BytesReceived](#bytesreceived) - 1. [EventRecevied](#eventrecevied) + 1. [EventsRecevied](#eventsrecevied) + 1. [EventsProcessed](#eventsprocessed) + 1. [EventsSent](#eventssent) + 1. [Error](#error) @@ -26,19 +31,37 @@ and in order to meet our [high user experience expectations] each Component must adhere to a common set of behaviorial rules. This document aims to clearly outline these rules to guide new component development and ongoing maintenance. +## Scope + +This specification addresses direct component concerns + +TODO: limit this document to direct component-level code and not supporting +infrastructure. + ## How to read this document This document is written from the broad perspective of a Vector component. -Unless otherwise stated, a section applies to all component types, although, -most sections will be broken along component lines for easy adherence. +Unless otherwise stated, a section applies to all component types (sources, +transforms, and sinks). ## Instrumentation +Vector components MUST be instrumented for optimal observability and monitoring. +This is required to drive various interfaces that Vector users depend on to +manage Vector installations in mission critical production environments. + +### Batching + +For performance reasons, components SHOULD instrument batches of Vector events +as opposed to individual Vector events. [Pull request #8383] demonstrated +meaningful performance improvements as a result of this strategy. + ### Events Vector implements an event driven pattern ([RFC 2064]) for internal instrumentation. This section lists all required and optional events that a -component MUST emit. +component MUST emit. It is expected that components will emit custom events +beyond those listed here that reflect component specific behavior. There is leeway in the implementation of these events: @@ -61,14 +84,29 @@ telemetry MUST be included: * Metrics * MUST increment the `bytes_in_total` counter by the number of bytes received. + * If received over the HTTP then the `http_path` tag must be set. * Logging * MUST log a `Bytes received.` message at the `trace` level with no rate limiting. -#### EventRecevied +#### EventsRecevied + +*All components* MUST emit an `EventsReceived` event immediately after creating +or receiving one or more Vector events. + +* Metrics + * MUST increment the `events_in_total` counter by the number of events + received. + * MUST increment the `event_bytes_in_total` counter by the cumulative byte + size of the events in JSON representation. +* Logging + * MUST log a `{count} events received.` message at the `trace` level with no + rate limiting. + +#### EventsProcessed -*Components* MUST emit an `EventReceived` event immediately after receiving or -creating a Vector event. +*All components* MUST emit an `EventsProcessed` event processing an event, +before the event is encoded and sent downstream. * Metrics * MUST increment the `events_in_total` counter by 1. @@ -78,6 +116,17 @@ creating a Vector event. * MUST log a `Event received.` message at the `trace` level with no rate limiting. +#### EventsSent + +*All components* MUST emit an `EventsSent` event processing an event, +before the event is encoded and sent downstream. + + +#### Error + + + [high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md +[Pull request #8383]: https://github.com/timberio/vector/pull/8383/ [RFC 2064]: https://github.com/timberio/vector/blob/master/rfcs/2020-03-17-2064-event-driven-observability.md [RFC 2119]: https://datatracker.ietf.org/doc/html/rfc2119 From 065831ef44ba9a834a2675be964a6f47582bc943 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 13:43:21 -0400 Subject: [PATCH 05/19] Final touches for specification foundation Signed-off-by: Ben Johnson --- docs/specs/component.md | 135 +++++++++++++++++++++++++++++++--------- 1 file changed, 106 insertions(+), 29 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index d8a9ce3ab9db0..23685131d2a43 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -12,13 +12,17 @@ interpreted as described in [RFC 2119]. 1. [Introduction](#introduction) 1. [Scope](#scope) 1. [How to read this document](#how-to-read-this-document) +1. [Configuration](#configuration) + 1. [Options](#options) + 1. [`address`](#address) + 1. [`endpoint(s)`](#endpoints) 1. [Instrumentation](#instrumentation) 1. [Batching](#batching) 1. [Events](#events) 1. [BytesReceived](#bytesreceived) 1. [EventsRecevied](#eventsrecevied) - 1. [EventsProcessed](#eventsprocessed) 1. [EventsSent](#eventssent) + 1. [BytesSent](#bytessent) 1. [Error](#error) @@ -44,6 +48,20 @@ This document is written from the broad perspective of a Vector component. Unless otherwise stated, a section applies to all component types (sources, transforms, and sinks). +## Configuration + +### Options + +#### `address` + +When a component binds to an address, it should expose an `address` option that +takes a `string` representing a single address. + +#### `endpoint(s)` + +When a component sends data to a downstream target, it should expose an +`endpoint(s)` option that takes a `string` representing one or more endpoints + ## Instrumentation Vector components MUST be instrumented for optimal observability and monitoring. @@ -78,53 +96,112 @@ There is leeway in the implementation of these events: #### BytesReceived *Sources* MUST emit a `BytesReceived` event immediately after receiving bytes -from the upstream source, before the creation of a Vector event. The following -telemetry MUST be included: - +from the upstream source and before the creation of a Vector event. + +* Properties + * `byte_size` + * For UDP, TCP, and Unix protocols, the total number of bytes received from + the socket excluding the delimiter. + * For HTTP-based protocols, the total number of bytes in the HTTP body, as + represented by the `Content-Length` header. + * For files, the total number of bytes read from the file excluding the + delimiter. + * `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`, + `unix`, `http`, `https`, `file`, etc.) + * `address` - If relevant, the bound address that the bytes were received + from. For HTTP, this MUST be the host and path only, excluding the query + string. + * `path` - If relevant, the HTTP path, excluding query strings. + * `socket` - If relevant, the socket number that bytes were received from. + * `remote_address` - If relevant, the remote IP address of the upstream + client. + * `file` - If relevant, the absolute path of the file. * Metrics - * MUST increment the `bytes_in_total` counter by the number of bytes - received. - * If received over the HTTP then the `http_path` tag must be set. + * MUST increment the `received_bytes_total` counter by the defined value with + the defined properties as metric tags. * Logging - * MUST log a `Bytes received.` message at the `trace` level with no rate - limiting. + * MUST log a `{byte_size} bytes received.` message at the `trace` level with + the defined properties as structured data. It MUST NOT be rate limited. #### EventsRecevied *All components* MUST emit an `EventsReceived` event immediately after creating or receiving one or more Vector events. +* Properties + * `quantity` - The quantity of Vector events. + * `byte_size` - The cumulative byte size of all events in JSON representation. * Metrics - * MUST increment the `events_in_total` counter by the number of events - received. - * MUST increment the `event_bytes_in_total` counter by the cumulative byte - size of the events in JSON representation. + * MUST increment the `received_events_total` counter by the defined `quantity` + property with the other properties as metric tags. + * MUST increment the `received_event_bytes_total` counter by the defined + `byte_size` property with the other properties as metric tags. * Logging - * MUST log a `{count} events received.` message at the `trace` level with no - rate limiting. + * MUST log a `{quantity} events received.` message at the `trace` level with + the defined properties as structured data. It MUST NOT be rate limited. -#### EventsProcessed +#### EventsSent -*All components* MUST emit an `EventsProcessed` event processing an event, -before the event is encoded and sent downstream. +*All components* MUST emit an `EventsSent` event immediately before sending the +event down stream. This should happen before any transmission preparation, such +as encoding. +* Properties + * `quantity` - The quantity of Vector events. + * `byte_size` - The cumulative byte size of all events in JSON representation. * Metrics - * MUST increment the `events_in_total` counter by 1. - * MUST increment the `event_bytes_in_total` counter by the event's byte - size in JSON representation. + * MUST increment the `sent_events_total` counter by the defined value with the + defined properties as metric tags. + * MUST increment the `sent_event_bytes_total` counter by the event's byte size + in JSON representation. * Logging - * MUST log a `Event received.` message at the `trace` level with no rate - limiting. - -#### EventsSent - -*All components* MUST emit an `EventsSent` event processing an event, -before the event is encoded and sent downstream. - + * MUST log a `{quantity} events sent.` message at the `trace` level with the + defined properties as structured data. It MUST NOT be rate limited. + +#### BytesSent + +*Sinks* MUST emit a `BytesSent` event immediately after sending bytes to the +downstream target regardless if the transmission was successful or not. + +* Properties + * `byte_size` + * For UDP, TCP, and Unix protocols, the total number of bytes placed on the + socket excluding the delimiter. + * For HTTP-based protocols, the total number of bytes in the HTTP body, as + represented by the `Content-Length` header. + * For files, the total number of bytes written to the file excluding the + delimiter. + * `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`, + `unix`, `http`, `http`, `file`, etc.) + * `endpoint` - If relevant, the endpoint that the bytes were sent to. For + HTTP, this MUST be the host and path only, excluding the query string. + * `file` - If relevant, the absolute path of the file. +* Metrics + * MUST increment the `bytes_in_total` counter by the defined value with the + defined properties as metric tags. +* Logging + * MUST log a `{byte_size} bytes received.` message at the `trace` level with + the defined properties as structured data. It MUST NOT be rate limited. #### Error +*All components* MUST emit error events when an error occurs, and errors MUST be +named with an `Error` suffix. For example, the `socket` source emits a +`SocketReceiveError` representing any error that occurs while receiving data off +of the socket. +This specification does list a standard set of errors that components must +implement since errors are specific to the component. + +* Properties + * `error` - The string representation of the error. + * `stage` - The stage at which the error occured. MUST be one of `receiving`, + `processing`, `sending`. +* Metrics + * MUST increment the `errors_total` counter by 1 with the defined properties + as metric tags. + * MUST increment the `events_discarded_total` counter by the number of Vector + events discarded if the error resulted in discarding (dropping) events. [high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md [Pull request #8383]: https://github.com/timberio/vector/pull/8383/ From 2ffec1590b09ff1c23b09bd16d924081eaee6d10 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 13:48:41 -0400 Subject: [PATCH 06/19] Error logs Signed-off-by: Ben Johnson --- docs/specs/component.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 23685131d2a43..6487ace047fda 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -119,7 +119,7 @@ from the upstream source and before the creation of a Vector event. * Metrics * MUST increment the `received_bytes_total` counter by the defined value with the defined properties as metric tags. -* Logging +* Logs * MUST log a `{byte_size} bytes received.` message at the `trace` level with the defined properties as structured data. It MUST NOT be rate limited. @@ -136,7 +136,7 @@ or receiving one or more Vector events. property with the other properties as metric tags. * MUST increment the `received_event_bytes_total` counter by the defined `byte_size` property with the other properties as metric tags. -* Logging +* Logs * MUST log a `{quantity} events received.` message at the `trace` level with the defined properties as structured data. It MUST NOT be rate limited. @@ -154,7 +154,7 @@ as encoding. defined properties as metric tags. * MUST increment the `sent_event_bytes_total` counter by the event's byte size in JSON representation. -* Logging +* Logs * MUST log a `{quantity} events sent.` message at the `trace` level with the defined properties as structured data. It MUST NOT be rate limited. @@ -179,7 +179,7 @@ downstream target regardless if the transmission was successful or not. * Metrics * MUST increment the `bytes_in_total` counter by the defined value with the defined properties as metric tags. -* Logging +* Logs * MUST log a `{byte_size} bytes received.` message at the `trace` level with the defined properties as structured data. It MUST NOT be rate limited. @@ -202,6 +202,10 @@ implement since errors are specific to the component. as metric tags. * MUST increment the `events_discarded_total` counter by the number of Vector events discarded if the error resulted in discarding (dropping) events. +* Logs + * MUST log a `{stage} error: {error}` message at the `error` level with the + defined properties as structured data. It SHOULD be rate limited to 10 + seconds. [high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md [Pull request #8383]: https://github.com/timberio/vector/pull/8383/ From 56a8daa6f8487eb7eaafa30cabd62f89491e060e Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 13:51:27 -0400 Subject: [PATCH 07/19] Comma separated Signed-off-by: Ben Johnson --- docs/specs/component.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 6487ace047fda..b6bfaee0696b0 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -60,7 +60,8 @@ takes a `string` representing a single address. #### `endpoint(s)` When a component sends data to a downstream target, it should expose an -`endpoint(s)` option that takes a `string` representing one or more endpoints +`endpoint(s)` option that takes a `string` representing one or more comma +separated endpoints. ## Instrumentation From ba8ef42f50f9aeb1e5aae68dadb39a88abca0904 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 13:53:19 -0400 Subject: [PATCH 08/19] Finish scope section Signed-off-by: Ben Johnson --- docs/specs/component.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index b6bfaee0696b0..94e833d3210ef 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -37,10 +37,10 @@ outline these rules to guide new component development and ongoing maintenance. ## Scope -This specification addresses direct component concerns - -TODO: limit this document to direct component-level code and not supporting -infrastructure. +This specification addresses _direct_ component development and does not cover +aspects that components inherit "for free". For example, this specification does +not cover gloal context, such as `component_id`, that all components receive in +their telemetry by nature of being a Vector compoent. ## How to read this document From 7ec8d8b53b08a53568bf444c840c1dfa9dbaa7dd Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 13:56:33 -0400 Subject: [PATCH 09/19] Rename metric Signed-off-by: Ben Johnson --- docs/specs/component.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 94e833d3210ef..91bbd76ac1378 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -201,7 +201,7 @@ implement since errors are specific to the component. * Metrics * MUST increment the `errors_total` counter by 1 with the defined properties as metric tags. - * MUST increment the `events_discarded_total` counter by the number of Vector + * MUST increment the `discarded_events_total` counter by the number of Vector events discarded if the error resulted in discarding (dropping) events. * Logs * MUST log a `{stage} error: {error}` message at the `error` level with the From 55d9f85cb2cea4233a166ec5d21e51aa0fa23671 Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Wed, 25 Aug 2021 13:39:43 -0600 Subject: [PATCH 10/19] Update endpoint option wording Signed-off-by: Bruce Guenter --- docs/specs/component.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 91bbd76ac1378..51f4c71c955f6 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -59,9 +59,10 @@ takes a `string` representing a single address. #### `endpoint(s)` -When a component sends data to a downstream target, it should expose an -`endpoint(s)` option that takes a `string` representing one or more comma -separated endpoints. +When a component sends data to a downstream target, it MUST expose +either an `endpoint` option that takes a `string` representing a single +endpoint, or an `endpoints` option that takes an array of strings +representing multiple endpoints. ## Instrumentation From a669d0612165782fbd4591671ff60616efe453e8 Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Wed, 25 Aug 2021 19:00:11 -0400 Subject: [PATCH 11/19] Update docs/specs/component.md Co-authored-by: Jesse Szwedko Signed-off-by: Bruce Guenter --- docs/specs/component.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 51f4c71c955f6..053a494be08c2 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -131,7 +131,7 @@ from the upstream source and before the creation of a Vector event. or receiving one or more Vector events. * Properties - * `quantity` - The quantity of Vector events. + * `count` - The count of Vector events. * `byte_size` - The cumulative byte size of all events in JSON representation. * Metrics * MUST increment the `received_events_total` counter by the defined `quantity` From 953f15230c316029de93068f82aa8a77ceefd2c1 Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Wed, 25 Aug 2021 16:26:44 -0600 Subject: [PATCH 12/19] Wording tweaks Signed-off-by: Bruce Guenter --- docs/specs/component.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 053a494be08c2..fb7ee6aedd2cb 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -54,7 +54,7 @@ transforms, and sinks). #### `address` -When a component binds to an address, it should expose an `address` option that +When a component binds to an address, it SHOULD expose an `address` option that takes a `string` representing a single address. #### `endpoint(s)` @@ -80,7 +80,7 @@ meaningful performance improvements as a result of this strategy. Vector implements an event driven pattern ([RFC 2064]) for internal instrumentation. This section lists all required and optional events that a -component MUST emit. It is expected that components will emit custom events +component must emit. It is expected that components will emit custom events beyond those listed here that reflect component specific behavior. There is leeway in the implementation of these events: @@ -93,7 +93,7 @@ There is leeway in the implementation of these events: * Components MAY emit events for batches of Vector events for performance reasons, but the resulting telemetry state MUST be equivalent to emitting individual events. For example, emitting the `EventsReceived` event for 10 - events MUST increment the `events_in_total` by 10. + events MUST increment the `events_in_total` counter by 10. #### BytesReceived @@ -110,8 +110,8 @@ from the upstream source and before the creation of a Vector event. delimiter. * `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`, `unix`, `http`, `https`, `file`, etc.) - * `address` - If relevant, the bound address that the bytes were received - from. For HTTP, this MUST be the host and path only, excluding the query + * `address` - If relevant, the local address that the bytes were received + on. For HTTP, this MUST be the host and path only, excluding the query string. * `path` - If relevant, the HTTP path, excluding query strings. * `socket` - If relevant, the socket number that bytes were received from. From 8f0a35e7a887e554e927c386073126b4e010951a Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Wed, 25 Aug 2021 16:35:22 -0600 Subject: [PATCH 13/19] Address feedback Signed-off-by: Bruce Guenter --- docs/specs/component.md | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index fb7ee6aedd2cb..b209ee04428ce 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -110,16 +110,10 @@ from the upstream source and before the creation of a Vector event. delimiter. * `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`, `unix`, `http`, `https`, `file`, etc.) - * `address` - If relevant, the local address that the bytes were received - on. For HTTP, this MUST be the host and path only, excluding the query - string. * `path` - If relevant, the HTTP path, excluding query strings. * `socket` - If relevant, the socket number that bytes were received from. - * `remote_address` - If relevant, the remote IP address of the upstream - client. - * `file` - If relevant, the absolute path of the file. * Metrics - * MUST increment the `received_bytes_total` counter by the defined value with + * MUST increment the `bytes_in_total` counter by the defined value with the defined properties as metric tags. * Logs * MUST log a `{byte_size} bytes received.` message at the `trace` level with From 352e810bf4efb001cc7d319c5302fd7a4c7c138e Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Wed, 25 Aug 2021 17:38:19 -0600 Subject: [PATCH 14/19] Address feedback Signed-off-by: Bruce Guenter --- docs/specs/component.md | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index b209ee04428ce..a0b76aa572c67 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -116,8 +116,8 @@ from the upstream source and before the creation of a Vector event. * MUST increment the `bytes_in_total` counter by the defined value with the defined properties as metric tags. * Logs - * MUST log a `{byte_size} bytes received.` message at the `trace` level with - the defined properties as structured data. It MUST NOT be rate limited. + * MUST log a `Bytes received.` message at the `trace` level with the + defined properties as key-value pairs. It MUST NOT be rate limited. #### EventsRecevied @@ -133,8 +133,8 @@ or receiving one or more Vector events. * MUST increment the `received_event_bytes_total` counter by the defined `byte_size` property with the other properties as metric tags. * Logs - * MUST log a `{quantity} events received.` message at the `trace` level with - the defined properties as structured data. It MUST NOT be rate limited. + * MUST log a `Events received.` message at the `trace` level with the + defined properties as key-value pairs. It MUST NOT be rate limited. #### EventsSent @@ -143,16 +143,16 @@ event down stream. This should happen before any transmission preparation, such as encoding. * Properties - * `quantity` - The quantity of Vector events. - * `byte_size` - The cumulative byte size of all events in JSON representation. + * `count` - The count of Vector events. + * `byte_size` - The cumulative in-memory byte size of all events sent. * Metrics * MUST increment the `sent_events_total` counter by the defined value with the defined properties as metric tags. * MUST increment the `sent_event_bytes_total` counter by the event's byte size in JSON representation. * Logs - * MUST log a `{quantity} events sent.` message at the `trace` level with the - defined properties as structured data. It MUST NOT be rate limited. + * MUST log a `Events sent.` message at the `trace` level with the + defined properties as key-value pairs. It MUST NOT be rate limited. #### BytesSent @@ -173,11 +173,11 @@ downstream target regardless if the transmission was successful or not. HTTP, this MUST be the host and path only, excluding the query string. * `file` - If relevant, the absolute path of the file. * Metrics - * MUST increment the `bytes_in_total` counter by the defined value with the + * MUST increment the `bytes_out_total` counter by the defined value with the defined properties as metric tags. * Logs - * MUST log a `{byte_size} bytes received.` message at the `trace` level with - the defined properties as structured data. It MUST NOT be rate limited. + * MUST log a `Bytes received.` message at the `trace` level with the + defined properties as key-value pairs. It MUST NOT be rate limited. #### Error @@ -190,18 +190,20 @@ This specification does list a standard set of errors that components must implement since errors are specific to the component. * Properties - * `error` - The string representation of the error. - * `stage` - The stage at which the error occured. MUST be one of `receiving`, - `processing`, `sending`. + * `error` - The specifics of the error condition, such as system error code, etc. + * `stage` - The stage at which the error occurred. MUST be one of + `receiving`, `processing`, or `sending`. This MAY be omitted from + being represented explicitly in the event data when the error may + only happen at one stage, but MUST be included in the emitted logs + and metrics as if it were present. * Metrics * MUST increment the `errors_total` counter by 1 with the defined properties as metric tags. * MUST increment the `discarded_events_total` counter by the number of Vector events discarded if the error resulted in discarding (dropping) events. * Logs - * MUST log a `{stage} error: {error}` message at the `error` level with the - defined properties as structured data. It SHOULD be rate limited to 10 - seconds. + * MUST log a message at the `error` level with the defined properties + as key-value pairs. It SHOULD be rate limited to 10 seconds. [high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md [Pull request #8383]: https://github.com/timberio/vector/pull/8383/ From 6619e7e8930d89ccccfcf292cf1c189e5eb24837 Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Wed, 25 Aug 2021 17:45:45 -0600 Subject: [PATCH 15/19] Fix the other byte_size Signed-off-by: Bruce Guenter --- docs/specs/component.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index a0b76aa572c67..4e038b2ab8902 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -126,7 +126,7 @@ or receiving one or more Vector events. * Properties * `count` - The count of Vector events. - * `byte_size` - The cumulative byte size of all events in JSON representation. + * `byte_size` - The cumulative in-memory byte size of all events received. * Metrics * MUST increment the `received_events_total` counter by the defined `quantity` property with the other properties as metric tags. From 421c30c148e01fdecd44ebea4cece7856ee61d11 Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Wed, 25 Aug 2021 18:28:40 -0600 Subject: [PATCH 16/19] Fix counter naming pattern to `DESCRIPTOR_UNITS_total` Signed-off-by: Bruce Guenter --- docs/specs/component.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 4e038b2ab8902..69482fe1f8706 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -93,7 +93,7 @@ There is leeway in the implementation of these events: * Components MAY emit events for batches of Vector events for performance reasons, but the resulting telemetry state MUST be equivalent to emitting individual events. For example, emitting the `EventsReceived` event for 10 - events MUST increment the `events_in_total` counter by 10. + events MUST increment the `received_events_total` counter by 10. #### BytesReceived @@ -113,7 +113,7 @@ from the upstream source and before the creation of a Vector event. * `path` - If relevant, the HTTP path, excluding query strings. * `socket` - If relevant, the socket number that bytes were received from. * Metrics - * MUST increment the `bytes_in_total` counter by the defined value with + * MUST increment the `received_bytes_total` counter by the defined value with the defined properties as metric tags. * Logs * MUST log a `Bytes received.` message at the `trace` level with the @@ -173,7 +173,7 @@ downstream target regardless if the transmission was successful or not. HTTP, this MUST be the host and path only, excluding the query string. * `file` - If relevant, the absolute path of the file. * Metrics - * MUST increment the `bytes_out_total` counter by the defined value with the + * MUST increment the `sent_bytes_total` counter by the defined value with the defined properties as metric tags. * Logs * MUST log a `Bytes received.` message at the `trace` level with the From 7111bec80949776cd83c6be77e578e5aa3a44168 Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Thu, 26 Aug 2021 14:54:03 -0600 Subject: [PATCH 17/19] Change endpoint wording and http_path Signed-off-by: Bruce Guenter --- docs/specs/component.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 69482fe1f8706..39c2571e5c69b 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -59,9 +59,9 @@ takes a `string` representing a single address. #### `endpoint(s)` -When a component sends data to a downstream target, it MUST expose -either an `endpoint` option that takes a `string` representing a single -endpoint, or an `endpoints` option that takes an array of strings +When a component makes a connection to a downstream target, it MUST +expose either an `endpoint` option that takes a `string` representing a +single endpoint, or an `endpoints` option that takes an array of strings representing multiple endpoints. ## Instrumentation @@ -110,7 +110,7 @@ from the upstream source and before the creation of a Vector event. delimiter. * `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`, `unix`, `http`, `https`, `file`, etc.) - * `path` - If relevant, the HTTP path, excluding query strings. + * `http_path` - If relevant, the HTTP path, excluding query strings. * `socket` - If relevant, the socket number that bytes were received from. * Metrics * MUST increment the `received_bytes_total` counter by the defined value with From 312a3d65d0da94c1e414e791883bcb64b264bf2d Mon Sep 17 00:00:00 2001 From: Ben Johnson Date: Thu, 26 Aug 2021 17:24:26 -0400 Subject: [PATCH 18/19] Update docs/specs/component.md --- docs/specs/component.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index 39c2571e5c69b..db027b53e8061 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -186,7 +186,7 @@ named with an `Error` suffix. For example, the `socket` source emits a `SocketReceiveError` representing any error that occurs while receiving data off of the socket. -This specification does list a standard set of errors that components must +This specification does not list a standard set of errors that components must implement since errors are specific to the component. * Properties From 8318f1272a0fd35b9d38ade29f196d0a864ce92e Mon Sep 17 00:00:00 2001 From: Bruce Guenter Date: Fri, 27 Aug 2021 15:05:56 -0600 Subject: [PATCH 19/19] Remove the `address` configuration specification. The discussion over this option was blocking moving forward on the rest of this specification, which is more important than nailing down every configuration option. Signed-off-by: Bruce Guenter --- docs/specs/component.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/docs/specs/component.md b/docs/specs/component.md index db027b53e8061..e3187b731cb68 100644 --- a/docs/specs/component.md +++ b/docs/specs/component.md @@ -52,11 +52,6 @@ transforms, and sinks). ### Options -#### `address` - -When a component binds to an address, it SHOULD expose an `address` option that -takes a `string` representing a single address. - #### `endpoint(s)` When a component makes a connection to a downstream target, it MUST