From 8c6d33f6c4883cd6124f8a511e7a611b81356229 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 6 Mar 2025 13:57:57 -0500 Subject: [PATCH 01/28] Create initial Entities data model specification. --- specification/entities/README.md | 27 ++++ specification/entities/data-model.md | 183 +++++++++++++++++++++++++++ specification/resource/README.md | 26 ++++ specification/resource/data-model.md | 62 +++++++++ 4 files changed, 298 insertions(+) create mode 100644 specification/entities/README.md create mode 100644 specification/entities/data-model.md create mode 100644 specification/resource/data-model.md diff --git a/specification/entities/README.md b/specification/entities/README.md new file mode 100644 index 00000000000..bd07077ea3f --- /dev/null +++ b/specification/entities/README.md @@ -0,0 +1,27 @@ + + +# Entities + +
+ Table of Contents + + + +- [Overview](#overview) +- [Specifications](#specifications) + + + +
+ +## Overview + +Entity represents an object of interest associated with produced telemetry: traces, metrics, logs, etc. + +## Specifications + +- [Data Model](./data-model.md) \ No newline at end of file diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md new file mode 100644 index 00000000000..92693b93a4c --- /dev/null +++ b/specification/entities/data-model.md @@ -0,0 +1,183 @@ +# Entity Data Model + +**Status**: [Development](../document-status.md) + + +
+Table of Contents + + + +- [Minimally Sufficient Id](#minimally-sufficient-id) +- [Examples of Entities](#examples-of-entities) + + + +
+ +Entity represents an object of interest associated with produced telemetry: +traces, metrics or logs. + +For example, telemetry produced using OpenTelemetry SDK is normally associated with +a `service` entity. Similarly, OpenTelemetry defines system metrics for a `host`. The `host` is the +entity we want to associate metrics with in this case. + +Entities may be also associated with produced telemetry indirectly. +For example a service that produces +telemetry is also related with a process in which the service runs, so we say that +the Service entity is related to the `process` entity. The process normally also runs +on a host, so we say that the `process` entity is related to the `host` entity. + +> Note: How entities are associated will be refined in future specification work. + +The data model below defines a logical model for an entity (irrespective of the physical +format and encoding of how entity data is recorded). + + + + + + + + + + + + + + + + + + + + + + +
Field + Type + Description +
Type + string + Defines the type of the entity. MUST not change during the +lifetime of the entity. For example: "service" or "host". This field is +required and MUST not be empty for valid entities. +
Id + map<string, attribute> + Attributes that identify the entity. +

+MUST not change during the lifetime of the entity. The Id must contain +at least one attribute. +

+Follows OpenTelemetry common +attribute definition. SHOULD follow OpenTelemetry semantic +conventions for attributes. +

Description + map<string, any> + Descriptive (non-identifying) attributes of the entity. +

+MAY change over the lifetime of the entity. MAY be empty. These +attributes are not part of entity's identity. +

+Follows any +value definition in the OpenTelemetry spec - it can be a scalar value, +byte array, an array or map of values. Arbitrary deep nesting of values +for arrays and maps is allowed. +

+SHOULD follow OpenTelemetry semantic +conventions for attributes. +

+ +### Minimally Sufficient Id + +Commonly, a number of attributes of an entity are readily available for the telemetry +producer to compose an Id from. Of the available attributes the entity Id should +include the minimal set of attributes that is sufficient for uniquely identifying +that entity. For example +a Process on a host can be uniquely identified by (`process.pid`,`process.start_time`) +attributes. Adding for example `process.executable.name` attribute to the Id is +unnecessary and violates the Minimally Sufficient Id rule. + +### Examples of Entities + +_This section is non-normative and is present only for the purposes of demonstrating +the data model._ + +Here are examples of entities, the typical identifying attributes they +have and some examples of non-identifying attributes that may be +associated with the entity. + +*Note: These examples MAY diverge from semantic conventions.* + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Entity + Entity Type + Identifying Attributes + Non-identifying Attributes +
Service + "service" + service.name (required) +

+service.instance.id +

+service.namespace +

service.version +
Host + "host" + host.id + host.name +

+host.type +

+host.image.id +

+host.image.name +

K8s Pod + "k8s.pod" + k8s.pod.uid (required) +

+k8s.cluster.name +

Any pod labels +
K8s Pod Container + "container" + k8s.pod.uid (required) +

+k8s.cluster.name +

+container.name +

Any container labels +
diff --git a/specification/resource/README.md b/specification/resource/README.md index 26d8f4eca97..0f311e58577 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -5,3 +5,29 @@ path_base_for_github_subdir: ---> # Resource + +
+ Table of Contents + + + +- [Overview](#overview) +- [Specifications](#specifications) + + + +
+ +## Overview + +A Resource is an immutable representation of the entity producing telemetry. +Within OpenTelemetry, all signals are associated with a Resource, enabling +contextual correlation of data from the same source. For Example, if I see +a high latency in a span I should be able to check the metrics for the +same entity that produced that Span during the time when the latency was +observed. + +## Specifications + +- [Data Model](./data-model.md) +- [Resource SDK](./sdk.md) \ No newline at end of file diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md new file mode 100644 index 00000000000..7feeae1901e --- /dev/null +++ b/specification/resource/data-model.md @@ -0,0 +1,62 @@ +# Resource Data Model + +**Status**: [Development](../document-status.md) + + +
+Table of Contents + + + +- [Identity](#identity) + * [Navigation](#navigation) + * [Telescoping](#telescoping) + + + +
+ +A Resource is an immutable representation of the entity producing telemetry as Attributes. For example, a process producing telemetry that is running in a container on Kubernetes has a Pod name, it is in a namespace and possibly is part of a Deployment which also has a name. All three of these attributes can be included in the Resource. Note that there are certain "standard attributes" that have prescribed meanings. + +A resource is composed of [`Entity`](../entities/README.md) and raw attributes. + +Resource provides two important aspects for observability: + +- It MUST *identify* an entity that is producing telemetry. +- It SHOULD allow users to determine *where* that entity resides within their infrastructure. + + +## Identity + +Most resources are a composition of `Entity`. `Entity` is described [here](../entities/data-model.md), and includes its own notion of identity. The identity of a resource is the set +of entities contained within it. Two resources are considered different if one +contains an entity not found in the other. + +Some resources include raw attributes in additon to Entities. Raw attributes are +considered identifying on a resource. That is, if the key-value pairs of +raw attributes are different, then you can assume the resource is different. + +### Navigation + +Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, Uis, etc. to find the *same* entity that telemetry is reporting against. For example, in the definition above, we see a few components listed for one entity: + +- A process +- A container +- A kubernetes pod name +- A namespace +- A deployment + +By including identifying attributes of each of these, we can help users navigate through their `kubectl` or kubernetes UIs to find the specific process generating telemetry. This is as important as being able to uniquely identify one process from another. + +> Aside: Observability signals SHOULD be actionable. Knowing a process is struggling is not as useful as > being able to scale up a deployment to take load off the struggling process. + + +If the only thing important to Resource was identity, we could simply use UUIDs. + +### Telescoping + +Within OpenTelemetry, we want to give users the flexibility to decide what information needs to be sent *with* observability signals and what information can be later joined. We call this "telescoping identity" where users can decide how *small* or *large* the size of an OpenTelemetry resource will be on the wire (and correspondingly, how large data points are when stored, depending on storage solution). + +For example, in the extreme, OpenTelemery could synthesize a UUID for every system which produces telemetry. All identifying attributes for Resource and Entity could be sent via a side channel with known relationships to this UUID. While this would optimise the runtime generation and sending of telemetry, it comes at the cost of downstream storage systems needing to join data back together either at ingestion time or query time. For high performance use cases, e.g. alerting, these joins can be expensive. + +In practice, users control Resource identity via the configuration of Resource Detection within SDKs and the collector. Users wishing for minimal identity will limit their resource detection just to a `service.instance.id`, for example. Some users highly customize resource detection with many concepts being appended. From 14b0d1aecafc0659cbd32e4e2616114a838f907c Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 6 Mar 2025 14:05:32 -0500 Subject: [PATCH 02/28] Add changelog with PR number. --- CHANGELOG.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index d5aa8bbab53..e166d8a090b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -28,6 +28,9 @@ release. ### Resource +- Add Datamodel for Entities + ([#4442](https://github.com/open-telemetry/opentelemetry-specification/pull/4442)) + ### Profiles ### OpenTelemetry Protocol From 08049317fe1f2bbacbc0e1c1f58ad5b232c9b400 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 6 Mar 2025 14:08:41 -0500 Subject: [PATCH 03/28] Markdownlint. --- specification/entities/README.md | 2 +- specification/entities/data-model.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/specification/entities/README.md b/specification/entities/README.md index bd07077ea3f..fd7e9876426 100644 --- a/specification/entities/README.md +++ b/specification/entities/README.md @@ -24,4 +24,4 @@ Entity represents an object of interest associated with produced telemetry: trac ## Specifications -- [Data Model](./data-model.md) \ No newline at end of file +- [Data Model](./data-model.md) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 92693b93a4c..6585c5f9420 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -92,7 +92,7 @@ conventions for attributes. -### Minimally Sufficient Id +## Minimally Sufficient Id Commonly, a number of attributes of an entity are readily available for the telemetry producer to compose an Id from. Of the available attributes the entity Id should @@ -102,7 +102,7 @@ a Process on a host can be uniquely identified by (`process.pid`,`process.start_ attributes. Adding for example `process.executable.name` attribute to the Id is unnecessary and violates the Minimally Sufficient Id rule. -### Examples of Entities +## Examples of Entities _This section is non-normative and is present only for the purposes of demonstrating the data model._ From 02b1dc567f514f5e958a0a75949a348ca9e2b5f4 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 6 Mar 2025 14:11:11 -0500 Subject: [PATCH 04/28] Fix more markdownlint. --- specification/entities/data-model.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 6585c5f9420..73335f06d91 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -2,7 +2,6 @@ **Status**: [Development](../document-status.md) -
Table of Contents @@ -111,7 +110,7 @@ Here are examples of entities, the typical identifying attributes they have and some examples of non-identifying attributes that may be associated with the entity. -*Note: These examples MAY diverge from semantic conventions.* +_Note: These examples MAY diverge from semantic conventions._ From fc3e90a11c462d48f83e3d066c45f10f8b21708c Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 6 Mar 2025 14:16:50 -0500 Subject: [PATCH 05/28] Fix lint issue. --- specification/resource/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/resource/README.md b/specification/resource/README.md index 0f311e58577..ff469168e3a 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -30,4 +30,4 @@ observed. ## Specifications - [Data Model](./data-model.md) -- [Resource SDK](./sdk.md) \ No newline at end of file +- [Resource SDK](./sdk.md) From 80769a02c5e1b570f373f3a5694b34e252ca851c Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 6 Mar 2025 14:39:52 -0500 Subject: [PATCH 06/28] Fix lint issues. --- specification/resource/data-model.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 7feeae1901e..986588f16f7 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -2,7 +2,6 @@ **Status**: [Development](../document-status.md) -
Table of Contents @@ -25,7 +24,6 @@ Resource provides two important aspects for observability: - It MUST *identify* an entity that is producing telemetry. - It SHOULD allow users to determine *where* that entity resides within their infrastructure. - ## Identity Most resources are a composition of `Entity`. `Entity` is described [here](../entities/data-model.md), and includes its own notion of identity. The identity of a resource is the set @@ -50,7 +48,6 @@ By including identifying attributes of each of these, we can help users navigate > Aside: Observability signals SHOULD be actionable. Knowing a process is struggling is not as useful as > being able to scale up a deployment to take load off the struggling process. - If the only thing important to Resource was identity, we could simply use UUIDs. ### Telescoping From c8adaf0cbc93b91f352d6ca1071589b290a440c0 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 13 Mar 2025 08:47:28 -0700 Subject: [PATCH 07/28] Apply suggestions from code review Co-authored-by: Daniel Dyla --- specification/resource/data-model.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 986588f16f7..9abc4818fcb 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -17,7 +17,7 @@ A Resource is an immutable representation of the entity producing telemetry as Attributes. For example, a process producing telemetry that is running in a container on Kubernetes has a Pod name, it is in a namespace and possibly is part of a Deployment which also has a name. All three of these attributes can be included in the Resource. Note that there are certain "standard attributes" that have prescribed meanings. -A resource is composed of [`Entity`](../entities/README.md) and raw attributes. +A resource is composed of 0 or more [`Entities`](../entities/README.md) and 0 or more attributes not associated with any entity. Resource provides two important aspects for observability: @@ -36,7 +36,7 @@ raw attributes are different, then you can assume the resource is different. ### Navigation -Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, Uis, etc. to find the *same* entity that telemetry is reporting against. For example, in the definition above, we see a few components listed for one entity: +Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, UIs, etc. to find the *same* entity that telemetry is reporting against. For example, in the definition above, we see a few components listed for one entity: - A process - A container From 1f10be2503783cdbb1b1e5a6250d4732fb152273 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 13 Mar 2025 08:47:53 -0700 Subject: [PATCH 08/28] Update specification/entities/data-model.md Co-authored-by: Daniel Dyla --- specification/entities/data-model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 73335f06d91..d20ca580bc4 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -24,7 +24,7 @@ entity we want to associate metrics with in this case. Entities may be also associated with produced telemetry indirectly. For example a service that produces telemetry is also related with a process in which the service runs, so we say that -the Service entity is related to the `process` entity. The process normally also runs +the `service` entity is related to the `process` entity. The process normally also runs on a host, so we say that the `process` entity is related to the `host` entity. > Note: How entities are associated will be refined in future specification work. From 1d43246f39df52bba7fc71d4f82e799501785c73 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Thu, 13 Mar 2025 09:00:27 -0700 Subject: [PATCH 09/28] Address missing repeatability language for id. --- specification/entities/data-model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index d20ca580bc4..a7c49511f10 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -27,7 +27,7 @@ telemetry is also related with a process in which the service runs, so we say th the `service` entity is related to the `process` entity. The process normally also runs on a host, so we say that the `process` entity is related to the `host` entity. -> Note: How entities are associated will be refined in future specification work. +> Note: Entity relationship modelling will be refined in future specification work. The data model below defines a logical model for an entity (irrespective of the physical format and encoding of how entity data is recorded). From 912e572dc323b419f0d6d3ec879abbda68981631 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 18 Mar 2025 10:11:18 -0700 Subject: [PATCH 10/28] Add cached but not saved vscode changes. --- specification/entities/data-model.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index a7c49511f10..c6edd96681e 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -101,6 +101,12 @@ a Process on a host can be uniquely identified by (`process.pid`,`process.start_ attributes. Adding for example `process.executable.name` attribute to the Id is unnecessary and violates the Minimally Sufficient Id rule. +## Repeatable Id + +The identifying attributes for entity SHOULD be values that can be repeatably obtained by observers of that entity. For example, a `process` entity SHOULD have the same id (and be recongized as the same process), regardless of whether the id was generated from the process itself, via SDK, by an OpenTelemetry Collector running on the same host, or by some other system describing the process. + +> Aside: There are many ways to accomplish repeatable identifying attributes across multiple observers. While many succesful systems rely on pushing down identity from a central registry or knowledge store, OpenTelemetry must support all possible scenarios. + ## Examples of Entities _This section is non-normative and is present only for the purposes of demonstrating From 9865527708b767a1d8922b5e5b863bebca139f17 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 18 Mar 2025 10:13:15 -0700 Subject: [PATCH 11/28] Fix typos. --- specification/entities/data-model.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index c6edd96681e..74ca5afbbe0 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -103,9 +103,9 @@ unnecessary and violates the Minimally Sufficient Id rule. ## Repeatable Id -The identifying attributes for entity SHOULD be values that can be repeatably obtained by observers of that entity. For example, a `process` entity SHOULD have the same id (and be recongized as the same process), regardless of whether the id was generated from the process itself, via SDK, by an OpenTelemetry Collector running on the same host, or by some other system describing the process. +The identifying attributes for entity SHOULD be values that can be repeatably obtained by observers of that entity. For example, a `process` entity SHOULD have the same id (and be recognized as the same process), regardless of whether the id was generated from the process itself, via SDK, by an OpenTelemetry Collector running on the same host, or by some other system describing the process. -> Aside: There are many ways to accomplish repeatable identifying attributes across multiple observers. While many succesful systems rely on pushing down identity from a central registry or knowledge store, OpenTelemetry must support all possible scenarios. +> Aside: There are many ways to accomplish repeatable identifying attributes across multiple observers. While many successful systems rely on pushing down identity from a central registry or knowledge store, OpenTelemetry must support all possible scenarios. ## Examples of Entities From 3b112cfade7cf6cb19a7014c372cfd2b6a0236d6 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 18 Mar 2025 10:16:06 -0700 Subject: [PATCH 12/28] Apply suggestions from code review Co-authored-by: Nathan L Smith --- specification/entities/README.md | 4 ++-- specification/entities/data-model.md | 4 ++-- specification/resource/data-model.md | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/specification/entities/README.md b/specification/entities/README.md index fd7e9876426..2ab56826667 100644 --- a/specification/entities/README.md +++ b/specification/entities/README.md @@ -7,7 +7,7 @@ path_base_for_github_subdir: # Entities
- Table of Contents + Table of Contents @@ -20,7 +20,7 @@ path_base_for_github_subdir: ## Overview -Entity represents an object of interest associated with produced telemetry: traces, metrics, logs, etc. +Entity represents an object of interest associated with produced telemetry: traces, metrics, logs, profiles etc. ## Specifications diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 74ca5afbbe0..d96fe27ae7f 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -15,9 +15,9 @@
Entity represents an object of interest associated with produced telemetry: -traces, metrics or logs. +traces, metrics, profiles, or logs. -For example, telemetry produced using OpenTelemetry SDK is normally associated with +For example, telemetry produced using an OpenTelemetry SDK is normally associated with a `service` entity. Similarly, OpenTelemetry defines system metrics for a `host`. The `host` is the entity we want to associate metrics with in this case. diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 9abc4818fcb..1b43da4c2bb 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -44,9 +44,9 @@ Implicit in the design of Resource and attributes is ensuring users are able to - A namespace - A deployment -By including identifying attributes of each of these, we can help users navigate through their `kubectl` or kubernetes UIs to find the specific process generating telemetry. This is as important as being able to uniquely identify one process from another. +By including identifying attributes of each of these, we can help users navigate through their `kubectl` or Kubernetes UIs to find the specific process generating telemetry. This is as important as being able to uniquely identify one process from another. -> Aside: Observability signals SHOULD be actionable. Knowing a process is struggling is not as useful as > being able to scale up a deployment to take load off the struggling process. +> Aside: Observability signals SHOULD be actionable. Knowing a process is struggling is not as useful as being able to scale up a deployment to take load off the struggling process. If the only thing important to Resource was identity, we could simply use UUIDs. From 45173f31ce9ed6578b483fbab109c669c32ff38a Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 18 Mar 2025 10:17:02 -0700 Subject: [PATCH 13/28] Update specification/entities/data-model.md Co-authored-by: Nathan L Smith --- specification/entities/data-model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index d96fe27ae7f..d062cf47c07 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -23,7 +23,7 @@ entity we want to associate metrics with in this case. Entities may be also associated with produced telemetry indirectly. For example a service that produces -telemetry is also related with a process in which the service runs, so we say that +telemetry is also related to a process in which the service runs, so we say that the `service` entity is related to the `process` entity. The process normally also runs on a host, so we say that the `process` entity is related to the `host` entity. From a2d6288ec89fdfc52d8d16498e95af17336eac0e Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 18 Mar 2025 10:39:56 -0700 Subject: [PATCH 14/28] Update toc. --- specification/entities/data-model.md | 1 + 1 file changed, 1 insertion(+) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index d062cf47c07..821c6b04415 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -8,6 +8,7 @@ - [Minimally Sufficient Id](#minimally-sufficient-id) +- [Repeatable Id](#repeatable-id) - [Examples of Entities](#examples-of-entities) From 2cfecbf06f8011def63189d673e35f4f9f3fa5f7 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 18 Mar 2025 11:18:34 -0700 Subject: [PATCH 15/28] reword a poorly worded sentence. --- specification/resource/data-model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 1b43da4c2bb..44d9f8fef07 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -15,7 +15,7 @@
-A Resource is an immutable representation of the entity producing telemetry as Attributes. For example, a process producing telemetry that is running in a container on Kubernetes has a Pod name, it is in a namespace and possibly is part of a Deployment which also has a name. All three of these attributes can be included in the Resource. Note that there are certain "standard attributes" that have prescribed meanings. +A Resource is an immutable representation of the entity producing telemetry as Attributes. For example, You could have a process producing telemetry that is running in a container on Kubernetes, which is associated to a Pod running on a Node that is a VM but also is in a namespace and possibly is part of a Deployment. Resource could have attributes to denote information about the Container, the Pod, the Node, the VM or the Deployment. All of these help identify what produced the telemetry. Note that there are certain "standard attributes" that have prescribed meanings. A resource is composed of 0 or more [`Entities`](../entities/README.md) and 0 or more attributes not associated with any entity. From 9ee7c4514f165d3646db40fe172ccd9ea46ea31d Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 09:54:26 -0400 Subject: [PATCH 16/28] Enforce 80 character limit on markdown lines. --- specification/entities/data-model.md | 35 ++++++++++------ specification/resource/data-model.md | 60 +++++++++++++++++++++------- 2 files changed, 69 insertions(+), 26 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 821c6b04415..7b383c3cf97 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -18,20 +18,23 @@ Entity represents an object of interest associated with produced telemetry: traces, metrics, profiles, or logs. -For example, telemetry produced using an OpenTelemetry SDK is normally associated with -a `service` entity. Similarly, OpenTelemetry defines system metrics for a `host`. The `host` is the -entity we want to associate metrics with in this case. +For example, telemetry produced using an OpenTelemetry SDK is normally +associated with a `service` entity. Similarly, OpenTelemetry defines system +metrics for a `host`. The `host` is the entity we want to associate metrics with +in this case. Entities may be also associated with produced telemetry indirectly. For example a service that produces telemetry is also related to a process in which the service runs, so we say that -the `service` entity is related to the `process` entity. The process normally also runs -on a host, so we say that the `process` entity is related to the `host` entity. +the `service` entity is related to the `process` entity. The process normally +also runs on a host, so we say that the `process` entity is related to the +`host` entity. -> Note: Entity relationship modelling will be refined in future specification work. +> Note: Entity relationship modelling will be refined in future specification +> work. -The data model below defines a logical model for an entity (irrespective of the physical -format and encoding of how entity data is recorded). +The data model below defines a logical model for an entity (irrespective of the +physical format and encoding of how entity data is recorded).
@@ -104,14 +107,22 @@ unnecessary and violates the Minimally Sufficient Id rule. ## Repeatable Id -The identifying attributes for entity SHOULD be values that can be repeatably obtained by observers of that entity. For example, a `process` entity SHOULD have the same id (and be recognized as the same process), regardless of whether the id was generated from the process itself, via SDK, by an OpenTelemetry Collector running on the same host, or by some other system describing the process. +The identifying attributes for entity SHOULD be values that can be repeatably +obtained by observers of that entity. For example, a `process` entity SHOULD +have the same id (and be recognized as the same process), regardless of whether +the id was generated from the process itself, via SDK, by an OpenTelemetry +Collector running on the same host, or by some other system describing the +process. -> Aside: There are many ways to accomplish repeatable identifying attributes across multiple observers. While many successful systems rely on pushing down identity from a central registry or knowledge store, OpenTelemetry must support all possible scenarios. +> Aside: There are many ways to accomplish repeatable identifying attributes +> across multiple observers. While many successful systems rely on pushing down +> identity from a central registry or knowledge store, OpenTelemetry must +> support all possible scenarios. ## Examples of Entities -_This section is non-normative and is present only for the purposes of demonstrating -the data model._ +_This section is non-normative and is present only for the purposes of +demonstrating the data model._ Here are examples of entities, the typical identifying attributes they have and some examples of non-identifying attributes that may be diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 44d9f8fef07..53c7679ba84 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -15,9 +15,17 @@ -A Resource is an immutable representation of the entity producing telemetry as Attributes. For example, You could have a process producing telemetry that is running in a container on Kubernetes, which is associated to a Pod running on a Node that is a VM but also is in a namespace and possibly is part of a Deployment. Resource could have attributes to denote information about the Container, the Pod, the Node, the VM or the Deployment. All of these help identify what produced the telemetry. Note that there are certain "standard attributes" that have prescribed meanings. - -A resource is composed of 0 or more [`Entities`](../entities/README.md) and 0 or more attributes not associated with any entity. +A Resource is an immutable representation of the entity producing telemetry as +Attributes. For example, You could have a process producing telemetry that is +running in a container on Kubernetes, which is associated to a Pod running on a +Node that is a VM but also is in a namespace and possibly is part of a +Deployment. Resource could have attributes to denote information about the +Container, the Pod, the Node, the VM or the Deployment. All of these help +identify what produced the telemetry. Note that there are certain "standard +attributes" that have prescribed meanings. + +A resource is composed of 0 or more [`Entities`](../entities/README.md) and 0 +or more attributes not associated with any entity. Resource provides two important aspects for observability: @@ -26,9 +34,11 @@ Resource provides two important aspects for observability: ## Identity -Most resources are a composition of `Entity`. `Entity` is described [here](../entities/data-model.md), and includes its own notion of identity. The identity of a resource is the set -of entities contained within it. Two resources are considered different if one -contains an entity not found in the other. +Most resources are a composition of `Entity`. `Entity` is described +[here](../entities/data-model.md), and includes its own notion of identity. +The identity of a resource is the set of entities contained within it. Two +resources are considered different if one contains an entity not found in the +other. Some resources include raw attributes in additon to Entities. Raw attributes are considered identifying on a resource. That is, if the key-value pairs of @@ -36,7 +46,10 @@ raw attributes are different, then you can assume the resource is different. ### Navigation -Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, UIs, etc. to find the *same* entity that telemetry is reporting against. For example, in the definition above, we see a few components listed for one entity: +Implicit in the design of Resource and attributes is ensuring users are able to +navigate their infrastructure, tools, UIs, etc. to find the *same* entity that +telemetry is reporting against. For example, in the definition above, we see a +few components listed for one entity: - A process - A container @@ -44,16 +57,35 @@ Implicit in the design of Resource and attributes is ensuring users are able to - A namespace - A deployment -By including identifying attributes of each of these, we can help users navigate through their `kubectl` or Kubernetes UIs to find the specific process generating telemetry. This is as important as being able to uniquely identify one process from another. +By including identifying attributes of each of these, we can help users navigate +through their `kubectl` or Kubernetes UIs to find the specific process +generating telemetry. This is as important as being able to uniquely identify +one process from another. -> Aside: Observability signals SHOULD be actionable. Knowing a process is struggling is not as useful as being able to scale up a deployment to take load off the struggling process. +> Aside: Observability signals SHOULD be actionable. Knowing a process is +> struggling is not as useful as being able to scale up a deployment to take +> load off the struggling process. If the only thing important to Resource was identity, we could simply use UUIDs. ### Telescoping -Within OpenTelemetry, we want to give users the flexibility to decide what information needs to be sent *with* observability signals and what information can be later joined. We call this "telescoping identity" where users can decide how *small* or *large* the size of an OpenTelemetry resource will be on the wire (and correspondingly, how large data points are when stored, depending on storage solution). - -For example, in the extreme, OpenTelemery could synthesize a UUID for every system which produces telemetry. All identifying attributes for Resource and Entity could be sent via a side channel with known relationships to this UUID. While this would optimise the runtime generation and sending of telemetry, it comes at the cost of downstream storage systems needing to join data back together either at ingestion time or query time. For high performance use cases, e.g. alerting, these joins can be expensive. - -In practice, users control Resource identity via the configuration of Resource Detection within SDKs and the collector. Users wishing for minimal identity will limit their resource detection just to a `service.instance.id`, for example. Some users highly customize resource detection with many concepts being appended. +Within OpenTelemetry, we want to give users the flexibility to decide what +information needs to be sent *with* observability signals and what information +can be later joined. We call this "telescoping identity" where users can decide +how *small* or *large* the size of an OpenTelemetry resource will be on the wire +(and correspondingly, how large data points are when stored, depending on +storage solution). + +For example, in the extreme, OpenTelemery could synthesize a UUID for every +system which produces telemetry. All identifying attributes for Resource and +Entity could be sent via a side channel with known relationships to this UUID. +While this would optimise the runtime generation and sending of telemetry, it +comes at the cost of downstream storage systems needing to join data back +together either at ingestion time or query time. For high performance use cases, +e.g. alerting, these joins can be expensive. + +In practice, users control Resource identity via the configuration of Resource +Detection within SDKs and the collector. Users wishing for minimal identity will +limit their resource detection just to a `service.instance.id`, for example. +Some users highly customize resource detection with many concepts being appended. From ae7e933e9edf0884087e39bbaf216e90cb27b9e0 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 09:59:08 -0400 Subject: [PATCH 17/28] Address some comments. --- specification/entities/data-model.md | 8 ++++---- specification/resource/README.md | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 7b383c3cf97..de334f3982a 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -95,7 +95,7 @@ conventions for attributes.
-## Minimally Sufficient Id +## Minimally Sufficient Identity Commonly, a number of attributes of an entity are readily available for the telemetry producer to compose an Id from. Of the available attributes the entity Id should @@ -105,12 +105,12 @@ a Process on a host can be uniquely identified by (`process.pid`,`process.start_ attributes. Adding for example `process.executable.name` attribute to the Id is unnecessary and violates the Minimally Sufficient Id rule. -## Repeatable Id +## Repeatable Identity The identifying attributes for entity SHOULD be values that can be repeatably obtained by observers of that entity. For example, a `process` entity SHOULD -have the same id (and be recognized as the same process), regardless of whether -the id was generated from the process itself, via SDK, by an OpenTelemetry +have the same identity (and be recognized as the same process), regardless of whether +the identity was generated from the process itself, via SDK, by an OpenTelemetry Collector running on the same host, or by some other system describing the process. diff --git a/specification/resource/README.md b/specification/resource/README.md index ff469168e3a..54f48b445f6 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -7,7 +7,7 @@ path_base_for_github_subdir: # Resource
- Table of Contents + Table of Contents From ef20cc83d01fdc4a80ea84bdddea3562bfb41cef Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 10:22:25 -0400 Subject: [PATCH 18/28] Clean up the specification examples. --- specification/entities/data-model.md | 92 +++++++++++++++------------- 1 file changed, 51 insertions(+), 41 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index de334f3982a..65ac072e390 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -58,7 +58,7 @@ required and MUST not be empty for valid entities. Id - map<string, attribute> + map<string, standard attribute value> Attributes that identify the entity.

@@ -66,7 +66,7 @@ MUST not change during the lifetime of the entity. The Id must contain at least one attribute.

Follows OpenTelemetry common +href="../../specification/common/README.md#standard-attribute">Standard attribute definition. SHOULD follow OpenTelemetry semantic conventions for attributes. @@ -100,19 +100,18 @@ conventions for attributes. Commonly, a number of attributes of an entity are readily available for the telemetry producer to compose an Id from. Of the available attributes the entity Id should include the minimal set of attributes that is sufficient for uniquely identifying -that entity. For example -a Process on a host can be uniquely identified by (`process.pid`,`process.start_time`) -attributes. Adding for example `process.executable.name` attribute to the Id is -unnecessary and violates the Minimally Sufficient Id rule. +that entity. For example a Process on a host can be uniquely identified by +(`process.pid`,`process.start_time`) attributes. Adding for example `process.executable.name` attribute to the Id is unnecessary and violates the +Minimally Sufficient Identity rule. ## Repeatable Identity The identifying attributes for entity SHOULD be values that can be repeatably obtained by observers of that entity. For example, a `process` entity SHOULD have the same identity (and be recognized as the same process), regardless of whether -the identity was generated from the process itself, via SDK, by an OpenTelemetry -Collector running on the same host, or by some other system describing the -process. +the identity was generated from the process itself, e.g. via SDK, or by an +OpenTelemetry Collector running on the same host, or by some other system +describing the process. > Aside: There are many ways to accomplish repeatable identifying attributes > across multiple observers. While many successful systems rely on pushing down @@ -125,7 +124,7 @@ _This section is non-normative and is present only for the purposes of demonstrating the data model._ Here are examples of entities, the typical identifying attributes they -have and some examples of non-identifying attributes that may be +have and some examples of descriptive attributes that may be associated with the entity. _Note: These examples MAY diverge from semantic conventions._ @@ -138,63 +137,74 @@ _Note: These examples MAY diverge from semantic conventions._ Identifying Attributes - Non-identifying Attributes + Descriptive Attributes - Service + Container - "service" +

container
- service.name (required) -

-service.instance.id -

-service.namespace + container.id - service.version + container.image.id
+ container.image.name
+ container.image.tag.{key}
+ container.label.{key}
+ container.name
+ container.runtime
+ oci.manifest.digest
+ container.command
Host - "host" +

host
host.id - host.name -

-host.type -

-host.image.id -

-host.image.name + host.arch
+ host.name
+ host.type
+ host.image.id
+ host.image.name
+ host.image.version
+ host.type - K8s Pod + Kubernetes Node - "k8s.pod" +

k8s.node
- k8s.pod.uid (required) -

-k8s.cluster.name + k8s.node.uid - Any pod labels + k8s.node.name - K8s Pod Container + Kubernetes Pod - "container" +

k8s.pod
- k8s.pod.uid (required) -

-k8s.cluster.name -

-container.name + k8s.pod.uid + + k8s.pod.name
+ k8s.pod.label.{key}
+ k8s.pod.annotation.{key}
+ + + + Service Instance + +

service.instance
- Any container labels + service.instance.id
+ service.name
+ service.namesapce + + service.version From 15c85fc6a783d24eb7ecc26e267e0951470b1a29 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 10:24:54 -0400 Subject: [PATCH 19/28] Another english cleanup. --- specification/resource/README.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/specification/resource/README.md b/specification/resource/README.md index 54f48b445f6..f2160215f7a 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -22,10 +22,9 @@ path_base_for_github_subdir: A Resource is an immutable representation of the entity producing telemetry. Within OpenTelemetry, all signals are associated with a Resource, enabling -contextual correlation of data from the same source. For Example, if I see -a high latency in a span I should be able to check the metrics for the -same entity that produced that Span during the time when the latency was -observed. +contextual correlation of data from the same source. For example, if I see +a high latency in a span I need to check the metrics for the same entity that +produced that Span during the time when the latency was observed. ## Specifications From c46fa82cd63dc2606c65c8ed07a3357391d3d5e9 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 10:26:28 -0400 Subject: [PATCH 20/28] Another language nit cleaned up. --- specification/entities/data-model.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 65ac072e390..651eeb2808b 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -84,8 +84,7 @@ attributes are not part of entity's identity.

Follows any -value definition in the OpenTelemetry spec - it can be a scalar value, -byte array, an array or map of values. Arbitrary deep nesting of values +value definition in the OpenTelemetry spec. Arbitrary deep nesting of values for arrays and maps is allowed.

SHOULD follow OpenTelemetry Date: Fri, 4 Apr 2025 10:37:36 -0400 Subject: [PATCH 21/28] Add a better transition statement. --- specification/resource/data-model.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 53c7679ba84..9e485e881cb 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -67,6 +67,12 @@ one process from another. > load off the struggling process. If the only thing important to Resource was identity, we could simply use UUIDs. +However, this would rely on some other, easily accessible, system to provide +human-friendly understanding for these UUIDs. OpenTelemetry provides a model +where a full UUID-only solution could be chosen, but defaults to a *blended* +approach, where resource provides both Identity and Navigation. + +This leads to the next concept: Telescoping identity to the needs of a system. ### Telescoping From 9a8618225d0c4933723a567be162249a88d55bdd Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 12:02:41 -0400 Subject: [PATCH 22/28] Regenerate toc. --- specification/entities/data-model.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 651eeb2808b..2a8a8c88384 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -7,8 +7,8 @@ -- [Minimally Sufficient Id](#minimally-sufficient-id) -- [Repeatable Id](#repeatable-id) +- [Minimally Sufficient Identity](#minimally-sufficient-identity) +- [Repeatable Identity](#repeatable-identity) - [Examples of Entities](#examples-of-entities) From 62a8848f3a68f8e43c5eed1b3b8b459a107e5bcc Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Fri, 4 Apr 2025 12:06:09 -0400 Subject: [PATCH 23/28] Fix last nit comment. --- specification/resource/data-model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 9e485e881cb..6280ec75bfb 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -49,7 +49,7 @@ raw attributes are different, then you can assume the resource is different. Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, UIs, etc. to find the *same* entity that telemetry is reporting against. For example, in the definition above, we see a -few components listed for one entity: +few entities listed for one Resource: - A process - A container From 14742d8f13ef2706807cad5d8f772cfee8dbd690 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 8 Apr 2025 08:18:02 -0400 Subject: [PATCH 24/28] Update layout from feedback. --- specification/resource/README.md | 72 ++++++++++++++++++- specification/resource/data-model.md | 103 +++++++++++---------------- 2 files changed, 111 insertions(+), 64 deletions(-) diff --git a/specification/resource/README.md b/specification/resource/README.md index f2160215f7a..08862eccde1 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -12,6 +12,9 @@ path_base_for_github_subdir: - [Overview](#overview) + * [Identity](#identity) + * [Navigation](#navigation) + * [Telescoping](#telescoping) - [Specifications](#specifications) @@ -26,7 +29,74 @@ contextual correlation of data from the same source. For example, if I see a high latency in a span I need to check the metrics for the same entity that produced that Span during the time when the latency was observed. +Resource provides two important aspects for observability: + +- It MUST identify an entity that is producing telemetry. +- It SHOULD allow users to determine where that entity resides within their infrastructure. + +### Identity + +Resource provides a natural way to understand "what" produced an effect and +evaluate other signals of that same source. This is done through attaching the +same set of identifying attributes on all telemetry produced in an +OpenTelemetry SDK. + +Resource identity provides a natural pivot point for observability signals, a +key type of correlation in OpenTelemetry. + +### Navigation + +Implicit in the design of Resource and attributes is ensuring users are able to +navigate their infrastructure, tools, UIs, etc. to find the *same* entity that +telemetry is reporting against. For example, in the definition above, we see a +few entities listed for one Resource: + +- A process +- A container +- A kubernetes pod name +- A namespace +- A deployment + +By including identifying attributes of each of these, we can help users navigate +through their `kubectl` or Kubernetes UIs to find the specific process +generating telemetry. This is as important as being able to uniquely identify +one process from another. + +> Aside: Observability signals SHOULD be actionable. Knowing a process is +> struggling is not as useful as being able to scale up a deployment to take +> load off the struggling process. + +If the only thing important to Resource was identity, we could simply use UUIDs. +However, this would rely on some other, easily accessible, system to provide +human-friendly understanding for these UUIDs. OpenTelemetry provides a model +where a full UUID-only solution could be chosen, but defaults to a *blended* +approach, where resource provides both Identity and Navigation. + +This leads to the next concept: Telescoping identity to the needs of a system. + +### Telescoping + +Within OpenTelemetry, we want to give users the flexibility to decide what +information needs to be sent *with* observability signals and what information +can be later joined. We call this "telescoping identity" where users can decide +how *small* or *large* the size of an OpenTelemetry resource will be on the wire +(and correspondingly, how large data points are when stored, depending on +storage solution). + +For example, in the extreme, OpenTelemery could synthesize a UUID for every +system which produces telemetry. All identifying attributes for Resource and +Entity could be sent via a side channel with known relationships to this UUID. +While this would optimise the runtime generation and sending of telemetry, it +comes at the cost of downstream storage systems needing to join data back +together either at ingestion time or query time. For high performance use cases, +e.g. alerting, these joins can be expensive. + +In practice, users control Resource identity via the configuration of Resource +Detection within SDKs and the collector. Users wishing for minimal identity will +limit their resource detection just to a `service.instance.id`, for example. +Some users highly customize resource detection with many concepts being appended. + ## Specifications - [Data Model](./data-model.md) -- [Resource SDK](./sdk.md) +- [Resource SDK](./sdk.md) \ No newline at end of file diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 6280ec75bfb..28f12e6e771 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -8,8 +8,6 @@ - [Identity](#identity) - * [Navigation](#navigation) - * [Telescoping](#telescoping) @@ -27,71 +25,50 @@ attributes" that have prescribed meanings. A resource is composed of 0 or more [`Entities`](../entities/README.md) and 0 or more attributes not associated with any entity. -Resource provides two important aspects for observability: - -- It MUST *identify* an entity that is producing telemetry. -- It SHOULD allow users to determine *where* that entity resides within their infrastructure. +The data model below defines a logical model for an Resource (irrespective of the physical format and encoding of how resource data is recorded). + + + + + + + + + + + + + + + + + +
Field + Type + Description +
Entities + set<Entity> + Defines the set of Entities associated with this resource. +

Entity is defined + here +

Attributes + map<string, standard attribute value> + Additional Attributes that identify the resource. +

+MUST not change during the lifetime of the resource. +

+Follows OpenTelemetry Standard +attribute definition. +

## Identity -Most resources are a composition of `Entity`. `Entity` is described -[here](../entities/data-model.md), and includes its own notion of identity. -The identity of a resource is the set of entities contained within it. Two -resources are considered different if one contains an entity not found in the -other. +Most resources are a composition of [`Entity`](../entities/data-model.md). +Entity includes its own notion of identity. The identity of a resource is +the set of entities contained within it. Two resources are considered +different if one contains an entity not found in the other. Some resources include raw attributes in additon to Entities. Raw attributes are considered identifying on a resource. That is, if the key-value pairs of raw attributes are different, then you can assume the resource is different. - -### Navigation - -Implicit in the design of Resource and attributes is ensuring users are able to -navigate their infrastructure, tools, UIs, etc. to find the *same* entity that -telemetry is reporting against. For example, in the definition above, we see a -few entities listed for one Resource: - -- A process -- A container -- A kubernetes pod name -- A namespace -- A deployment - -By including identifying attributes of each of these, we can help users navigate -through their `kubectl` or Kubernetes UIs to find the specific process -generating telemetry. This is as important as being able to uniquely identify -one process from another. - -> Aside: Observability signals SHOULD be actionable. Knowing a process is -> struggling is not as useful as being able to scale up a deployment to take -> load off the struggling process. - -If the only thing important to Resource was identity, we could simply use UUIDs. -However, this would rely on some other, easily accessible, system to provide -human-friendly understanding for these UUIDs. OpenTelemetry provides a model -where a full UUID-only solution could be chosen, but defaults to a *blended* -approach, where resource provides both Identity and Navigation. - -This leads to the next concept: Telescoping identity to the needs of a system. - -### Telescoping - -Within OpenTelemetry, we want to give users the flexibility to decide what -information needs to be sent *with* observability signals and what information -can be later joined. We call this "telescoping identity" where users can decide -how *small* or *large* the size of an OpenTelemetry resource will be on the wire -(and correspondingly, how large data points are when stored, depending on -storage solution). - -For example, in the extreme, OpenTelemery could synthesize a UUID for every -system which produces telemetry. All identifying attributes for Resource and -Entity could be sent via a side channel with known relationships to this UUID. -While this would optimise the runtime generation and sending of telemetry, it -comes at the cost of downstream storage systems needing to join data back -together either at ingestion time or query time. For high performance use cases, -e.g. alerting, these joins can be expensive. - -In practice, users control Resource identity via the configuration of Resource -Detection within SDKs and the collector. Users wishing for minimal identity will -limit their resource detection just to a `service.instance.id`, for example. -Some users highly customize resource detection with many concepts being appended. From 4422a24c8eb1e3363cee9664b2442dbc0c5777ee Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 8 Apr 2025 08:32:28 -0400 Subject: [PATCH 25/28] Fix lint issue. --- specification/resource/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/resource/README.md b/specification/resource/README.md index 08862eccde1..33f39a267d5 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -99,4 +99,4 @@ Some users highly customize resource detection with many concepts being appended ## Specifications - [Data Model](./data-model.md) -- [Resource SDK](./sdk.md) \ No newline at end of file +- [Resource SDK](./sdk.md) From 316d170730ba7b3db7f6df6080e296a52397b45e Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Tue, 8 Apr 2025 20:03:40 -0400 Subject: [PATCH 26/28] Remove references to Resource being immutable on the data model. --- specification/entities/README.md | 3 ++- specification/resource/README.md | 2 +- specification/resource/data-model.md | 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/specification/entities/README.md b/specification/entities/README.md index 2ab56826667..be0fc3b2a7c 100644 --- a/specification/entities/README.md +++ b/specification/entities/README.md @@ -20,7 +20,8 @@ path_base_for_github_subdir: ## Overview -Entity represents an object of interest associated with produced telemetry: traces, metrics, logs, profiles etc. +Entity represents an object of interest associated with produced telemetry: +traces, metrics, logs, profiles etc. ## Specifications diff --git a/specification/resource/README.md b/specification/resource/README.md index 33f39a267d5..df1de1fc7c0 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -23,7 +23,7 @@ path_base_for_github_subdir: ## Overview -A Resource is an immutable representation of the entity producing telemetry. +A Resource is a representation of the entity producing telemetry. Within OpenTelemetry, all signals are associated with a Resource, enabling contextual correlation of data from the same source. For example, if I see a high latency in a span I need to check the metrics for the same entity that diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 28f12e6e771..dae35033d0d 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -13,8 +13,8 @@

-A Resource is an immutable representation of the entity producing telemetry as -Attributes. For example, You could have a process producing telemetry that is +A Resource is a representation of the entity producing telemetry as Attributes. +For example, You could have a process producing telemetry that is running in a container on Kubernetes, which is associated to a Pod running on a Node that is a VM but also is in a namespace and possibly is part of a Deployment. Resource could have attributes to denote information about the From b39bd07ad1f2a6cf995ca16a73c3828193af8c98 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Wed, 9 Apr 2025 14:23:33 -0400 Subject: [PATCH 27/28] Fix up bad reference. --- specification/resource/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/resource/README.md b/specification/resource/README.md index df1de1fc7c0..84f552eb770 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -48,8 +48,8 @@ key type of correlation in OpenTelemetry. Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, UIs, etc. to find the *same* entity that -telemetry is reporting against. For example, in the definition above, we see a -few entities listed for one Resource: +telemetry is reporting against. For example, in practice we cloud see Resource +including more than on entity, like: - A process - A container From 67e7e5d51a75e216ccc8625d3903a68e329ffd21 Mon Sep 17 00:00:00 2001 From: Josh Suereth Date: Wed, 9 Apr 2025 20:29:15 -0400 Subject: [PATCH 28/28] Update specification/resource/README.md Co-authored-by: Christophe Kamphaus --- specification/resource/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/resource/README.md b/specification/resource/README.md index 84f552eb770..2a2b8cce7d1 100644 --- a/specification/resource/README.md +++ b/specification/resource/README.md @@ -48,7 +48,7 @@ key type of correlation in OpenTelemetry. Implicit in the design of Resource and attributes is ensuring users are able to navigate their infrastructure, tools, UIs, etc. to find the *same* entity that -telemetry is reporting against. For example, in practice we cloud see Resource +telemetry is reporting against. For example, in practice we could see Resource including more than on entity, like: - A process