diff --git a/CHANGELOG.md b/CHANGELOG.md index f4eaf6997a3..75c9b5f18e1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -48,6 +48,9 @@ release. ### Entities +- Define merge algorithm. + ([4768](https://github.com/open-telemetry/opentelemetry-specification/pull/4768)) + ### Common ### OpenTelemetry Protocol diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index da23026517e..9bbb4e95be0 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -18,6 +18,7 @@ weight: 2 - [Resource and Entities](#resource-and-entities) * [Attribute Referencing Model](#attribute-referencing-model) * [Placement of Shared Descriptive Attributes](#placement-of-shared-descriptive-attributes) +- [Merging of Entities](#merging-of-entities) - [Examples of Entities](#examples-of-entities) @@ -151,6 +152,44 @@ different values, then **only** the `k8s.node` entity can reference this key Other entities (e.g., `k8s.cluster`) can report this attribute in a separate telemetry channel (e.g., entity events) where full ownership context is known. +## Merging of Entities + +Entities MAY be merged if and only if their types are the same, their +identity attributes are exactly the same AND their schema_url is the same. +This means both Entities MUST have the same identity attribute keys and +for each key, the values of the key MUST be the same. + +Here's an example algorithm that will check compatibility: + +``` +can_merge(current_entity, new_entity) { + current_entity.type == new_entity.type && + current_entity.schema_url == new_entity.schema_url && + has_same_attributes(current_entity.identity, new_entity.identity) +} +``` + +When merging entities, all attributes in description are merged together, with +one entity acting as "primary" where any conflicting attribute values will be +chosen from the "primary" entity. + +Here's an example algorithm that will merge: + +``` +merge(current_entity, new_entity) { + if can_merge(current_entity, new_entity) { + for attribute in new_entity.description { + // New entity descriptions take precedence. + current_entity.description.insert(attribute) + } + } +} +``` + +Note: If Entities have different `schema_url`s, they SHOULD be converted to the +same schema version (if possible) before attempting a merge. The merge algorithm +defined here assumes the entities are already at the same schema version. + ## Examples of Entities _This section is non-normative and is present only for the purposes of diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 5fb5bebfec4..f77bafcef5c 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -13,6 +13,8 @@ weight: 2 - [Identity](#identity) +- [Merging Resources](#merging-resources) + * [Merging Entities into a Resource](#merging-entities-into-a-resource) @@ -49,3 +51,166 @@ different if one contains an entity not found in the other. Some resources include raw attributes in addition to Entities. Raw attributes are considered identifying on a resource. That is, if the key-value pairs of raw attributes are different, then you can assume the resource is different. + +## Merging Resources + +Note: The current SDK specification outlines a [merge algorithm](sdk.md#merge). +This specification updates the algorithm to be compliant with entities. This +section will replace that section upon stabilization of entities. SDKs SHOULD +NOT update their merge algorithm until full Entity SDK support is provided. + +Merging resources is an action of joining together the context of observation. +That is, we can look at the resource context for a signal and *expand* that +context to include more details (see +[telescoping identity](README.md#telescoping)). As such, a merge SHOULD preserve +any identity that already existed on a Resource while adding in new identifying +information or descriptive attributes. + +### Merging Entities into a Resource + +We define the following algorithm for merging entities into an existing +resource. + +- Construct a set of existing entities on the resource, `E`. + - For each entity, `new_entity`, in priority order (highest first), + do one of the following: + - If an entity `e` exists in `E` with the same entity type as `new_entity`: + - Perform an [Entity DataModel Merge](../entities/data-model.md#merging-of-entities) with `e` and `new_entity` + - Note: If unable to merge `e` and `new_entity`, then no change is made. + - Otherwise, add the entity `new_entity` to set `E` +- Update the Resource to use the set of entities `E`. + - If all entities within `E` have the same `schema_url`, set the + resources `schema_url` to match. + - Otherwise set the Resource `schema_url` blank. + - Remove any attribute from `Attributes` which exists in either the + description or identity of an entity in `E`. +- Solve for resource flattening issues (See + [Attribute Referencing Model](../entities/data-model.md#attribute-referencing-model)). + - If, for all entities, there are now overlapping attribute keys, then nothing + is needed. + - If there is a conflict where two entities use the same attribute key then + remove the lower priority entity from the Resource. + +**Note**: Priority of entity merging is generally chosen implicitly by user +configuration, e.g. the order of Resource Detectors configured for an SDK +implicitly create an order of priority for merging entities. + +#### Examples + +*These examples demonstrate how conflicts are resolved during a merge.* + +##### Example 1: Entity replaces loose attribute + +The conflict between loose attributes and those belonging to an entity. Here when entity is added it removes previous attributes. + +**Initial Resource:** + +- Entities: *None* +- Attributes: + - `host.name`: `"old-name"` + - `env`: `"prod"` + +**Entities to Merge (by priority):** + +1. `host` + - type: `"host"` + - identity: + - `host.id`: `"H1"` + - description: + - `host.name`: `"new-name"` +2. `service` + - type: `"service"` + - identity: + - `service.name`: `"my-svc"` + +**Resulting Resource:** + +- Entities: + - `host` + - type: `"host"` + - identity: + - `host.id`: `"H1"` + - description: + - `host.name`: `"new-name"` + - `service` + - type: `"service"` + - identity: + - `service.name`: `"my-svc"` +- Attributes: + - `env`: `"prod"` + +##### Example 2: Loose attribute replaces entity attribute + +The conflict between loose attributes and those belonging to an entity. Here when the loose attribute is added, the entity must be removed due to conflict. + +**Initial Resource:** + +- Entities: + - `host` + - type: `"host"` + - identity: + - `host.id`: `"H1"` + - description: + - `host.name`: `"detected-name"` + - `process` + - type: `"process"` + - identity: + - `process.pid`: `12345` +- Attributes: *None* + +**Resource to Merge:** + +- Entities: *None* +- Attributes: + - `host.id`: `"h2"` + - `env`: `"prod"` + +**Resulting Resource:** + +- Entities: + - `process` + - type: `"process"` + - identity: + - `process.pid`: `12345` +- Attributes: + - `host.id`: `"h2"` + - `env`: `"prod"` + +##### Example 3: Identity & Attribute Conflicts + +Reject an entity with a different identity of the same type, and drop a lower priority entity due to an attribute key conflict. + +**Initial Resource:** + +- Entities: + - `host` + - type: `"host"` + - identity: + - `host.id`: `"H1"` + - description: + - `env`: `"prod"` +- Attributes: *None* + +**Entities to Merge (by priority):** + +1. `host` + - type: `"host"` + - identity: + - `host.id`: `"H2"` +2. `service` + - type: `"service"` + - identity: + - `service.name`: `"S1"` + - description: + - `env`: `"dev"` + +**Resulting Resource:** + +- Entities: + - `host` + - type: `"host"` + - identity: + - `host.id`: `"H1"` + - description: + - `env`: `"prod"` +- Attributes: *None*