Status | |
---|---|
Stability | beta: metrics |
Distributions | contrib, k8s |
Issues | |
Code Owners | @dmitryax |
The metrics transform processor can be used to rename metrics, and add, rename or delete label keys and values. It can also be used to perform scaling and aggregations on metrics across labels or label values. The complete list of supported operations that can be applied to one or more metrics is provided in the below table.
ℹ️ This processor only supports renames/aggregations within a batch of metrics. It does not do any aggregation across batches, so it is not suitable for aggregating metrics from multiple sources (e.g. multiple nodes or clients).
Operation | Example (based on metric system.cpu.usage ) |
---|---|
Rename metrics | Rename to system.cpu.usage_time |
Add labels | Add new label identifier with value 1 to all points |
Rename label keys | Rename label state to cpu_state |
Rename label values | For label state , rename value idle to - |
Delete data points | Delete all points where label state has value idle |
Toggle data type | Change from int data points to double data points |
Scale value | Multiply values by 1000 to convert from seconds to milliseconds |
Aggregate across label sets | Retain only the label state , average all points with the same value for this label |
Aggregate across label values | For label state , sum points where the value is user or system into used = user + system |
In addition to the above:
- Operations can be applied to one or more metrics using a
strict
orregexp
filter - The
action
property allows metrics to be:- Updated in-place (
update
) - Copied and updates applied to the copy (
insert
) - Combined into a newly inserted metric that is generated by combining all data
points from the set of matching metrics into a single metric (
combine
); the original matching metrics are also removed
- Updated in-place (
- When renaming metrics, capturing groups from the
regexp
filter will be expanded - When adding or updating a label value,
{{version}}
will be replaced with this collector's version number
Configuration is specified through a list of transformations and operations. Transformations and operations will be applied to all metrics in order so that later transformations or operations may reference the result of previous transformations or operations.
processors:
metricstransform:
# transforms is a list of transformations with each element transforming a metric selected by metric name
transforms:
# SPECIFY WHICH METRIC(S) TO MATCH
# include specifies the metric name used to determine which metric(s) to operate on
- include: <metric_name>
# match_type specifies whether the include name should be used as a strict match or regexp match, default = strict
match_type: {strict, regexp}
# experimental_match_labels specifies the label set against which the metric filter will work. If experimental_match_labels is specified, transforms will only be applied to those metrics which
# have the provided metric label values. This works for both strict and regexp match_type. This is an experimental feature.
experimental_match_labels: {<label1>: <label_value1>, <label2>: <label_value2>}
# SPECIFY THE ACTION TO TAKE ON THE MATCHED METRIC(S)
# action specifies if the operations (specified below) are performed on metrics in place (update), on an inserted clone (insert), or on a new combined metric (combine)
action: {update, insert, combine}
# SPECIFY HOW TO TRANSFORM THE METRIC GENERATED AS A RESULT OF APPLYING THE ABOVE ACTION
# new_name specifies the updated name of the metric; if action is insert or combine, new_name is required
new_name: <new_metric_name_inserted>
# aggregation_type defines how combined data points will be aggregated; if action is combine, aggregation_type is required
aggregation_type: {sum, mean, min, max, count, median}
# submatch_case specifies the case that should be used when adding label values based on regexp submatches when performing a combine action; leave blank to use the submatch value as is
submatch_case: {lower, upper}
# operations contain a list of operations that will be performed on the resulting metric(s)
operations:
# action defines the type of operation that will be performed, see examples below for more details
- action: {add_label, update_label, delete_label_value, toggle_scalar_data_type, experimental_scale_value, aggregate_labels, aggregate_label_values}
# label specifies the label to operate on
label: <label>
# new_label specifies the updated name of the label; if action is add_label, new_label is required
new_label: <new_label>
# aggregated_values contains a list of label values that will be aggregated; if action is aggregate_label_values, aggregated_values is required
aggregated_values: [values...]
# new_value specifies the updated name of the label value; if action is add_label or aggregate_label_values, new_value is required
new_value: <new_value>
# label_value specifies the label value for which points should be deleted; if action is delete_label_value, label_value is required
label_value: <label_value>
# label_set contains a list of labels that will remain after aggregation; if action is aggregate_labels, label_set is required
label_set: [labels...]
# aggregation_type defines how data points will be aggregated; if action is aggregate_labels or aggregate_label_values, aggregation_type is required
aggregation_type: {sum, mean, min, max, count, median}
# experimental_scale specifies the scalar to apply to values. Scaling exponential histograms inherently involves some loss of accuracy.
experimental_scale: <scalar>
# value_actions contain a list of operations that will be performed on the selected label
value_actions:
# value specifies the value to operate on
- value: <current_label_value>
# new_value specifies the updated value
new_value: <new_label_value>
# create host.cpu.utilization from host.cpu.usage
include: host.cpu.usage
action: insert
new_name: host.cpu.utilization
operations:
...
# create host.cpu.utilization from host.cpu.usage where we have metric label "container=my_container"
include: host.cpu.usage
action: insert
new_name: host.cpu.utilization
match_type: strict
experimental_match_labels: {"container": "my_container"}
operations:
...
# create host.cpu.utilization from host.cpu.usage where we have metric label pod with non-empty values
include: host.cpu.usage
action: insert
new_name: host.cpu.utilization
match_type: regexp
experimental_match_labels: {"pod": "(.|\\s)*\\S(.|\\s)*"}
operations:
...
# rename system.cpu.usage to system.cpu.usage_time
include: system.cpu.usage
action: update
new_name: system.cpu.usage_time
# rename all system.cpu metrics to system.processor.*.stat
# instead of regular $ use double dollar $$. Because $ is treated as a special character.
# wrap the group name/number with braces
include: ^system\.cpu\.(.*)$$
match_type: regexp
action: update
new_name: system.processor.$${1}.stat
# for system.cpu.usage_time, add label `version` with value `opentelemetry collector vX.Y.Z` to all points
include: system.cpu.usage
action: update
operations:
- action: add_label
new_label: version
new_value: opentelemetry collector {{version}}
# for all system metrics, add label `version` with value `opentelemetry collector vX.Y.Z` to all points
include: ^system\.
match_type: regexp
action: update
operations:
- action: add_label
new_label: version
new_value: opentelemetry collector {{version}}
# for system.cpu.usage_time, rename the label state to cpu_state
include: system.cpu.usage
action: update
operations:
- action: update_label
label: state
new_label: cpu_state
# for all system.cpu metrics, rename the label state to cpu_state
include: ^system\.cpu\.
action: update
operations:
- action: update_label
label: state
new_label: cpu_state
# rename the label value slab_reclaimable to sreclaimable, slab_unreclaimable to sunreclaimable
include: system.memory.usage
action: update
operations:
- action: update_label
label: state
value_actions:
- value: slab_reclaimable
new_value: sreclaimable
- value: slab_unreclaimable
new_value: sunreclaimable
# deletes all data points with the label value 'idle' of the label 'state'
include: system.cpu.usage
action: update
operations:
- action: delete_label_value
label: state
label_value: idle
# toggle the datatype of cpu usage from int (the default) to double
include: system.cpu.usage
action: update
operations:
- action: toggle_scalar_data_type
# experimental_scale CPU usage from seconds to milliseconds
include: system.cpu.usage
action: update
operations:
- action: experimental_scale_value
experimental_scale: 1000
# aggregate away all labels except `state` using summation
include: system.cpu.usage
action: update
operations:
- action: aggregate_labels
label_set: [ state ]
aggregation_type: sum
NOTE: Only the sum
aggregation function is supported for histogram and exponential histogram datatypes.
# aggregate data points with state label value slab_reclaimable & slab_unreclaimable using summation into slab
include: system.memory.usage
action: update
operations:
- action: aggregate_label_values
label: state
aggregated_values: [ slab_reclaimable, slab_unreclaimable ]
new_value: slab
aggregation_type: sum
NOTE: Only the sum
aggregation function is supported for histogram and exponential histogram datatypes.
# convert a set of metrics for each http_method into a single metric with an http_method label, i.e.
#
# Web Service (*)/Total Delete Requests iis.requests{http_method=delete}
# Web Service (*)/Total Get Requests > iis.requests{http_method=get}
# Web Service (*)/Total Post Requests iis.requests{http_method=post}
include: ^Web Service \(\*\)/Total (?P<http_method>.*) Requests$
match_type: regexp
action: combine
new_name: iis.requests
submatch_case: lower
operations:
...
# Group metrics from one single ResourceMetrics and report them as multiple ResourceMetrics.
#
# ex: Consider pod and container metrics collected from Kubernetes. Both the metrics are recorded under one ResourceMetric
# applying this transformation will result in two separate ResourceMetric packets with corresponding resource labels in the resource headers
#
# instead of regular $ use double dollar $$. Because $ is treated as a special character.
- include: ^k8s\.pod\.(.*)$$
match_type: regexp
action: group
group_resource_labels: {"resource.type": "k8s.pod", "source": "kubelet"}
- include: ^container\.(.*)$$
match_type: regexp
action: group
group_resource_labels: {"resource.type": "container", "source": "kubelet"}
Metric Transform Processor vs. Attributes Processor for Metrics
Regarding metric support, these two processors have overlapping functionality. They can both do simple modifications of metric attribute key-value pairs. As a general rule the attributes processor has more attribute related functionality, while the metrics transform processor can do much more data manipulation. The attributes processor is preferred when the only needed functionality is overlapping, as it natively uses the official OpenTelemetry data model. However, if the metric transform processor is already in use or its extra functionality is necessary, there's no need to migrate away from it.
Shared functionality
- Add attributes
- Update values of attributes
Attribute processor specific functionality
- delete
- hash
- extract
Metric transform processor specific functionality
- Rename metrics
- Delete data points
- Toggle data type
- Scale value
- Aggregate across label sets
- Aggregate across label values