-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Metrics family config #2121
Conversation
Metric family metadata consists of information relevant to all metrics with the same name (but with different labels). For example, name, description, and aggregate labels. The metric family metadata is found in two data structures: the _value_map that holds all registered metrics and _metadata that contains only reported metrics. This patch makes the _metadata hold a reference to the data in the _value_map instead of copying it, which serves two purposes: 1. Minor performance gain (memory and computation). 2. It will now be possible for a change in _value_map to be reflected in _metadata. Signed-off-by: Amnon Heiman <[email protected]>
8881bdc
to
c8dff12
Compare
@nyh can you take a look? |
Can you please explain what does the value of "aggregate_label" do? What does it mean for it to be "lb" or "ll","aa"? Also, please explain if these aggregations sum all the values for the same metric - including values for different shards (which are summed to get a per-node value), or does it happen on a per-shard basis and we are still left with shard labels even after the aggregation? |
include/seastar/core/metrics_api.hh
Outdated
@@ -333,7 +333,7 @@ using metric_metadata_fifo = std::deque<metric_info>; | |||
* each of the metric. | |||
*/ | |||
struct metric_family_metadata { | |||
metric_family_info mf; | |||
metric_family_info& mf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment saying that this holds a pointer into _value_map.
It's important to know what is the lifetime of this object.
* \brief metrics_family_config allow changing metrics family configuration | ||
* | ||
* Allow changing metrics family configuration | ||
* Supports changing the aggregation labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this sentence is saying - "changing the aggregation labels"? Are you changing labels? Or the aggregation of values with different labels?
* | ||
* name - optional exact metric name | ||
* regex_name - if set, all the metrics name that match the regular expression | ||
* aggregate_labels - The labels to aggregate the metrics by. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand (see my comment on the commit message too) what aggregate_labels="x","y" means:
Does it mean you aggregate (sum?) the value with label x and label y and... where do you put the sum?
Or maybe it means that for each of x and y, you aggregate many different values that might have different other labels (does this include different shards?) and the result is one value for x and one value for y?
Or does it means something else?
Please explain this more clearly, because I don't have any intuitive grasp of what the concepts of "metric families" or "label aggregation" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading your test in the last patch, I think I understand better: It turns out labels have a name and a value - it's not that we have a metric with a label x - we can have one metric with the label "x=dog" and another one with "x=cat". So apparently aggregation will sum these two metrics and create one metric without (?) the label x at all. I still don't understand, though, what happens with other labels: What if we have the following four metrics:
some_metric[x=dog, z=hello] = 1
some_metric[x=cat, z=hello] = 2
some_metric[x=dog, z=hi] = 3
some_metric[x=cat, z=yo] = 4
How do we aggregate it over "x"? Do we sum all of those metrics and drop all labels? Or will we have three metrics for the three different "z" label values (i.e., just the first two metrics are summed)? This makes sense, but is it documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, just for context, we already support aggregate by label, but currently, you have to configure it when you register the metrics, and it cannot be changed.
Aggregation works by dropping one or more labels and summing over all metrics that now have the same name and label.
Using your example, let's drop x
some_metric[ z=hello] = 1
some_metric[ z=hello] = 2
some_metric[ z=hi] = 3
some_metric[ z=yo] = 4
Now we sum and get:
some_metric[ z=hello] = 3
some_metric[ z=hi] = 3
some_metric[ z=yo] = 4
This is not new, just it's run-time configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must have been sleeping under a rock when you added it.
That being said, all of this should be documented somewhere. I also found the names of the fields and their explanations to be confusing. It says "aggregate_labels - The labels to aggregate the metrics by.". In the above example, we are not aggregating "by x" - in fact in some sense it's the opposite - we're dropping x, and aggregating the metrics by the other labels.
include/seastar/core/metrics_api.hh
Outdated
|
||
/* | ||
* \brief return the current metrics_family_config | ||
* This function returns a vector of the current metrics relabel configs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-pasto mentions "relabel".
@@ -130,7 +130,20 @@ future<metric_relabeling_result> set_relabel_configs(const std::vector<relabel_c | |||
const std::vector<relabel_config>& get_relabel_configs() { | |||
return impl::get_local_impl()->get_relabel_configs(); | |||
} | |||
void impl::impl::update_aggregate(metric_family_info& mf) const noexcept { | |||
for (const auto& fc : _metrics_family_configs) { | |||
if (fc.name == mf.name || fc.regex_name.match(mf.name)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you need to check regex_name.empty() here? Or match() is correct (and efficient enough) in the empty case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that it was efficient enough and the update_aggregate is a rare operation
I saw you already figured things out, but I'll add a context. The main usage is dropping the shard label and reporting a metric per node, but it can also be used to aggregate over any other label (more info on the Prometheus enhancements #1120 ). While we can modify configuration that is per metric instance (e.g. name and labels) using the metrics_relabel, we don't have the option to change in run-time the aggregate labels, because this is a feature that belongs to the metric family. This is what the series is about. Changing in run-time the metric family configuration. |
All of this should be documented somewhere (not in a discussion in this PR). I see I figured out the right meaning of this stuff after I finished reviewing the entire series including the test, but I don't think users should read tests to guess what an API does. By the way, I see you have classes |
You are right about the documentation, for the user-facing part (the relabel and metrics family config) there's already an open issue around that |
About the naming, the plural is intended because it could be multiple metrics, multiple families |
c8dff12
to
de58ead
Compare
The plural of "metric family" is "metric families", not "metrics famly". |
Ok, then we can have this PR not solve this problem - just please say that and add a reference to that issue. |
You are right that it requires documentation, but it does not add functionality, it adds an option to modify the current functionality in run-time. I've opened #2140 |
This patch adds support for metric family config. A metric family config is a configuration that relates to all the metrics with the same name (but with different labels). The configuration would allow changing the labels aggregation for metrics. Besides the configuration class, which is similar to metrics_relabel_config, two enhancements were added to the relabel_config_regex: 1. empty() which check if it's empty or not. 2. match() that checks if the regex is nonempty and matches a string without returning the matched group. Signed-off-by: Amnon Heiman <[email protected]>
Family config is a configuration that relates to all metrics with the same name but with different labels values. set_metric_family_configs allows changing that configuration during run time. Specifically, change the label aggregation based on a metric name. The following is an example for setting the aggregate labels for the metric test_gauge_1 and all metrics matching the regex test_gauge1.*: std::vector<sm::metric_family_config> fc(2); fc[0].name = "test_gauge_1"; fc[0].aggregate_labels = { "lb" }; fc[1].regex_name = "test_gauge1.*"; fc[1].aggregate_labels = { "ll", "aa" }; sm::set_metric_family_configs(fc); Signed-off-by: Amnon Heiman <[email protected]> include/seastar/core/metrics_api.hh
This patch adds a test for the family config API. it register metrics and change their aggregate labels. It check that both metrics that create before and after the call to set_metric_family_configs works as expected. Signed-off-by: Amnon Heiman <[email protected]>
de58ead
to
9a80dce
Compare
This series adds metrics family config. Family config is a configuration that relates to all metrics with the same name but with different labels. set_metrics_family_configs allows changing that configuration during run time. Specifically, change the label aggregation based on a metric name. Using multiple independent labels on the same metric is a common practice. A known drawback is that this cartesian product makes it easy to discover in real-time that the number of metrics explodes. While metrics relabel config allows dropping some of those metrics, it could be useful to report an aggregated metric. The following is an example for setting the aggregate labels for the metric test_gauge_1 and all metrics matching the regex test_gauge1.* ``` std::vector<sm::metrics_family_config> fc(2); fc[0].name = "test_gauge_1"; fc[0].aggregate_labels = { "lb" }; fc[1].regex_name = "test_gauge1.*"; fc[1].aggregate_labels = { "ll", "aa" }; sm::set_metrics_family_configs(fc); ``` An alternative to scylladb#1966 Closes scylladb#2121 * github.com:scylladb/seastar: tests/unit/metrics_test.cc: Test metrics family config metrics: Adds metric_family_config support core/relabel_config.hh: support metric family config metrics_api.hh: Remove duplication of metric family metadata
This series adds metrics family config.
Family config is a configuration that relates to all metrics with the same name but with different labels.
set_metrics_family_configs allows changing that configuration during run time. Specifically, change the label aggregation based on a metric name.
Using multiple independent labels on the same metric is a common practice. A known drawback is that this cartesian product makes it easy to discover in real-time that the number of metrics explodes.
While metrics relabel config allows dropping some of those metrics, it could be useful to report an aggregated metric.
The following is an example for setting the aggregate labels for the metric test_gauge_1 and all metrics matching the regex test_gauge1.*
An alternative to #1966