Skip to content

stats: add ability to view histogram buckets to the /stats endpoint#19586

Merged
ggreenway merged 43 commits intoenvoyproxy:mainfrom
VillePihlava:histogram-buckets-stats-endpoint
Mar 2, 2022
Merged

stats: add ability to view histogram buckets to the /stats endpoint#19586
ggreenway merged 43 commits intoenvoyproxy:mainfrom
VillePihlava:histogram-buckets-stats-endpoint

Conversation

@VillePihlava
Copy link
Contributor

@VillePihlava VillePihlava commented Jan 18, 2022

Commit Message:

Add ability to view histogram buckets to the /stats endpoint through the histogram_buckets query parameter.

Additional Description:

Using /stats?histogram_buckets=cumulative or /stats?histogram_buckets=disjoint will change the output of histograms to a bucket summary with cumulative or disjoint buckets. Can be used with format=json, usedonly, or filter.

Risk Level: Low
Testing: Unit testing, manual testing
Docs Changes: Added documentation about histogram_buckets query parameter.
Release Notes: Added
Platform Specific Features: N/A
Fixes #19378

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@lizan
Copy link
Member

lizan commented Jan 19, 2022

/assign-from @envoyproxy/first-pass-reviewers

@repokitteh-read-only
Copy link

@envoyproxy/first-pass-reviewers assignee is @mathetake

🐱

Caused by: a #19586 (comment) was created by @lizan.

see: more, trace.

@jmarantz jmarantz self-assigned this Jan 19, 2022
Full-string matching can be specified with begin- and end-line anchors. (i.e.
``/stats?filter=^server.concurrency$``)

.. http:get:: /stats?histogram_buckets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a draft PR #19546 which will hopefully be ready soon, where I introduce a 'type=' which makes me wonder about exactly how best to control this.

First -- is there any reason not to always be in histogram_buckets mode? Is there any reason you'd want to see the current mode?

The new PR introduces a '&type=' query-param which has possible values (Counters, Gauges, Histograms, TextReadouts, or All). I'm unsure whether this would best fit as a new category, or as a boolean property which only makes sense if you are displaying histograms.

Also, what are your thoughts about how this parameter works with format=json or format=prometheus? Or does this only make sense for text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First -- is there any reason not to always be in histogram_buckets mode? Is there any reason you'd want to see the current mode?

I don't have enough experience on this to give a good opinion, but I didn't find a use for the current mode.

The new PR introduces a '&type=' query-param which has possible values (Counters, Gauges, Histograms, TextReadouts, or All). I'm unsure whether this would best fit as a new category, or as a boolean property which only makes sense if you are displaying histograms.

I think a boolean property would suit this best.

Also, what are your thoughts about how this parameter works with format=json or format=prometheus? Or does this only make sense for text?

From what I've understood, the current way format=prometheus displays histogram buckets cumulatively is the correct way for prometheus, so it seems like changing this would be unnecessary.

Using format=json outputs the same quantile summary data available from the plain text endpoint for histograms. If the quantile summary is replaced by the histogram_buckets output in the plain text endpoint, I think it should be replaced here too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So...can you add a test for the JSON mode as well then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll work on a JSON mode next. Also something that came to mind is that should there be a query parameter or an option of seeing the original overlapping (or cumulative) histogram buckets? It should be easy to implement if there is any use for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to have a query-param for the new mode. The absence of that query-params means you want the old mode. Am I missing something?

Copy link
Contributor

@jmarantz jmarantz Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see -- there are really 3 choices. Can you just have one query-param with 3 options then (default value being 'none'), rather than bools? I think that'll look better in the UI I've got pending -- see image in description of #18670

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one query-param with 2 options sounds much better. I think something like this would work, although there might be a better word for nonoverlapping:
/stats?histogram_buckets=cumulative
/stats?histogram_buckets=nonoverlapping

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh...a better word for nonoverlapping might be "disjoint"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll start using this

previous_computed_interval_bucket = current_computed_interval_bucket;
previous_computed_cumulative_bucket = current_computed_cumulative_bucket;
}
return absl::StrJoin(bucket_summary, " ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should have this API return the array, and have a separate layer that joins it as strings.

For example, if I have a richer HTML display of histograms I might like to be able to render these values graphically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, I think it'd be great to return a structured result that's an array of triples, and do the formatting in the stats handler (e.g. json-population vs emitting strings manually)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll start looking into this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added nonoverlappingComputedBuckets() to HistogramStatistics which can be used from the ParentHistogram. It returns an array with the desired values as output. Does this accomplish what you meant? I changed the nonoverlappingBucketSummary() to use the new method. Should I also move the summary to the stats handler? Originally I wrote it here because it is similar to the existing bucketSummary().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better but I still think the formatting of this could be removed from the histogram and go into the stats handler class.

E.g. one thing I was thinking (in a follow-up after an in-progress PR) is to have HTML-mode actually render the histograms graphically. So the more we do via structured API, and the less we commit to string representations inside the Histogram class, the more flexible we are about how to present the data in the stats handler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved computeDisjointBucketSummary() to the stats handler.

@mathetake mathetake removed their assignment Jan 19, 2022
@jmarantz
Copy link
Contributor

/wait

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Full-string matching can be specified with begin- and end-line anchors. (i.e.
``/stats?filter=^server.concurrency$``)

.. http:get:: /stats?histogram_buckets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So...can you add a test for the JSON mode as well then?

* Returns nonoverlapping version of computedBuckets(). This vector is
* guaranteed to be the same length as supportedBuckets().
*/
virtual const std::vector<uint64_t> nonoverlappingComputedBuckets() const PURE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 nits :)

  1. https://clang.llvm.org/extra/clang-tidy/checks/readability-avoid-const-params-in-decls.html -- you can just return the vector without making it const in the declaration -- that's a no-op
  2. rename new function to computeNonOverlappingBuckets(). The function is doing the computation and returning the result, not returning an already-existing result, so I think it's better to start the function name with the verb.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry that was the wrong clang link; it's this one that's applicable: https://clang.llvm.org/extra/clang-tidy/checks/readability-const-return-type.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

/**
* Returns the bucket summary representation with nonoverlapping buckets.
*/
virtual const std::string nonoverlappingBucketSummary() const PURE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computeNonOverlappingBucketSummary()

remove the const prefix -- also do this for bucketSummary() which shouldn't have the const prefix either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed the const prefix from quantileSummary() too

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking much better - -still a few comments remain.

Thanks!

std::vector<uint64_t> buckets;
buckets.reserve(computed_buckets_.size());
uint64_t previous_computed_bucket = 0;
for (size_t i = 0; i < computed_buckets_.size(); ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; for (uint64_t computed_bucket : computed_buckets_)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

interval_statistics_.computeDisjointBuckets();
const std::vector<uint64_t> disjoint_cumulative_buckets =
cumulative_statistics_.computeDisjointBuckets();
bucket_summary.reserve(supported_buckets.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paranoia nit: ASSERT here that the 3 array lengths are the same, and make the loop go to their min value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created ASSERT and looped to min value

previous_computed_interval_bucket = current_computed_interval_bucket;
previous_computed_cumulative_bucket = current_computed_cumulative_bucket;
}
return absl::StrJoin(bucket_summary, " ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better but I still think the formatting of this could be removed from the histogram and go into the stats handler class.

E.g. one thing I was thinking (in a follow-up after an in-progress PR) is to have HTML-mode actually render the histograms graphically. So the more we do via structured API, and the less we commit to string representations inside the Histogram class, the more flexible we are about how to present the data in the stats handler.

@jmarantz
Copy link
Contributor

/wait

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
…ectors.

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@jmarantz
Copy link
Contributor

jmarantz commented Feb 4, 2022

@VillePihlava are you going to push this forward any further? I felt like we were converging.

…reate tests, and other small changes.

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
…uckets-stats-endpoint

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@VillePihlava
Copy link
Contributor Author

@jmarantz Sorry for the delay! Had to work on something else for a while. I'll start looking into the JSON output next.

@jmarantz
Copy link
Contributor

jmarantz commented Feb 7, 2022

/wait

@VillePihlava
Copy link
Contributor Author

@jmarantz ValueUtil::numberValue converts all uint64_t values to doubles. I think this is unnecessary for histogram values, but it was also previously in use with counters and gauges (which according to my understanding are also of type uint64_t).

for (const auto& stat : all_stats) {
ProtobufWkt::Struct stat_obj;
auto* stat_obj_fields = stat_obj.mutable_fields();
(*stat_obj_fields)["name"] = ValueUtil::stringValue(stat.first);
(*stat_obj_fields)["value"] = ValueUtil::numberValue(stat.second);
stats_array.push_back(ValueUtil::structValue(stat_obj));
}

I'm pretty new to using anything related to protocol buffers, so it would be nice to know if this is ok. I asked this in a previous comment but wasn't sure if it was already answered.

@jmarantz
Copy link
Contributor

jmarantz commented Feb 22, 2022

Yeah I think the whole JSON serialization machinery is really heavyweight and needs a re-think. But this seems orthogonal to your PR assuming you are not changing it. Maybe another follow-up to avoid pointless conversions and protobuf overhead would be warranted?

For this PR my suggestion is to add comments into the code for your observation and maybe open an issue.

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@VillePihlava
Copy link
Contributor Author

Thanks for the reply! I added some comments and will open up an issue.

@VillePihlava
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #19586 (comment) was created by @VillePihlava.

see: more, trace.

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@VillePihlava
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #19586 (comment) was created by @VillePihlava.

see: more, trace.

@VillePihlava
Copy link
Contributor Author

Found that an issue exists already, so I won't be creating a duplicate: #10411

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@jmarantz
Copy link
Contributor

Needs main merge

@jmarantz
Copy link
Contributor

@ggreenway it probably makes sense for you to be the senior maintainer for this one as @snowp is not available atm.

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@VillePihlava
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #19586 (comment) was created by @VillePihlava.

see: more, trace.

@rojkov
Copy link
Member

rojkov commented Mar 1, 2022

@ggreenway gentle ping

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

/wait

Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
Signed-off-by: Ville Pihlava <ville.pihlava@intel.com>
@ggreenway ggreenway merged commit 848cfe7 into envoyproxy:main Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add detail about histogram buckets to the /stats endpoint

7 participants