-
Notifications
You must be signed in to change notification settings - Fork 5.5k
stats: add ability to view histogram buckets to the /stats endpoint #19586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
96ca9f8
cc874ef
bdf901f
0b90dc3
cea16a5
1c5b193
d00801c
c010782
e0d0700
3a7e6c9
00c963c
2927e3d
acb92d5
d2dbe87
7fd1f16
37a57b0
fd8c4bd
c6e0bc0
1e3613d
7401db0
7e03e9b
ba359e4
ba6a6dc
5ba7520
dbf7b27
e392cad
7ec7a8f
1723109
7645c3a
5dbca5d
6e8d867
505aed2
741cc7b
85138ae
842d5ce
3ff6c45
47ca897
9d62d48
b18d0f3
be1b95b
c4369be
3691e2a
c296df9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -69,6 +69,12 @@ class HistogramStatistics { | |
| */ | ||
| virtual const std::vector<uint64_t>& computedBuckets() const PURE; | ||
|
|
||
| /** | ||
| * Returns nonoverlapping version of computedBuckets(). This vector is | ||
| * guaranteed to be the same length as supportedBuckets(). | ||
| */ | ||
| virtual const std::vector<uint64_t> nonoverlappingComputedBuckets() const PURE; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 2 nits :)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry that was the wrong clang link; it's this one that's applicable: https://clang.llvm.org/extra/clang-tidy/checks/readability-const-return-type.html
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||
|
|
||
| /** | ||
| * Returns number of values during the period. This number may be an approximation | ||
| * of the number of samples in the histogram, it is not guaranteed that this will be | ||
|
|
@@ -161,6 +167,11 @@ class ParentHistogram : public Histogram { | |
| * Returns the bucket summary representation. | ||
| */ | ||
| virtual const std::string bucketSummary() const PURE; | ||
|
|
||
| /** | ||
| * Returns the bucket summary representation with nonoverlapping buckets. | ||
| */ | ||
| virtual const std::string nonoverlappingBucketSummary() const PURE; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
remove the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. Removed the |
||
| }; | ||
|
|
||
| using ParentHistogramSharedPtr = RefcountPtr<ParentHistogram>; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,6 +31,18 @@ const std::vector<double>& HistogramStatisticsImpl::supportedQuantiles() const { | |
| {0, 0.25, 0.5, 0.75, 0.90, 0.95, 0.99, 0.995, 0.999, 1}); | ||
| } | ||
|
|
||
| const std::vector<uint64_t> HistogramStatisticsImpl::nonoverlappingComputedBuckets() const { | ||
| std::vector<uint64_t> buckets; | ||
| buckets.reserve(computed_buckets_.size()); | ||
| uint64_t previous_computed_bucket = 0; | ||
| for (size_t i = 0; i < computed_buckets_.size(); ++i) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit;
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||
| uint64_t current_computed_bucket = computed_buckets_[i]; | ||
| buckets.push_back(current_computed_bucket - previous_computed_bucket); | ||
| previous_computed_bucket = current_computed_bucket; | ||
| } | ||
| return buckets; | ||
| } | ||
|
|
||
| std::string HistogramStatisticsImpl::quantileSummary() const { | ||
| std::vector<std::string> summary; | ||
| const std::vector<double>& supported_quantiles = supportedQuantiles(); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -940,6 +940,26 @@ const std::string ParentHistogramImpl::bucketSummary() const { | |
| } | ||
| } | ||
|
|
||
| const std::string ParentHistogramImpl::nonoverlappingBucketSummary() const { | ||
| if (used()) { | ||
| std::vector<std::string> bucket_summary; | ||
| ConstSupportedBuckets& supported_buckets = interval_statistics_.supportedBuckets(); | ||
| const std::vector<uint64_t> nonoverlapping_interval_buckets = | ||
| interval_statistics_.nonoverlappingComputedBuckets(); | ||
| const std::vector<uint64_t> nonoverlapping_cumulative_buckets = | ||
| cumulative_statistics_.nonoverlappingComputedBuckets(); | ||
| bucket_summary.reserve(supported_buckets.size()); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Paranoia nit: ASSERT here that the 3 array lengths are the same, and make the loop go to their min value?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Created ASSERT and looped to min value |
||
| for (size_t i = 0; i < supported_buckets.size(); ++i) { | ||
| bucket_summary.push_back(fmt::format("B{:g}({},{})", supported_buckets[i], | ||
| nonoverlapping_interval_buckets[i], | ||
| nonoverlapping_cumulative_buckets[i])); | ||
| } | ||
| return absl::StrJoin(bucket_summary, " "); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like we should have this API return the array, and have a separate layer that joins it as strings. For example, if I have a richer HTML display of histograms I might like to be able to render these values graphically.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moreover, I think it'd be great to return a structured result that's an array of triples, and do the formatting in the stats handler (e.g. json-population vs emitting strings manually)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll start looking into this
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added nonoverlappingComputedBuckets() to HistogramStatistics which can be used from the ParentHistogram. It returns an array with the desired values as output. Does this accomplish what you meant? I changed the nonoverlappingBucketSummary() to use the new method. Should I also move the summary to the stats handler? Originally I wrote it here because it is similar to the existing bucketSummary().
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is better but I still think the formatting of this could be removed from the histogram and go into the stats handler class. E.g. one thing I was thinking (in a follow-up after an in-progress PR) is to have HTML-mode actually render the histograms graphically. So the more we do via structured API, and the less we commit to string representations inside the Histogram class, the more flexible we are about how to present the data in the stats handler.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I moved |
||
| } else { | ||
| return std::string("No recorded values"); | ||
| } | ||
| } | ||
|
|
||
| void ParentHistogramImpl::addTlsHistogram(const TlsHistogramSharedPtr& hist_ptr) { | ||
| Thread::LockGuard lock(merge_lock_); | ||
| tls_histograms_.emplace_back(hist_ptr); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a draft PR #19546 which will hopefully be ready soon, where I introduce a 'type=' which makes me wonder about exactly how best to control this.
First -- is there any reason not to always be in histogram_buckets mode? Is there any reason you'd want to see the current mode?
The new PR introduces a '&type=' query-param which has possible values (Counters, Gauges, Histograms, TextReadouts, or All). I'm unsure whether this would best fit as a new category, or as a boolean property which only makes sense if you are displaying histograms.
Also, what are your thoughts about how this parameter works with format=json or format=prometheus? Or does this only make sense for text?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough experience on this to give a good opinion, but I didn't find a use for the current mode.
I think a boolean property would suit this best.
From what I've understood, the current way format=prometheus displays histogram buckets cumulatively is the correct way for prometheus, so it seems like changing this would be unnecessary.
Using format=json outputs the same quantile summary data available from the plain text endpoint for histograms. If the quantile summary is replaced by the histogram_buckets output in the plain text endpoint, I think it should be replaced here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So...can you add a test for the JSON mode as well then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll work on a JSON mode next. Also something that came to mind is that should there be a query parameter or an option of seeing the original overlapping (or cumulative) histogram buckets? It should be easy to implement if there is any use for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to have a query-param for the new mode. The absence of that query-params means you want the old mode. Am I missing something?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see -- there are really 3 choices. Can you just have one query-param with 3 options then (default value being 'none'), rather than bools? I think that'll look better in the UI I've got pending -- see image in description of #18670
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, one query-param with 2 options sounds much better. I think something like this would work, although there might be a better word for nonoverlapping:
/stats?histogram_buckets=cumulative/stats?histogram_buckets=nonoverlappingThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh...a better word for nonoverlapping might be "disjoint"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'll start using this