Add stats to record latency metrics for each response code in NH by qqustc · Pull Request #381 · envoyproxy/nighthawk

qqustc · 2020-06-22T22:49:35Z

Add the following latency histogram in benchmark_client_impl :

benchmark_http_client.latency_1xx | SinkableHdrStatistic | Latency (in Nanosecond) histogram of request with code 1xx
benchmark_http_client.latency_2xx | SinkableHdrStatistic | Latency (in Nanosecond) histogram of request with code 2xx
benchmark_http_client.latency_3xx | SinkableHdrStatistic | Latency (in Nanosecond) histogram of request with code 3xx
benchmark_http_client.latency_4xx | SinkableHdrStatistic | Latency (in Nanosecond) histogram of request with code 4xx
benchmark_http_client.latency_5xx | SinkableHdrStatistic | Latency (in Nanosecond) histogram of request with code 5xx
benchmark_http_client.latency_xxx | SinkableHdrStatistic | Latency (in Nanosecond) histogram of request with code <100 or >=600

Linked Issues: #344

Testing: unit tests
Docs Changes: Add docs/root/statistics.md

Signed-off-by: Qin Qin qqin@google.com

…metrics in NH (#1) * Create statistics.md * Update benchmark_client_impl.cc * Update benchmark_client_impl.h * Update stream_decoder.cc * Update stream_decoder.h * Update BUILD * Update benchmark_http_client_test.cc * Update test_integration_basics.py * Update stream_decoder_test.cc * local commit 1 Signed-off-by: qqustc <qqin@google.com> Commit from master Signed-off-by: qqustc <qqustc@gmail.com>

Signed-off-by: qqustc <qqin@google.com> Signed-off-by: qqustc <qqustc@gmail.com>

Signed-off-by: qqustc <qqin@google.com>

oschaaf · 2020-06-24T16:32:07Z

Could the abrupt socket termination complaints in the longer running ci tasks be associated to a subprocess being killed because of the closing/re-opening of the PR here? /retest

qqustc · 2020-06-24T16:36:41Z

ociated to a subprocess being killed becau

I don't think so. When I closed and reopened the PR, there was no CI tasks running. The CI tasks only started to run after I made the last commit.

oschaaf · 2020-06-24T16:37:23Z

Well apparently /retest doesn’t work for some reason, I’ll respin the jobs manually then.
The ci tasks failed after a couple of minutes, I have never seen that.

oschaaf · 2020-06-24T16:39:03Z

(In instances where they flake as observed in other cases, they have been running much longer)

oschaaf · 2020-06-24T16:42:09Z

CircleCI doesn’t allow me to restart the jobs in its ui, those options are grayed out.

oschaaf

Thanks, posting a round of comments and requesting feedback on an idea in there.

source/client/benchmark_client_impl.h

oschaaf · 2020-06-24T21:14:30Z

source/client/benchmark_client_impl.h

  COUNTER(pool_overflow)                                                                           \
-  COUNTER(pool_connection_failure)
+  COUNTER(pool_connection_failure)                                                                 \
+  COUNTER(total_req_sent)                                                                          \


How is this different from upstream_rq_total tracked by Envoy?

it seams upstream_rq_total is per cluster (I'm not sure what cluster means in Nighthawk) and here total_req_sent is per worker.
If there is no difference between those 2 counters to the user, we should reuse upstream_rq_total.

We set up a single cluster per worker today, so I think effectively the counters are tracking the same thing.

Thanks Otto for the explanation! Decided not to add total_req_sent since it is equivalent to upstream_rq_total .

oschaaf · 2020-06-24T21:17:47Z

source/client/benchmark_client_impl.cc

+  if (response_code > 199 && response_code <= 299) {
+    benchmark_client_stats_.latency_on_success_req_.recordValue(latency_us);
+  } else {
+    benchmark_client_stats_.latency_on_error_req_.recordValue(latency_us);


I wonder if this is a one size fits all. This might often suffice, but could there be a case where one would like to control the status code aggregation more? E.g. bucketize 4xx and 5xx? One might be a missing file, handled by the webserver itself, whereas the other could be an application crashing out, both would have their own latency characteristics.

Agree that we should add separate latency metrics for each code if it provides more value.
Is it better to have all (latency_1xx, latency_2xx, latency_3xx, latency_4xx, latency_5xx, latency_xxx) for uniformity rather than just adding (latency_4xx, latency_5xx)?

The previous reasoning was to wait after Envoy support metrics with key-value label to add a single latency with different codes as label and currently only add 2 to save memory (if that is a concern).

That would be consistent with the counters we track too (and actually make them sort of redundant?).

Obviously out of scope here, but still, I'm kind of wondering if we could/should somehow attempt to generalise the concept of grouping metrics based on user-provided configuration. If we could do that today, the initial grouping proposed here would be a nice conservative default. So, thinking about it, maybe it is providing the means to change the default that I was concerned about here, and not so much the specific default. Maybe we shouldn't be discussing that here in this PR, but elsewhere :-) I'll defer to you for the specific default we choose here.

Thanks Otto. Added latency statistics latency_1xx, latency_2xx, latency_3xx, latency_4xx, latency_5xx, latency_xxx.

oschaaf · 2020-06-24T21:18:46Z

source/client/benchmark_client_impl.h

-  COUNTER(pool_connection_failure)
+  COUNTER(pool_connection_failure)                                                                 \
+  COUNTER(total_req_sent)                                                                          \
+  HISTOGRAM(latency_on_success_req, Microseconds)                                                  \


Maybe include http_status in the name, as that's what is being used to define success/error?

source/client/benchmark_client_impl.cc

oschaaf · 2020-06-24T21:24:05Z

source/client/stream_decoder.cc

+                                      time_source_.monotonicTime() - request_start_)
+                                      .count();
+      decoder_completion_callback_.exportLatency(stream_info_.response_code_.value(), latency_us);
+    }


Would a log line add value here in an else clause? That might need to guard against high frequency logging, but there might be something special going on when we'd hit that?

Good point. Will add a sparse log.

source/client/benchmark_client_impl.h

mum4k

Partial review (documentation only for now).

mum4k · 2020-06-24T21:35:26Z

docs/root/statistics.md

+-----| ----- | ----------------
+total_req_sent | Counter | Total number of requests sent from Nighthawk
+http_xxx | Counter | Total number of response with code xxx
+stream_resets | Counter | Total number of sream reset


typo: stream

mum4k · 2020-06-24T21:36:18Z

docs/root/statistics.md

+Name | Type | Description
+-----| ----- | ----------------
+total_req_sent | Counter | Total number of requests sent from Nighthawk
+http_xxx | Counter | Total number of response with code xxx


Would it be more useful to list the counters explicitly? How many are we planning to export, is it too many?

If we do list them explicitly, users can use this as a reference when looking for the complete list of metrics.

Updated to list the counters explicitly. The plan is to convert all counters with different codes into a single streamz in customized Sinks (similar to what is in Monarch Sink).

mum4k · 2020-06-24T21:37:44Z

docs/root/statistics.md

+# Nighthawk Statistics
+
+## Background
+Currently Nighthawk only outputs metrics at the end of a test run. There are no metrics streamed during a test run. In order to collect all the useful Nighthawk metrics, we plan to instrument Nighthawk with the functionality to stream its metrics.


Not sure if the future tense is a good choice here. We will end up implementing this at some point and this document will still talk about it as something we only plan to work on.

How would you feel about writing this document as a documentation for metrics that are already supported (as in already done) and if you want we could add a TODO or a note that this is work in progress. We can remove that note once done.

Thanks Jakub for the suggestion! Agreed that it is better to write this document as a documentation for metrics that are already supported.

Updated the doc to document metrics that are already supported and added a note saying The work to stream its metrics is in progress.

mum4k · 2020-06-24T21:38:45Z

docs/root/statistics.md

+In Envoy, the stat [Store](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/include/envoy/stats/store.h#L29) is a singleton and provides a simple interface by which the rest of the code can obtain handles to scopes, counters, gauges, and histograms. Envoy counters and gauges are periodically (configured at ~5 sec interval) flushed to the sinks. Note that currently histogram values are sent directly to the sinks. A stat [Sink](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/include/envoy/stats/sink.h#L48) is an interface that takes generic stat data and translates it into a backend-specific wire format. Currently Envoy supports the TCP and UDP [statsd](https://github.com/b/statsd_spec) protocol (implemented in [statsd.h](https://github.com/envoyproxy/envoy/blob/master/source/extensions/stat_sinks/common/statsd/statsd.h)). Users can create their own Sink subclass to translate Envoy metrics into backend-specific format.
+
+Envoy metrics can be defined using a macro, e.g.
+```


Do we want to add syntax highlighting here, i.e. "cc" instead of just ""?

Thanks @mum4k for the note. This is really useful to know. Done.

mum4k · 2020-06-24T21:42:32Z

docs/root/statistics.md

+
+
+## Metrics Export in Nighthawk
+Currently a single Nighthawk can run with multiple workers. In the future, Nighthawk will be extended to be able to run multiple instances together. Since each Nighthawk worker sends requests independently, we decide to export per worker level metrics since it provides several advantages over global level metrics (aggregated across all workers). Notice that *there are existing Nighthawk metrics already defined at per worker level* ([example](https://github.com/envoyproxy/nighthawk/blob/bc72a52efdc489beaa0844b34f136e03394bd355/source/client/benchmark_client_impl.cc#L61)).


possible typo: "we decided".

mum4k · 2020-06-24T21:42:48Z

docs/root/statistics.md

@@ -0,0 +1,63 @@
+# Nighthawk Statistics


Can we format this document so that each line is at most 80 characters?

Signed-off-by: qqustc <qqin@google.com>

…Statistic Signed-off-by: Qin Qin <qqin@google.com> Signed-off-by: qqustc <qqustc@gmail.com> Signed-off-by: Qin Qin <qqin@google.com>

Signed-off-by: Qin Qin <qqin@google.com>

mum4k

I note this PR is adding a Submodule called "nighthawk". Is that intentional and can you help me understand what it is for?

mum4k · 2020-07-17T17:22:02Z

source/client/benchmark_client_impl.h

 };

+// For histogram metrics, Nighthawk Statistic is used instead of Envoy Histogram.
+struct BenchmarkClientStatistic {


What is the relationship between struct BenchmarkClientStats and struct BenchmarkClientStatistic? They have very similar names, yet don't look directly related.

Can we improve the readability by either or both:

choosing better name for the new struct, one that would clearly distinguish it from the existing one.

adding a comment here and above struct BenchmarkClientStats explaining the difference.

Added a comment explaining the difference.

For counter metrics, Nighthawk use Envoy Counter directly. For histogram metrics, Nighthawk use its own Statistic instead of Envoy Histogram. Here BenchmarkClientStats contains only counters while BenchmarkClientStatistic contains only histograms.

I think the comment helps here, but not elsewhere in the code where this exists. Could we rename these BenchmarkClientCounters and BenchmarkClientHistograms? Or something else that distinguishes their purposes in the names themselves.

Thanks Nate. Renamed BenchmarkClientStats to BenchmarkClientCounters to avoid confusion (while keep BenchmarkClientStatistic unchanged since some of its members are StreamingStatistic which is not exactly histograms)

Makes sense to me. Thanks for the effort here!

mum4k · 2020-07-17T17:26:00Z

source/client/benchmark_client_impl.h

  StatisticPtr response_statistic_;
  StatisticPtr response_header_size_statistic_;
  StatisticPtr response_body_size_statistic_;
+  StatisticPtr latency_1xx_statistic_;


Why do we list these here individually? Can we reuse the struct BenchmarkClientStatistic defined above?

Yes, changed to reuse BenchmarkClientStatistic struct here.

mum4k · 2020-07-17T17:29:09Z

source/client/factories_impl.cc

    Envoy::Upstream::ClusterManagerPtr& cluster_manager,
    Envoy::Tracing::HttpTracerSharedPtr& http_tracer, absl::string_view cluster_name,
-    RequestSource& request_generator) const {
+    const int sink_stat_prefix, RequestSource& request_generator) const {


We probably don't need to mark an int argument const, since int being a basic type is immutable (passed by value).

mum4k · 2020-07-17T17:37:20Z

source/client/factories_impl.cc

  // NullStatistic).
  // TODO(#292): Create options and have the StatisticFactory consider those when instantiating
  // statistics.
+  BenchmarkClientStatistic statistic;


Would it make sense to define a constructor for this struct right next to it and just call the constructor here?

Thanks for the suggestion. Added a constructor for the struct. Yes, this makes the code simpler.

mum4k · 2020-07-17T17:38:16Z

source/client/factories_impl.h

                            Envoy::Upstream::ClusterManagerPtr& cluster_manager,
                            Envoy::Tracing::HttpTracerSharedPtr& http_tracer,
-                            absl::string_view cluster_name,
+                            absl::string_view cluster_name, const int sink_stat_prefix,


Comment about unnecessary const qualifier next to an int argument applies here as well.

mum4k · 2020-07-17T17:40:45Z

test/benchmark_http_client_test.cc

              {{":scheme", "http"}, {":method", "GET"}, {":path", "/"}, {":host", "localhost"}}));
      return std::make_unique<RequestImpl>(header);
    };
+    statistic_.connect_statistic = std::make_unique<StreamingStatistic>();


Comment about having a constructor next to the struct itself applies here too. Assuming that the differences between the two call sites can be parameterized.

mum4k · 2020-07-17T17:42:20Z

test/benchmark_http_client_test.cc

+  EXPECT_EQ(10, client_->statistics()["benchmark_http_client.latency_2xx"]->count());
+}
+
+TEST_F(BenchmarkClientHttpTest, ExportSuccessLatency) {


I note the test coverage for the latency metrics.

How are we testing population and propagation of the other fields on struct BenchmarkClientStatistic?

The tests on connect_statistic and response_statistic are already in test case EnableLatencyMeasurement. Added test for response_header_size_statistic and response_body_size_statistic there too.

Signed-off-by: Qin Qin <qqin@google.com>

qqustc · 2020-07-21T14:00:13Z

I note this PR is adding a Submodule called "nighthawk". Is that intentional and can you help me understand what it is for?

Thanks Jakub. This was not intentional and it has been deleted.

mum4k · 2020-07-21T19:42:38Z

Thanks for implementing the changes @qqustc.

@dubious90 please review and assign back to me once done.
@oschaaf please take another look as there were significant changes after your review.

oschaaf

It looks like we're now always instantiating SinkableHdrStatistic instances from the factory, but there's no sink config involved. I'm curious: how will that behave? Do we need to consider adding options in this PR to instantiate SinkableXXXStatistic or just XXXStatistic before merging this? E.g. when an option like --sink <address:port> is given?

qqustc · 2020-07-21T20:46:18Z

It looks like we're now always instantiating SinkableHdrStatistic instances from the factory, but there's no sink config involved. I'm curious: how will that behave? Do we need to consider adding options in this PR to instantiate SinkableXXXStatistic or just XXXStatistic before merging this? E.g. when an option like --sink <address:port> is given?

Thanks Otto. When there is no sink configured, SinkableHdrStatistic just behave the same as HdrStatistic (deliverHistogramToSinks in SinkableHdrStatistic::recordValue does nothing). So we don't need to add option in this PR to decide whether instantiate SinkableXXXStatistic or just XXXStatistic.

oschaaf · 2020-07-21T20:48:00Z

Thanks Otto. When there is no sink configured, SinkableHdrStatistic just behave the same as HdrStatistic (deliverHistogramToSinks in SinkableHdrStatistic::recordValue does nothing). So we don't need to add option in this PR to decide whether instantiate SinkableXXXStatistic or just XXXStatistic.

Great, thanks for the explanation.

dubious90 · 2020-07-22T16:04:34Z

docs/root/statistics.md

+benchmark_http_client.latency_5xx | HdrStatistic | Latency (in Nanosecond) histogram of request with code 5xx	
+benchmark_http_client.latency_xxx | HdrStatistic | Latency (in Nanosecond) histogram of request with code <100 or >=600
+benchmark_http_client.queue_to_connect | HdrStatistic | Histogram of request connection time	(in Nanosecond)
+benchmark_http_client.request_to_response | HdrStatistic | Latency (in Nanosecond) histogram of all requests (include requests with stream reset or pool failure)


"histogram of all requests" feels like the wrong clarification here. All of the histographs deal with all of the requests, right? Would this be better as "histogram of full request time"?

All other latency_*** histograms only collect values for requests (where response with a code successfully returned), while the request_to_response histogram also include requests with stream reset or pool failure.
Updated to Latency (in Nanosecond) histogram include requests with stream reset or pool failure

Understood. Thanks for the clarification.

dubious90 · 2020-07-22T16:08:20Z

docs/root/statistics.md

+benchmark_http_client.latency_xxx | HdrStatistic | Latency (in Nanosecond) histogram of request with code <100 or >=600
+benchmark_http_client.queue_to_connect | HdrStatistic | Histogram of request connection time	(in Nanosecond)
+benchmark_http_client.request_to_response | HdrStatistic | Latency (in Nanosecond) histogram of all requests (include requests with stream reset or pool failure)
+benchmark_http_client.response_header_size | StreamingStatistic | Statistic of response header size (min, max, mean, pstdev values in bytes)


It's not clear to me reading this what these statistics are going to look like. If HdrStatistic is a histogram, what is StreamingStatistic? If it's also a histogram, how does it differ from HdrStatistic?

Both StreamingStatistic and HdrStatistic are implementations of NH Statistic. The difference is HdrStatistic is a "real" histogram (it stores the distributions and provide detailed percentile values) while StreamingStatistic is not exactly a histogram (it only provide min, max, mean, pstdev values and does not have percentile values)

As shown in this table, different NH metrics choose different implementations of NH Statistic.

dubious90 · 2020-07-22T16:11:22Z

docs/root/statistics.md

+In Envoy, the stat
+[Store](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/include/envoy/stats/store.h#L29)
+is a singleton and provides a simple interface by which the rest of the code can
+obtain handles to scopes, counters, gauges, and histograms. Envoy counters and


it sticks out that this is undefined, while counters, gauges, and histograms are defined above.

Store is not a type of metrics so I decide not to list its definition above with other metrics. Since it is a (common) Envoy concept and its definition is not closely relevant in the NH Statistics doc, so only a link to the code definition is provided here.

Sorry I was unclear here. I was actually referring to scopes. I realize that scopes are probably not metrics themselves, but I'm not sure what they are in this context, and unlike store, we don't have a link to envoy documentation.

Added a code link to scopes too.

dubious90 · 2020-07-22T16:14:57Z

docs/root/statistics.md

+Users can create their own Sink subclass to translate Envoy metrics into
+backend-specific format.	
+
+Envoy metrics can be defined using a macro, e.g.	


This example is helpful, but also opens up other questions. We're defining metrics here, but they're just names. How are they calculated? Can you add a second code snippet below it, explaining how to collect stats?

looks great. thank you

dubious90 · 2020-07-22T16:17:10Z

docs/root/statistics.md

+Currently Envoy metrics don't support key-value map. As a result, for metrics to
+be broken down by certain dimensions, we need to define a separate metric for
+each dimension. For example, currently Nighthawk defines separate
+[counters](https://github.com/envoyproxy/nighthawk/blob/master/source/client/benchmark_client_impl.h#L35-L40)


nit: I think your intention would be clearer if you changed what you're bracketing:
currently Nighthawk defines [separate counters]

dubious90 · 2020-07-22T16:31:24Z

source/client/benchmark_client_impl.cc

 namespace Nighthawk {
 namespace Client {

+BenchmarkClientStatistic::BenchmarkClientStatistic(BenchmarkClientStatistic& statistic)


I'm probably misunderstanding, but this appears to be a copy constructor that moves all of the underlying fields, which seems unideal.

Thanks Nate for notice this. Changed to move constructor.

dubious90 · 2020-07-22T17:31:10Z

source/client/benchmark_client_impl.cc


+void BenchmarkClientHttpImpl::exportLatency(const uint32_t response_code,
+                                            const uint64_t latency_ns) {
+  if (response_code > 99 && response_code <= 199) {


optional: it seems very odd to me to notate this expression this way. It's more conventional to do:
response_code >= 100 && response_code < 200

ah, I was following the code here.

nighthawk/source/client/benchmark_client_impl.cc

Lines 140 to 152 in c57811f

if (status > 99 && status <= 199) {

benchmark_client_stats_.http_1xx_.inc();

} else if (status > 199 && status <= 299) {

benchmark_client_stats_.http_2xx_.inc();

} else if (status > 299 && status <= 399) {

benchmark_client_stats_.http_3xx_.inc();

} else if (status > 399 && status <= 499) {

benchmark_client_stats_.http_4xx_.inc();

} else if (status > 499 && status <= 599) {

benchmark_client_stats_.http_5xx_.inc();

} else {

benchmark_client_stats_.http_xxx_.inc();

}

dubious90 · 2020-07-22T17:32:36Z

source/client/benchmark_client_impl.h

  COUNTER(pool_overflow)                                                                           \
  COUNTER(pool_connection_failure)

+// For counter metrics, Nighthawk use Envoy Counter directly. For histogram metrics, Nighthawk use


nit: grammar - Nighthawk uses

dubious90 · 2020-07-22T17:34:29Z

source/client/benchmark_client_impl.h

 };

+// For histogram metrics, Nighthawk Statistic is used instead of Envoy Histogram.
+struct BenchmarkClientStatistic {


I think the comment helps here, but not elsewhere in the code where this exists. Could we rename these BenchmarkClientCounters and BenchmarkClientHistograms? Or something else that distinguishes their purposes in the names themselves.

dubious90 · 2020-07-22T17:36:31Z

source/client/factories_impl.cc

    Envoy::Upstream::ClusterManagerPtr& cluster_manager,
    Envoy::Tracing::HttpTracerSharedPtr& http_tracer, absl::string_view cluster_name,
-    RequestSource& request_generator) const {
+    int sink_stat_prefix, RequestSource& request_generator) const {


it's possible this is also true elsewhere in this PR, but is it desirable for us to only allow an int as the sink_stat_prefix? In the future, isn't it possible that we'd want to use a prefix that isn't just the worker number?

If not, then this really isn't just a prefix. It has the specific meaning of being a worker number.

Discussed offline. Decide to change sink_stat_prefix to worker_id to make it explicitly that this will be used to populate worker_id field in SinkableHdrStatistic

nighthawk/source/client/factories_impl.cc

Lines 45 to 50 in d8f4d31

std::make_unique<SinkableHdrStatistic>(scope, worker_id),

std::make_unique<SinkableHdrStatistic>(scope, worker_id),

std::make_unique<SinkableHdrStatistic>(scope, worker_id),

std::make_unique<SinkableHdrStatistic>(scope, worker_id),

std::make_unique<SinkableHdrStatistic>(scope, worker_id),

std::make_unique<SinkableHdrStatistic>(scope, worker_id));

\cc @oschaaf

Signed-off-by: Qin Qin <qqin@google.com>

qqustc · 2020-07-22T21:53:15Z

Thanks @dubious90 for the review! The PR is ready for review again.

dubious90

LGTM, mumak to approve

dubious90 · 2020-07-22T23:05:57Z

docs/root/statistics.md

+In Envoy, the stat
+[Store](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/include/envoy/stats/store.h#L29)
+is a singleton and provides a simple interface by which the rest of the code can
+obtain handles to scopes, counters, gauges, and histograms. Envoy counters and


Sorry I was unclear here. I was actually referring to scopes. I realize that scopes are probably not metrics themselves, but I'm not sure what they are in this context, and unlike store, we don't have a link to envoy documentation.

Signed-off-by: Qin Qin <qqin@google.com>

qqustc · 2020-07-23T13:56:18Z

/retest

repokitteh-read-only · 2020-07-23T13:56:22Z

🔨 rebuilding ci/circleci: test_gcc (failed build)

🐱

Caused by: a #381 (comment) was created by @qqustc.

see: more, trace.

qqustc force-pushed the master branch from dc7c7f2 to e5d13ea Compare June 22, 2020 23:13

qqustc marked this pull request as ready for review June 22, 2020 23:14

qqustc closed this Jun 23, 2020

qqustc reopened this Jun 23, 2020

qqustc changed the title ~~Add counter/histogram to record total number of requests and latency metrics in NH~~ [Draft]Add counter/histogram to record total number of requests and latency metrics in NH Jun 23, 2020

Update stream_decoder.cc

37ef0db

Signed-off-by: qqustc <qqin@google.com> Signed-off-by: qqustc <qqustc@gmail.com>

qqustc force-pushed the master branch from f8998a2 to 37ef0db Compare June 23, 2020 18:18

qqustc closed this Jun 24, 2020

qqustc reopened this Jun 24, 2020

Fix format source/client/stream_decoder.cc

ecc9f1a

Signed-off-by: qqustc <qqin@google.com>

oschaaf mentioned this pull request Jun 24, 2020

Ignore: trigger ci for 381 #385

Closed

qqustc requested a review from oschaaf June 24, 2020 19:20

qqustc changed the title ~~[Draft]Add counter/histogram to record total number of requests and latency metrics in NH~~ Add counter/histogram to record total number of requests and latency metrics in NH Jun 24, 2020

qqustc added P1 waiting-for-review A PR waiting for a review. labels Jun 24, 2020

qqustc assigned mum4k Jun 24, 2020

oschaaf requested changes Jun 24, 2020

View reviewed changes

mum4k requested changes Jun 25, 2020

View reviewed changes

mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jun 25, 2020

qqustc and others added 4 commits June 30, 2020 17:04

Merge remote-tracking branch 'upstream/master'

dd611a9

Signed-off-by: qqustc <qqin@google.com>

Add new NH statistic class CircllhistStatistic and SinkableCircllhist…

af1ea54

…Statistic Signed-off-by: Qin Qin <qqin@google.com> Signed-off-by: qqustc <qqustc@gmail.com> Signed-off-by: Qin Qin <qqin@google.com>

Update statistic_impl.h

43d7ee2

Signed-off-by: Qin Qin <qqin@google.com>

Update statistic_impl.cc

77249b9

Signed-off-by: Qin Qin <qqin@google.com>

mum4k requested changes Jul 17, 2020

View reviewed changes

mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jul 17, 2020

qqustc changed the title ~~Add counter/histogram to record total number of requests and latency metrics in NH~~ Add stats to record latency metrics for each response code in NH Jul 20, 2020

qqustc added 3 commits July 20, 2020 14:30

add constructor to stat struct

d56991f

Signed-off-by: Qin Qin <qqin@google.com>

delete module

cc7ecc3

Signed-off-by: Qin Qin <qqin@google.com>

rerun CI

8d92933

Signed-off-by: Qin Qin <qqin@google.com>

qqustc added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jul 21, 2020

qqustc requested a review from mum4k July 21, 2020 14:00

mum4k requested a review from dubious90 July 21, 2020 19:41

mum4k removed their assignment Jul 21, 2020

oschaaf reviewed Jul 21, 2020

View reviewed changes

oschaaf previously approved these changes Jul 21, 2020

View reviewed changes

dubious90 suggested changes Jul 22, 2020

View reviewed changes

change sink_stat_prefix to worker_id

d8f4d31

Signed-off-by: Qin Qin <qqin@google.com>

qqustc dismissed oschaaf’s stale review via d8f4d31 July 22, 2020 21:47

dubious90 reviewed Jul 22, 2020

View reviewed changes

dubious90 assigned mum4k Jul 22, 2020

fix clang

5c36515

Signed-off-by: Qin Qin <qqin@google.com>

mum4k approved these changes Jul 23, 2020

View reviewed changes

mum4k merged commit 58c5fb6 into envoyproxy:master Jul 23, 2020

qqustc deleted the pr3 branch July 23, 2020 15:25



		## Metrics Export in Nighthawk
		Currently a single Nighthawk can run with multiple workers. In the future, Nighthawk will be extended to be able to run multiple instances together. Since each Nighthawk worker sends requests independently, we decide to export per worker level metrics since it provides several advantages over global level metrics (aggregated across all workers). Notice that there are existing Nighthawk metrics already defined at per worker level ([example](https://github.com/envoyproxy/nighthawk/blob/bc72a52efdc489beaa0844b34f136e03394bd355/source/client/benchmark_client_impl.cc#L61)).

	if (status > 99 && status <= 199) {
	benchmark_client_stats_.http_1xx_.inc();
	} else if (status > 199 && status <= 299) {
	benchmark_client_stats_.http_2xx_.inc();
	} else if (status > 299 && status <= 399) {
	benchmark_client_stats_.http_3xx_.inc();
	} else if (status > 399 && status <= 499) {
	benchmark_client_stats_.http_4xx_.inc();
	} else if (status > 499 && status <= 599) {
	benchmark_client_stats_.http_5xx_.inc();
	} else {
	benchmark_client_stats_.http_xxx_.inc();
	}

	std::make_unique<SinkableHdrStatistic>(scope, worker_id),
	std::make_unique<SinkableHdrStatistic>(scope, worker_id),
	std::make_unique<SinkableHdrStatistic>(scope, worker_id),
	std::make_unique<SinkableHdrStatistic>(scope, worker_id),
	std::make_unique<SinkableHdrStatistic>(scope, worker_id),
	std::make_unique<SinkableHdrStatistic>(scope, worker_id));

Conversation

qqustc commented Jun 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oschaaf commented Jun 24, 2020

Uh oh!

qqustc commented Jun 24, 2020

Uh oh!

oschaaf commented Jun 24, 2020

Uh oh!

oschaaf commented Jun 24, 2020

Uh oh!

oschaaf commented Jun 24, 2020

Uh oh!

oschaaf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qqustc Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qqustc Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mum4k left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mum4k left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qqustc commented Jun 22, 2020 •

edited

Loading

qqustc Jun 25, 2020 •

edited

Loading

qqustc Jun 25, 2020 •

edited

Loading