Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
e5d13ea
Add counter/histogram to record total number of requests and latency …
qqustc Jun 22, 2020
37ef0db
Update stream_decoder.cc
qqustc Jun 23, 2020
ecc9f1a
Fix format source/client/stream_decoder.cc
qqustc Jun 24, 2020
dd611a9
Merge remote-tracking branch 'upstream/master'
qqustc Jun 30, 2020
af1ea54
Add new NH statistic class CircllhistStatistic and SinkableCircllhist…
qqustc Jul 1, 2020
43d7ee2
Update statistic_impl.h
qqustc Jul 6, 2020
77249b9
Update statistic_impl.cc
qqustc Jul 6, 2020
bdecd69
Update statistic_test.cc
qqustc Jul 6, 2020
2c14561
Update circllhist_proto_json.gold
qqustc Jul 6, 2020
97ccc8a
Update statistic_impl.h
qqustc Jul 6, 2020
6876450
Update statistic_test.cc
qqustc Jul 6, 2020
8fd0a05
Refactor SinkableCircllhistStatistic
qqustc Jul 6, 2020
cd6b29e
fix clang_tidy
qqustc Jul 7, 2020
684fe0b
Update statistic_test.cc
qqustc Jul 7, 2020
dc655a4
replace ASSERT with RELEASE_ASSERT
qqustc Jul 7, 2020
fc298c5
update statistic_test.cc
qqustc Jul 7, 2020
e1356ec
update statistic_test.cc
qqustc Jul 8, 2020
a9d4d5d
update statistic_impl.h and change coverage threshold
qqustc Jul 8, 2020
e4d6672
format
qqustc Jul 8, 2020
db78892
update statistic_impl
qqustc Jul 10, 2020
8903b60
update statistic_impl.cc
qqustc Jul 10, 2020
69f270d
Merge branch 'pr1' of https://github.com/qqustc/nighthawk into master
qqustc Jul 13, 2020
0fb2d12
update latency histogram to use nh statistics
qqustc Jul 13, 2020
4ce94f2
delete stat.doc
qqustc Jul 14, 2020
0f4fc95
add latency stat
qqustc Jul 14, 2020
190299c
fix include
qqustc Jul 14, 2020
4ef322d
fix stream_decoder.cc
qqustc Jul 14, 2020
66e1e89
fix factories_test.cc
qqustc Jul 14, 2020
a0cba74
format file
qqustc Jul 14, 2020
ccc055f
fix format
qqustc Jul 14, 2020
eaf6b84
Merge branch 'master' into pr3
qqustc Jul 14, 2020
d6e2a47
Create statistics.md
qqustc Jul 15, 2020
2e3762b
update doc
qqustc Jul 15, 2020
a712eb2
Merge branch 'master' into pr3
qqustc Jul 15, 2020
a71dad8
update stat
qqustc Jul 15, 2020
be4533c
fix mock_benchmark_client_factory.h
qqustc Jul 15, 2020
f8d5696
fix tests
qqustc Jul 16, 2020
aee69ae
Merge branch 'master' into pr3
qqustc Jul 16, 2020
6b36a64
fix format
qqustc Jul 16, 2020
302807d
fix test
qqustc Jul 16, 2020
49e708e
fix format
qqustc Jul 16, 2020
60e46cd
rerun CI
qqustc Jul 16, 2020
c801782
rerun CI
qqustc Jul 16, 2020
9255197
rerun CI
qqustc Jul 16, 2020
d56991f
add constructor to stat struct
qqustc Jul 20, 2020
cc7ecc3
delete module
qqustc Jul 20, 2020
8d92933
rerun CI
qqustc Jul 20, 2020
d8f4d31
change sink_stat_prefix to worker_id
qqustc Jul 22, 2020
5c36515
fix clang
qqustc Jul 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions docs/root/statistics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Nighthawk Statistics

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we format this document so that each line is at most 80 characters?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


## Background
Currently Nighthawk only outputs metrics at the end of a test run. There are no metrics streamed during a test run. In order to collect all the useful Nighthawk metrics, we plan to instrument Nighthawk with the functionality to stream its metrics.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the future tense is a good choice here. We will end up implementing this at some point and this document will still talk about it as something we only plan to work on.

How would you feel about writing this document as a documentation for metrics that are already supported (as in already done) and if you want we could add a TODO or a note that this is work in progress. We can remove that note once done.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jakub for the suggestion! Agreed that it is better to write this document as a documentation for metrics that are already supported.

Updated the doc to document metrics that are already supported and added a note saying The work to stream its metrics is in progress.



## Statistics in BenchMarkClient
Per worker Statistics are defined in [benchmark_client_impl.h](https://github.com/envoyproxy/nighthawk/blob/master/source/client/benchmark_client_impl.h)

Name | Type | Description
-----| ----- | ----------------
total_req_sent | Counter | Total number of requests sent from Nighthawk
http_xxx | Counter | Total number of response with code xxx

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more useful to list the counters explicitly? How many are we planning to export, is it too many?

If we do list them explicitly, users can use this as a reference when looking for the complete list of metrics.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to list the counters explicitly. The plan is to convert all counters with different codes into a single streamz in customized Sinks (similar to what is in Monarch Sink).

stream_resets | Counter | Total number of sream reset

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: stream

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pool_overflow | Counter | Total number of times connection pool overflowed
pool_connection_failure | Counter | Total number of times pool connection failed
latency_on_success_req | Histogram | Latency (in Microseconds) histogram of successful request with code 2xx
latency_on_error_req | Histogram | Latency (in Microseconds) histogram of error request with code other than 2xx

## Envoy Metrics Model
Since Nighthawk is built on top of [Envoy](https://github.com/envoyproxy/envoy) and similar metrics have been exported from Envoy, it is natural to follow the example in Envoy. Furthermore *Envoy typed metrics are already being used in Nighthawk* ([example](https://github.com/envoyproxy/nighthawk/blob/master/source/client/benchmark_client_impl.h#L33-L46)).


Envoy has 3 types of metrics
- Counters: Unsigned integers (can only increase) represent how many times an event happens, e.g. total number of requests.
- Gauges: Unsigned integers (can increase or decrease), e.g. number of active connections.
- Histograms: Unsigned integers that will yield summarized percentile values. E.g. latency distributions.


In Envoy, the stat [Store](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/include/envoy/stats/store.h#L29) is a singleton and provides a simple interface by which the rest of the code can obtain handles to scopes, counters, gauges, and histograms. Envoy counters and gauges are periodically (configured at ~5 sec interval) flushed to the sinks. Note that currently histogram values are sent directly to the sinks. A stat [Sink](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/include/envoy/stats/sink.h#L48) is an interface that takes generic stat data and translates it into a backend-specific wire format. Currently Envoy supports the TCP and UDP [statsd](https://github.com/b/statsd_spec) protocol (implemented in [statsd.h](https://github.com/envoyproxy/envoy/blob/master/source/extensions/stat_sinks/common/statsd/statsd.h)). Users can create their own Sink subclass to translate Envoy metrics into backend-specific format.

Envoy metrics can be defined using a macro, e.g.
```

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add syntax highlighting here, i.e. "cc" instead of just ""?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mum4k for the note. This is really useful to know. Done.

#define ALL_CLUSTER_STATS(COUNTER, GAUGE, HISTOGRAM)
COUNTER(upstream_cx_total)
GAUGE(upstream_cx_active, NeverImport)
HISTOGRAM(upstream_cx_length, Milliseconds)
struct ClusterStats {
ALL_CLUSTER_STATS(GENERATE_COUNTER_STRUCT, GENERATE_GAUGE_STRUCT, GENERATE_HISTOGRAM_STRUCT)
};
```

## Envoy Metrics Limitation
Currently Envoy metrics don't support key-value map. As a result, for metrics to be broken down by certain dimensions, we need to define a separate metric for each dimension. For example, currently Nighthawk defines separate [counters](https://github.com/envoyproxy/nighthawk/blob/master/source/client/benchmark_client_impl.h#L35-L40) to monitor the number of responses with corresponding response code.

Due to this limitation, for the latency metric we define 2 metrics: *latency_on_success_req* (to capture latency of successful requests with 2xx) and *latency_on_error_req* (to capture latency of error requests with code other than 2xx).


## Envoy Metrics Flush
Envoy uses a flush timer to periodically flush metrics into stat sinks ([here](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/source/server/server.cc#L479-L480)) at a configured interval (default to 5 sec). For every metric flush, Envoy will call the function [flushMetricsToSinks()](https://github.com/envoyproxy/envoy/blob/74530c92cfa3682b49b540fddf2aba40ac10c68e/source/server/server.cc#L175) to create a metric snapshot from Envoy stat store and flush the snapshot to all sinks through `sink->flush(snapshot)`.


## Metrics Export in Nighthawk
Currently a single Nighthawk can run with multiple workers. In the future, Nighthawk will be extended to be able to run multiple instances together. Since each Nighthawk worker sends requests independently, we decide to export per worker level metrics since it provides several advantages over global level metrics (aggregated across all workers). Notice that *there are existing Nighthawk metrics already defined at per worker level* ([example](https://github.com/envoyproxy/nighthawk/blob/bc72a52efdc489beaa0844b34f136e03394bd355/source/client/benchmark_client_impl.cc#L61)).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible typo: "we decided".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- Per worker level metrics provide information about the performance of each worker which will be hidden by the global level metrics.
- Keep the workers independent which makes it easier/efficient to scale up to multiple Nighthawks with large numbers of workers.

Metrics can be defined at per worker level using [Scope](https://github.com/envoyproxy/envoy/blob/e9c2c8c4a0141c9634316e8283f98f412d0dd207/include/envoy/stats/scope.h#L35) ( e.g. `cluster.<worker_id>.total_request_sent`). The dynamic portions of metric (e.g. `worker_id`) can be embedded into the metric name. A [TagSpecifier](https://github.com/envoyproxy/envoy/blob/7a652daf35d7d4a6a6bad5a010fe65947ee4411a/api/envoy/config/metrics/v3/stats.proto#L182) can be specified in the bootstrap configuration, which will transform dynamic portions into tags. When per worker level metrics are exported from Nighthawk, multiple per worker level metrics can be converted into a single metric with a `worker_id` label in the stat Sink if the corresponding backend metric supports key-value map.

## Reference
- [Nighthawk: architecture and key concepts](https://github.com/envoyproxy/nighthawk/blob/master/docs/root/overview.md)
- [Envoy Stats System](https://github.com/envoyproxy/envoy/blob/master/source/docs/stats.md)
- [Envoy Stats blog](https://blog.envoyproxy.io/envoy-stats-b65c7f363342)
15 changes: 13 additions & 2 deletions source/client/benchmark_client_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,8 @@ BenchmarkClientHttpImpl::BenchmarkClientHttpImpl(
response_statistic_(std::move(response_statistic)),
response_header_size_statistic_(std::move(response_header_size_statistic)),
response_body_size_statistic_(std::move(response_body_size_statistic)), use_h2_(use_h2),
benchmark_client_stats_({ALL_BENCHMARK_CLIENT_STATS(POOL_COUNTER(*scope_))}),
benchmark_client_stats_(
{ALL_BENCHMARK_CLIENT_STATS(POOL_COUNTER(*scope_), POOL_HISTOGRAM(*scope_))}),
cluster_manager_(cluster_manager), http_tracer_(http_tracer),
cluster_name_(std::string(cluster_name)), request_generator_(std::move(request_generator)),
provide_resource_backpressure_(provide_resource_backpressure) {
Expand Down Expand Up @@ -123,6 +124,7 @@ bool BenchmarkClientHttpImpl::tryStartRequest(CompletionCallback caller_completi
*response_body_size_statistic_, request->header(), shouldMeasureLatencies(), content_length,
generator_, http_tracer_);
requests_initiated_++;
benchmark_client_stats_.total_req_sent_.inc();
pool_ptr->newStream(*stream_decoder, *stream_decoder);
return true;
}
Expand Down Expand Up @@ -168,5 +170,14 @@ void BenchmarkClientHttpImpl::onPoolFailure(Envoy::Http::ConnectionPool::PoolFai
}
}

void BenchmarkClientHttpImpl::exportLatency(const uint32_t response_code,
const uint64_t latency_us) {
if (response_code > 199 && response_code <= 299) {
benchmark_client_stats_.latency_on_success_req_.recordValue(latency_us);
} else {
benchmark_client_stats_.latency_on_error_req_.recordValue(latency_us);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is a one size fits all. This might often suffice, but could there be a case where one would like to control the status code aggregation more? E.g. bucketize 4xx and 5xx? One might be a missing file, handled by the webserver itself, whereas the other could be an application crashing out, both would have their own latency characteristics.

@qqustc qqustc Jun 25, 2020

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that we should add separate latency metrics for each code if it provides more value.
Is it better to have all (latency_1xx, latency_2xx, latency_3xx, latency_4xx, latency_5xx, latency_xxx) for uniformity rather than just adding (latency_4xx, latency_5xx)?

The previous reasoning was to wait after Envoy support metrics with key-value label to add a single latency with different codes as label and currently only add 2 to save memory (if that is a concern).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be consistent with the counters we track too (and actually make them sort of redundant?).

Obviously out of scope here, but still, I'm kind of wondering if we could/should somehow attempt to generalise the concept of grouping metrics based on user-provided configuration. If we could do that today, the initial grouping proposed here would be a nice conservative default. So, thinking about it, maybe it is providing the means to change the default that I was concerned about here, and not so much the specific default. Maybe we shouldn't be discussing that here in this PR, but elsewhere :-) I'll defer to you for the specific default we choose here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Otto. Added latency statistics latency_1xx, latency_2xx, latency_3xx, latency_4xx, latency_5xx, latency_xxx.

}
Comment thread
oschaaf marked this conversation as resolved.
}

} // namespace Client
} // namespace Nighthawk
} // namespace Nighthawk
12 changes: 8 additions & 4 deletions source/client/benchmark_client_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ using namespace std::chrono_literals;

using namespace Envoy; // We need this because of macro expectations.

#define ALL_BENCHMARK_CLIENT_STATS(COUNTER) \
#define ALL_BENCHMARK_CLIENT_STATS(COUNTER, HISTOGRAM) \
Comment thread
oschaaf marked this conversation as resolved.
Outdated
COUNTER(stream_resets) \
COUNTER(http_1xx) \
COUNTER(http_2xx) \
Expand All @@ -39,10 +39,13 @@ using namespace Envoy; // We need this because of macro expectations.
COUNTER(http_5xx) \
COUNTER(http_xxx) \
COUNTER(pool_overflow) \
COUNTER(pool_connection_failure)
COUNTER(pool_connection_failure) \
COUNTER(total_req_sent) \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from upstream_rq_total tracked by Envoy?

@qqustc qqustc Jun 25, 2020

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seams upstream_rq_total is per cluster (I'm not sure what cluster means in Nighthawk) and here total_req_sent is per worker.
If there is no difference between those 2 counters to the user, we should reuse upstream_rq_total.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set up a single cluster per worker today, so I think effectively the counters are tracking the same thing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Otto for the explanation! Decided not to add total_req_sent since it is equivalent to upstream_rq_total .

HISTOGRAM(latency_on_success_req, Microseconds) \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include http_status in the name, as that's what is being used to define success/error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obsolete.

HISTOGRAM(latency_on_error_req, Microseconds)
Comment thread
oschaaf marked this conversation as resolved.
Outdated

struct BenchmarkClientStats {
ALL_BENCHMARK_CLIENT_STATS(GENERATE_COUNTER_STRUCT)
ALL_BENCHMARK_CLIENT_STATS(GENERATE_COUNTER_STRUCT, GENERATE_HISTOGRAM_STRUCT)
};

class Http1PoolImpl : public Envoy::Http::Http1::ProdConnPoolImpl {
Expand Down Expand Up @@ -104,6 +107,7 @@ class BenchmarkClientHttpImpl : public BenchmarkClient,
// StreamDecoderCompletionCallback
void onComplete(bool success, const Envoy::Http::ResponseHeaderMap& headers) override;
void onPoolFailure(Envoy::Http::ConnectionPool::PoolFailureReason reason) override;
void exportLatency(const uint32_t response_code, const uint64_t latency_us) override;

// Helpers
Envoy::Http::ConnectionPool::Instance* pool() {
Expand Down Expand Up @@ -142,4 +146,4 @@ class BenchmarkClientHttpImpl : public BenchmarkClient,
};

} // namespace Client
} // namespace Nighthawk
} // namespace Nighthawk
7 changes: 7 additions & 0 deletions source/client/stream_decoder.cc
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,13 @@ void StreamDecoder::onComplete(bool success) {
ASSERT(!success || complete_);
if (success && measure_latencies_) {
latency_statistic_.addValue((time_source_.monotonicTime() - request_start_).count());
// At this point StreamDecoder::decodeHeaders() should have been called.
if (stream_info_.response_code_.has_value()) {
const uint64_t latency_us = std::chrono::duration_cast<std::chrono::microseconds>(
time_source_.monotonicTime() - request_start_)
.count();
decoder_completion_callback_.exportLatency(stream_info_.response_code_.value(), latency_us);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a log line add value here in an else clause? That might need to guard against high frequency logging, but there might be something special going on when we'd hit that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will add a sparse log.

}
upstream_timing_.onLastUpstreamRxByteReceived(time_source_);
response_body_sizes_statistic_.addValue(stream_info_.bytesSent());
Expand Down
1 change: 1 addition & 0 deletions source/client/stream_decoder.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ class StreamDecoderCompletionCallback {
virtual ~StreamDecoderCompletionCallback() = default;
virtual void onComplete(bool success, const Envoy::Http::ResponseHeaderMap& headers) PURE;
virtual void onPoolFailure(Envoy::Http::ConnectionPool::PoolFailureReason reason) PURE;
virtual void exportLatency(const uint32_t response_code, const uint64_t latency_us) PURE;
};

// TODO(oschaaf): create a StreamDecoderPool?
Expand Down
2 changes: 1 addition & 1 deletion test/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ envoy_cc_test(
"//test/test_common:environment_lib",
"@envoy//source/common/http:header_map_lib_with_external_headers",
"@envoy//source/common/network:utility_lib_with_external_headers",
"@envoy//source/common/stats:isolated_store_lib_with_external_headers",
"@envoy//source/exe:process_wide_lib_with_external_headers",
"@envoy//test/mocks/runtime:runtime_mocks",
"@envoy//test/mocks/stats:stats_mocks",
"@envoy//test/mocks/stream_info:stream_info_mocks",
"@envoy//test/mocks/thread_local:thread_local_mocks",
"@envoy//test/mocks/upstream:upstream_mocks",
Expand Down
50 changes: 41 additions & 9 deletions test/benchmark_http_client_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
#include "external/envoy/source/common/http/header_map_impl.h"
#include "external/envoy/source/common/network/utility.h"
#include "external/envoy/source/common/runtime/runtime_impl.h"
#include "external/envoy/source/common/stats/isolated_store_impl.h"
#include "external/envoy/source/exe/process_wide.h"
#include "external/envoy/test/mocks/common.h"
#include "external/envoy/test/mocks/runtime/mocks.h"
#include "external/envoy/test/mocks/stats/mocks.h"
#include "external/envoy/test/mocks/stream_info/mocks.h"
#include "external/envoy/test/mocks/thread_local/mocks.h"
#include "external/envoy/test/mocks/upstream/mocks.h"
Expand Down Expand Up @@ -124,7 +124,7 @@ class BenchmarkClientHttpTest : public Test {

void setupBenchmarkClient() {
client_ = std::make_unique<Client::BenchmarkClientHttpImpl>(
*api_, *dispatcher_, store_, std::make_unique<StreamingStatistic>(),
*api_, *dispatcher_, mock_store_, std::make_unique<StreamingStatistic>(),
std::make_unique<StreamingStatistic>(), std::make_unique<StreamingStatistic>(),
std::make_unique<StreamingStatistic>(), false, cluster_manager_, http_tracer_, "benchmark",
request_generator_, true);
Expand All @@ -143,7 +143,8 @@ class BenchmarkClientHttpTest : public Test {
}

Envoy::Event::TestRealTimeSystem time_system_;
Envoy::Stats::IsolatedStoreImpl store_;
// deliverHistogramToSinks() is not implemented in IsolatedStoreImpl so test with a mock store.
Envoy::Stats::MockIsolatedStatsStore mock_store_;
Envoy::Api::ApiPtr api_;
Envoy::Event::DispatcherPtr dispatcher_;
Envoy::Runtime::RandomGeneratorImpl generator_;
Expand All @@ -166,45 +167,76 @@ TEST_F(BenchmarkClientHttpTest, BasicTestH1200) {
response_code_ = "200";
testBasicFunctionality(2, 3, 10);
EXPECT_EQ(5, getCounter("http_2xx"));
EXPECT_EQ(5, getCounter("total_req_sent"));
}

TEST_F(BenchmarkClientHttpTest, BasicTestH1300) {
response_code_ = "300";
testBasicFunctionality(0, 11, 10);
EXPECT_EQ(10, getCounter("http_3xx"));
EXPECT_EQ(10, getCounter("total_req_sent"));
}

TEST_F(BenchmarkClientHttpTest, BasicTestH1404) {
response_code_ = "404";
testBasicFunctionality(0, 1, 10);
EXPECT_EQ(1, getCounter("http_4xx"));
EXPECT_EQ(1, getCounter("total_req_sent"));
}

TEST_F(BenchmarkClientHttpTest, WeirdStatus) {
response_code_ = "601";
testBasicFunctionality(0, 1, 10);
EXPECT_EQ(1, getCounter("http_xxx"));
EXPECT_EQ(1, getCounter("total_req_sent"));
}

TEST_F(BenchmarkClientHttpTest, EnableLatencyMeasurement) {
setupBenchmarkClient();
EXPECT_EQ(false, client_->shouldMeasureLatencies());
EXPECT_CALL(mock_store_, deliverHistogramToSinks(_, _)).Times(0);
testBasicFunctionality(10, 1, 10);
EXPECT_EQ(0, client_->statistics()["benchmark_http_client.queue_to_connect"]->count());
EXPECT_EQ(0, client_->statistics()["benchmark_http_client.request_to_response"]->count());
EXPECT_EQ(10, getCounter("total_req_sent"));
client_->setShouldMeasureLatencies(true);
EXPECT_CALL(mock_store_, deliverHistogramToSinks(_, _)).Times(10);
testBasicFunctionality(10, 1, 10);
EXPECT_EQ(10, client_->statistics()["benchmark_http_client.queue_to_connect"]->count());
EXPECT_EQ(10, client_->statistics()["benchmark_http_client.request_to_response"]->count());
EXPECT_EQ(20, getCounter("total_req_sent"));
}

TEST_F(BenchmarkClientHttpTest, ExportSuccessLatency) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note the test coverage for the latency metrics.

How are we testing population and propagation of the other fields on struct BenchmarkClientStatistic?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests on connect_statistic and response_statistic are already in test case EnableLatencyMeasurement. Added test for response_header_size_statistic and response_body_size_statistic there too.

setupBenchmarkClient();
uint64_t latency = 10;
EXPECT_CALL(mock_store_, deliverHistogramToSinks(Property(&Envoy::Stats::Metric::name,
"benchmark.latency_on_success_req"),
latency))
.Times(1);
EXPECT_CALL(mock_store_,
deliverHistogramToSinks(
Property(&Envoy::Stats::Metric::name, "benchmark.latency_on_error_req"), latency))
.Times(0);
client_->exportLatency(/*response_code=*/200, /*latency_us=*/latency);
}

TEST_F(BenchmarkClientHttpTest, ExportErrorLatency) {
setupBenchmarkClient();
uint64_t latency = 10;
EXPECT_CALL(mock_store_, deliverHistogramToSinks(Property(&Envoy::Stats::Metric::name,
"benchmark.latency_on_success_req"),
latency))
.Times(0);
EXPECT_CALL(mock_store_,
deliverHistogramToSinks(
Property(&Envoy::Stats::Metric::name, "benchmark.latency_on_error_req"), latency))
.Times(1);
client_->exportLatency(/*response_code=*/500, /*latency_us=*/latency);
}

TEST_F(BenchmarkClientHttpTest, StatusTrackingInOnComplete) {
auto store = std::make_unique<Envoy::Stats::IsolatedStoreImpl>();
client_ = std::make_unique<Client::BenchmarkClientHttpImpl>(
*api_, *dispatcher_, *store, std::make_unique<StreamingStatistic>(),
std::make_unique<StreamingStatistic>(), std::make_unique<StreamingStatistic>(),
std::make_unique<StreamingStatistic>(), false, cluster_manager_, http_tracer_, "foo",
request_generator_, true);
setupBenchmarkClient();
Envoy::Http::ResponseHeaderMapPtr header = Envoy::Http::ResponseHeaderMapImpl::create();

header->setStatus(1);
Expand Down
12 changes: 8 additions & 4 deletions test/integration/test_integration_basics.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ def test_http_h1(http_test_server_fixture):
assertCounterEqual(counters, "upstream_rq_pending_total", 1)
assertCounterEqual(counters, "upstream_rq_total", 25)
assertCounterEqual(counters, "default.total_match_count", 1)
assertCounterEqual(counters, "benchmark.total_req_sent", 25)

global_histograms = http_test_server_fixture.getNighthawkGlobalHistogramsbyIdFromJson(parsed_json)
assertEqual(int(global_histograms["benchmark_http_client.response_body_size"]["count"]), 25)
Expand All @@ -47,7 +48,7 @@ def test_http_h1(http_test_server_fixture):
assertEqual(int(global_histograms["benchmark_http_client.response_body_size"]["raw_pstdev"]), 0)
assertEqual(int(global_histograms["benchmark_http_client.response_header_size"]["raw_pstdev"]), 0)

assertEqual(len(counters), 12)
assertEqual(len(counters), 13)


def mini_stress_test(fixture, args):
Expand Down Expand Up @@ -179,7 +180,8 @@ def test_http_h2(http_test_server_fixture):
assertCounterEqual(counters, "upstream_rq_pending_total", 1)
assertCounterEqual(counters, "upstream_rq_total", 25)
assertCounterEqual(counters, "default.total_match_count", 1)
assertEqual(len(counters), 12)
assertCounterEqual(counters, "benchmark.total_req_sent", 25)
assertEqual(len(counters), 13)


def test_http_concurrency(http_test_server_fixture):
Expand Down Expand Up @@ -224,7 +226,8 @@ def test_https_h1(https_test_server_fixture):
assertCounterEqual(counters, "ssl.sigalgs.rsa_pss_rsae_sha256", 1)
assertCounterEqual(counters, "ssl.versions.TLSv1.2", 1)
assertCounterEqual(counters, "default.total_match_count", 1)
assertEqual(len(counters), 17)
assertCounterEqual(counters, "benchmark.total_req_sent", 25)
assertEqual(len(counters), 18)

server_stats = https_test_server_fixture.getTestServerStatisticsJson()
assertEqual(
Expand Down Expand Up @@ -258,7 +261,8 @@ def test_https_h2(https_test_server_fixture):
assertCounterEqual(counters, "ssl.sigalgs.rsa_pss_rsae_sha256", 1)
assertCounterEqual(counters, "ssl.versions.TLSv1.2", 1)
assertCounterEqual(counters, "default.total_match_count", 1)
assertEqual(len(counters), 17)
assertCounterEqual(counters, "benchmark.total_req_sent", 25)
assertEqual(len(counters), 18)


def test_https_h2_multiple_connections(https_test_server_fixture):
Expand Down
Loading