Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
8ea442d
Merge pull request #5 from envoyproxy/master
eric846 Jun 1, 2020
5ac755a
Merge pull request #6 from envoyproxy/master
eric846 Jun 28, 2020
b8c25a5
Merge pull request #7 from envoyproxy/master
eric846 Jul 7, 2020
1c19c68
initial commit
eric846 Jul 9, 2020
7050686
fix comments
eric846 Jul 9, 2020
0776563
fix format
eric846 Jul 9, 2020
16fd8f6
rename adaptive_rps to adaptive_load
eric846 Jul 10, 2020
c383010
add field_selector in example
eric846 Jul 10, 2020
6e1a483
fix example comment
eric846 Jul 10, 2020
4ef1140
fix format
eric846 Jul 10, 2020
4111bf4
add support for fault injection headers
eric846 Jul 10, 2020
871a959
replace linear and binary search with exponential search
eric846 Jul 10, 2020
1fd77c1
add InputVariableSetter mechanism
eric846 Jul 11, 2020
edc36b2
add input variable setter to build file
eric846 Jul 11, 2020
4d0364e
fix syntax errors
eric846 Jul 11, 2020
aed6d94
rename samples/adaptive_rps
eric846 Jul 11, 2020
d9ae87d
improve comments, change step controller initial value from int64 to …
eric846 Jul 12, 2020
a05a6f5
add proto validation rules, fix comments, make rps the default input_…
eric846 Jul 13, 2020
8cd4d21
fix comment wording
eric846 Jul 13, 2020
d814a96
simplify protos, add defaults, specify required or optional
eric846 Jul 14, 2020
5f5a885
add missing newline
eric846 Jul 14, 2020
7e20a78
Kick CI
eric846 Jul 14, 2020
9048267
simplify protos
eric846 Jul 15, 2020
306c0ec
fix format
eric846 Jul 15, 2020
d33f543
fix some optional field comments and rules
eric846 Jul 15, 2020
442cca9
Merge pull request #10 from envoyproxy/master
eric846 Jul 16, 2020
677b783
add Nighthawk status field in BenchmarkResult as nested nighthawk.cli…
eric846 Jul 19, 2020
cefb366
switch to standard Envoy plugin config proto, add prefix to internal …
eric846 Jul 22, 2020
f3684df
Merge remote-tracking branch 'upstream/master' into adaptive-rps-protos2
eric846 Jul 22, 2020
5463051
create headers
eric846 Jul 22, 2020
46e0e25
fix format
eric846 Jul 22, 2020
f634642
use docstring format
eric846 Jul 22, 2020
3c39faa
fix typos in comments
eric846 Jul 23, 2020
b9c8f2b
split build target, get rid of ostream, change InputValueSetter to us…
eric846 Jul 24, 2020
5fc4db4
remove nested namespace, remove redundant _include in target names
eric846 Jul 26, 2020
64e7852
merge from upstream
eric846 Jul 29, 2020
12807f1
Merge remote-tracking branch 'upstream/master' into adaptive-rps-headers
eric846 Jul 29, 2020
e8e960f
merge from upstream
eric846 Aug 27, 2020
7a5cc6d
initial commit: MetricsEvaluator library
eric846 Aug 27, 2020
2090763
add class comment
eric846 Aug 27, 2020
c8dee61
fix format
eric846 Aug 27, 2020
b1e8ea8
fix comments
eric846 Aug 27, 2020
f0595f7
remove unused includes, try to fix strange clang-tidy-only compilatio…
eric846 Aug 27, 2020
6306b4e
Merge remote-tracking branch 'upstream/master' into master2
eric846 Aug 27, 2020
1ece783
Merge remote-tracking branch 'upstream/master' into master2
eric846 Aug 28, 2020
ee1bf99
Merge branch 'master2' into adaptive-rps-metric-evaluation
eric846 Aug 28, 2020
d61db72
fix clang-tidy
eric846 Aug 28, 2020
283965f
fix clang-tidy: move some includes to impl
eric846 Aug 28, 2020
ea9f562
change ExtractMetricSpecs output parameters to returned pair
eric846 Aug 29, 2020
70705e9
Merge remote-tracking branch 'upstream/master' into master2
eric846 Aug 31, 2020
e576bc1
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 1, 2020
1fca528
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 3, 2020
f663975
rename unit tests, fix compile error
eric846 Sep 3, 2020
f367120
make ExtractMetricSpecs return const containers
eric846 Sep 3, 2020
93f42dc
Merge branch 'master2' into adaptive-rps-metric-evaluation
eric846 Sep 3, 2020
d2e502f
remove bazelrc
eric846 Sep 3, 2020
8eae526
initial commit - session spec proto helper and FakeMetricsPlugin helper
eric846 Sep 3, 2020
a10b625
rename SetDefaults
eric846 Sep 3, 2020
f2dccb2
fix clang-tidy
eric846 Sep 3, 2020
ed32856
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 3, 2020
eecf00d
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 8, 2020
788fa07
Merge branch 'master2' into adaptive-rps-metric-evaluation
eric846 Sep 8, 2020
ce68373
Merge branch 'master2' into adaptive-rps-spec-validation
eric846 Sep 8, 2020
34b81da
return vector of pairs from ExtractMetricSpecs
eric846 Sep 8, 2020
07493ae
remove accidental comment
eric846 Sep 8, 2020
de71ab5
Merge branch 'adaptive-rps-spec-validation' into adaptive-rps-main-lo…
eric846 Sep 8, 2020
5362193
initial commit, not passing
eric846 Sep 9, 2020
edc82db
improve comments
eric846 Sep 9, 2020
bc820f1
tests passing
eric846 Sep 10, 2020
6e5699b
Merge branch 'adaptive-rps-metric-evaluation' into adaptive-rps-main-…
eric846 Sep 10, 2020
13179fb
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 10, 2020
3da5916
merge from upstream
eric846 Sep 10, 2020
3eb89e6
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 11, 2020
41b3541
merge from upstream
eric846 Sep 11, 2020
f8d96dc
fix comments, remove unused include
eric846 Sep 11, 2020
1c8e98c
fix spec defaults test
eric846 Sep 11, 2020
070c04d
add const to adaptive load helper interface methods, clean up deps an…
eric846 Sep 11, 2020
1110cf2
add mocks for adaptive load helpers
eric846 Sep 11, 2020
fae60ad
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 12, 2020
12852c6
Merge branch 'master2' into adaptive-rps-mocks
eric846 Sep 12, 2020
80578fa
fix typo
eric846 Sep 12, 2020
50bc113
add comments to mock classes
eric846 Sep 14, 2020
d0c1e0b
use real spec proto helper
eric846 Sep 14, 2020
9fec200
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 14, 2020
27dbf6f
Merge branch 'master2' into adaptive-rps-main-loop-short
eric846 Sep 14, 2020
d22ca48
fix comments
eric846 Sep 14, 2020
5918b25
handle chrono-duration in more standard way, remove using namespace
eric846 Sep 15, 2020
1882f2d
fix asan
eric846 Sep 15, 2020
9d1cf53
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 15, 2020
63f5c5e
Merge branch 'master2' into adaptive-rps-main-loop-short
eric846 Sep 15, 2020
170909f
improve comments, simplify main loop by moving UpdateAndRecompute int…
eric846 Sep 19, 2020
cfad844
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 19, 2020
cf669b2
Merge branch 'master2' into adaptive-rps-main-loop-short
eric846 Sep 19, 2020
4d009ae
add more error logging
eric846 Sep 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 30 additions & 21 deletions include/nighthawk/adaptive_load/adaptive_load_controller.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#pragma once

#include "envoy/common/time.h"
#include "envoy/common/pure.h"

#include "external/envoy/source/common/common/statusor.h"

Expand All @@ -10,26 +10,35 @@
namespace Nighthawk {

/**
* Performs an adaptive load session consisting of the Adjusting Stage and the
* Testing Stage. Adjusting Stage: Runs a series of short benchmarks, checks metrics according to
* MetricSpecs, and adjusts load up or down based on the result; returns an error if convergence is
* not detected before the deadline in the spec. Load adjustments and convergence detection are
* computed by a StepController plugin. Metric values are obtained through MetricsPlugins. Testing
* Stage: When the optimal load is found, runs one long benchmark to validate it.
*
* @param nighthawk_service_stub A Nighthawk Service gRPC stub.
* @param spec A proto that defines all aspects of the adaptive load session, including metrics,
* threshold, duration of adjusting stage benchmarks, and underlying Nighthawk traffic parameters.
* @param time_source An abstraction of the system clock. Normally, just construct an
* Envoy::Event::RealTimeSystem and pass it. NO_CHECK_FORMAT(real_time). If calling from an
* Envoy-based process, there may be an existing TimeSource or TimeSystem to use. If calling
* from a test, pass a fake TimeSource.
*
* @return StatusOr<AdaptiveLoadSessionOutput> A proto logging the result of all traffic attempted
* and all corresponding metric values and scores, or an overall error status if the session failed.
* Contains the main loop of the adaptive load controller. Consults a StepController for load
* decisions, interacts with Nighthawk Service and MetricsPlugins.
*/
absl::StatusOr<nighthawk::adaptive_load::AdaptiveLoadSessionOutput> PerformAdaptiveLoadSession(
nighthawk::client::NighthawkService::StubInterface* nighthawk_service_stub,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, Envoy::TimeSource& time_source);
class AdaptiveLoadController {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be renamed to interface now, since we now a pure methods.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention for Nighthawk seems to be that everything under include should be pure but no existing interfaces have Interface in their name.

public:
virtual ~AdaptiveLoadController() = default;
/**
* Performs an adaptive load session consisting of the Adjusting Stage and the
* Testing Stage.
*
* Adjusting Stage: Runs a series of short benchmarks, checks metrics according to MetricSpecs,
* and adjusts load up or down based on the result. Returns an error if convergence is not
* detected before the deadline in the spec. Load adjustments and convergence detection are
* computed by a StepController plugin. Metric values are obtained through MetricsPlugins.
*
* Testing Stage: When the optimal load is found, runs one long benchmark to validate it.
*
* @param nighthawk_service_stub A Nighthawk Service gRPC stub.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe stub is more appropriate as a member in the class than an argument to the method. I don't think we expect it to change on a multiple call scenario.

Copy link
Copy Markdown
Contributor Author

@eric846 eric846 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is intended to be stateless helpers and it's only a class to make mocking easier. We will have a chance to design a more convenient client that might be stateful when we merge RemoteProcessImpl and NighthawkServiceClientImpl.

* @param spec A proto that defines all aspects of the adaptive load session, including metrics,
* threshold, duration of adjusting stage benchmarks, and underlying Nighthawk traffic parameters.
*
* @return StatusOr<AdaptiveLoadSessionOutput> A proto logging the result of all traffic attempted
* and all corresponding metric values and scores, or an overall error status if the session
* failed.
*/
virtual absl::StatusOr<nighthawk::adaptive_load::AdaptiveLoadSessionOutput>
PerformAdaptiveLoadSession(
nighthawk::client::NighthawkService::StubInterface* nighthawk_service_stub,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec) PURE;
};

} // namespace Nighthawk
24 changes: 24 additions & 0 deletions source/adaptive_load/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,30 @@ licenses(["notice"]) # Apache 2

envoy_package()

envoy_cc_library(
name = "adaptive_load_controller_impl",
srcs = [
"adaptive_load_controller_impl.cc",
],
hdrs = [
"adaptive_load_controller_impl.h",
],
repository = "@envoy",
visibility = ["//visibility:public"],
deps = [
":metrics_plugin_impl",
":plugin_loader",
"//api/adaptive_load:adaptive_load_proto_cc_proto",
"//api/client:base_cc_proto",
"//include/nighthawk/adaptive_load:adaptive_load_controller",
"//include/nighthawk/adaptive_load:metrics_evaluator",
"//include/nighthawk/adaptive_load:scoring_function",
"//include/nighthawk/adaptive_load:session_spec_proto_helper",
"//include/nighthawk/adaptive_load:step_controller",
"//include/nighthawk/common:nighthawk_service_client",
],
)

envoy_cc_library(
name = "config_validator_impl",
srcs = [
Expand Down
205 changes: 205 additions & 0 deletions source/adaptive_load/adaptive_load_controller_impl.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
#include "adaptive_load/adaptive_load_controller_impl.h"

#include <chrono>

#include "envoy/common/exception.h"
#include "envoy/config/core/v3/base.pb.h"

#include "nighthawk/adaptive_load/adaptive_load_controller.h"
#include "nighthawk/adaptive_load/metrics_plugin.h"
#include "nighthawk/adaptive_load/scoring_function.h"
#include "nighthawk/adaptive_load/step_controller.h"

#include "external/envoy/source/common/common/logger.h"
#include "external/envoy/source/common/common/statusor.h"
#include "external/envoy/source/common/protobuf/protobuf.h"

#include "api/adaptive_load/adaptive_load.pb.h"
#include "api/adaptive_load/benchmark_result.pb.h"
#include "api/adaptive_load/metric_spec.pb.h"
#include "api/client/options.pb.h"
#include "api/client/output.pb.h"
#include "api/client/service.grpc.pb.h"

#include "absl/container/flat_hash_map.h"
#include "absl/status/status.h"
#include "absl/strings/str_format.h"
#include "absl/strings/str_join.h"
#include "adaptive_load/metrics_plugin_impl.h"
#include "adaptive_load/plugin_loader.h"

namespace Nighthawk {

namespace {

using nighthawk::adaptive_load::AdaptiveLoadSessionOutput;
using nighthawk::adaptive_load::AdaptiveLoadSessionSpec;
using nighthawk::adaptive_load::BenchmarkResult;
using nighthawk::adaptive_load::MetricEvaluation;
using nighthawk::adaptive_load::MetricSpec;
using nighthawk::adaptive_load::MetricSpecWithThreshold;
using nighthawk::adaptive_load::ThresholdSpec;

/**
* Loads and initializes MetricsPlugins requested in the session spec. Assumes the spec has already
* been validated; crashes the process otherwise.
*
* @param spec Adaptive load session spec proto that has already been validated.
*
* @return Map from MetricsPlugin names to initialized plugins, to be used in the course of a single
* adaptive load session based on the session spec.
*/
absl::flat_hash_map<std::string, MetricsPluginPtr>
LoadMetricsPlugins(const AdaptiveLoadSessionSpec& spec) {
absl::flat_hash_map<std::string, MetricsPluginPtr> name_to_custom_metrics_plugin_map;
for (const envoy::config::core::v3::TypedExtensionConfig& config :
spec.metrics_plugin_configs()) {
absl::StatusOr<MetricsPluginPtr> metrics_plugin_or = LoadMetricsPlugin(config);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: usually it's better to perform the load right after validations and pass in the metric plugins to avoid the assert. No action needed if it's too much of code refactoring.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's a bit awkward for historical reasons -- filed #547

RELEASE_ASSERT(
metrics_plugin_or.ok(),
absl::StrCat(
"MetricsPlugin loading error should have been caught during input validation: ",
metrics_plugin_or.status().message()));
name_to_custom_metrics_plugin_map[config.name()] = std::move(metrics_plugin_or.value());
}
return name_to_custom_metrics_plugin_map;
}

/**
* Loads and initializes a StepController plugin requested in the session spec. Assumes
* the spec has already been validated; crashes the process otherwise.
*
* @param spec Adaptive load session spec proto that has already been validated.
*
* @return unique_ptr<StepController> Initialized StepController plugin.
*/
StepControllerPtr LoadStepControllerPluginFromSpec(const AdaptiveLoadSessionSpec& spec) {
absl::StatusOr<StepControllerPtr> step_controller_or =
LoadStepControllerPlugin(spec.step_controller_config(), spec.nighthawk_traffic_template());
RELEASE_ASSERT(
step_controller_or.ok(),
absl::StrCat(
"StepController plugin loading error should have been caught during input validation: ",
step_controller_or.status().message()));
return std::move(step_controller_or.value());
}

} // namespace

AdaptiveLoadControllerImpl::AdaptiveLoadControllerImpl(
const NighthawkServiceClient& nighthawk_service_client,
const MetricsEvaluator& metrics_evaluator,
const AdaptiveLoadSessionSpecProtoHelper& session_spec_proto_helper,
Envoy::TimeSource& time_source)
: nighthawk_service_client_{nighthawk_service_client}, metrics_evaluator_{metrics_evaluator},
session_spec_proto_helper_{session_spec_proto_helper}, time_source_{time_source} {}

absl::StatusOr<BenchmarkResult> AdaptiveLoadControllerImpl::PerformAndAnalyzeNighthawkBenchmark(
nighthawk::client::NighthawkService::StubInterface* nighthawk_service_stub,
const AdaptiveLoadSessionSpec& spec,
const absl::flat_hash_map<std::string, MetricsPluginPtr>& name_to_custom_plugin_map,
StepController& step_controller, Envoy::ProtobufWkt::Duration duration) {
absl::StatusOr<nighthawk::client::CommandLineOptions> command_line_options_or =
step_controller.GetCurrentCommandLineOptions();
if (!command_line_options_or.ok()) {
ENVOY_LOG_MISC(error, "Error constructing Nighthawk input: {}: {}",
command_line_options_or.status().code(),
command_line_options_or.status().message());
return command_line_options_or.status();
}
nighthawk::client::CommandLineOptions command_line_options = command_line_options_or.value();
// Overwrite the duration in the traffic template with the specified duration of the adjusting
// or testing stage.
*command_line_options.mutable_duration() = std::move(duration);

ENVOY_LOG_MISC(info, "Sending load: {}", command_line_options.DebugString());
absl::StatusOr<nighthawk::client::ExecutionResponse> nighthawk_response_or =
nighthawk_service_client_.PerformNighthawkBenchmark(nighthawk_service_stub,
command_line_options);
if (!nighthawk_response_or.ok()) {
ENVOY_LOG_MISC(error, "Nighthawk Service error: {}: {}", nighthawk_response_or.status().code(),
nighthawk_response_or.status().message());
return nighthawk_response_or.status();
}
nighthawk::client::ExecutionResponse nighthawk_response = nighthawk_response_or.value();

absl::StatusOr<BenchmarkResult> benchmark_result_or =
metrics_evaluator_.AnalyzeNighthawkBenchmark(nighthawk_response, spec,
name_to_custom_plugin_map);
if (!benchmark_result_or.ok()) {
ENVOY_LOG_MISC(error, "Benchmark scoring error: {}: {}", benchmark_result_or.status().code(),
benchmark_result_or.status().message());
return benchmark_result_or.status();
}
BenchmarkResult benchmark_result = benchmark_result_or.value();
for (const MetricEvaluation& evaluation : benchmark_result.metric_evaluations()) {
ENVOY_LOG_MISC(info, "Evaluation: {}", evaluation.DebugString());
}
step_controller.UpdateAndRecompute(benchmark_result);
return benchmark_result;
}

absl::StatusOr<AdaptiveLoadSessionOutput> AdaptiveLoadControllerImpl::PerformAdaptiveLoadSession(
nighthawk::client::NighthawkService::StubInterface* nighthawk_service_stub,
const AdaptiveLoadSessionSpec& input_spec) {
AdaptiveLoadSessionSpec spec = session_spec_proto_helper_.SetSessionSpecDefaults(input_spec);
absl::Status validation_status = session_spec_proto_helper_.CheckSessionSpec(spec);
if (!validation_status.ok()) {
ENVOY_LOG_MISC(error, "Validation failed: {}", validation_status.message());
return validation_status;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log reasons of failures.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
absl::flat_hash_map<std::string, MetricsPluginPtr> name_to_custom_metrics_plugin_map =
LoadMetricsPlugins(spec);
StepControllerPtr step_controller = LoadStepControllerPluginFromSpec(spec);
AdaptiveLoadSessionOutput output;

// Threshold specs are reproduced in the output proto for convenience.
for (const nighthawk::adaptive_load::MetricSpecWithThreshold& threshold :
spec.metric_thresholds()) {
*output.mutable_metric_thresholds()->Add() = threshold;
}

// Perform adjusting stage:
Envoy::MonotonicTime start_time = time_source_.monotonicTime();
std::string doom_reason;
do {
absl::StatusOr<BenchmarkResult> result_or = PerformAndAnalyzeNighthawkBenchmark(
nighthawk_service_stub, spec, name_to_custom_metrics_plugin_map, *step_controller,
spec.measuring_period());
if (!result_or.ok()) {
return result_or.status();
}
BenchmarkResult result = result_or.value();
*output.mutable_adjusting_stage_results()->Add() = result;

const std::chrono::nanoseconds time_limit_ns(
Envoy::Protobuf::util::TimeUtil::DurationToNanoseconds(spec.convergence_deadline()));
const auto elapsed_time_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(
time_source_.monotonicTime() - start_time);
if (elapsed_time_ns > time_limit_ns) {
std::string message = absl::StrFormat("Failed to converge before deadline of %.2f seconds.",
time_limit_ns.count() / 1e9);
ENVOY_LOG_MISC(error, message);
return absl::DeadlineExceededError(message);
}
} while (!step_controller->IsConverged() && !step_controller->IsDoomed(doom_reason));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually while loops are more readable than do/while. Is there a reason not to use a simple while loop here?

Copy link
Copy Markdown
Contributor Author

@eric846 eric846 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had discussed this, I think the motivation for do/while is that IsConverged and IsDoomed don't have meaningful answers until the loop body has executed.


if (step_controller->IsDoomed(doom_reason)) {
std::string message =
absl::StrCat("Step controller determined that it can never converge: ", doom_reason);
ENVOY_LOG_MISC(error, message);
return absl::AbortedError(message);
}

// Perform testing stage:
absl::StatusOr<BenchmarkResult> result_or = PerformAndAnalyzeNighthawkBenchmark(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on why calling this again after complete.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (this is the testing stage)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW we have #539 to consider making the testing stage an optional thing, decided by the StepController, so this special call might go away.

nighthawk_service_stub, spec, name_to_custom_metrics_plugin_map, *step_controller,
spec.testing_stage_duration());
if (!result_or.ok()) {
return result_or.status();
}
*output.mutable_testing_stage_result() = result_or.value();
return output;
}

} // namespace Nighthawk
83 changes: 83 additions & 0 deletions source/adaptive_load/adaptive_load_controller_impl.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#include "envoy/common/time.h"

#include "nighthawk/adaptive_load/adaptive_load_controller.h"
#include "nighthawk/adaptive_load/metrics_evaluator.h"
#include "nighthawk/adaptive_load/session_spec_proto_helper.h"
#include "nighthawk/adaptive_load/step_controller.h"
#include "nighthawk/common/nighthawk_service_client.h"

namespace Nighthawk {

class AdaptiveLoadControllerImpl : public AdaptiveLoadController {
public:
/**
* Constructs an implementation of the adaptive load controller main loop that relies on logic in
* several helper objects. Through helpers, it performs Nighthawk Service benchmarks, obtains
* metrics from MetricsPlugins, scores the results, and consults a StepController plugin to
* determine the next load and detect convergence. All plugins are specified through the
* AdaptiveLoadSessionSpec proto.
*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on the thread safety? Can you run concurrent sessions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* This class is thread-safe, but Nighthawk Service itself does not support multiple simultaneous
* benchmarks.
*
* Usage:
*
* AdaptiveLoadControllerImpl controller(
* NighthawkServiceClientImpl(),
* MetricsEvaluatorImpl(),
* AdaptiveLoadSessionSpecProtoHelperImpl(),
* Envoy::Event::RealTimeSystem()); // NO_CHECK_FORMAT(real_time))
* AdaptiveLoadSessionSpec spec;
* // (set spec fields here)
* StatusOr<AdaptiveLoadSessionOutput> output =
* controller.PerformAdaptiveLoadSession(&nighthawk_service_stub, spec);
*
* @param nighthawk_service_client A helper that executes Nighthawk Service benchmarks given a
* gRPC stub.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason behing seperating the client from the stub?

Copy link
Copy Markdown
Contributor Author

@eric846 eric846 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for historical reasons: We had a helper that made it easier to call the stub, then moved the helper into its own class because it's only easy to mock out methods when they are in a class.

I'm not sure what will happen with this, but we may add a more convenient Nighthawk Service client wrapper object that contains the stub, and then take this wrapper instead. Or the stub might be the best thing for the public API to take.

* @param metrics_evaluator A helper that obtains metrics from MetricsPlugins and Nighthawk
* Service responses, then scores them.
* @param session_spec_proto_helper A helper that sets default values and performs validation in
* an AdaptiveLoadSessionSpec proto.
* @param time_source An abstraction of the system clock. Normally, just construct an
* Envoy::Event::RealTimeSystem and pass it. NO_CHECK_FORMAT(real_time). If calling from an
* Envoy-based process, there may be an existing TimeSource or TimeSystem to use. If calling
* from a test, pass a fake TimeSource.
*/
AdaptiveLoadControllerImpl(const NighthawkServiceClient& nighthawk_service_client,
const MetricsEvaluator& metrics_evaluator,
const AdaptiveLoadSessionSpecProtoHelper& session_spec_proto_helper,
Envoy::TimeSource& time_source);

absl::StatusOr<nighthawk::adaptive_load::AdaptiveLoadSessionOutput> PerformAdaptiveLoadSession(
nighthawk::client::NighthawkService::StubInterface* nighthawk_service_stub,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec) override;

private:
/**
* Gets the current load from the StepController, performs a benchmark via a Nighthawk Service,
* hands the result off for analysis, and reports the scores back to the StepController.
*
* @param nighthawk_service_stub Nighthawk Service gRPC stub.
* @param spec Proto describing the overall adaptive load session.
* @param name_to_custom_plugin_map Common map from plugin names to MetricsPlugins loaded and
* initialized once at the beginning of the session and passed to all calls of this function.
* @param step_controller The active StepController specified in the session spec proto.
* @param duration The duration of the benchmark to insert into the traffic template, different
* between adjusting and testing stages.
*
* @return BenchmarkResult Proto containing either an error status or raw Nighthawk Service
* results, metric values, and metric scores.
*/
absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult> PerformAndAnalyzeNighthawkBenchmark(
nighthawk::client::NighthawkService::StubInterface* nighthawk_service_stub,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
const absl::flat_hash_map<std::string, MetricsPluginPtr>& name_to_custom_plugin_map,
StepController& step_controller, Envoy::ProtobufWkt::Duration duration);

const NighthawkServiceClient& nighthawk_service_client_;
const MetricsEvaluator& metrics_evaluator_;
const AdaptiveLoadSessionSpecProtoHelper& session_spec_proto_helper_;
Envoy::TimeSource& time_source_;
};

} // namespace Nighthawk
Loading