-
Notifications
You must be signed in to change notification settings - Fork 93
Introduce Phases #219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Phases #219
Changes from 31 commits
a76a706
4a348eb
7883d59
e458ea5
e7b65c1
db1c468
731736b
79bdf33
ebf82a3
658bd48
bff40ac
d12d0d5
b5b5fd2
fb85732
3ea8055
ca49ab5
9528d92
281a4e3
8a268fe
34f649b
b31f03e
cdb10e5
3298929
62914e4
07fa643
c32ba0e
0fd8271
4dcb33b
efd8a51
128b633
c927a6e
be3f183
0bd9cfd
b81f4c5
0ab72fd
49c9344
fc998a2
29872a7
e8300c9
f3df58c
f3f3271
ba832c3
b75803e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -37,9 +37,9 @@ class SequencerFactory { | |
| public: | ||
| virtual ~SequencerFactory() = default; | ||
| virtual SequencerPtr create(Envoy::TimeSource& time_source, Envoy::Event::Dispatcher& dispatcher, | ||
| Envoy::MonotonicTime start_time, BenchmarkClient& benchmark_client, | ||
| TerminationPredicate& termination_predicate, | ||
| Envoy::Stats::Scope& scope) const PURE; | ||
| BenchmarkClient& benchmark_client, | ||
| TerminationPredicatePtr&& termination_predicate, | ||
| Envoy::Stats::Scope& scope, const bool warmup) const PURE; | ||
| }; | ||
|
|
||
| class StoreFactory { | ||
|
|
@@ -72,8 +72,8 @@ class RequestSourceFactory { | |
| class TerminationPredicateFactory { | ||
| public: | ||
| virtual ~TerminationPredicateFactory() = default; | ||
| virtual TerminationPredicatePtr create(Envoy::TimeSource& time_source, Envoy::Stats::Scope& scope, | ||
| Envoy::MonotonicTime time) const PURE; | ||
| virtual TerminationPredicatePtr create(Envoy::TimeSource& time_source, | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (optional) If you don't mind, can we document this interface since we are touching this? Optional, since this is unrelated to this PR.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe I should just go over all the interfaces, and document everything that isn't in one go in a separate PR? I'd be happy to.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Assuming you don't mind doing that, that would be well worth the time and would help us when reading the code base.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Filed #268 |
||
| Envoy::Stats::Scope& scope) const PURE; | ||
| }; | ||
|
|
||
| } // namespace Nighthawk | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
|
|
||
| #pragma once | ||
|
|
||
| #include <memory> | ||
|
|
||
| #include "envoy/common/pure.h" | ||
|
|
||
| #include "nighthawk/common/sequencer.h" | ||
|
|
||
| namespace Nighthawk { | ||
|
|
||
| /** | ||
| * Phase represents a distinct phase of a benchmmark execution, like warmup and cooldown. | ||
| * A phase is associated to a sequencer, which in turn can be associated to separate termination | ||
| * and failure predicates as well as its own rate limiter policy. The end of a phase also represents | ||
| * a natural boundary for reporting a snapshot of stats and latencies associated to the phase. | ||
| * High level, a worker statically configure a vector of phases, and will transfer the hot | ||
| * connection pool when transitioning between them. At this time, nothing is stopping us from | ||
| * dynamically injecting phases later, be it via grpc calls and/or live CLI input. | ||
| */ | ||
| class Phase { | ||
| public: | ||
| virtual ~Phase() = default; | ||
|
|
||
| /** | ||
| * @return absl::string_view Contains the id of the phase. Should be unique but that is not | ||
| * enforced at this time so take care. | ||
| */ | ||
| virtual absl::string_view id() const PURE; | ||
|
|
||
| /** | ||
| * @return Sequencer& Sequencer associated to this phase. | ||
| */ | ||
| virtual Sequencer& sequencer() const PURE; | ||
|
|
||
| /** | ||
| * @return bool Indicates if latencies should be tracked for this phase. | ||
| */ | ||
| virtual bool measureLatencies() const PURE; | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (nit) The current name sounds like a verb (a command) and could be misread as a request to start measuring latencies. It helps to include the word "should" in boolean functions that are technically asking a question. How about:
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the convention is that
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Renamed to |
||
|
|
||
| /** | ||
| * Runs the sequencer associated to this phase and blocks until completion, which means this phase | ||
| * has ended as well. | ||
| */ | ||
| virtual void run() const PURE; | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesn't the phase need to communicate any result status / error code, etc? If there already is a way for us to examine the result, can we document it?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm mulling this over; we can query the stats and have this decoupled, but we could probably also return the last termination predicate result here.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a doc comment here |
||
| }; | ||
|
|
||
| using PhasePtr = std::unique_ptr<Phase>; | ||
|
|
||
| } // namespace Nighthawk | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,13 +1,18 @@ | ||
| #include "client/client_worker_impl.h" | ||
|
|
||
| #include <vector> | ||
|
|
||
| #include "external/envoy/source/common/stats/symbol_table_impl.h" | ||
|
|
||
| #include "common/request_source_impl.h" | ||
| #include "common/phase_impl.h" | ||
| #include "common/termination_predicate_impl.h" | ||
| #include "common/utility.h" | ||
|
|
||
| namespace Nighthawk { | ||
| namespace Client { | ||
|
|
||
| using namespace std::chrono_literals; | ||
|
|
||
| ClientWorkerImpl::ClientWorkerImpl(Envoy::Api::Api& api, Envoy::ThreadLocal::Instance& tls, | ||
| Envoy::Upstream::ClusterManagerPtr& cluster_manager, | ||
| const BenchmarkClientFactory& benchmark_client_factory, | ||
|
|
@@ -17,34 +22,55 @@ ClientWorkerImpl::ClientWorkerImpl(Envoy::Api::Api& api, Envoy::ThreadLocal::Ins | |
| Envoy::Stats::Store& store, const int worker_number, | ||
| const Envoy::MonotonicTime starting_time, | ||
| Envoy::Tracing::HttpTracerPtr& http_tracer) | ||
| : WorkerImpl(api, tls, store), worker_scope_(store_.createScope("cluster.")), | ||
| : WorkerImpl(api, tls, store), termination_predicate_factory_(termination_predicate_factory), | ||
| sequencer_factory_(sequencer_factory), worker_scope_(store_.createScope("cluster.")), | ||
| worker_number_scope_(worker_scope_->createScope(fmt::format("{}.", worker_number))), | ||
| worker_number_(worker_number), starting_time_(starting_time), http_tracer_(http_tracer), | ||
| request_generator_(request_generator_factory.create()), | ||
| benchmark_client_(benchmark_client_factory.create( | ||
| api, *dispatcher_, *worker_number_scope_, cluster_manager, http_tracer_, | ||
| fmt::format("{}", worker_number), *request_generator_)), | ||
| termination_predicate_( | ||
| termination_predicate_factory.create(time_source_, *worker_number_scope_, starting_time)), | ||
| sequencer_(sequencer_factory.create(time_source_, *dispatcher_, starting_time, | ||
| *benchmark_client_, *termination_predicate_, | ||
| *worker_number_scope_)) {} | ||
| fmt::format("{}", worker_number), *request_generator_)) {} | ||
|
|
||
| void ClientWorkerImpl::simpleWarmup() { | ||
| ENVOY_LOG(debug, "> worker {}: warmup start.", worker_number_); | ||
| if (benchmark_client_->tryStartRequest([this](bool, bool) { dispatcher_->exit(); })) { | ||
| dispatcher_->run(Envoy::Event::Dispatcher::RunType::RunUntilExit); | ||
| } else { | ||
| ENVOY_LOG(warn, "> worker {}: failed to initiate warmup request.", worker_number_); | ||
| } | ||
| ENVOY_LOG(debug, "> worker {}: warmup done.", worker_number_); | ||
| // We add a short warmup phase, which ends when either the first successful response | ||
| // is observed or two seconds have passed, whichever comes first. These warmup conditions are | ||
| // registered on top of the original configured predicates. | ||
| // TODO(oschaaf): allow configuration of this phase once the ramping rate limiters land, | ||
| // or a generic configuration of multiple phases (id, duration, termination predicates, rate | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (for the future) Instinctively I lean towards generic configuration of multiple phases, rather than built-in phase types, but I haven't given it as much thought as you did. Generic configuration for phases will allow users to utilize phases in ways that we didn't account for. On the other hand the warmup phase is very likely going to be fairly common. Maybe we could come up with providing some sort of templating for command line / service arguments that we could ship with the final version of phases. I.e. instead of building the warmup phase (or others) into the code we could bake it into config templates. WDYT?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My instinct is the same; however because of the intermediate step performed here we're applying the generic phases concept in a very specific way to ensure we're doing a no-op. Hopefully we won't have hard coded build-in phases when we are done; I love the config templates idea.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Backed some of this out: we retain the old warmup call, and have a single |
||
| // limiting options, etc). | ||
| TerminationPredicatePtr warmup_predicates = | ||
| termination_predicate_factory_.create(time_source_, *worker_number_scope_); | ||
| warmup_predicates | ||
| ->appendToChain(std::make_unique<StatsCounterAbsoluteThresholdTerminationPredicateImpl>( | ||
| worker_number_scope_->counter("benchmark.http_2xx"), 0, | ||
| TerminationPredicate::Status::TERMINATE)) | ||
| .appendToChain(std::make_unique<DurationTerminationPredicateImpl>(time_source_, 2s)); | ||
| phases_.emplace_back(std::make_unique<PhaseImpl>( | ||
| "warmup", | ||
| sequencer_factory_.create(time_source_, *dispatcher_, *benchmark_client_, | ||
| std::move(warmup_predicates), *worker_number_scope_, true), | ||
| false)); | ||
| } | ||
|
|
||
| void ClientWorkerImpl::work() { | ||
| simpleWarmup(); | ||
| benchmark_client_->setMeasureLatencies(true); | ||
| sequencer_->start(); | ||
| sequencer_->waitForCompletion(); | ||
|
|
||
| phases_.emplace_back(std::make_unique<PhaseImpl>( | ||
| "main", | ||
| sequencer_factory_.create( | ||
| time_source_, *dispatcher_, *benchmark_client_, | ||
| termination_predicate_factory_.create(time_source_, *worker_number_scope_), | ||
| *worker_number_scope_, false), | ||
| true)); | ||
|
|
||
| for (auto& phase : phases_) { | ||
| benchmark_client_->setMeasureLatencies(phase->measureLatencies()); | ||
| phase->run(); | ||
| if (worker_number_scope_->counter("sequencer.failed_terminations").value() != 0) { | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| // Save a final snapshot of the worker-specific counter accumulations before | ||
| // we exit the thread. | ||
| for (const auto& stat : store_.counters()) { | ||
|
|
@@ -69,7 +95,8 @@ void ClientWorkerImpl::shutdownThread() { benchmark_client_->terminate(); } | |
| StatisticPtrMap ClientWorkerImpl::statistics() const { | ||
| StatisticPtrMap statistics; | ||
| StatisticPtrMap s1 = benchmark_client_->statistics(); | ||
| StatisticPtrMap s2 = sequencer_->statistics(); | ||
| auto& sequencer = phases_.back()->sequencer(); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we prefer to mention the Sequencer type explicitly (instead of auto) unless it is too unwieldy?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
| StatisticPtrMap s2 = sequencer.statistics(); | ||
| statistics.insert(s1.begin(), s1.end()); | ||
| statistics.insert(s2.begin(), s2.end()); | ||
| return statistics; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,7 @@ | ||
| #pragma once | ||
|
|
||
| #include <vector> | ||
|
|
||
| #include "envoy/api/api.h" | ||
| #include "envoy/event/dispatcher.h" | ||
| #include "envoy/stats/store.h" | ||
|
|
@@ -9,6 +11,7 @@ | |
| #include "nighthawk/client/benchmark_client.h" | ||
| #include "nighthawk/client/client_worker.h" | ||
| #include "nighthawk/client/factories.h" | ||
| #include "nighthawk/common/phase.h" | ||
| #include "nighthawk/common/request_source.h" | ||
| #include "nighthawk/common/sequencer.h" | ||
| #include "nighthawk/common/termination_predicate.h" | ||
|
|
@@ -34,23 +37,26 @@ class ClientWorkerImpl : public WorkerImpl, virtual public ClientWorker { | |
| const std::map<std::string, uint64_t>& thread_local_counter_values() override { | ||
| return thread_local_counter_values_; | ||
| } | ||
| const Sequencer& sequencer() const override { return *sequencer_; } | ||
|
|
||
| const std::vector<PhasePtr>& phases() const override { return phases_; } | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we document the new public method?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one is now renamed. Doc comments related to this reside in the interface class, does that suffice? |
||
|
|
||
| void shutdownThread() override; | ||
|
|
||
| protected: | ||
| void work() override; | ||
|
|
||
| private: | ||
| void simpleWarmup(); | ||
| const TerminationPredicateFactory& termination_predicate_factory_; | ||
| const SequencerFactory& sequencer_factory_; | ||
| Envoy::Stats::ScopePtr worker_scope_; | ||
| Envoy::Stats::ScopePtr worker_number_scope_; | ||
| const int worker_number_; | ||
| const Envoy::MonotonicTime starting_time_; | ||
| Envoy::Tracing::HttpTracerPtr& http_tracer_; | ||
| RequestSourcePtr request_generator_; | ||
| BenchmarkClientPtr benchmark_client_; | ||
| TerminationPredicatePtr termination_predicate_; | ||
| const SequencerPtr sequencer_; | ||
| std::vector<PhasePtr> phases_; | ||
| Envoy::LocalInfo::LocalInfoPtr local_info_; | ||
| std::map<std::string, uint64_t> thread_local_counter_values_; | ||
| }; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -47,16 +47,15 @@ SequencerFactoryImpl::SequencerFactoryImpl(const Options& options) | |
|
|
||
| SequencerPtr SequencerFactoryImpl::create(Envoy::TimeSource& time_source, | ||
| Envoy::Event::Dispatcher& dispatcher, | ||
| Envoy::MonotonicTime start_time, | ||
| BenchmarkClient& benchmark_client, | ||
| TerminationPredicate& termination_predicate, | ||
| Envoy::Stats::Scope& scope) const { | ||
| TerminationPredicatePtr&& termination_predicate, | ||
| Envoy::Stats::Scope& scope, const bool warmup) const { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there a way this can be refactored to not require a special set of extra code for warmup? Or to rename this variable to be more general purpose?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My goal is to wire in the phase concept as a no-op, with test coverage; my idea is that in a followup we can optionize this as generic as we want to, eg allow arbitrary phase names and configs.. so one step at a time |
||
| StatisticFactoryImpl statistic_factory(options_); | ||
| RateLimiterPtr rate_limiter = | ||
| std::make_unique<LinearRateLimiter>(time_source, Frequency(options_.requestsPerSecond())); | ||
| Frequency frequency(warmup ? 1 : options_.requestsPerSecond()); | ||
| RateLimiterPtr rate_limiter = std::make_unique<LinearRateLimiter>(time_source, frequency); | ||
| const uint64_t burst_size = options_.burstSize(); | ||
|
|
||
| if (burst_size) { | ||
| if (!warmup && burst_size) { | ||
| rate_limiter = std::make_unique<BurstingRateLimiter>(std::move(rate_limiter), burst_size); | ||
| } | ||
|
|
||
|
|
@@ -71,9 +70,9 @@ SequencerPtr SequencerFactoryImpl::create(Envoy::TimeSource& time_source, | |
| return benchmark_client.tryStartRequest(std::move(f)); | ||
| }; | ||
| return std::make_unique<SequencerImpl>( | ||
| platform_util_, dispatcher, time_source, start_time, std::move(rate_limiter), | ||
| sequencer_target, statistic_factory.create(), statistic_factory.create(), | ||
| options_.sequencerIdleStrategy(), termination_predicate, scope); | ||
| platform_util_, dispatcher, time_source, std::move(rate_limiter), sequencer_target, | ||
| statistic_factory.create(), statistic_factory.create(), options_.sequencerIdleStrategy(), | ||
| std::move(termination_predicate), scope); | ||
| } | ||
|
|
||
| StoreFactoryImpl::StoreFactoryImpl(const Options& options) : OptionBasedFactoryImpl(options) {} | ||
|
|
@@ -144,11 +143,10 @@ RequestSourcePtr RequestSourceFactoryImpl::create() const { | |
| TerminationPredicateFactoryImpl::TerminationPredicateFactoryImpl(const Options& options) | ||
| : OptionBasedFactoryImpl(options) {} | ||
|
|
||
| TerminationPredicatePtr | ||
| TerminationPredicateFactoryImpl::create(Envoy::TimeSource& time_source, Envoy::Stats::Scope& scope, | ||
| const Envoy::MonotonicTime start) const { | ||
| TerminationPredicatePtr TerminationPredicateFactoryImpl::create(Envoy::TimeSource& time_source, | ||
| Envoy::Stats::Scope& scope) const { | ||
| TerminationPredicatePtr duration_predicate = | ||
| std::make_unique<DurationTerminationPredicateImpl>(time_source, start, options_.duration()); | ||
| std::make_unique<DurationTerminationPredicateImpl>(time_source, options_.duration()); | ||
| TerminationPredicate* current_predicate = duration_predicate.get(); | ||
| current_predicate = linkConfiguredPredicates(*current_predicate, options_.failurePredicates(), | ||
| TerminationPredicate::Status::FAIL, scope); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -399,7 +399,7 @@ bool ProcessImpl::run(OutputCollector& collector) { | |
| int i = 0; | ||
| std::chrono::nanoseconds total_execution_duration = 0ns; | ||
| for (auto& worker : workers_) { | ||
| auto sequencer_execution_duration = worker->sequencer().executionDuration(); | ||
| auto sequencer_execution_duration = worker->phases().back()->sequencer().executionDuration(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you explain why it's safe to pick just the last phase for this? Why don't we need the duration of all of the phases?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above; this is safe now because of some implicit knowledge abo this first step to land phases internally; in a follow up this has to change |
||
| // We don't write per-worker results if we only have a single worker, because the global results | ||
| // will be precisely the same. | ||
| if (workers_.size() > 1) { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| #include "common/phase_impl.h" | ||
|
|
||
| namespace Nighthawk { | ||
|
|
||
| absl::string_view PhaseImpl::id() const { return id_; } | ||
|
|
||
| Sequencer& PhaseImpl::sequencer() const { return *sequencer_; } | ||
|
|
||
| bool PhaseImpl::measureLatencies() const { return measure_latencies_; } | ||
|
|
||
| void PhaseImpl::run() const { | ||
| ENVOY_LOG(trace, "starting '{}' phase", id_); | ||
| sequencer().start(); | ||
| sequencer().waitForCompletion(); | ||
| ENVOY_LOG(trace, "finished '{}' phase", id_); | ||
| } | ||
|
|
||
| } // namespace Nighthawk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to our discussion in one of the previous PRs, how about we use an enum instead of the boolean flag argument? Looks like this boolean is used in a couple of interfaces that plumb the phases through. We might be able to reuse the enum in multiple locations.
This however makes me wonder how the end result is going to look like. If we go down the path of dynamically configured phases, what are we going to replace the boolean here with? Do we need a class carrying information about phases?
With that said - is this something we should leave to the next PR? Or would it be more beneficial to come up with a good structure now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to iterate here. My goal with this is to land a decent chunk of the changes required to implement phases as a no-op, as an intermediate step to make review more fun.
Later on we'll use them generically, we can also iterate on this in here in this PR here if you prefer. Or we can go for shipping this, and iterate on a second step in another one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I eliminated the warmup bool arg, to minimize changes that won't have future value as per earlier discussion.