add resource monitor framework#3848
Conversation
to be used by the overload manager (issue envoyproxy#373) Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
htuch
left a comment
There was a problem hiding this comment.
This looks really solid, LGTM modulo comments.
| option go_package = "v2alpha"; | ||
|
|
||
| message FixedHeapConfig { | ||
| // Limit of the Envoy process heap size. This is used to calculate heap memory pressure which |
There was a problem hiding this comment.
Wondering if there is a more push-button way to approach this, by having Envoy infer maximum memory available in the container? We have had some recent discussions with OpenCensus on how to do memory utilization determination and the main takeaway is its hard to do cross platform in a consistent way. OTOH, freeing users from having to make this determination seems a net win if there's some reasonable approaches (e.g. can tcmalloc tell us anything?).
There was a problem hiding this comment.
Unfortunately, I couldn't find a way to dynamically get the heap limit from tcmalloc. I expect most users will want to define their own memory monitor which knows how to query the envoy container for this info and perhaps use it in combination with this or on its own.
There was a problem hiding this comment.
FWIW I agree with @eziskind on this that it will be easier to have the operator provide the limit. How that limit is obtained/enforced is going to be system specific.
|
|
||
| // Callback for handling updated usage information for this resource. | ||
| // Exactly one of 'usage' or 'error' will be non-null. | ||
| typedef std::function<void(const ResourceUsage* usage, const EnvoyException* error)> UpdateCb; |
There was a problem hiding this comment.
I think it would make sense to register both an onSuccess and onFailure callback, following the same patterns as the async clients, e.g. https://github.com/envoyproxy/envoy/blob/master/include/envoy/http/async_client.h.
| * @return Event::Dispatcher& the main thread's dispatcher. This dispatcher should be used | ||
| * for all singleton processing. | ||
| */ | ||
| virtual Event::Dispatcher& dispatcher() PURE; |
There was a problem hiding this comment.
I would just inline this and skip the context unless you anticipate this growing significantly.
| ResourceMonitorFactoryContext& context) PURE; | ||
|
|
||
| /** | ||
| * @return std::string the identifgitying name for a particular implementation of a resource |
| const Protobuf::Message& config, | ||
| Server::Configuration::ResourceMonitorFactoryContext& /*unused_context*/) { | ||
| return std::make_unique<FixedHeapMonitor>( | ||
| MessageUtil::downcastAndValidate< |
There was a problem hiding this comment.
To avoid the cast/validation boilerplate, some other extension mechanisms have a templated adapter to the extension interface. Search for "FromProtoTyped" to see examples of this. Not necessarily something needed in this PR, but could be good to add a TODO at least.
| } | ||
|
|
||
| void FixedHeapMonitor::updateResourceUsage(const Server::ResourceMonitor::UpdateCb& completionCb) { | ||
| size_t physical = Memory::Stats::totalCurrentlyReserved(); |
| TEST(FixedHeapMonitorTest, SaneUsage) { | ||
| envoy::config::resource_monitor::fixed_heap::v2alpha::FixedHeapConfig config; | ||
| config.set_max_heap_size_bytes(std::numeric_limits<uint64_t>::max()); | ||
| std::unique_ptr<FixedHeapMonitor> monitor(new FixedHeapMonitor(config)); |
There was a problem hiding this comment.
Could FixedHeapMonitor receive its memory query functions as a mockable object in its constructor? This would allow you to validate the actual math in the utilization function as well with mockable inputs.
lizan
left a comment
There was a problem hiding this comment.
You will need add a field in bootstrap.proto to actually include resource monitors in config.
| void FixedHeapMonitor::updateResourceUsage(const Server::ResourceMonitor::UpdateCb& completionCb) { | ||
| size_t physical = Memory::Stats::totalCurrentlyReserved(); | ||
| size_t unmapped = Memory::Stats::totalPageHeapUnmapped(); | ||
| size_t used = std::max<size_t>(physical - unmapped, 0); |
There was a problem hiding this comment.
size_t is unsigned, so I think this max does nothing?
Signed-off-by: Elisha Ziskind <eziskind@google.com>
htuch
left a comment
There was a problem hiding this comment.
LGTM modulo nits. @mattklein123 @lizan can you take another pass?
| name = "resource_monitor_interface", | ||
| hdrs = ["resource_monitor.h"], | ||
| deps = [ | ||
| "//source/common/protobuf", |
There was a problem hiding this comment.
I think this dependency isn't needed (based on header analysis).
| void FixedHeapMonitor::updateResourceUsage(Server::ResourceMonitor::Callbacks& callbacks) { | ||
| const size_t physical = stats_->reservedHeapBytes(); | ||
| const size_t unmapped = stats_->unmappedHeapBytes(); | ||
| const size_t used = physical - unmapped; |
There was a problem hiding this comment.
Maybe ASSERT(physical >= unmapped); to sanity check here.
|
|
||
| load( | ||
| "//bazel:envoy_build_system.bzl", | ||
| "envoy_cc_mock", |
| virtual ~MemoryStatsReader() {} | ||
|
|
||
| // Memory reserved for the process by the heap. | ||
| virtual uint64_t reservedHeapBytes(); |
There was a problem hiding this comment.
The general style in Envoy for this is to declare pure interfaces for everything and then specialize as real vs. mock. I think what you have here is fine though, so no objections, just sharing an idiom that seems fairly pervasive.
Signed-off-by: Elisha Ziskind <eziskind@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
Looks great from a high level skim. Few small comments. Very excited to see this being added!
| option go_package = "v2alpha"; | ||
|
|
||
| message FixedHeapConfig { | ||
| // Limit of the Envoy process heap size. This is used to calculate heap memory pressure which |
There was a problem hiding this comment.
FWIW I agree with @eziskind on this that it will be easier to have the operator provide the limit. How that limit is obtained/enforced is going to be system specific.
| * Called when the request for updated resource usage succeeds. | ||
| * @param usage the updated resource usage | ||
| */ | ||
| virtual void onSuccess(const ResourceUsage& usage) PURE; |
There was a problem hiding this comment.
FWIW, given the size of ResourceUsage I would probably pass by value everywhere, unless you see it getting larger in the future?
There was a problem hiding this comment.
We may want to add the actual usage and limit values - useful at least for reporting as stats. So it may grow a little, but probably not much. In any event, I'd expect resource usages won't be polled so frequently that passing by reference or value will make any noticeable difference.
| * Create a particular resource monitor implementation. | ||
| * @param config const ProtoBuf::Message& supplies the config for the resource monitor | ||
| * implementation. | ||
| * @param dispatcher Event::Dispatcher& the main thread's dispatcher. This is used by |
There was a problem hiding this comment.
Optional comment: I would expect that in the future we might need to add other things here beyond the dispatcher, such as cluster manager, TLS, etc. It might be better to just create some type of Factory interface like we do for the other types of plugins? That way it's easier to add additional members to pass in the future without having to change the interface everywhere.
There was a problem hiding this comment.
Ok, adding back the ResourceMonitorFactoryContext I had originally... :)
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Add an extensible resource monitor framework for monitoring resource "pressures" (usage/limit). This will be used by the overload manager to implement downstream circuit breaking (issue #373 - see design doc linked from there).
Risk Level: low (not yet used in envoy main)
Signed-off-by: Elisha Ziskind eziskind@google.com