You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The scope of this proposal is to present a reliable technical proposal which, under certain specific conditions which we will discuss further in the next sections, can guarantee that heavy-load execution requests won’t be executed twice.
Requirements and scope
Requirements and Assumptions
We are assuming here that only one instance of GeoServer will be responsible for the process scheduling and execution.
In other words, the assumptions are that the current architecture won’t include a cluster of GeoServer, but only one instance at a time while we can have a cluster of processing machines to offload the processing tasks, as per the image below.
Technical Solution
The technical proposal is to allow the remote processing endpoints to decide whenever they can execute a certain process or not. Through the processes configurations, we already know exactly which ones can be managed by the endpoint, other than their possible inputs and input-types.
The idea would be to assign each processing node a capacity (an integer from 1 to 100) that is set by the system administrator which represents the measure of the processing resources on the node; such value can be updated at runtime to refine the estimate.
We would also have each process declare its weight, an integer from 1 to 100 that is set by the developer and which represents a rough measure of the processing resources needed by a standard execution of the process;
Each process will be also given the possibility to manipulate its weight dynamically based on the input of the execution request to account for different inputs.
On a processing node we also account for:
Blacklisted processes. If a blacklisted process is running no other process will be scheduled until the blacklisted process runs.
Load Average. If the average CPU and memory load over the last XXX minutes on a node stays above a configurable threshold then no other process can be run on that node.
Once GeoServer will, upon a WPS Execute request, will try to search for a suitable processing node to execute a certain process, it will ask to all available nodes, in round-robin, if they could take care of the execution or not; each node will compare its residual capacity and decide if it can execute or not.
The residual capacity will be kept updated on the local node as the difference between the initial capacity and the sum of weights of the running processing on a specific node. If the residual capacity is bigger than the weight requested to run the new process (eventually amended taking into account the inputs of the specific requests) the new process will run, Otherwise, we will check the next processing nodes.
If at a certain point in time no processing nodes have enough residual capacity to run a WPS process, GeoServer will throwback to the requestor an exception.
From an implementation point of view, we will add a “template classes” able to describe a process capacity weight and also allowing a user to assign, statically or dynamically, a coefficient, in order to optimize and tune-up the capacity estimations.
Implementation Details
ProcessWeight template class
A template class “ProcessWeight” is used to describe the process weights. The weights are estimations that will impact the remote processing machine residual capacity.
class ProcessWeight():
process_id = “”
weight = [1; 100]
coefficient = 1.0
# ability to customize process load on per request basis
@property
def request_weight(self, exec_request):
# this one is the default implementation
return (coefficient * weight)
Global variables on the remote processing machine
Notice that service_processor.py, the main daemon running on each remote processing machine orchestrating the remote processes executions, keeps synchronized the following global variables.
# total capacity on this node. It can be updated at runtime by editing the node
# configuration
capacity = [0, 100]
# current load on this node, it update when a process starts or ends
load = 0
# cpu_usage and available_mem over the last XXX minutes
load_average = [0, 100]
load_threshold = [0, 100]
# a list of names the processing machine checks before starting a new process
black_listed_running_processes = [...]
Execution pseudo-logic on each processing machine
def execute(execute_request):
# if the load average is above threshold then we cannot run another process
# and we should answer with a proper message
if load_average > load_threshold:
return -1
# if there is at least a single blacklisted process running there is no avail
# capacity to run another process and we should answer with a proper message
if len(black_listed_running_processes) > 0:
return -1
# compute residual capacity
residual_capacity = capacity - load
if (residual_capacity <=0):
return -1 # no residual capacity left, no execution
# we have capacity, is it enough?
request_load = PCk.request_weight(exec_request)
if (residual_capacity >= request_load):
# ok we can execute, update load, execute and return
load += request_load
# execute
return PCk.execute(execute_request)
# residual capacity is not enough. Skip to the next remote endpoint
return -1
As we stated above, this pseudocode runs in a synchronized block, since processes can die while we try to run news ones.
The service processor daemon, already envisage the possibility to recover both from process exceptions, whenever a remote process throws an error for any reason, or potential deadlocks, there are already some configuration variables allowing the administrator to kill a process not sending any feedback between a certain amount of time.
In both cases, the overall load will be updated accordingly, by freeing resources for further executions.
Some additional configuration details
It will be possible, from the service configuration, to instantiate concrete implementations of the “ProcessWeight” class through one of the following methods:
In the case we won’t need to redefine the request_weight(self, exec_request) logic, we can easily ask the Service Processor to instantiate concrete classes from the service_config.properties by just defining the capacity and coefficient values.
process_weight = {weight : 20, coefficient: 1.0}
For more complex cases, where we might want to redefine the request_weight(self, exec_request) dynamically, accordingly to the exec_request parameters, we can ask the Service Processor to instantiate concrete classes through the “introspection” mechanism by defining the class path:
Introduction
The scope of this proposal is to present a reliable technical proposal which, under certain specific conditions which we will discuss further in the next sections, can guarantee that heavy-load execution requests won’t be executed twice.
Requirements and scope
Requirements and Assumptions
We are assuming here that only one instance of GeoServer will be responsible for the process scheduling and execution.
In other words, the assumptions are that the current architecture won’t include a cluster of GeoServer, but only one instance at a time while we can have a cluster of processing machines to offload the processing tasks, as per the image below.
Technical Solution
The technical proposal is to allow the remote processing endpoints to decide whenever they can execute a certain process or not. Through the processes configurations, we already know exactly which ones can be managed by the endpoint, other than their possible inputs and input-types.
The idea would be to assign each processing node a capacity (an integer from 1 to 100) that is set by the system administrator which represents the measure of the processing resources on the node; such value can be updated at runtime to refine the estimate.
We would also have each process declare its weight, an integer from 1 to 100 that is set by the developer and which represents a rough measure of the processing resources needed by a standard execution of the process;
Each process will be also given the possibility to manipulate its weight dynamically based on the input of the execution request to account for different inputs.
On a processing node we also account for:
Blacklisted processes. If a blacklisted process is running no other process will be scheduled until the blacklisted process runs.
Load Average. If the average CPU and memory load over the last XXX minutes on a node stays above a configurable threshold then no other process can be run on that node.
Once GeoServer will, upon a WPS Execute request, will try to search for a suitable processing node to execute a certain process, it will ask to all available nodes, in round-robin, if they could take care of the execution or not; each node will compare its residual capacity and decide if it can execute or not.
The residual capacity will be kept updated on the local node as the difference between the initial capacity and the sum of weights of the running processing on a specific node. If the residual capacity is bigger than the weight requested to run the new process (eventually amended taking into account the inputs of the specific requests) the new process will run, Otherwise, we will check the next processing nodes.
If at a certain point in time no processing nodes have enough residual capacity to run a WPS process, GeoServer will throwback to the requestor an exception.
From an implementation point of view, we will add a “template classes” able to describe a process capacity weight and also allowing a user to assign, statically or dynamically, a coefficient, in order to optimize and tune-up the capacity estimations.
Implementation Details
ProcessWeight template class
A template class “ProcessWeight” is used to describe the process weights. The weights are estimations that will impact the remote processing machine residual capacity.
Global variables on the remote processing machine
Notice that
service_processor.py
, the main daemon running on each remote processing machine orchestrating the remote processes executions, keeps synchronized the following global variables.Execution pseudo-logic on each processing machine
As we stated above, this pseudocode runs in a synchronized block, since processes can die while we try to run news ones.
The service processor daemon, already envisage the possibility to recover both from process exceptions, whenever a remote process throws an error for any reason, or potential deadlocks, there are already some configuration variables allowing the administrator to kill a process not sending any feedback between a certain amount of time.
In both cases, the overall load will be updated accordingly, by freeing resources for further executions.
Some additional configuration details
It will be possible, from the service configuration, to instantiate concrete implementations of the “ProcessWeight” class through one of the following methods:
In the case we won’t need to redefine the request_weight(self, exec_request) logic, we can easily ask the Service Processor to instantiate concrete classes from the service_config.properties by just defining the capacity and coefficient values.
process_weight = {weight : 20, coefficient: 1.0}
For more complex cases, where we might want to redefine the request_weight(self, exec_request) dynamically, accordingly to the exec_request parameters, we can ask the Service Processor to instantiate concrete classes through the “introspection” mechanism by defining the class path:
process_weight = “my_service.my_process.MyProcessCapacity”
The text was updated successfully, but these errors were encountered: