This package contains the Job-Runner Worker, which is responsible for executing the scheduled jobs managed by the Job-Runner.
Requirements (depending on your distro, the naming might be a bit different):
python-dev
build-essential
libevent-dev
Then you should be able to install this package with
pip install job-runner-worker
.
If you want to install this package in development mode, clone this repository
and then execute python setup.py develop
. In the latter, you might want
to install the testing requirements by executing
pip install -r test-requirements.txt
.
See the getting started section in the Job-Runner documentation ( in the job-runner repo) for setting up the whole project.
Example with required settings:
[job_runner_worker] api_base_url=http://domain.of.job.runner/ api_key=worker1 secret=verysecret script_temp_path=/tmp ws_server_hostname=domain.of.websocket.server broadcaster_server_hostname=domain.of.broadcast.server
api_base_url
- The base URL which will be used to access the API. This should start with
http://
orhttps://
. api_key
- Public-key to access the API.
secret
- Private-key to access the API.
concurrent_jobs
- The number of jobs to run concurrently. Default:
4
. log_level
The log level. Default:
'info'
. Valid options are:debug
info
warning
error
max_log_bytes
- The maximum number of bytes of the log that is sent back to the API. This
is to avoid
413 Request Entity Too Large
errors. If the log will be larger than this value, 20% of the allowed size will be taken from the top of the log, the remaining 80% will be taken from the bottom. Everything in between will be truncated. Default:819200
(800kb). ws_server_hostname
- The hostname of the WebSocket Server.
ws_server_port
- The port of the WebSocket Server. Default:
5555
. script_temp_path
- The path where the scripts that are being executed through the Job-Runner
are temporarily stored. Default:
'/tmp'
. broadcaster_server_hostname
- The hostname of the queue broadcaster server.
broadcaster_server_port
- The port of the queue broadcaster server. Default:
5556
. reconnect_after_inactivity
- Seconds after which the subscriber is re-connecting to the publisher
when no data has been received. Default:
300
. This is useful when you are loadbalancing the publisher and it keeps the TCP connection open on the front-end, when the connection on the back-end has been closed. Because of this ZMQ doesn't detect that it is not connected anymore and jobs get stuck.
For starting the worker, you can use the job_runner_worker
command:
usage: job_runner_worker [-h] [--config-path CONFIG_PATH] Job Runner worker (v2.1.0) optional arguments: -h, --help show this help message and exit --config-path CONFIG_PATH absolute path to config file (default: CONFIG_PATH env variable)
- Rollback retry on 4xx errors. Instead, recover when an unexpected error
occurs in the
execute_run
,enqueue_actions
, orkill_run
. This will recover from when a run was claimed by two workers (e.g. in the case when it was sent to worker a, which doesn't respond directly, then it was sent to worker b which claims it after which a claims it too).
- Make sure a shebang does exist on scripts to be run. Use shlex to make Popen safer.
- Retry request 5x when the response is in the 4xx range before raising an exception.
- On ping response, send back the version of the worker and the number of
concurrent jobs. This version requires that you have
job-runner>=3.4.0
running.
- Update error message when job does not start to be more verbose and specific.
- Fix the case where in case of an exception, the run was marked as completed but not started.
- Make sure to only cleanup runs that are assigned to the worker. This version
is dependent on
job-runner>=3.0.1
.
- Make the worker compatible with the new worker-pool structure.
IMPORTANT: This version is dependent on
job-runner>=2.0.0
! - Change
SETTINGS_PATH
environment variable toCONFIG_PATH
for better naming consistency. - Make sure that when a run already has log, it is updated (before it would hang on the database integrity error).
- Make the worker crash early instead of hanging on errors happening before the actual job starts, to give the user a visible cue that something went wrong.
- The worker will now terminate gracefully when receiving the
TERM
signal. This means that all pending jobs will be completed, but that it will not accept any new jobs. After finishing the last pending job, the worker will terminate.
- Set
reconnect_after_inactivity
default to 10 minutes. This is 2 x theJOB_RUNNER_WORKER_PING_INTERVAL
default setting in Job-Runner.
- Implement handler for
ping
action.
- Add and implement
reconnect_after_inactivity
setting.
- Run script by finding their shebang without the x bit being needed.
- Handle separate run log-output resource. This requires Job-Runner >= v1.3.0.
- Fix killing job-runs. Where v1.0.5 was killing children processes, it did not kill children of children, ... This should kill the full tree of child-processes.
- Freeze requests library version, since 1.0.0 contains backwards compatible changes.
- Fix killing job-runs. When the process had sub-processes, only the parent process was killed and the worker was waiting for the child-processes to complete.
- Add config variable
max_log_bytes
to limit the amount of logdata that will be send back to the API (to avoid413 Request Entity Too Large
errors).
- Send
pid
back to the REST API when a job has been started. - Kill a job-run when a
kill
action is received.
- Make sure that the API exactly matches.
- Make the timezones send to the REST API timezone aware.
- Deployar related changes.
- Fix encoding issue when writing the file.
- Refactor to make the worker compatible with the 0.7 version of the
job-runner
package. - Make it consume runs from the queue broadcaster instead of hitting the REST interface every x seconds.
- Add retry on error to recover from temporary REST interface errors.
- Merge fixes v0.5.1 and v0.5.2 into v0.6.x version.
- Refactor to make use of separate WebSocket Server.
- Make temporary path for scripts configurable.
- Disable SSL certificate validation.
- Initial release.