diff --git a/doc/rtd/explanation/failure_states.rst b/doc/rtd/explanation/failure_states.rst new file mode 100644 index 000000000000..c2e1156eb76b --- /dev/null +++ b/doc/rtd/explanation/failure_states.rst @@ -0,0 +1,138 @@ +.. _failure_states: + +Failure states: recoverable errors vs non-recoverable errors +============================================================ + +critical failure +---------------- +If cloud-init is unable to complete, error messages will now +additionally be visible in output of `cloud-init status --format json` +within the 'error' key nested under the module-level keys: 'init-local', +'init', 'modules-config', 'modules-final'. + +recoverable failure +------------------- +In the case that cloud-init is able to complete yet something goes awry, +the service will now return with exit code 2, and error messages will be +visible in the output of `cloud-init status --format` json under the top +level 'recoverable_errors' key as well as within the 'error' key nested +under the module-level keys: 'init-local', 'init', 'modules-config', +'modules-final'. + +Implementation +============== + +Cloud-init error codes +---------------------- + 0 - success + 1 - unrecoverable error + 2 - recoverable error + +If cloud-init exits with exit code 1, cloud-init experienced critical failure +and was unable to recover. In this case, something is likely seriously +wrong with the system, or cloud-init has experienced a serious bug. + +If cloud-init exits with exit code 2, cloud-init was able to complete +gracefully, however something went wrong and the user should investigate. + + +Reported state +-------------- +Cloud-init can report its internal state via the `status --format json` +subcommand undert the `'extended_status'` key. + +$ cloud-init status --format json | jq .extended_status +"degraded done" + +See the list of all possible states: + +.. code-block: shell-session + + "not running" + "running" + "done" + "error" + "degraded done" + "degraded running" + "disabled" + + +Exported errors: Aggregated errors +---------------------------------- +When a recoverable error occurs, the internal cloud-init state +information is made visible under a top level aggregate key +'recoverable_errors' with errors sorted by error level: + +.. code-block: shell-session + + $ cloud-init status --format json | jq .recoverable_errors + { + "WARNING": [ + "Failed at merging in cloud config part from p-01: empty cloud config", + "No template found in /etc/cloud/templates for template source.deb822", + "No template found in /etc/cloud/templates for template sources.list", + "No template found, not rendering /etc/apt/soures.list.d/ubuntu.source" + ] + } + +See :ref:`Appendix A` for list of possible error levels. + +Exported errors: Per-stage errors +--------------------------------- +The keys 'errors' and 'recoverable_errors' are also exported for each +stage to allow attribution of recoverable and non-recoverable errors +to their source. + +.. code-block: shell-session + + $ cloud-init status --format json | jq .init.recoverable_errors + { + "WARNING": [ + "Failed at merging in cloud config part from p-001: empty cloud config" + ] + } + +Note: Only cloud-init stages which have completed are listed in the +output of `cloud-init status --format json`. + +See :ref:`Appendix B` for list of cloud-init stages. + +Limitations of internal errors +============================== +- Exported recoverable errors represent logged messages, which are not + guaranteed to be stable between releases. The contents of the + 'errors' and 'recoverable_errors' keys are not guaranteed to have + stable output! +- Exported errors and recoverable errors may occur at different stages + since users may reorder configuration modules to run at different + stages via cloud.cfg. + +Appendices +========== + +.. _states_appendix_a: + +Appendix A: Error levels +------------------------ +Reported recoverable error messages are grouped by the level at which +they are logged. Complete list of levels: + +.. code-block: shell-session + + WARNING + DEPRECATED + ERROR + CRITICAL + +.. _states_appendix_b: + +Appendix B: Stages of cloud-init +-------------------------------- +The json representation of cloud-init stages (in run order) is: + +.. code-block: shell-session + + "init-local" + "init" + "modules-config" + "modules-final" diff --git a/doc/rtd/explanation/index.rst b/doc/rtd/explanation/index.rst index 754318c11dea..f1a10a43eb16 100644 --- a/doc/rtd/explanation/index.rst +++ b/doc/rtd/explanation/index.rst @@ -20,3 +20,4 @@ knowledge and become better at using and configuring ``cloud-init``. security.rst analyze.rst kernel-cmdline.rst + failure_states.rst