-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cloud-init degraded state and machine readable degraded errors to cloud-init status #4500
Add cloud-init degraded state and machine readable degraded errors to cloud-init status #4500
Conversation
2711654
to
bafd9d0
Compare
Don't forget about |
What do you think about presenting more information in the
doesn't really tell me anything useful except that something went wrong in each of the stages. At this point I'd have to either jump into the logs or know to run |
Sure, I can add that.
Just noticed this was buggy and was printing all stages if there was an error in any stage. I'll fix that.
I can add this to |
I'm not sure I understand the structure of --json output:
Why is there a Also, this is probably more relevant to one of the previous PRs, but in playing around with how different types of errors are handled, I noticed that not all warnings are being handled. The easiest way to demonstrate is to add a |
As for why it is under
Yikes. I'll take a look, thanks for the heads up. That will have to be a follow-up PR. [1] Aside: Personally I think we could blow away this whole versioning in json thing we've got going on in |
But presumably Per your aside, I was going to comment something similar but thought it not relevant enough to this PR. I have a few thoughts:
Instead be
If we don't want to version things, and then later need to introduce a breaking change, we can always version at that time, though it adds some ugliness. E.g.:
The current approach doesn't really buy us anything and just makes the output more cumbersome to work with. I'm fine either way, but I agree that we're unlikely to change the key types and can instead always add more keys. The problem is this has been in the wild for a few releases now, so it'd be a breaking change to change it now. It's probably worth doing, but if we do go ahead with it, we should also look at changing |
Awesome! Access to the spec is restricted. |
cd4bd0f
to
f415f23
Compare
Thanks for the discussion on this @TheRealFalcon. Per out out of band discussion, I've gotten rid of the versioned I also dropped the
|
@cjp256: I've moved the content of the spec to discourse: https://discourse.ubuntu.com/t/spec-improve-error-and-warning-visibility/39765 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good now, but I have a few more comments.
It looks like the --long
output isn't distinguishing between recoverable errors and hard errors E.g., when I provide an invalid network-config:
root@me:~# cloud-init status --long
status: error
extended_status: error
boot_status_code: enabled-by-generator
last_update: Fri, 27 Oct 2023 15:19:35 +0000
detail:
DataSourceLXD
recoverable_errors:
WARNING:
- Failed to rename devices: Failed to apply network config names: Unknown network config version: None
- failed stage init-local
The network config error appears as a recoverable error, but it appears as both in --format json
.
root@me:~# cloud-init status --format json
{
"boot_status_code": "enabled-by-generator",
"datasource": "lxd",
"detail": "DataSourceLXD",
"errors": [
"Unknown network config version: None"
],
"extended_status": "error",
"init": {
"errors": [],
"finished": 1698419940.0766742,
"recoverable_errors": {
"WARNING": [
"Failed to rename devices: Failed to apply network config names: Unknown network config version: None"
]
},
"start": 1698419939.4522545
},
"init-local": {
"errors": [
"Unknown network config version: None"
],
"finished": 1698419819.0041544,
"recoverable_errors": {
"WARNING": [
"failed stage init-local"
]
},
"start": 1698419818.9596815
},
"last_update": "Fri, 27 Oct 2023 15:19:35 +0000",
"modules-config": {
"errors": [],
"finished": 1698419975.6157393,
"recoverable_errors": {},
"start": 1698419975.4693048
},
"modules-final": {
"errors": [],
"finished": 1698419975.9615448,
"recoverable_errors": {},
"start": 1698419975.8828495
},
"recoverable_errors": {
"WARNING": [
"Failed to rename devices: Failed to apply network config names: Unknown network config version: None",
"failed stage init-local"
]
},
"stage": null,
"status": "error"
}
If there's not an easy way to distinguish them, we should at least print the hard errors in the --long
output.
Also, I think we should add to the spec how we're distinguishing between hard error vs recoverable error. I couldn't tell until I checked the source that it comes down to service failure.
For integration tests, I would like to see an integration test that intentionally generates a few warnings and verifies that they show up in the status output.
For unit tests, I see that test_status.py has been updated to reflect the new output, but we don't have any tests that generate any kind of errors/warnings and ensure the output is correct. I think we should add a test that does this.
cloudinit/cmd/status.py
Outdated
|
||
if args.format == "tabular": | ||
prefix = "\n" if args.wait else "" | ||
print(f"{prefix}status: {details.status.value}") | ||
|
||
# For backwards compability, don't report degraded status here, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
# For backwards compability, don't report degraded status here, | |
# For backwards compatibility, don't report degraded status here, |
) | ||
|
||
status_json = client.execute("cloud-init status --format json").stdout | ||
assert json.loads(status_json)["v1"]["init"]["recoverable_errors"].get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right anymore. status
command doesn't output v1
anymore, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, thank you.
@TheRealFalcon Thanks for the review.
Hrm, you're right. I guess I can add both to
Happy to add that to the spec. I'm planning on basing the followup documentation PR on the commit message that implements this feature. Do you think the content there is sufficient regarding the difference between cloud-init failure and recoverable error, or would you prefer to see more/different content?
+1 Will do agreed. I didn't want to generate new tests until we were happy with how this looked to avoid unnecessary rework. |
See the new current state and future state sections under the Implementation section. |
Those are great, thanks!
I don't see what distinguishes a recoverable error from a non-recoverable one. I think if you make the same distinction that you made in the spec, that will work. |
Thanks for the feedback, I'll add that to the commit message when I'm ready to merge. Also, I think I've addressed the remaining concerns in ab35812. Better unittest and integration test coverage, fixed the spelling nit, and included both Ready for re-review. |
ab35812
to
321126a
Compare
Summary ======= This commit `cloud-init status` to include: 1. A new exit code (2) 2. Additional running states, exported under a new key "extended_status" 3. External representation of all internal errors: - aggregate recoverable errors - per-stage recoverable errors - per-stage non-recoverable errors (aggregate key already exists) Current state: recoverable errors vs non-recoverable errors =========================================================== critical failure ---------------- If cloud-init is unable to complete, the service returns with exit code 1, and error messages are visible in the log files and in output of `cloud-init status --format json` under the top level 'error' key. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service returns with exit code 0 and messages are visible in the log files. Future state: recoverable errors vs non-recoverable errors ========================================================== critical failure ---------------- If cloud-init is unable to complete, error messages will now additionally be visible in output of `cloud-init status --format json` within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service will now return with exit code 2, and error messages will be visible in the output of `cloud-init status --format` json under the top level 'recoverable_errors' key as well as within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. Implementation ============== Cloud-init error codes ---------------------- 0 - success 1 - unrecoverable error 2 - recoverable error (new) This new exit code indicates recoverable errors. If cloud-init exits with exit code (2), cloud-init was able to complete gracefully, however something went wrong and the user should investigate. Additional states ----------------- For backwards compatibility, the output of `cloud-init status` remains unchanged. A new key 'extended_status' is included in the output: $ cloud-init status --format json | jq .status "done" $ cloud-init status --format json | jq .extended_status "degraded done" See Appendix A for list of possible states. Exported errors: Aggregated errors ---------------------------------- When a recoverable error occurs, the internal cloud-init state information is made visible under a top level aggregate key 'recoverable_errors' with errors sorted by error level: $ cloud-init status --format json | jq .recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config", "No template found in /etc/cloud/templates for template named sources.list.ubuntu.deb822", "No template found in /etc/cloud/templates for template named sources.list", "No template found, not rendering /etc/apt/sources.list.d/ubuntu.sources" ] } See Appendix B for list of possible error levels. Exported errors: Per-stage errors --------------------------------- The keys 'errors' and 'recoverable_errors' are also exported for each stage to allow attribution of recoverable and non-recoverable errors to their source. $ cloud-init status --format json | jq .init.recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config" ] } Note: Only cloud-init stages which have completed are listed in the output of `cloud-init status --format json`. See Appendix C for list of possible cloud-init stages. Limitations of internal errors ============================== - Exported recoverable errors represent logged messages, which are not guaranteed to be stable between releases. The contents of the 'errors' and 'recoverable_errors' keys are not guaranteed to have stable output! - Exported errors and recoverable errors may occur at different stages since users may reorder configuration modules to run at different stages via cloud.cfg. Appendices ========== Appendix A: Extended states --------------------------- "not running" "running" "done" "error" "degraded done" "degraded running" "disabled" Appendix B: Error levels ------------------------ Reported recoverable error messages are grouped by the level at which they are logged. Complete list of levels: WARNING DEPRECATED ERROR CRITICAL Appendix C: Stages of cloud-init -------------------------------- The json representation of cloud-init stages (in run order) is: "init-local" "init" "modules-config" "modules-final" This commit implements design specification US057[1]. [1] https://discourse.ubuntu.com/t/spec-improve-error-and-warning-visibility/39765
Commit f780cf9 removed the modules-init key from status.json v1 key. Don't use it as example test data.
This key had more meaning to a developer than to a user. Replace with "recoverable_errors", and align internal variable names with external user UI for code legibility.
If different meaning for duplicate keys is required, then a v2 can be added. Drop versioning scheme and duplicate keys to reduce unnecessary verbosity. BREAKING CHANGE: cloud-init status --json output
The detail in this key is duplicate, and changing the value of this key during error condition is neither obvious nor documented. Make this key behave the same regardless of error condition. BREAKING CHANGE: status.json
Summary ======= This commit `cloud-init status` to include: 1. A new exit code (2) 2. Additional running states, exported under a new key "extended_status" 3. External representation of all internal errors: - aggregate recoverable errors - per-stage recoverable errors - per-stage non-recoverable errors (aggregate key already exists) Current state: recoverable errors vs non-recoverable errors =========================================================== critical failure ---------------- If cloud-init is unable to complete, the service returns with exit code 1, and error messages are visible in the log files and in output of `cloud-init status --format json` under the top level 'error' key. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service returns with exit code 0 and messages are visible in the log files. Future state: recoverable errors vs non-recoverable errors ========================================================== critical failure ---------------- If cloud-init is unable to complete, error messages will now additionally be visible in output of `cloud-init status --format json` within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service will now return with exit code 2, and error messages will be visible in the output of `cloud-init status --format` json under the top level 'recoverable_errors' key as well as within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. Implementation ============== Cloud-init error codes ---------------------- 0 - success 1 - unrecoverable error 2 - recoverable error (new) This new exit code indicates recoverable errors. If cloud-init exits with exit code (2), cloud-init was able to complete gracefully, however something went wrong and the user should investigate. Additional states ----------------- For backwards compatibility, the output of `cloud-init status` remains unchanged. A new key 'extended_status' is included in the output: $ cloud-init status --format json | jq .status "done" $ cloud-init status --format json | jq .extended_status "degraded done" See Appendix A for list of possible states. Exported errors: Aggregated errors ---------------------------------- When a recoverable error occurs, the internal cloud-init state information is made visible under a top level aggregate key 'recoverable_errors' with errors sorted by error level: $ cloud-init status --format json | jq .recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config", "No template found in /etc/cloud/templates for template named sources.list.ubuntu.deb822", "No template found in /etc/cloud/templates for template named sources.list", "No template found, not rendering /etc/apt/sources.list.d/ubuntu.sources" ] } See Appendix B for list of possible error levels. Exported errors: Per-stage errors --------------------------------- The keys 'errors' and 'recoverable_errors' are also exported for each stage to allow attribution of recoverable and non-recoverable errors to their source. $ cloud-init status --format json | jq .init.recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config" ] } Note: Only cloud-init stages which have completed are listed in the output of `cloud-init status --format json`. See Appendix C for list of possible cloud-init stages. Limitations of internal errors ============================== - Exported recoverable errors represent logged messages, which are not guaranteed to be stable between releases. The contents of the 'errors' and 'recoverable_errors' keys are not guaranteed to have stable output! - Exported errors and recoverable errors may occur at different stages since users may reorder configuration modules to run at different stages via cloud.cfg. Appendices ========== Appendix A: Extended states --------------------------- "not running" "running" "done" "error" "degraded done" "degraded running" "disabled" Appendix B: Error levels ------------------------ Reported recoverable error messages are grouped by the level at which they are logged. Complete list of levels: WARNING DEPRECATED ERROR CRITICAL Appendix C: Stages of cloud-init -------------------------------- The json representation of cloud-init stages (in run order) is: "init-local" "init" "modules-config" "modules-final" This commit implements design specification US057[1]. [1] https://discourse.ubuntu.com/t/spec-improve-error-and-warning-visibility/39765
Commit f780cf9 removed the modules-init key from status.json v1 key. Don't use it as example test data.
This key had more meaning to a developer than to a user. Replace with "recoverable_errors", and align internal variable names with external user UI for code legibility.
If different meaning for duplicate keys is required, then a v2 can be added. Drop versioning scheme and duplicate keys to reduce unnecessary verbosity. BREAKING CHANGE: cloud-init status --json output
The detail in this key is duplicate, and changing the value of this key during error condition is neither obvious nor documented. Make this key behave the same regardless of error condition. BREAKING CHANGE: status.json
Summary ======= This commit `cloud-init status` to include: 1. A new exit code (2) 2. Additional running states, exported under a new key "extended_status" 3. External representation of all internal errors: - aggregate recoverable errors - per-stage recoverable errors - per-stage non-recoverable errors (aggregate key already exists) Current state: recoverable errors vs non-recoverable errors =========================================================== critical failure ---------------- If cloud-init is unable to complete, the service returns with exit code 1, and error messages are visible in the log files and in output of `cloud-init status --format json` under the top level 'error' key. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service returns with exit code 0 and messages are visible in the log files. Future state: recoverable errors vs non-recoverable errors ========================================================== critical failure ---------------- If cloud-init is unable to complete, error messages will now additionally be visible in output of `cloud-init status --format json` within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service will now return with exit code 2, and error messages will be visible in the output of `cloud-init status --format` json under the top level 'recoverable_errors' key as well as within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. Implementation ============== Cloud-init error codes ---------------------- 0 - success 1 - unrecoverable error 2 - recoverable error (new) This new exit code indicates recoverable errors. If cloud-init exits with exit code (2), cloud-init was able to complete gracefully, however something went wrong and the user should investigate. Additional states ----------------- For backwards compatibility, the output of `cloud-init status` remains unchanged. A new key 'extended_status' is included in the output: $ cloud-init status --format json | jq .status "done" $ cloud-init status --format json | jq .extended_status "degraded done" See Appendix A for list of possible states. Exported errors: Aggregated errors ---------------------------------- When a recoverable error occurs, the internal cloud-init state information is made visible under a top level aggregate key 'recoverable_errors' with errors sorted by error level: $ cloud-init status --format json | jq .recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config", "No template found in /etc/cloud/templates for template named sources.list.ubuntu.deb822", "No template found in /etc/cloud/templates for template named sources.list", "No template found, not rendering /etc/apt/sources.list.d/ubuntu.sources" ] } See Appendix B for list of possible error levels. Exported errors: Per-stage errors --------------------------------- The keys 'errors' and 'recoverable_errors' are also exported for each stage to allow attribution of recoverable and non-recoverable errors to their source. $ cloud-init status --format json | jq .init.recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config" ] } Note: Only cloud-init stages which have completed are listed in the output of `cloud-init status --format json`. See Appendix C for list of possible cloud-init stages. Limitations of internal errors ============================== - Exported recoverable errors represent logged messages, which are not guaranteed to be stable between releases. The contents of the 'errors' and 'recoverable_errors' keys are not guaranteed to have stable output! - Exported errors and recoverable errors may occur at different stages since users may reorder configuration modules to run at different stages via cloud.cfg. Appendices ========== Appendix A: Extended states --------------------------- "not running" "running" "done" "error" "degraded done" "degraded running" "disabled" Appendix B: Error levels ------------------------ Reported recoverable error messages are grouped by the level at which they are logged. Complete list of levels: WARNING DEPRECATED ERROR CRITICAL Appendix C: Stages of cloud-init -------------------------------- The json representation of cloud-init stages (in run order) is: "init-local" "init" "modules-config" "modules-final" This commit implements design specification US057[1]. [1] https://discourse.ubuntu.com/t/spec-improve-error-and-warning-visibility/39765
Commit f780cf9 removed the modules-init key from status.json v1 key. Don't use it as example test data.
This key had more meaning to a developer than to a user. Replace with "recoverable_errors", and align internal variable names with external user UI for code legibility.
If different meaning for duplicate keys is required, then a v2 can be added. Drop versioning scheme and duplicate keys to reduce unnecessary verbosity. BREAKING CHANGE: cloud-init status --json output
The detail in this key is duplicate, and changing the value of this key during error condition is neither obvious nor documented. Make this key behave the same regardless of error condition. BREAKING CHANGE: status.json
Summary ======= This commit `cloud-init status` to include: 1. A new exit code (2) 2. Additional running states, exported under a new key "extended_status" 3. External representation of all internal errors: - aggregate recoverable errors - per-stage recoverable errors - per-stage non-recoverable errors (aggregate key already exists) Current state: recoverable errors vs non-recoverable errors =========================================================== critical failure ---------------- If cloud-init is unable to complete, the service returns with exit code 1, and error messages are visible in the log files and in output of `cloud-init status --format json` under the top level 'error' key. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service returns with exit code 0 and messages are visible in the log files. Future state: recoverable errors vs non-recoverable errors ========================================================== critical failure ---------------- If cloud-init is unable to complete, error messages will now additionally be visible in output of `cloud-init status --format json` within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service will now return with exit code 2, and error messages will be visible in the output of `cloud-init status --format` json under the top level 'recoverable_errors' key as well as within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. Implementation ============== Cloud-init error codes ---------------------- 0 - success 1 - unrecoverable error 2 - recoverable error (new) This new exit code indicates recoverable errors. If cloud-init exits with exit code (2), cloud-init was able to complete gracefully, however something went wrong and the user should investigate. Additional states ----------------- For backwards compatibility, the output of `cloud-init status` remains unchanged. A new key 'extended_status' is included in the output: $ cloud-init status --format json | jq .status "done" $ cloud-init status --format json | jq .extended_status "degraded done" See Appendix A for list of possible states. Exported errors: Aggregated errors ---------------------------------- When a recoverable error occurs, the internal cloud-init state information is made visible under a top level aggregate key 'recoverable_errors' with errors sorted by error level: $ cloud-init status --format json | jq .recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config", "No template found in /etc/cloud/templates for template named sources.list.ubuntu.deb822", "No template found in /etc/cloud/templates for template named sources.list", "No template found, not rendering /etc/apt/sources.list.d/ubuntu.sources" ] } See Appendix B for list of possible error levels. Exported errors: Per-stage errors --------------------------------- The keys 'errors' and 'recoverable_errors' are also exported for each stage to allow attribution of recoverable and non-recoverable errors to their source. $ cloud-init status --format json | jq .init.recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config" ] } Note: Only cloud-init stages which have completed are listed in the output of `cloud-init status --format json`. See Appendix C for list of possible cloud-init stages. Limitations of internal errors ============================== - Exported recoverable errors represent logged messages, which are not guaranteed to be stable between releases. The contents of the 'errors' and 'recoverable_errors' keys are not guaranteed to have stable output! - Exported errors and recoverable errors may occur at different stages since users may reorder configuration modules to run at different stages via cloud.cfg. Appendices ========== Appendix A: Extended states --------------------------- "not running" "running" "done" "error" "degraded done" "degraded running" "disabled" Appendix B: Error levels ------------------------ Reported recoverable error messages are grouped by the level at which they are logged. Complete list of levels: WARNING DEPRECATED ERROR CRITICAL Appendix C: Stages of cloud-init -------------------------------- The json representation of cloud-init stages (in run order) is: "init-local" "init" "modules-config" "modules-final" This commit implements design specification US057[1]. [1] https://discourse.ubuntu.com/t/spec-improve-error-and-warning-visibility/39765
Commit f780cf9 removed the modules-init key from status.json v1 key. Don't use it as example test data.
This key had more meaning to a developer than to a user. Replace with "recoverable_errors", and align internal variable names with external user UI for code legibility.
If different meaning for duplicate keys is required, then a v2 can be added. Drop versioning scheme and duplicate keys to reduce unnecessary verbosity. BREAKING CHANGE: cloud-init status --json output
The detail in this key is duplicate, and changing the value of this key during error condition is neither obvious nor documented. Make this key behave the same regardless of error condition. BREAKING CHANGE: status.json
Summary ======= This commit `cloud-init status` to include: 1. A new exit code (2) 2. Additional running states, exported under a new key "extended_status" 3. External representation of all internal errors: - aggregate recoverable errors - per-stage recoverable errors - per-stage non-recoverable errors (aggregate key already exists) Current state: recoverable errors vs non-recoverable errors =========================================================== critical failure ---------------- If cloud-init is unable to complete, the service returns with exit code 1, and error messages are visible in the log files and in output of `cloud-init status --format json` under the top level 'error' key. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service returns with exit code 0 and messages are visible in the log files. Future state: recoverable errors vs non-recoverable errors ========================================================== critical failure ---------------- If cloud-init is unable to complete, error messages will now additionally be visible in output of `cloud-init status --format json` within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. recoverable failure ------------------- In the case that cloud-init is able to complete yet something goes awry, the service will now return with exit code 2, and error messages will be visible in the output of `cloud-init status --format` json under the top level 'recoverable_errors' key as well as within the 'error' key nested under the module-level keys: 'init-local', 'init', 'modules-config', 'modules-final'. Implementation ============== Cloud-init error codes ---------------------- 0 - success 1 - unrecoverable error 2 - recoverable error (new) This new exit code indicates recoverable errors. If cloud-init exits with exit code (2), cloud-init was able to complete gracefully, however something went wrong and the user should investigate. Additional states ----------------- For backwards compatibility, the output of `cloud-init status` remains unchanged. A new key 'extended_status' is included in the output: $ cloud-init status --format json | jq .status "done" $ cloud-init status --format json | jq .extended_status "degraded done" See Appendix A for list of possible states. Exported errors: Aggregated errors ---------------------------------- When a recoverable error occurs, the internal cloud-init state information is made visible under a top level aggregate key 'recoverable_errors' with errors sorted by error level: $ cloud-init status --format json | jq .recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config", "No template found in /etc/cloud/templates for template named sources.list.ubuntu.deb822", "No template found in /etc/cloud/templates for template named sources.list", "No template found, not rendering /etc/apt/sources.list.d/ubuntu.sources" ] } See Appendix B for list of possible error levels. Exported errors: Per-stage errors --------------------------------- The keys 'errors' and 'recoverable_errors' are also exported for each stage to allow attribution of recoverable and non-recoverable errors to their source. $ cloud-init status --format json | jq .init.recoverable_errors { "WARNING": [ "Failed at merging in cloud config part from part-001: empty cloud config" ] } Note: Only cloud-init stages which have completed are listed in the output of `cloud-init status --format json`. See Appendix C for list of possible cloud-init stages. Limitations of internal errors ============================== - Exported recoverable errors represent logged messages, which are not guaranteed to be stable between releases. The contents of the 'errors' and 'recoverable_errors' keys are not guaranteed to have stable output! - Exported errors and recoverable errors may occur at different stages since users may reorder configuration modules to run at different stages via cloud.cfg. Appendices ========== Appendix A: Extended states --------------------------- "not running" "running" "done" "error" "degraded done" "degraded running" "disabled" Appendix B: Error levels ------------------------ Reported recoverable error messages are grouped by the level at which they are logged. Complete list of levels: WARNING DEPRECATED ERROR CRITICAL Appendix C: Stages of cloud-init -------------------------------- The json representation of cloud-init stages (in run order) is: "init-local" "init" "modules-config" "modules-final" This commit implements design specification US057[1]. [1] https://discourse.ubuntu.com/t/spec-improve-error-and-warning-visibility/39765
- New page and content describing debugging for users - New page and content documenting cloud-init's status - New page and content documenting cloud-init's exported errors - New page and content documenting cloud-init's failure states - New page and content documenting how to re-run cloud-init - New content documenting how validate user-data - New content documenting how to use cloud-init with libvirt Documents canonicalGH-4500 Fixes canonicalGH-4608
- New page and content describing debugging for users - New page and content documenting cloud-init's status - New page and content documenting cloud-init's exported errors - New page and content documenting cloud-init's failure states - New page and content documenting how to re-run cloud-init - New content documenting how validate user-data - New content documenting how to use cloud-init with libvirt Documents canonicalGH-4500 Fixes canonicalGH-4608
- New page and content describing debugging for users - New page and content documenting cloud-init's status - New page and content documenting cloud-init's exported errors - New page and content documenting cloud-init's failure states - New page and content documenting how to re-run cloud-init - New content documenting how validate user-data - New content documenting how to use cloud-init with libvirt Documents canonicalGH-4500 Fixes canonicalGH-4608
- New page and content describing debugging for users - New page and content documenting cloud-init's status - New page and content documenting cloud-init's exported errors - New page and content documenting cloud-init's failure states - New page and content documenting how to re-run cloud-init - New content documenting how validate user-data - New content documenting how to use cloud-init with libvirt Documents GH-4500 Fixes GH-4608
(Rebase merge)
Proposed Commit Message
See individual commit messages for details.
Additional Context
Design spec
TODO