Need prototype reporting of persistent sled faults

During investigation of #1364, Josh brought up the general point of fault reporting. See [this comment thread](https://github.com/oxidecomputer/omicron/issues/1364#issuecomment-1176967377) for context. This issue tracks adding some prototype or preliminary reporting of persistent faults on a sled. In that particular issue, a failure to delete an OPTE port means that the sled cannot be used further, at least for hosting that particular guest instance. We'd like a simple way to track that fact, ideally in CockroachDB, and use that knowledge in Nexus to direct instances (or Oxide services, potentially) to other sleds.

cc @jclulow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Need prototype reporting of persistent sled faults #1366

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Need prototype reporting of persistent sled faults #1366

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions