Skip to content

Commit

Permalink
remove hero image and fix nav (#130)
Browse files Browse the repository at this point in the history
  • Loading branch information
pd-reg-braithwaite authored Nov 1, 2022
1 parent a689938 commit 784b16e
Showing 1 changed file with 9 additions and 11 deletions.
20 changes: 9 additions & 11 deletions docs/during/external_communication_guidelines.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
---
cover: assets/img/covers/whos_on-call.png
description: Information on how to manage external communications
hero: assets/img/headers/who_oncall.png
hero_alt_text: External Communication Guidelines
---

## External Communication Guidelines
Information on how to manage external communications during an incident. See our [role descriptions](../before/different_roles/) for information about who is responsible for external communications.

### When to communicate publicly
## When to communicate publicly

Before you decide to communicate an incident, it’s important to have an agreed-upon set of criteria for when a major incident is communicated. False alarms and short-lived issues can sometimes kick off incident calls, so knowing when communication is appropriate will help your customers avoid widespread panic. This can be tied to your organization’s definition of [what an incident is](https://response.pagerduty.com/before/what_is_an_incident/), and/or your [severity levels](https://response.pagerduty.com/before/severity_levels/).

Expand All @@ -21,25 +19,25 @@ You might consider the following criteria as well:

We also recommend coming up with a set of templates for different stages of an incident, including options for the communications below as well as special situations (long-running incidents, small or limited customer impact, incidents opened with immediate resolution, etc.)

### How to communicate
## How to communicate

#### Initial communication:
### Initial communication:

The first communication should indicate that an incident is under investigation. The goal here is to avoid a customer experiencing symptoms of the incident, checking status pages or Twitter accounts, and not seeing awareness of the issue from the business.

- Decision and posting of initial communication happens within 5 minutes of kicking off the incident call.
- These messages should be entirely templated for ease of action.
- These messages can be minimal in revealing scope which might not be known yet, but should indicate that scope will be coming soon.

#### Second communication: Initial Scoping of Impact
### Second communication: Initial Scoping of Impact

This is a message that should be delivered within 5 minutes of the first communication, once some scope of impact is known. This post should outline:

- Customer impact
- An update of which components and/or functionality are impacted
- Which regions are affected.

#### Updates
### Updates

Depending on the length of the incident, periodic updates will be necessary. These updates should be delivered **at least** every 20 minutes from the scoping update during the first two hours of an incident. After two hours, you may choose to update with reduced frequency and shift to a long incident communication model (see below). Regardless of expected frequency, when the degree of impact has meaningfully changed, updates should be posted. These updates should:

Expand All @@ -49,15 +47,15 @@ Depending on the length of the incident, periodic updates will be necessary. The

Customers with special contracts around their Customer Support or Customer Success, such as a customer on a Premium Support plan, should also receive communication of impact delivered individually, whether through a Customer Liaison or their account team.

#### Long Incidents
### Long Incidents

Incidents longer than two hours should be considered a long incident, and have different communication procedures as a result. When we know an incident will be extended, customer expectations have to be set appropriately, and customer notification fatigue due to content-less updates should be avoided. When in doubt, notify at the frequency which keeps updates meaningful.

- Don’t determine this within the first hour of an incident.
- For incidents where we know a long running recovery, indicate this in an update when known.
- If planning to reduce update frequency, continue to provide expectations of when the next update will be posted.

#### Resolution
### Resolution

Your final communication should be posted when full recovery of the incident has been confirmed by the Incident Commander. This update should include:

Expand All @@ -67,6 +65,6 @@ Your final communication should be posted when full recovery of the incident has

Once this is posted, continue to follow the steps for [After an Incident](https://response.pagerduty.com/after/after_an_incident/) and the [Postmortem Process](https://response.pagerduty.com/after/post_mortem_process/).

### Quick Reference
## Quick Reference

![Quick reference rubric for external communications spanning from initial investigation communication to resolution.](../assets/img/misc/decision-tree.png)

0 comments on commit 784b16e

Please sign in to comment.