feat(pagerduty/alert): support integration with pagerduty#400
feat(pagerduty/alert): support integration with pagerduty#400
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a new GitHub composite action for integrating with PagerDuty's alert system. The action allows triggering PagerDuty incidents and retrieving the resulting incident URL.
Key Changes:
- Adds a new
pagerduty/alertcomposite action that sends events to PagerDuty's v2 Events API - Implements incident creation with configurable summary, source, severity, component, and routing
- Includes functionality to fetch and return the created incident's URL
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pagerduty/alert/action.yml | Defines the composite action with inputs, outputs, and steps for triggering alerts and fetching incident URLs |
| pagerduty/alert/README.md | Provides documentation including usage examples and input/output descriptions |
| .github/workflows/test-pagerduty-alert.yml | Adds automated testing workflow for the new PagerDuty alert action |
| .github/workflows/no-test.yml | Excludes the pagerduty/alert path from the no-test workflow |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| id: pagerduty_incident | ||
| uses: fjogeleit/http-request-action@1297c6fc63a79b147d1676540a3fd9d2e37817c5 # v1.16.5 | ||
| with: | ||
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' |
There was a problem hiding this comment.
Fetching all incidents from the last 15 minutes without pagination could be inefficient if there are many incidents. Consider adding pagination support or filtering by additional parameters to reduce response size.
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' | |
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged&limit=20' |
| with: | ||
| script: | | ||
| const responseData = `${{ steps.pagerduty_incident.outputs.response }}` | ||
| const parsedData = JSON.parse(responseData) |
There was a problem hiding this comment.
Parsing the response data without error handling could cause the action to fail silently if the API returns invalid JSON or the response is unexpectedly structured. Add try-catch error handling to provide meaningful error messages.
| const parsedData = JSON.parse(responseData) | |
| let parsedData; | |
| try { | |
| parsedData = JSON.parse(responseData) | |
| } catch (error) { | |
| console.error('Failed to parse PagerDuty incident response as JSON:', error) | |
| throw new Error('Invalid JSON response from PagerDuty incidents API') | |
| } |
| const responseData = `${{ steps.pagerduty_incident.outputs.response }}` | ||
| const parsedData = JSON.parse(responseData) | ||
| const summary = process.env.SUMMARY | ||
| const specificIncident = parsedData.incidents.find(incident => incident.title === summary) |
There was a problem hiding this comment.
The search assumes 'parsedData.incidents' exists and is an array. If the PagerDuty API returns an error or unexpected structure, this will throw an error. Add validation to check if 'incidents' exists before accessing it.
| const specificIncident = parsedData.incidents.find(incident => incident.title === summary) | |
| let specificIncident = null; | |
| if (parsedData && Array.isArray(parsedData.incidents)) { | |
| specificIncident = parsedData.incidents.find(incident => incident.title === summary) | |
| } else { | |
| console.log('No incidents array found in PagerDuty response.'); | |
| } |
| id: pagerduty_incident | ||
| uses: fjogeleit/http-request-action@1297c6fc63a79b147d1676540a3fd9d2e37817c5 # v1.16.5 | ||
| with: | ||
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' |
There was a problem hiding this comment.
There's a potential race condition: the incident may not be immediately available when fetching. Consider adding retry logic or a small delay between triggering the alert (line 47) and fetching incidents (line 55).
| core.setOutput("time-search", isoDate) | ||
|
|
||
| - name: Trigger PagerDuty Alert | ||
| uses: fjogeleit/http-request-action@1297c6fc63a79b147d1676540a3fd9d2e37817c5 # v1.16.5 |
There was a problem hiding this comment.
There was a problem hiding this comment.
because this is a copy of an existing composite action - so I prefered to keep the same implementation from now
There was a problem hiding this comment.
Why not use github.com/marketplace/actions/http-request-action?
Actually that's the one we use
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| with: | ||
| url: 'https://events.pagerduty.com/v2/enqueue' | ||
| method: 'POST' | ||
| customHeaders: '{"Accept": "application/json", "Content-Type": "application/json", "Authorization" : "Token token=${{ inputs.api-key }}"}' |
There was a problem hiding this comment.
The API key is exposed in plain text in the workflow logs via customHeaders. Consider using a masked approach or ensure that the http-request-action properly masks sensitive headers to prevent accidental exposure in logs.
| const responseData = `${{ steps.pagerduty_incident.outputs.response }}` | ||
| const parsedData = JSON.parse(responseData) | ||
| const summary = process.env.SUMMARY |
There was a problem hiding this comment.
Using template literals to embed GitHub Actions output can break if the response contains backticks or special characters. Use fromJSON directly: const parsedData = JSON.parse('${{ steps.pagerduty_incident.outputs.response }}')
| const responseData = `${{ steps.pagerduty_incident.outputs.response }}` | |
| const parsedData = JSON.parse(responseData) | |
| const summary = process.env.SUMMARY | |
| const parsedData = ${{ fromJSON(steps.pagerduty_incident.outputs.response) }} | |
| const summary = process.env.SUMMARY |
| url: 'https://events.pagerduty.com/v2/enqueue' | ||
| method: 'POST' | ||
| customHeaders: '{"Accept": "application/json", "Content-Type": "application/json", "Authorization" : "Token token=${{ inputs.api-key }}"}' | ||
| data: '{"event_action": "trigger", "routing_key": "${{ inputs.routing-key }}", "payload": {"summary": "${{steps.sanitize.outputs.summary}}", "source": "${{ inputs.source }}", "custom_details":"${{ inputs.source }}", "severity": "${{ inputs.severity }}", "component": "${{ inputs.component }}"}}' |
There was a problem hiding this comment.
The custom_details field is set to the same value as source (${{ inputs.source }}), which appears to be unintentional. According to PagerDuty API documentation, custom_details should contain additional context as an object, not duplicate the source string.
| data: '{"event_action": "trigger", "routing_key": "${{ inputs.routing-key }}", "payload": {"summary": "${{steps.sanitize.outputs.summary}}", "source": "${{ inputs.source }}", "custom_details":"${{ inputs.source }}", "severity": "${{ inputs.severity }}", "component": "${{ inputs.component }}"}}' | |
| data: '{"event_action": "trigger", "routing_key": "${{ inputs.routing-key }}", "payload": {"summary": "${{steps.sanitize.outputs.summary}}", "source": "${{ inputs.source }}", "custom_details": {"source": "${{ inputs.source }}", "component": "${{ inputs.component }}", "summary": "${{ steps.sanitize.outputs.summary }}"}, "severity": "${{ inputs.severity }}", "component": "${{ inputs.component }}"}}' |
| - name: Fetch PagerDuty Incidents | ||
| id: pagerduty_incident | ||
| uses: fjogeleit/http-request-action@1297c6fc63a79b147d1676540a3fd9d2e37817c5 # v1.16.5 | ||
| with: | ||
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' | ||
| method: 'GET' | ||
| customHeaders: '{"Accept": "application/json", "Content-Type": "application/json", "Authorization" : "Token token=${{ inputs.api-key }}"}' | ||
|
|
||
| - name: Search the incident | ||
| uses: actions/github-script@v8 | ||
| id: get_incident | ||
| env: | ||
| SUMMARY: ${{ steps.sanitize.outputs.summary }} | ||
| with: | ||
| script: | | ||
| const responseData = `${{ steps.pagerduty_incident.outputs.response }}` | ||
| const parsedData = JSON.parse(responseData) | ||
| const summary = process.env.SUMMARY | ||
| const specificIncident = parsedData.incidents.find(incident => incident.title === summary) | ||
| if (specificIncident) { | ||
| console.log(`Found HTML URL: ${specificIncident.html_url}`) | ||
| return specificIncident.html_url | ||
| } else { | ||
| console.log('No incident found.') | ||
| return '' | ||
| } |
There was a problem hiding this comment.
Race condition: The incident search queries PagerDuty immediately after triggering the alert (lines 56-62), but PagerDuty may not have processed and indexed the incident yet. Consider adding a retry mechanism with exponential backoff or a brief delay before searching.
| - name: Fetch PagerDuty Incidents | |
| id: pagerduty_incident | |
| uses: fjogeleit/http-request-action@1297c6fc63a79b147d1676540a3fd9d2e37817c5 # v1.16.5 | |
| with: | |
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' | |
| method: 'GET' | |
| customHeaders: '{"Accept": "application/json", "Content-Type": "application/json", "Authorization" : "Token token=${{ inputs.api-key }}"}' | |
| - name: Search the incident | |
| uses: actions/github-script@v8 | |
| id: get_incident | |
| env: | |
| SUMMARY: ${{ steps.sanitize.outputs.summary }} | |
| with: | |
| script: | | |
| const responseData = `${{ steps.pagerduty_incident.outputs.response }}` | |
| const parsedData = JSON.parse(responseData) | |
| const summary = process.env.SUMMARY | |
| const specificIncident = parsedData.incidents.find(incident => incident.title === summary) | |
| if (specificIncident) { | |
| console.log(`Found HTML URL: ${specificIncident.html_url}`) | |
| return specificIncident.html_url | |
| } else { | |
| console.log('No incident found.') | |
| return '' | |
| } | |
| # Removed "Fetch PagerDuty Incidents" step; incident fetching and retry logic moved to next step. | |
| - name: Search the incident with retry | |
| uses: actions/github-script@v8 | |
| id: get_incident | |
| env: | |
| SUMMARY: ${{ steps.sanitize.outputs.summary }} | |
| API_KEY: ${{ inputs.api-key }} | |
| TIME_SEARCH: ${{ steps.sanitize.outputs.time-search }} | |
| with: | |
| script: | | |
| const fetch = require('node-fetch'); | |
| const summary = process.env.SUMMARY; | |
| const apiKey = process.env.API_KEY; | |
| const timeSearch = process.env.TIME_SEARCH; | |
| const maxAttempts = 5; | |
| let attempt = 0; | |
| let delay = 1000; // start with 1s | |
| let specificIncident = null; | |
| const url = `https://api.pagerduty.com/incidents?since=${encodeURIComponent(timeSearch)}&statuses[]=triggered&statuses[]=acknowledged`; | |
| const headers = { | |
| "Accept": "application/json", | |
| "Content-Type": "application/json", | |
| "Authorization": `Token token=${apiKey}` | |
| }; | |
| async function sleep(ms) { | |
| return new Promise(resolve => setTimeout(resolve, ms)); | |
| } | |
| async function findIncident() { | |
| for (attempt = 1; attempt <= maxAttempts; attempt++) { | |
| try { | |
| const res = await fetch(url, { method: 'GET', headers }); | |
| if (!res.ok) { | |
| throw new Error(`PagerDuty API error: ${res.status} ${res.statusText}`); | |
| } | |
| const parsedData = await res.json(); | |
| specificIncident = parsedData.incidents.find(incident => incident.title === summary); | |
| if (specificIncident) { | |
| console.log(`Found HTML URL: ${specificIncident.html_url} on attempt ${attempt}`); | |
| return specificIncident.html_url; | |
| } else { | |
| console.log(`Attempt ${attempt}: Incident not found, retrying after ${delay}ms...`); | |
| await sleep(delay); | |
| delay *= 2; // exponential backoff | |
| } | |
| } catch (err) { | |
| console.log(`Attempt ${attempt}: Error fetching incidents: ${err.message}`); | |
| await sleep(delay); | |
| delay *= 2; | |
| } | |
| } | |
| console.log('No incident found after retries.'); | |
| return ''; | |
| } | |
| return await findIncident(); |
| id: pagerduty_incident | ||
| uses: fjogeleit/http-request-action@1297c6fc63a79b147d1676540a3fd9d2e37817c5 # v1.16.5 | ||
| with: | ||
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' |
There was a problem hiding this comment.
The incident search retrieves all triggered and acknowledged incidents from the last 15 minutes without pagination limits. In high-traffic environments, this could return many incidents. Consider adding &limit=100 parameter to bound the response size.
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged' | |
| url: 'https://api.pagerduty.com/incidents?since=${{steps.sanitize.outputs.time-search}}&statuses[]=triggered&statuses[]=acknowledged&limit=100' |
See https://developer.pagerduty.com/api-reference/368ae3d938c9e-send-an-event-to-pager-duty
Initially implemented at https://github.com/elastic/app-obs-dev/tree/main/.github/actions/pagerduty-alert
But decided to create a fork so we can keep its maintenance outside of the SDH automation
It worked fine