Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions .github/workflows/gh-aw-pr-buildkite-detective.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ on:
bots:
- "${{ inputs.allowed-bot-users }}"
concurrency:
group: pr-buildkite-detective-${{ github.event.workflow_run.id }}
group: pr-buildkite-detective-${{ github.run_id }}
cancel-in-progress: false
permissions:
actions: read
Expand Down Expand Up @@ -99,9 +99,10 @@ Analyze failed Buildkite CI builds for pull requests in ${{ github.repository }}
## Context

- **Repository**: ${{ github.repository }}
- **Workflow Run ID**: ${{ github.event.workflow_run.id }}
- **Conclusion**: ${{ github.event.workflow_run.conclusion }}
- **Commit SHA**: ${{ github.event.workflow_run.head_sha }}
- **Event Name**: ${{ github.event_name }}
- **Event ID**: ${{ github.event_name == 'status' && github.event.id || github.event.check_run.id }}
- **Failure State**: ${{ github.event_name == 'status' && github.event.state || github.event.check_run.conclusion }}
- **Commit SHA**: ${{ github.event_name == 'status' && github.event.sha || github.event.check_run.head_sha }}
Comment on lines +102 to +105
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

GitHub Actions expressions syntax operators && || ternary official documentation

💡 Result:

  • Official operator list (GitHub Actions expressions): GitHub documents these operators: !, comparison ops (< <= > >= == !=), logical && (and) and || (or), plus grouping/index/property (( ), [ ], .). [1]
  • Ternary (cond ? a : b) support: GitHub Actions expression syntax does not include a ternary operator in the official operator list. [1] A long-running feature request explicitly asks for ternary/conditional support, indicating it’s not currently available. [2]
  • Common “ternary-like” workaround (uses documented operators): ${{ condition && 'trueValue' || 'falseValue' }} (works as long as 'trueValue' is truthy). [1]

Sources
[1] GitHub Docs — Evaluate expressions in workflows and actions (Operators table) (docs.github.com)
[2] GitHub actions/runner issue #409Conditional operator or function for expression syntax (github.com)

Citations:


Reconsider the proposed alternative—ternary operators are not supported in GitHub Actions expressions.

The chained && and || operators in these expressions are officially documented and the standard workaround pattern for conditional logic in GitHub Actions. However, your underlying concerns about edge cases are valid:

  1. Undefined property access: Accessing github.event.id when the event is 'check_run' (where this property doesn't exist) relies on GitHub's handling of undefined properties.
  2. Falsy value handling: If github.event.id exists but is falsy (0, empty string), the expression will incorrectly fall back to github.event.check_run.id.

The ternary operator (? :) is not supported in GitHub Actions expressions—this is a long-running open feature request. Instead, consider:

  • Use conditional job-level if statements to process each event type separately
  • Restructure the workflow to only trigger on a single event type, avoiding the dual-path logic
  • Add explicit property existence checks if GitHub Actions provides them
🧰 Tools
🪛 LanguageTool

[uncategorized] ~102-~102: The official name of this software platform is spelled with a capital “H”.
Context: ...hub.repository }} - Event Name: ${{ github.event_name }} - Event ID: ${{ githu...

(GITHUB)


[uncategorized] ~103-~103: The official name of this software platform is spelled with a capital “H”.
Context: ...ithub.event_name }} - Event ID: ${{ github.event_name == 'status' && github.event....

(GITHUB)


[uncategorized] ~103-~103: The official name of this software platform is spelled with a capital “H”.
Context: ...*: ${{ github.event_name == 'status' && github.event.id || github.event.check_run.id }...

(GITHUB)


[uncategorized] ~103-~103: The official name of this software platform is spelled with a capital “H”.
Context: ..._name == 'status' && github.event.id || github.event.check_run.id }} - **Failure State...

(GITHUB)


[uncategorized] ~104-~104: The official name of this software platform is spelled with a capital “H”.
Context: ...heck_run.id }} - Failure State: ${{ github.event_name == 'status' && github.event....

(GITHUB)


[uncategorized] ~104-~104: The official name of this software platform is spelled with a capital “H”.
Context: ...*: ${{ github.event_name == 'status' && github.event.state || github.event.check_run.c...

(GITHUB)


[uncategorized] ~104-~104: The official name of this software platform is spelled with a capital “H”.
Context: ...me == 'status' && github.event.state || github.event.check_run.conclusion }} - **Commi...

(GITHUB)


[uncategorized] ~105-~105: The official name of this software platform is spelled with a capital “H”.
Context: ...run.conclusion }} - Commit SHA: ${{ github.event_name == 'status' && github.event....

(GITHUB)


[uncategorized] ~105-~105: The official name of this software platform is spelled with a capital “H”.
Context: ...*: ${{ github.event_name == 'status' && github.event.sha || github.event.check_run.hea...

(GITHUB)


[uncategorized] ~105-~105: The official name of this software platform is spelled with a capital “H”.
Context: ...name == 'status' && github.event.sha || github.event.check_run.head_sha }} - **Buildki...

(GITHUB)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gh-aw-pr-buildkite-detective.md around lines 102 - 105,
The chained &&/|| expressions using github.event_name, github.event.id and
github.event.check_run.* are fragile because GitHub Actions expressions lack a
ternary operator and can mis-evaluate when properties are undefined or falsy;
refactor to remove the dual-path inline logic by handling each event type in
separate conditional steps or jobs (use job/step-level if on github.event_name
== 'status' and github.event_name == 'check_run') or create distinct workflows
for each trigger so you can directly reference github.event.id,
github.event.state, github.event.sha for status events and
github.event.check_run.id, github.event.check_run.conclusion,
github.event.check_run.head_sha for check_run events instead of the current
fallback expressions.

- **Buildkite Organization**: ${{ inputs.buildkite-org }}

## Constraints
Expand Down Expand Up @@ -133,15 +134,16 @@ Classify each failure to guide your investigation:
### Step 1: Gather Context

1. Call `generate_agents_md` to get the repository's coding guidelines and conventions. If this fails, continue without it.
2. Identify the PRs associated with the workflow run using `github.event.workflow_run.pull_requests`. If there are none, call `noop` with message "No pull request associated with workflow run; nothing to do" and stop.
3. For each PR, call `pull_request_read` with method `get` to capture the author, branches, and fork status.
2. Resolve the failed commit SHA from the triggering event (`github.event.sha` for `status`, `github.event.check_run.head_sha` for `check_run`).
3. Call `list_pull_requests` for the repository (open PRs), then call `pull_request_read` with method `get` on candidates and keep PRs where `head.sha` matches the failed commit SHA. If none match, call `noop` with message "No pull request associated with failed commit status; nothing to do" and stop.
4. For each matching PR, keep author, branches, and fork status for downstream analysis.

### Step 2: Find the Buildkite Build

> **If Buildkite MCP is unavailable** (connection error, 401, timeout): The build failure may come from GitHub Actions CI, not Buildkite. Fall back to analyzing the GitHub Actions workflow run directly — use `web-fetch` to retrieve the run page, `bash` to call `gh run view`, or the GitHub API to read job logs. Proceed to Step 3 using whatever evidence is available and note in your comment that Buildkite data was unavailable.
> **If Buildkite MCP is unavailable** (connection error, 401, timeout): The build failure may come from GitHub checks/status contexts outside Buildkite. Fall back to analyzing the failing status/check context directly — use the GitHub API (`pull_request_read` status endpoints), `web-fetch`, or `bash` with `gh` to inspect related checks/jobs. Proceed to Step 3 using whatever evidence is available and note in your comment that Buildkite data was unavailable.

1. **Resolve the pipeline**: If `${{ inputs.buildkite-pipeline }}` is provided, use it. Otherwise, call `list_pipelines` for organization `${{ inputs.buildkite-org }}` and find the pipeline whose slug matches the repository name (extract the repo name from `${{ github.repository }}`). If multiple pipelines match, prefer an exact slug match.
2. **Find the failed build**: Call `list_builds` for the resolved pipeline, filtering by commit SHA `${{ github.event.workflow_run.head_sha }}`. If no match by SHA, use the PR's head branch (from the `pull_request_read` response in Step 1) to filter builds and select the most recent failed one.
2. **Find the failed build**: Call `list_builds` for the resolved pipeline, filtering by the failed commit SHA resolved in Step 1. If no match by SHA, use the PR's head branch (from the `pull_request_read` response in Step 1) to filter builds and select the most recent failed one.
3. **Collect failure evidence**:
- Call `get_build` for the matched build to get overall status and job list.
- For each **failed** job:
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/trigger-pr-buildkite-detective.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
# Edit gh-agent-workflows/pr-buildkite-detective/example.yml and run 'make compile' to regenerate.
name: Trigger PR Buildkite Detective
on:
workflow_run:
workflows: ["CI", "Build", "Test"]
status:
check_run:
types: [completed]

permissions:
Expand All @@ -16,8 +16,8 @@ permissions:
jobs:
run:
if: >-
github.event.workflow_run.conclusion == 'failure' &&
toJSON(github.event.workflow_run.pull_requests) != '[]'
(github.event_name == 'status' && github.event.state == 'failure') ||
(github.event_name == 'check_run' && github.event.check_run.conclusion == 'failure')
Comment on lines +19 to +20
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/trigger-pr-buildkite-detective.yml

Repository: elastic/ai-github-actions

Length of output: 1206


🏁 Script executed:

# Find and read the downstream workflow
fd 'gh-aw-pr-buildkite-detective.lock.yml' --exec head -100 {}

echo "---- Search for Buildkite origin guards in all related files ----"
rg -n "context|app\.slug|buildkite" .github/workflows/ gh-agent-workflows/ | head -50

Repository: elastic/ai-github-actions

Length of output: 6752


🏁 Script executed:

# Find the downstream workflow
find . -name "gh-aw-pr-buildkite-detective.lock.yml" -type f -exec cat -n {} \;

echo "---- Source example.yml ----"
cat -n gh-agent-workflows/pr-buildkite-detective/example.yml

Repository: elastic/ai-github-actions

Length of output: 50381


🏁 Script executed:

# Search for the pre_activation job and Buildkite origin checks
grep -n "pre_activation\|buildkite\|app\.slug\|status\.context" .github/workflows/gh-aw-pr-buildkite-detective.lock.yml | head -20

# Also check the source configuration
grep -rn "buildkite\|app\.slug\|status\.context" gh-agent-workflows/pr-buildkite-detective/

Repository: elastic/ai-github-actions

Length of output: 12962


🏁 Script executed:

# Search for pre_activation job definition in trigger-pr-buildkite-detective.yml
sed -n '/^  pre_activation:/,/^  [a-z]/p' .github/workflows/trigger-pr-buildkite-detective.yml

# If not found, check if it's in a different location
grep -A 50 "pre_activation:" .github/workflows/trigger-pr-buildkite-detective.yml

Repository: elastic/ai-github-actions

Length of output: 51


🏁 Script executed:

# Read the full trigger workflow to see if there's a pre_activation job
cat .github/workflows/trigger-pr-buildkite-detective.yml

Repository: elastic/ai-github-actions

Length of output: 1017


🏁 Script executed:

# Extract the pre_activation job from the reusable workflow
grep -n "^  pre_activation:" .github/workflows/gh-aw-pr-buildkite-detective.lock.yml

# Get more context around it
sed -n '/^  pre_activation:/,/^  [a-zA-Z]/p' .github/workflows/gh-aw-pr-buildkite-detective.lock.yml | head -100

Repository: elastic/ai-github-actions

Length of output: 1085


Scope failure filter to Buildkite-originated events.

At lines 19-20, this condition matches all failing status/check_run events, not just Buildkite. The downstream pre_activation job filters only by user role, not event origin. Add Buildkite-source guards to prevent false-positive detective runs from unrelated CI checks.

Suggested condition update
     if: >-
-      (github.event_name == 'status' && github.event.state == 'failure') ||
-      (github.event_name == 'check_run' && github.event.check_run.conclusion == 'failure')
+      (github.event_name == 'status' &&
+       github.event.state == 'failure' &&
+       startsWith(github.event.context, 'buildkite/')) ||
+      (github.event_name == 'check_run' &&
+       github.event.check_run.conclusion == 'failure' &&
+       github.event.check_run.app.slug == 'buildkite')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(github.event_name == 'status' && github.event.state == 'failure') ||
(github.event_name == 'check_run' && github.event.check_run.conclusion == 'failure')
(github.event_name == 'status' &&
github.event.state == 'failure' &&
startsWith(github.event.context, 'buildkite/')) ||
(github.event_name == 'check_run' &&
github.event.check_run.conclusion == 'failure' &&
github.event.check_run.app.slug == 'buildkite')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/trigger-pr-buildkite-detective.yml around lines 19 - 20,
The existing workflow condition matches any failing status/check_run events
(github.event_name == 'status' && github.event.state == 'failure') ||
(github.event_name == 'check_run' && github.event.check_run.conclusion ==
'failure'); tighten it to only trigger for Buildkite-originated events by adding
a Buildkite actor/sender guard (e.g. require github.actor == 'buildkite' or the
corresponding sender field) to both branches of the boolean expression so the
condition becomes something like (github.event_name == 'status' &&
github.event.state == 'failure' && github.actor == 'buildkite') ||
(github.event_name == 'check_run' && github.event.check_run.conclusion ==
'failure' && github.actor == 'buildkite'); update the conditional in the
workflow to include this guard.

uses: ./.github/workflows/gh-aw-pr-buildkite-detective.lock.yml
with:
buildkite-org: "elastic"
Expand Down
5 changes: 3 additions & 2 deletions gh-agent-workflows/pr-buildkite-detective/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Analyze failed PR checks backed by Buildkite and report findings (read-only).

## How it works

Triggered automatically when a CI workflow fails on a PR. Looks up the related Buildkite build via MCP, analyzes failed jobs/logs/annotations, and posts a comment with root cause and recommended fixes. Read-only — never pushes changes.
Triggered automatically when a commit status or check run reports a failure. Looks up the related Buildkite build via MCP, analyzes failed jobs/logs/annotations, and posts a comment with root cause and recommended fixes. Read-only — never pushes changes.

## Quick Install

Expand All @@ -20,7 +20,8 @@ See [example.yml](example.yml) for the full workflow file.

| Event | Types | Condition |
| --- | --- | --- |
| `workflow_run` | `completed` | CI workflow failed and the run is associated with a PR |
| `status` | N/A | Commit status changed to `failure` |
| `check_run` | `completed` | Check run completed with conclusion `failure` |

## Inputs

Expand Down
8 changes: 4 additions & 4 deletions gh-agent-workflows/pr-buildkite-detective/example.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: PR Buildkite Detective
on:
workflow_run:
workflows: ["CI", "Build", "Test"]
status:
check_run:
types: [completed]

permissions:
Expand All @@ -14,8 +14,8 @@ permissions:
jobs:
run:
if: >-
github.event.workflow_run.conclusion == 'failure' &&
toJSON(github.event.workflow_run.pull_requests) != '[]'
(github.event_name == 'status' && github.event.state == 'failure') ||
(github.event_name == 'check_run' && github.event.check_run.conclusion == 'failure')
uses: elastic/ai-github-actions/.github/workflows/gh-aw-pr-buildkite-detective.lock.yml@v0
with:
buildkite-org: "elastic"
Expand Down
Loading