Skip to content

refactor(agents): migrate dependabot AW review to workflow_run trigger#612

Merged
WilliamBerryiii merged 16 commits into
mainfrom
chore/aw-stale-ci
May 13, 2026
Merged

refactor(agents): migrate dependabot AW review to workflow_run trigger#612
WilliamBerryiii merged 16 commits into
mainfrom
chore/aw-stale-ci

Conversation

@katriendg
Copy link
Copy Markdown
Collaborator

Description

The aw-dependabot-pr-review agentic workflow used to fire on pull_request_target, which meant the resolver step captured a snapshot of PR Validation while it was still pending or in_progress:*, and the advisory review was posted before the orchestrator ever finished. PR #608 was the canonical example: the review correctly applied the Isaac Sim numpy 2.x ABI guard, but its CI banner quoted a stale in_progress:in_progress conclusion.

This PR migrates the workflow to workflow_run keyed on PR Validation completed, reads the orchestrator's terminal conclusion straight from context.payload.workflow_run.conclusion, and pre-resolves failing per-surface check-runs once in the resolver step. The persona rubric is rewritten to consume those env vars and to map every terminal conclusion explicitly - pending and in_progress:* branches are gone because they are now unreachable.

Related to #579.

Type of Change

  • 🐛 Bug fix (non-breaking change fixing an issue)
  • ✨ New feature (non-breaking change adding functionality)
  • 💥 Breaking change (fix or feature causing existing functionality to change)
  • 📚 Documentation update
  • 🏗️ Infrastructure change (Terraform/IaC)
  • ♻️ Refactoring (no functional changes)

Component(s) Affected

  • infrastructure/terraform/prerequisites/ - Azure subscription setup
  • infrastructure/terraform/ - Terraform infrastructure
  • infrastructure/setup/ - OSMO control plane / Helm
  • workflows/ - Training and evaluation workflows
  • training/ - Training pipelines and scripts
  • docs/ - Documentation

Changes

Workflow trigger and resolver

Switching to workflow_run runs the agent step against the trusted, default-branch copy of the workflow, so the gh-aw compiler can auto-inject fork-PR exclusion and the repository.id guard.

  • Replaced pull_request_target with workflow_run on workflows: ["PR Validation"], types: [completed], branches: ["dependabot/**"]. The branches: filter on workflow_run matches the triggering run's head_branch (not the base), so dependabot/** is the only value that fires for Dependabot PRs — using main here was the #583 regression fixed in #584. The workflow-level if: filters on workflow_run.event == 'pull_request', workflow_run.actor.login == 'dependabot[bot]', and a whitelist of seven terminal conclusions.
  • Kept on.bots: ["dependabot[bot]"] and on.roles: [admin, maintainer, write] at the top level — gh-aw's pre_activation guard checks the triggering actor against on.bots / on.roles independently of the workflow if:, so dropping these would resurrect the #585 / #586 User permission 'none' activation block.
  • Added checks: read to permissions: for server-side check-run enumeration; existing contents, pull-requests, and actions scopes are unchanged.
  • Rewrote the resolve-pr step. It reads context.payload.workflow_run, prefers workflow_run.pull_requests[0], and falls back to search.issuesAndPullRequests keyed on head_sha for the fork case. Both paths re-hydrate via pulls.get so body and draft are reliable.
  • Dropped the previous listWorkflowRunsForRepo lookup. PR_VALIDATION_CONCLUSION now reads directly from run.conclusion, which under types: [completed] is always one of success, failure, cancelled, timed_out, neutral, skipped, or action_required.
  • Added two new env vars exported by the resolver:
    • PR_VALIDATION_FAILING_CHECKS — JSON array of {name, html_url, conclusion} from checks.listForRef(ref=pr.head.sha) filtered to completed non-success/non-neutral/non-skipped runs.
    • PR_BODY — PR body hydrated server-side so the agent does not depend on the integrity-filtered MCP read of the PR.
  • New skip reasons in PR_DEPENDABOT_SKIP_REASON: not-a-pr-run and pr-resolution-failed, alongside the existing not-dependabot / draft.
  • Retargeted safe-outputs:
    • submit-pull-request-review.target${{ env.PR_NUMBER }}
    • add-comment.target${{ env.PR_NUMBER }} (was triggering, which is undefined under workflow_run)
    • create-pull-request-review-comment.target"*"

Persona verdict rubric

The agent now reasons over a final CI signal, so the rubric collapses to a clean terminal-conclusion map.

  • Rewrote the Validation Signal section in .github/agents/dependabot-pr-reviewer.agent.md. The persona is told the workflow runs after PR Validation reaches a terminal conclusion, and is explicitly forbidden from calling checks.listForRef or commits/{sha}/check-runs — it reads PR_VALIDATION_FAILING_CHECKS from the environment instead.
  • Reframed the Surface to Check Run Map as an informational lookup for mapping a failing check name back to its dependency surface. The persona no longer walks it via the API.
  • Rewrote the Verdict Adjustment block as an explicit terminal-conclusion map:
    • success + no static concern + no sticky high-risk trigger → APPROVE-eligible, citing the orchestrator conclusion plus an empty PR_VALIDATION_FAILING_CHECKS.
    • failure | cancelled | timed_out | action_requiredCOMMENT; body MUST quote every entry from PR_VALIDATION_FAILING_CHECKS (name plus html_url).
    • neutral | skipped | unknown or PR_DEPENDABOT_SKIP_REASON == 'pr-resolution-failed'COMMENT with a > [!CAUTION] banner: Deterministic CI signal unavailable ({conclusion}); review is advisory only.
  • Preserved the sticky Isaac Sim ABI guard verbatim — a numpy 2.x bump still keeps the verdict at COMMENT and forces the ⚠️ Maintainer review recommended banner regardless of CI conclusion.

Workflow documentation and lock files

  • Rewrote the Trigger Posture and step-by-step prose in aw-dependabot-pr-review.md to describe the workflow_run execution model, the gh-aw compiler's auto-injected fork-PR exclusion and repository.id guard, and the new env-var contract.
  • Bumped github/gh-aw-actions/setup v0.68.3v0.71.1 in .github/aw/actions-lock.json (SHA ba90f21…239aec4…), picked up by recompilation.
  • Regenerated .github/workflows/aw-dependabot-pr-review.lock.yml via the gh-aw compiler — diff reflects the trigger swap, the new env vars, and the setup-action SHA bump. No hand edits.

Testing Performed

  • Terraform plan reviewed (no unexpected changes)
  • Terraform apply tested in dev environment
  • Training scripts tested locally with Isaac Sim
  • OSMO workflow submitted successfully
  • Smoke tests passed (smoke_test_azure.py)

None of the templated test surfaces apply — this PR only touches .github/agents/ and .github/workflows/. Validation evidence: npm run lint:md and npm run lint:yaml pass on the changed files; the aw-dependabot-pr-review.lock.yml artifact is regenerated rather than hand-edited and matches the gh-aw compiler output for the new source. The behavioural change is observable on the next Dependabot PR — the advisory review will fire after PR Validation completes and quote the orchestrator's terminal conclusion plus any failing per-surface checks.

Documentation Impact

  • No documentation changes needed
  • Documentation updated in this PR
  • Documentation issue filed

Bug Fix Checklist

Not a bug fix — this is a refactor of an agentic-workflow trigger surface.

  • Linked to issue being fixed
  • Regression test included, OR
  • Justification for no regression test:

Checklist

Related Issues

Related to #579

Notes

The min-integrity: approved setting on tools.github is intentionally preserved. The agent's MCP PR-body read is therefore filtered, which is why the resolver hydrates PR_BODY from the REST API server-side — the persona consumes the env var rather than relying on the filtered MCP payload.

  • Lowering min-integrity to unapproved was rejected on prompt-injection grounds; the resolver-side hydration is the chosen mitigation.
  • workflow_run runs in default-branch context, which means changes to the AW workflow itself cannot be exercised by a Dependabot PR — this is the secure-by-design tradeoff documented in the GitHub Security Lab "preventing pwn requests" guide and aligns with the gh-aw workflow_run recommendation.

Follow-up Tasks

  • Validate behaviour on a grouped Dependabot update that produces multiple PR Validation runs against the same head SHA — confirm that only the latest completed run drives the advisory review.
  • After the first live Dependabot PR runs through the new trigger, compare the posted review's CI banner against the orchestrator's final conclusion and the failing-check list to confirm the staleness regression observed in PR security(deps): bump the training-dependencies group across 1 directory with 76 updates #608 is gone.
  • Confirm that safe-outputs.submit-pull-request-review and add-comment post successfully under workflow_run — the target: ${{ env.PR_NUMBER }} overrides are the #588 / #589 mitigation; a Not in pull request context skip in safe_outputs would mean the env var did not resolve.

katriendg and others added 2 commits May 4, 2026 18:31
- Switch trigger from pull_request_target to workflow_run gated on PR Validation completion on main
- Filter on workflow_run.actor.login == 'dependabot[bot]' (replacing pull_request_target bots:/roles: allowlists)
- Hydrate PR_VALIDATION_CONCLUSION from workflow_run payload and PR_VALIDATION_FAILING_CHECKS via checks.listForRef
- Tighten persona verdict rubric so non-success conclusions map to COMMENT with caution banner
- Replace persona check-run API walk with resolver-supplied env vars
- Regenerate aw-dependabot-pr-review.lock.yml

🤖 - Generated by Copilot

Co-authored-by: Copilot <copilot@github.com>
…dabot branches

- change workflow_run branches from main to dependabot/**
- clarify workflow execution context for Dependabot PRs

🔧 - Generated by Copilot
@katriendg katriendg requested a review from a team as a code owner May 5, 2026 06:19
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 2964d8f.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.63%. Comparing base (3f88f17) to head (2964d8f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #612   +/-   ##
=======================================
  Coverage   88.63%   88.63%           
=======================================
  Files         252      252           
  Lines       18018    18019    +1     
  Branches     2492     2492           
=======================================
+ Hits        15971    15972    +1     
  Misses       1579     1579           
  Partials      468      468           
Flag Coverage Δ *Carryforward flag
pester 83.14% <100.00%> (+<0.01%) ⬆️
pytest-data-pipeline 100.00% <ø> (ø) Carriedforward from d53d89a
pytest-dataviewer 93.60% <ø> (ø) Carriedforward from d53d89a
pytest-dm-tools 100.00% <ø> (ø) Carriedforward from d53d89a
pytest-evaluation 99.51% <ø> (ø)
pytest-fuzz 4.89% <ø> (ø) Carriedforward from d53d89a
pytest-inference 100.00% <ø> (ø) Carriedforward from d53d89a
pytest-training 93.32% <ø> (ø) Carriedforward from d53d89a
vitest 86.27% <ø> (ø) Carriedforward from d53d89a
vitest-app 86.27% <ø> (ø) Carriedforward from d53d89a
vitest-components 86.27% <ø> (ø) Carriedforward from d53d89a
vitest-features 86.27% <ø> (ø) Carriedforward from d53d89a
vitest-lib 86.27% <ø> (ø) Carriedforward from d53d89a
vitest-state 86.27% <ø> (ø) Carriedforward from d53d89a

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
scripts/linting/Invoke-YamlLint.ps1 93.18% <100.00%> (+0.07%) ⬆️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

katriendg added 3 commits May 5, 2026 08:47
# Conflicts:
#	.github/workflows/aw-dependabot-pr-review.lock.yml
- Merge main to resolve lock file conflict
- Upgrade gh-aw from v0.71.1 to v0.71.5
- Recompile aw-dependabot-pr-review workflow

🤖 - Generated by Copilot
@katriendg katriendg force-pushed the chore/aw-stale-ci branch from 5d10a1c to eec026b Compare May 8, 2026 10:59
Copy link
Copy Markdown
Member

@WilliamBerryiii WilliamBerryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @katriendg — really nice migration. Pulling the trigger, resolver, and dispatch logic into a clean workflow_run shape makes the Dependabot review path feel a lot more durable, and the explicit no-checks.listForRef-in-resolver guard is a great touch.

A couple of thoughts that aren't blocking:

  • Concurrency group scope (.lock.yml) — gh-aw-copilot-${{ github.workflow }} serializes every Dependabot review, so a batch of ~10 PRs would stack end-to-end (~10× wall time). Scoping per-PR (e.g., gh-aw-copilot-${{ github.workflow }}-${{ github.event.workflow_run.pull_requests[0].number || github.event.workflow_run.head_sha }}) would let independent reviews run in parallel while still serializing within a single PR.
  • Dead "Checkout PR branch" step (.lock.yml) — under workflow_run, neither github.event.pull_request nor github.event.issue.pull_request is set, so that step is permanently skipped. That's intentional (the resolver pulls everything from REST), but a one-line note in the source .md "Trigger Posture" section — something like "The agent runs without a working tree — all PR context comes from REST APIs in the resolver. Do not add a checkout step." — would save the next reader from trying to "fix" the missing checkout.

A few other items we looked at and don't think need action: the XPIA prompt-injection guidance reads correctly under the new trigger; the surface-to-check-run name mapping in the resolver matches the matrix names we could verify; and the > [!CAUTION] callout is the project's preferred alert syntax.

Thanks again for moving this forward!

Comment thread .github/workflows/aw-dependabot-pr-review.md Outdated
Comment thread .github/workflows/aw-dependabot-pr-review.md Outdated
Comment thread .github/workflows/aw-dependabot-pr-review.md Outdated
Comment thread scripts/linting/Invoke-YamlLint.ps1
Copy link
Copy Markdown
Member

@bindsi bindsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Katrien!!!

katriendg added 6 commits May 11, 2026 14:23
- Disambiguate fork-PR search fallback by filtering to open
  Dependabot PRs with matching SHA; fail loudly on ambiguity
- Paginate checks.listForRef to avoid silently missing check-runs
  when the CI matrix grows beyond a single page

🤖 - Generated by Copilot
…exclusions

- Remove || 'unknown' fallback on run.conclusion since
  workflow_run types: [completed] guarantees conclusion is set
- Consolidate lock file exclusions in Invoke-YamlLint.ps1 to
  a single post-filter; drop redundant inline exclusions

🤖 - Generated by Copilot
- Add concurrency.job-discriminator keyed on workflow_run.head_sha
  so independent Dependabot PR reviews run in parallel instead of
  serializing end-to-end through a single concurrency group

🤖 - Generated by Copilot
- Clarify that the agent runs without a working tree and the
  compiler-generated checkout step is permanently skipped under
  workflow_run

🤖 - Generated by Copilot
@katriendg
Copy link
Copy Markdown
Collaborator Author

Thanks @katriendg — really nice migration. Pulling the trigger, resolver, and dispatch logic into a clean workflow_run shape makes the Dependabot review path feel a lot more durable, and the explicit no-checks.listForRef-in-resolver guard is a great touch.

A couple of thoughts that aren't blocking:

  • Concurrency group scope (.lock.yml) — gh-aw-copilot-${{ github.workflow }} serializes every Dependabot review, so a batch of ~10 PRs would stack end-to-end (~10× wall time). Scoping per-PR (e.g., gh-aw-copilot-${{ github.workflow }}-${{ github.event.workflow_run.pull_requests[0].number || github.event.workflow_run.head_sha }}) would let independent reviews run in parallel while still serializing within a single PR.
  • Dead "Checkout PR branch" step (.lock.yml) — under workflow_run, neither github.event.pull_request nor github.event.issue.pull_request is set, so that step is permanently skipped. That's intentional (the resolver pulls everything from REST), but a one-line note in the source .md "Trigger Posture" section — something like "The agent runs without a working tree — all PR context comes from REST APIs in the resolver. Do not add a checkout step." — would save the next reader from trying to "fix" the missing checkout.

A few other items we looked at and don't think need action: the XPIA prompt-injection guidance reads correctly under the new trigger; the surface-to-check-run name mapping in the resolver matches the matrix names we could verify; and the > [!CAUTION] callout is the project's preferred alert syntax.

Thanks again for moving this forward!

Thanks for your review @WilliamBerryiii - I believe I have addressed all comments, and by digging more into the docs, the concurrency group scope should also be addressed.
concurrency is a first-class frontmatter key. The docs show both workflow-level concurrency.group and concurrency.job-discriminator for fan-out patterns. For the workflow_run trigger, the default compiler behavior maps to Schedule/Other → gh-aw-${{ github.workflow }} (no PR scoping), which is exactly the serialization you flagged.

For the second comment, added a note to the MD file.

Ready for a second pass review. Thanks.

@katriendg katriendg requested a review from WilliamBerryiii May 11, 2026 14:42
@WilliamBerryiii WilliamBerryiii merged commit eb53059 into main May 13, 2026
49 checks passed
@WilliamBerryiii WilliamBerryiii deleted the chore/aw-stale-ci branch May 13, 2026 05:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants