refactor(agents): migrate dependabot AW review to workflow_run trigger by katriendg · Pull Request #612 · microsoft/physical-ai-toolchain

katriendg · 2026-05-05T06:19:30Z

Description

The aw-dependabot-pr-review agentic workflow used to fire on pull_request_target, which meant the resolver step captured a snapshot of PR Validation while it was still pending or in_progress:*, and the advisory review was posted before the orchestrator ever finished. PR #608 was the canonical example: the review correctly applied the Isaac Sim numpy 2.x ABI guard, but its CI banner quoted a stale in_progress:in_progress conclusion.

This PR migrates the workflow to workflow_run keyed on PR Validation completed, reads the orchestrator's terminal conclusion straight from context.payload.workflow_run.conclusion, and pre-resolves failing per-surface check-runs once in the resolver step. The persona rubric is rewritten to consume those env vars and to map every terminal conclusion explicitly - pending and in_progress:* branches are gone because they are now unreachable.

Related to #579.

Type of Change

🐛 Bug fix (non-breaking change fixing an issue)
✨ New feature (non-breaking change adding functionality)
💥 Breaking change (fix or feature causing existing functionality to change)
📚 Documentation update
🏗️ Infrastructure change (Terraform/IaC)
♻️ Refactoring (no functional changes)

Component(s) Affected

infrastructure/terraform/prerequisites/ - Azure subscription setup
infrastructure/terraform/ - Terraform infrastructure
infrastructure/setup/ - OSMO control plane / Helm
workflows/ - Training and evaluation workflows
training/ - Training pipelines and scripts
docs/ - Documentation

Changes

Workflow trigger and resolver

Switching to workflow_run runs the agent step against the trusted, default-branch copy of the workflow, so the gh-aw compiler can auto-inject fork-PR exclusion and the repository.id guard.

Replaced pull_request_target with workflow_run on workflows: ["PR Validation"], types: [completed], branches: ["dependabot/**"]. The branches: filter on workflow_run matches the triggering run's head_branch (not the base), so dependabot/** is the only value that fires for Dependabot PRs — using main here was the #583 regression fixed in #584. The workflow-level if: filters on workflow_run.event == 'pull_request', workflow_run.actor.login == 'dependabot[bot]', and a whitelist of seven terminal conclusions.
Kept on.bots: ["dependabot[bot]"] and on.roles: [admin, maintainer, write] at the top level — gh-aw's pre_activation guard checks the triggering actor against on.bots / on.roles independently of the workflow if:, so dropping these would resurrect the #585 / #586 User permission 'none' activation block.
Added checks: read to permissions: for server-side check-run enumeration; existing contents, pull-requests, and actions scopes are unchanged.
Rewrote the resolve-pr step. It reads context.payload.workflow_run, prefers workflow_run.pull_requests[0], and falls back to search.issuesAndPullRequests keyed on head_sha for the fork case. Both paths re-hydrate via pulls.get so body and draft are reliable.
Dropped the previous listWorkflowRunsForRepo lookup. PR_VALIDATION_CONCLUSION now reads directly from run.conclusion, which under types: [completed] is always one of success, failure, cancelled, timed_out, neutral, skipped, or action_required.
Added two new env vars exported by the resolver:
- PR_VALIDATION_FAILING_CHECKS — JSON array of {name, html_url, conclusion} from checks.listForRef(ref=pr.head.sha) filtered to completed non-success/non-neutral/non-skipped runs.
- PR_BODY — PR body hydrated server-side so the agent does not depend on the integrity-filtered MCP read of the PR.
New skip reasons in PR_DEPENDABOT_SKIP_REASON: not-a-pr-run and pr-resolution-failed, alongside the existing not-dependabot / draft.
Retargeted safe-outputs:
- submit-pull-request-review.target → ${{ env.PR_NUMBER }}
- add-comment.target → ${{ env.PR_NUMBER }} (was triggering, which is undefined under workflow_run)
- create-pull-request-review-comment.target → "*"

Persona verdict rubric

The agent now reasons over a final CI signal, so the rubric collapses to a clean terminal-conclusion map.

Rewrote the Validation Signal section in .github/agents/dependabot-pr-reviewer.agent.md. The persona is told the workflow runs after PR Validation reaches a terminal conclusion, and is explicitly forbidden from calling checks.listForRef or commits/{sha}/check-runs — it reads PR_VALIDATION_FAILING_CHECKS from the environment instead.
Reframed the Surface to Check Run Map as an informational lookup for mapping a failing check name back to its dependency surface. The persona no longer walks it via the API.
Rewrote the Verdict Adjustment block as an explicit terminal-conclusion map:
- success + no static concern + no sticky high-risk trigger → APPROVE-eligible, citing the orchestrator conclusion plus an empty PR_VALIDATION_FAILING_CHECKS.
- failure | cancelled | timed_out | action_required → COMMENT; body MUST quote every entry from PR_VALIDATION_FAILING_CHECKS (name plus html_url).
- neutral | skipped | unknown or PR_DEPENDABOT_SKIP_REASON == 'pr-resolution-failed' → COMMENT with a > [!CAUTION] banner: Deterministic CI signal unavailable ({conclusion}); review is advisory only.
Preserved the sticky Isaac Sim ABI guard verbatim — a numpy 2.x bump still keeps the verdict at COMMENT and forces the ⚠️ Maintainer review recommended banner regardless of CI conclusion.

Workflow documentation and lock files

Rewrote the Trigger Posture and step-by-step prose in aw-dependabot-pr-review.md to describe the workflow_run execution model, the gh-aw compiler's auto-injected fork-PR exclusion and repository.id guard, and the new env-var contract.
Bumped github/gh-aw-actions/setup v0.68.3 → v0.71.1 in .github/aw/actions-lock.json (SHA ba90f21… → 239aec4…), picked up by recompilation.
Regenerated .github/workflows/aw-dependabot-pr-review.lock.yml via the gh-aw compiler — diff reflects the trigger swap, the new env vars, and the setup-action SHA bump. No hand edits.

Testing Performed

Terraform plan reviewed (no unexpected changes)
Terraform apply tested in dev environment
Training scripts tested locally with Isaac Sim
OSMO workflow submitted successfully
Smoke tests passed (smoke_test_azure.py)

None of the templated test surfaces apply — this PR only touches .github/agents/ and .github/workflows/. Validation evidence: npm run lint:md and npm run lint:yaml pass on the changed files; the aw-dependabot-pr-review.lock.yml artifact is regenerated rather than hand-edited and matches the gh-aw compiler output for the new source. The behavioural change is observable on the next Dependabot PR — the advisory review will fire after PR Validation completes and quote the orchestrator's terminal conclusion plus any failing per-surface checks.

Documentation Impact

No documentation changes needed
Documentation updated in this PR
Documentation issue filed

Bug Fix Checklist

Not a bug fix — this is a refactor of an agentic-workflow trigger surface.

Linked to issue being fixed
Regression test included, OR
Justification for no regression test:

Checklist

My code follows the project conventions
Commit messages follow conventional commit format
I have performed a self-review
Documentation impact assessed above
No new linting warnings introduced

Related Issues

Related to #579

Notes

The min-integrity: approved setting on tools.github is intentionally preserved. The agent's MCP PR-body read is therefore filtered, which is why the resolver hydrates PR_BODY from the REST API server-side — the persona consumes the env var rather than relying on the filtered MCP payload.

Lowering min-integrity to unapproved was rejected on prompt-injection grounds; the resolver-side hydration is the chosen mitigation.
workflow_run runs in default-branch context, which means changes to the AW workflow itself cannot be exercised by a Dependabot PR — this is the secure-by-design tradeoff documented in the GitHub Security Lab "preventing pwn requests" guide and aligns with the gh-aw workflow_run recommendation.

Follow-up Tasks

Validate behaviour on a grouped Dependabot update that produces multiple PR Validation runs against the same head SHA — confirm that only the latest completed run drives the advisory review.
After the first live Dependabot PR runs through the new trigger, compare the posted review's CI banner against the orchestrator's final conclusion and the failing-check list to confirm the staleness regression observed in PR security(deps): bump the training-dependencies group across 1 directory with 76 updates #608 is gone.
Confirm that safe-outputs.submit-pull-request-review and add-comment post successfully under workflow_run — the target: ${{ env.PR_NUMBER }} overrides are the #588 / #589 mitigation; a Not in pull request context skip in safe_outputs would mean the env var did not resolve.

- Switch trigger from pull_request_target to workflow_run gated on PR Validation completion on main - Filter on workflow_run.actor.login == 'dependabot[bot]' (replacing pull_request_target bots:/roles: allowlists) - Hydrate PR_VALIDATION_CONCLUSION from workflow_run payload and PR_VALIDATION_FAILING_CHECKS via checks.listForRef - Tighten persona verdict rubric so non-success conclusions map to COMMENT with caution banner - Replace persona check-run API walk with resolver-supplied env vars - Regenerate aw-dependabot-pr-review.lock.yml 🤖 - Generated by Copilot Co-authored-by: Copilot <copilot@github.com>

…dabot branches - change workflow_run branches from main to dependabot/** - clarify workflow execution context for Dependabot PRs 🔧 - Generated by Copilot

github-actions · 2026-05-05T06:19:44Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 2964d8f.

Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

codecov-commenter · 2026-05-05T06:21:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.63%. Comparing base (3f88f17) to head (2964d8f).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #612   +/-   ##
=======================================
  Coverage   88.63%   88.63%           
=======================================
  Files         252      252           
  Lines       18018    18019    +1     
  Branches     2492     2492           
=======================================
+ Hits        15971    15972    +1     
  Misses       1579     1579           
  Partials      468      468

Flag	Coverage Δ		*Carryforward flag
pester	`83.14% <100.00%> (+<0.01%)`	⬆️
pytest-data-pipeline	`100.00% <ø> (ø)`		Carriedforward from d53d89a
pytest-dataviewer	`93.60% <ø> (ø)`		Carriedforward from d53d89a
pytest-dm-tools	`100.00% <ø> (ø)`		Carriedforward from d53d89a
pytest-evaluation	`99.51% <ø> (ø)`
pytest-fuzz	`4.89% <ø> (ø)`		Carriedforward from d53d89a
pytest-inference	`100.00% <ø> (ø)`		Carriedforward from d53d89a
pytest-training	`93.32% <ø> (ø)`		Carriedforward from d53d89a
vitest	`86.27% <ø> (ø)`		Carriedforward from d53d89a
vitest-app	`86.27% <ø> (ø)`		Carriedforward from d53d89a
vitest-components	`86.27% <ø> (ø)`		Carriedforward from d53d89a
vitest-features	`86.27% <ø> (ø)`		Carriedforward from d53d89a
vitest-lib	`86.27% <ø> (ø)`		Carriedforward from d53d89a
vitest-state	`86.27% <ø> (ø)`		Carriedforward from d53d89a

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines	Coverage Δ
scripts/linting/Invoke-YamlLint.ps1	`93.18% <100.00%> (+0.07%)`	⬆️

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

# Conflicts: # .github/workflows/aw-dependabot-pr-review.lock.yml

- Merge main to resolve lock file conflict - Upgrade gh-aw from v0.71.1 to v0.71.5 - Recompile aw-dependabot-pr-review workflow 🤖 - Generated by Copilot

# Conflicts: # .github/workflows/aw-dependabot-pr-review.lock.yml

WilliamBerryiii

Thanks @katriendg — really nice migration. Pulling the trigger, resolver, and dispatch logic into a clean workflow_run shape makes the Dependabot review path feel a lot more durable, and the explicit no-checks.listForRef-in-resolver guard is a great touch.

A couple of thoughts that aren't blocking:

Concurrency group scope (.lock.yml) — gh-aw-copilot-${{ github.workflow }} serializes every Dependabot review, so a batch of ~10 PRs would stack end-to-end (~10× wall time). Scoping per-PR (e.g., gh-aw-copilot-${{ github.workflow }}-${{ github.event.workflow_run.pull_requests[0].number || github.event.workflow_run.head_sha }}) would let independent reviews run in parallel while still serializing within a single PR.
Dead "Checkout PR branch" step (.lock.yml) — under workflow_run, neither github.event.pull_request nor github.event.issue.pull_request is set, so that step is permanently skipped. That's intentional (the resolver pulls everything from REST), but a one-line note in the source .md "Trigger Posture" section — something like "The agent runs without a working tree — all PR context comes from REST APIs in the resolver. Do not add a checkout step." — would save the next reader from trying to "fix" the missing checkout.

A few other items we looked at and don't think need action: the XPIA prompt-injection guidance reads correctly under the new trigger; the surface-to-check-run name mapping in the resolver matches the matrix names we could verify; and the > [!CAUTION] callout is the project's preferred alert syntax.

Thanks again for moving this forward!

bindsi

Nice Katrien!!!

- Disambiguate fork-PR search fallback by filtering to open Dependabot PRs with matching SHA; fail loudly on ambiguity - Paginate checks.listForRef to avoid silently missing check-runs when the CI matrix grows beyond a single page 🤖 - Generated by Copilot

…exclusions - Remove || 'unknown' fallback on run.conclusion since workflow_run types: [completed] guarantees conclusion is set - Consolidate lock file exclusions in Invoke-YamlLint.ps1 to a single post-filter; drop redundant inline exclusions 🤖 - Generated by Copilot

- Add concurrency.job-discriminator keyed on workflow_run.head_sha so independent Dependabot PR reviews run in parallel instead of serializing end-to-end through a single concurrency group 🤖 - Generated by Copilot

- Clarify that the agent runs without a working tree and the compiler-generated checkout step is permanently skipped under workflow_run 🤖 - Generated by Copilot

katriendg · 2026-05-11T14:42:48Z

Thanks @katriendg — really nice migration. Pulling the trigger, resolver, and dispatch logic into a clean workflow_run shape makes the Dependabot review path feel a lot more durable, and the explicit no-checks.listForRef-in-resolver guard is a great touch.

A couple of thoughts that aren't blocking:

Concurrency group scope (.lock.yml) — gh-aw-copilot-${{ github.workflow }} serializes every Dependabot review, so a batch of ~10 PRs would stack end-to-end (~10× wall time). Scoping per-PR (e.g., gh-aw-copilot-${{ github.workflow }}-${{ github.event.workflow_run.pull_requests[0].number || github.event.workflow_run.head_sha }}) would let independent reviews run in parallel while still serializing within a single PR.

Dead "Checkout PR branch" step (.lock.yml) — under workflow_run, neither github.event.pull_request nor github.event.issue.pull_request is set, so that step is permanently skipped. That's intentional (the resolver pulls everything from REST), but a one-line note in the source .md "Trigger Posture" section — something like "The agent runs without a working tree — all PR context comes from REST APIs in the resolver. Do not add a checkout step." — would save the next reader from trying to "fix" the missing checkout.

A few other items we looked at and don't think need action: the XPIA prompt-injection guidance reads correctly under the new trigger; the surface-to-check-run name mapping in the resolver matches the matrix names we could verify; and the > [!CAUTION] callout is the project's preferred alert syntax.

Thanks again for moving this forward!

Thanks for your review @WilliamBerryiii - I believe I have addressed all comments, and by digging more into the docs, the concurrency group scope should also be addressed.
concurrency is a first-class frontmatter key. The docs show both workflow-level concurrency.group and concurrency.job-discriminator for fan-out patterns. For the workflow_run trigger, the default compiler behavior maps to Schedule/Other → gh-aw-${{ github.workflow }} (no PR scoping), which is exactly the serialization you flagged.

For the second comment, added a note to the MD file.

Ready for a second pass review. Thanks.

katriendg and others added 2 commits May 4, 2026 18:31

chore(workflows): update Dependabot PR review triggers to match depen…

db20d20

…dabot branches - change workflow_run branches from main to dependabot/** - clarify workflow execution context for Dependabot PRs 🔧 - Generated by Copilot

katriendg requested a review from a team as a code owner May 5, 2026 06:19

katriendg added 3 commits May 5, 2026 08:47

Merge branch 'main' into chore/aw-stale-ci

f811335

Merge remote-tracking branch 'origin/main' into chore/aw-stale-ci

7c9774a

# Conflicts: # .github/workflows/aw-dependabot-pr-review.lock.yml

build(agents): regenerate lock file with gh-aw v0.71.5

eec026b

- Merge main to resolve lock file conflict - Upgrade gh-aw from v0.71.1 to v0.71.5 - Recompile aw-dependabot-pr-review workflow 🤖 - Generated by Copilot

katriendg force-pushed the chore/aw-stale-ci branch from 5d10a1c to eec026b Compare May 8, 2026 10:59

katriendg and others added 3 commits May 8, 2026 13:48

Merge branch 'main' into chore/aw-stale-ci

5edc326

# Conflicts: # .github/workflows/aw-dependabot-pr-review.lock.yml

build(agents): recompile dependabot AW lock file after merge

0eaa3da

Merge branch 'main' into chore/aw-stale-ci

78371b9

WilliamBerryiii reviewed May 8, 2026

View reviewed changes

Comment thread .github/workflows/aw-dependabot-pr-review.md Outdated

Comment thread .github/workflows/aw-dependabot-pr-review.md Outdated

Comment thread .github/workflows/aw-dependabot-pr-review.md Outdated

Comment thread scripts/linting/Invoke-YamlLint.ps1

bindsi approved these changes May 11, 2026

View reviewed changes

katriendg added 6 commits May 11, 2026 14:23

chore: json linting

e1c86f1

Merge branch 'main' into chore/aw-stale-ci

859be42

docs(agents): note dead checkout step in Trigger Posture section

c7cc45a

- Clarify that the agent runs without a working tree and the compiler-generated checkout step is permanently skipped under workflow_run 🤖 - Generated by Copilot

katriendg requested a review from WilliamBerryiii May 11, 2026 14:42

rezatnoMsirhC approved these changes May 12, 2026

View reviewed changes

rezatnoMsirhC and others added 2 commits May 12, 2026 09:54

Merge branch 'main' into chore/aw-stale-ci

d53d89a

Merge branch 'main' into chore/aw-stale-ci

2964d8f

WilliamBerryiii merged commit eb53059 into main May 13, 2026
49 checks passed

WilliamBerryiii deleted the chore/aw-stale-ci branch May 13, 2026 05:03

physical-ai-toolchain-release Bot mentioned this pull request May 13, 2026

chore(main): release 0.9.0 #648

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(agents): migrate dependabot AW review to workflow_run trigger#612

refactor(agents): migrate dependabot AW review to workflow_run trigger#612
WilliamBerryiii merged 16 commits into
mainfrom
chore/aw-stale-ci

katriendg commented May 5, 2026

Uh oh!

github-actions Bot commented May 5, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 5, 2026 •

edited

Loading

Uh oh!

WilliamBerryiii left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bindsi left a comment

Uh oh!

katriendg commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

katriendg commented May 5, 2026

Description

Type of Change

Component(s) Affected

Changes

Workflow trigger and resolver

Persona verdict rubric

Workflow documentation and lock files

Testing Performed

Documentation Impact

Bug Fix Checklist

Checklist

Related Issues

Notes

Follow-up Tasks

Uh oh!

github-actions Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Snapshot Warnings

Scanned Files

Uh oh!

codecov-commenter commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

WilliamBerryiii left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bindsi left a comment

Choose a reason for hiding this comment

Uh oh!

katriendg commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions Bot commented May 5, 2026 •

edited

Loading

codecov-commenter commented May 5, 2026 •

edited

Loading