Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions .github/workflows/gh-aw-fragments/pr-context.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,83 @@ steps:
jq -r '[.[] | {filename, size: ((.additions // 0) + (.deletions // 0))}] | sort_by(-.size) | .[].filename' /tmp/pr-context/files.json \
> /tmp/pr-context/file_order_largest.txt

# Determine sub-agent count based on PR size
FILE_COUNT=$(jq 'length' /tmp/pr-context/files.json)
if [ "$FILE_COUNT" -le 10 ]; then
AGENT_COUNT=0
elif [ "$FILE_COUNT" -le 20 ]; then
AGENT_COUNT=2
else
AGENT_COUNT=3
fi
echo "$AGENT_COUNT" > /tmp/pr-context/agent_count.txt
echo "PR size: ${FILE_COUNT} files → ${AGENT_COUNT} sub-agents"

# Write review strategy with precise instructions for the agent
echo "# Review Strategy" > /tmp/pr-context/review-strategy.md
echo "" >> /tmp/pr-context/review-strategy.md
echo "**PR size:** ${FILE_COUNT} files | **Sub-agents:** ${AGENT_COUNT}" >> /tmp/pr-context/review-strategy.md
echo "" >> /tmp/pr-context/review-strategy.md

if [ "$AGENT_COUNT" -eq 0 ]; then
cat >> /tmp/pr-context/review-strategy.md << 'STRATEGY_DIRECT'
## Direct Review (no sub-agents)

This PR is small enough to review directly. Do NOT spawn sub-agents.

Review the diff file by file using the ordering in `/tmp/pr-context/file_order_az.txt`. For each changed file:

1. Read the diff from `/tmp/pr-context/diffs/<filename>.diff`
2. Read the full file from the workspace for context
3. Check existing threads in `/tmp/pr-context/threads/<filename>.json` (if it exists)
4. Identify issues matching the Code Review Reference criteria
5. Verify each issue: construct a concrete failure scenario, challenge the finding, check for existing threads

Proceed to the Verify and Comment step with your findings.
STRATEGY_DIRECT
elif [ "$AGENT_COUNT" -eq 2 ]; then
cat >> /tmp/pr-context/review-strategy.md << 'STRATEGY_TWO'
## Sub-agent Review (2 agents)

Spawn exactly 2 `code-review` sub-agents in parallel:

- **Agent 1**: file ordering from `/tmp/pr-context/file_order_az.txt` (A→Z)
- **Agent 2**: file ordering from `/tmp/pr-context/file_order_za.txt` (Z→A)

Each sub-agent prompt must include:
- Instruction to read `/tmp/pr-context/review-instructions.md` for the review process, criteria, and calibration examples
- Instruction to read `/tmp/pr-context/README.md` for a manifest of all available context files
- The review intensity and minimum severity settings from the workflow
- The path to that sub-agent's file ordering — tell it to read the file for its ordered list (per-file diffs are at `/tmp/pr-context/diffs/<filename>.diff`)
- Instruction to read changed files from the workspace (the PR branch is checked out)

Each sub-agent returns a structured findings list. They do NOT leave inline comments.

After both sub-agents complete, merge and deduplicate findings per the Pick Three, Keep Many process before proceeding to the Verify and Comment step.
STRATEGY_TWO
else
cat >> /tmp/pr-context/review-strategy.md << 'STRATEGY_THREE'
## Sub-agent Review (3 agents)

Spawn exactly 3 `code-review` sub-agents in parallel:

- **Agent 1**: file ordering from `/tmp/pr-context/file_order_az.txt` (A→Z)
- **Agent 2**: file ordering from `/tmp/pr-context/file_order_za.txt` (Z→A)
- **Agent 3**: file ordering from `/tmp/pr-context/file_order_largest.txt` (largest diff first)

Each sub-agent prompt must include:
- Instruction to read `/tmp/pr-context/review-instructions.md` for the review process, criteria, and calibration examples
- Instruction to read `/tmp/pr-context/README.md` for a manifest of all available context files
- The review intensity and minimum severity settings from the workflow
- The path to that sub-agent's file ordering — tell it to read the file for its ordered list (per-file diffs are at `/tmp/pr-context/diffs/<filename>.diff`)
- Instruction to read changed files from the workspace (the PR branch is checked out)

Each sub-agent returns a structured findings list. They do NOT leave inline comments.

After all 3 sub-agents complete, merge and deduplicate findings per the Pick Three, Keep Many process before proceeding to the Verify and Comment step.
STRATEGY_THREE
fi

# Existing reviews
gh api "repos/$GITHUB_REPOSITORY/pulls/$PR_NUMBER/reviews" --paginate \
| jq -s 'add // []' > /tmp/pr-context/reviews.json
Expand Down Expand Up @@ -132,6 +209,8 @@ steps:
| `threads/<path>.json` | Per-file review threads — one file per changed file with existing threads, mirroring the repo path under `threads/` |
| `comments.json` | PR discussion comments (not inline) |
| `issue-{N}.json` | Linked issue details (one file per linked issue, if any) |
| `agent_count.txt` | Pre-computed sub-agent count: `0` (≤10 files, direct review), `2` (11–20 files), or `3` (>20 files) |
| `review-strategy.md` | Pre-computed review strategy with precise instructions for the agent based on PR size |
| `agents.md` | Repository conventions from `generate_agents_md` (if written by agent) |
| `review-instructions.md` | Review instructions, criteria, and calibration examples (if written by review-process fragment) |
MANIFEST
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/gh-aw-mention-in-pr-by-id.lock.yml

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions .github/workflows/gh-aw-mention-in-pr-no-sandbox.lock.yml

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions .github/workflows/gh-aw-mention-in-pr.lock.yml

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions .github/workflows/gh-aw-pr-review-addresser.lock.yml

Large diffs are not rendered by default.

22 changes: 5 additions & 17 deletions .github/workflows/gh-aw-pr-review.lock.yml

Large diffs are not rendered by default.

18 changes: 3 additions & 15 deletions .github/workflows/gh-aw-pr-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,25 +120,13 @@ Follow these steps in order.
4. Read `/tmp/pr-context/reviews.json` to check prior review submissions from this bot. Note any prior verdicts to avoid redundant reviews.
5. Read `/tmp/pr-context/review_comments.json` to check existing review threads. Note which files already have threads and whether they are resolved, unresolved, or outdated.

### Step 2: Sub-agent Review
### Step 2: Review

**File orderings are pre-computed** at `/tmp/pr-context/`:
- **Agent 1**: `/tmp/pr-context/file_order_az.txt` — alphabetical (A → Z)
- **Agent 2**: `/tmp/pr-context/file_order_za.txt` — reverse alphabetical (Z → A)
- **Agent 3**: `/tmp/pr-context/file_order_largest.txt` — by diff size descending

**Spawn sub-agents:** Follow the **Pick Three, Keep Many** process — spawn 3 `code-review` sub-agents to review the PR diff in parallel. Each sub-agent prompt must include:
- Instruction to read `/tmp/pr-context/review-instructions.md` for the review process, criteria, and calibration examples
- Instruction to read `/tmp/pr-context/README.md` for a manifest of all available context files
- The review intensity (`${{ inputs.intensity }}`) and minimum severity (`${{ inputs.minimum_severity }}`)
- The path to that sub-agent's file ordering (e.g., `/tmp/pr-context/file_order_az.txt`) — tell it to read the file for its ordered list (per-file diffs are at `/tmp/pr-context/diffs/<filename>.diff`)
- Instruction to read changed files from the workspace (the PR branch is checked out)

Each sub-agent returns a structured findings list. They do NOT leave inline comments.
Read `/tmp/pr-context/review-strategy.md` for the pre-computed review strategy. The strategy is determined by PR size and specifies the exact number of sub-agents (0, 2, or 3). Follow the instructions in that file exactly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Conflicting instructions still force three sub-agents

This new strategy step says to follow /tmp/pr-context/review-strategy.md (which can require 0 or 2 agents), but the workflow still imports gh-aw-fragments/pick-three-keep-many.md, whose text explicitly says “Spawn 3 sub-agents” and “Wait for all 3 sub-agents”.

Because both instructions are present in the same prompt, the model can still follow the hardcoded “3 sub-agents” rule, which defeats the deterministic sizing behavior introduced here.

Please make the prompt sources consistent (for example, gate/remove the pick-three-keep-many fragment when strategy says direct review, or rewrite that fragment to be conditional on review-strategy.md).


### Step 3: Verify and Comment

After merging and deduplicating sub-agent findings per the Pick Three, Keep Many process, verify each finding before leaving a comment. For every finding:
If sub-agents were used, merge and deduplicate findings per the Pick Three, Keep Many process. Verify each finding before leaving a comment. For every finding:

1. **Read the file and surrounding context** — open the full file, not just the diff. Understand the broader code.
2. **Construct a concrete failure scenario** — what specific input or state causes the bug? If you cannot describe one, drop the finding.
Expand Down
22 changes: 22 additions & 0 deletions github/workflows/gh-aw-fragments/pick-three-keep-many.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
### Pick Three, Keep Many

If your review strategy requires sub-agents, parallelize your work using sub-agents. Spawn the exact number of sub-agents specified by `/tmp/pr-context/review-strategy.md`, with each sub-agent approaching the task from a different angle (for example, different focus areas, heuristics, or file order). If the strategy says direct review, do not spawn sub-agents.

**How to spawn sub-agents:** Call `runSubagent` with the `agentType` and `model` specified by the workflow instructions below (defaulting to `agentType: "general-purpose"` and `model: "${{ inputs.model }}"` if none are specified). Sub-agents cannot see your conversation history, the other sub-agents' results, or any context you have gathered so far. Each prompt must be **fully self-contained** — include everything the sub-agent needs:

- The full task description and objective (restate it, don't summarize)
- All repository context, conventions, and constraints you've gathered (e.g., from `generate_agents_md`)
- Any relevant data the sub-agent needs to do its job (diffs, file contents, existing threads)
- The quality criteria and output format you expect
- The specific angle that distinguishes this sub-agent from the others

Err on the side of providing too much context rather than too little. A well-informed sub-agent with a 10,000-token prompt will produce far better results than one that has to rediscover the codebase from scratch.

**Wait for all spawned sub-agents to complete.** Do not proceed until every sub-agent you started has returned its result.

**Merge and deduplicate findings** across all sub-agents:
1. If multiple sub-agents flagged the same issue, keep the version with the strongest evidence and clearest explanation.
2. If a finding is unique to one sub-agent, include it only if it passes the quality gate on its own merits — a finding flagged by only one sub-agent deserves extra scrutiny.
3. Drop any finding that does not meet the verification criteria.

**Filter aggressively for quality.** Your job as the parent agent is to be the quality gate. Sub-agents cast a wide net; you decide what's worth keeping. For each surviving finding, verify it yourself — check that file paths exist, line numbers are accurate, the problem is real, and the finding is actionable. Discard anything vague, speculative, or already addressed. If no findings survive filtering, call `noop`.
161 changes: 161 additions & 0 deletions github/workflows/gh-aw-pr-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
---
inlined-imports: true
name: "PR Review"
description: "AI code review with inline comments on pull requests"
imports:
- gh-aw-fragments/elastic-tools.md
- gh-aw-fragments/runtime-setup.md
- gh-aw-fragments/formatting.md
- gh-aw-fragments/rigor.md
- gh-aw-fragments/mcp-pagination.md
- gh-aw-fragments/pr-context.md
- gh-aw-fragments/review-process.md
- gh-aw-fragments/messages-footer.md
- gh-aw-fragments/safe-output-review-comment.md
- gh-aw-fragments/safe-output-submit-review.md
- gh-aw-fragments/pick-three-keep-many.md
- gh-aw-fragments/network-ecosystems.md
engine:
id: copilot
model: ${{ inputs.model }}
concurrency:
group: "gh-aw-copilot-${{ github.workflow }}-pr-review-${{ github.event.pull_request.number }}"
on:
workflow_call:
inputs:
model:
description: "AI model to use"
type: string
required: false
default: "gpt-5.3-codex"
additional-instructions:
description: "Repo-specific instructions appended to the agent prompt"
type: string
required: false
default: ""
setup-commands:
description: "Shell commands to run before the agent starts (dependency install, build, etc.)"
type: string
required: false
default: ""
allowed-bot-users:
description: "Allowlisted bot actor usernames (comma-separated)"
type: string
required: false
default: "github-actions[bot]"
intensity:
description: "Review intensity: conservative, balanced, or aggressive"
type: string
required: false
default: "balanced"
minimum_severity:
description: "Minimum severity for inline comments: critical, high, medium, low, or nitpick. Issues below this threshold go in a collapsible section of the review body instead."
type: string
required: false
default: "low"
messages-footer:
description: "Footer appended to all agent comments and reviews"
type: string
required: false
default: ""
create-pull-request-review-comment-max:
description: "Maximum number of review comments the agent can create per run"
type: string
required: false
default: "30"
secrets:
COPILOT_GITHUB_TOKEN:
required: true
roles: [admin, maintainer, write]
bots:
- "${{ inputs.allowed-bot-users }}"
concurrency:
group: ${{ github.workflow }}-pr-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
permissions:
actions: read
contents: read
pull-requests: read
issues: read
tools:
github:
toolsets: [repos, issues, pull_requests, search, actions]
bash: true
web-fetch:
safe-outputs:
activation-comments: false
strict: false
timeout-minutes: 90
steps:
- name: Repo-specific setup
if: ${{ inputs.setup-commands != '' }}
env:
SETUP_COMMANDS: ${{ inputs.setup-commands }}
run: eval "$SETUP_COMMANDS"
---

# PR Review Agent

Review pull requests in ${{ github.repository }} and provide actionable feedback via inline review comments on specific code lines.

## Context

- **Repository**: ${{ github.repository }}
- **PR**: #${{ github.event.pull_request.number }} — ${{ github.event.pull_request.title }}
- **PR context on disk**: `/tmp/pr-context/` — PR metadata, diff, files, reviews, comments, and linked issues are pre-fetched. Read from these files instead of calling the API.

## Constraints

This workflow is read-only. You can read files, search code, run commands, and interact with PRs and issues — but your only outputs are inline review comments and a review submission.

## Review Process

Follow these steps in order.

### Step 1: Gather Context

1. Call `generate_agents_md` to get the repository's coding guidelines and conventions. Write the result to `/tmp/pr-context/agents.md` so sub-agents can read it. If `generate_agents_md` fails, continue without it.
2. Read `/tmp/pr-context/pr.json` for PR details (author, description, branches).
3. Read `/tmp/pr-context/issue-*.json` files if any exist to understand linked issue motivation and acceptance criteria.
4. Read `/tmp/pr-context/reviews.json` to check prior review submissions from this bot. Note any prior verdicts to avoid redundant reviews.
5. Read `/tmp/pr-context/review_comments.json` to check existing review threads. Note which files already have threads and whether they are resolved, unresolved, or outdated.

### Step 2: Review

Read `/tmp/pr-context/review-strategy.md` for the pre-computed review strategy. The strategy is determined by PR size and specifies the exact number of sub-agents (0, 2, or 3). Follow the instructions in that file exactly.

### Step 3: Verify and Comment

If sub-agents were used, merge and deduplicate findings per the Pick Three, Keep Many process. Verify each finding before leaving a comment. For every finding:

1. **Read the file and surrounding context** — open the full file, not just the diff. Understand the broader code.
2. **Construct a concrete failure scenario** — what specific input or state causes the bug? If you cannot describe one, drop the finding.
3. **Challenge the finding** — would a senior engineer familiar with this codebase agree this is a real issue? If "probably not" or "unsure", drop it.
4. **Check existing threads** — if this issue was already flagged in a prior review (resolved or unresolved), do not duplicate.

Only leave a comment if the finding survives all four checks. Findings flagged independently by multiple sub-agents are stronger candidates. Findings from only one sub-agent deserve extra scrutiny.

Leave inline comments (`create_pull_request_review_comment`) per the **Code Review Reference** above for each finding that survives verification. Comment on each file's findings before moving to the next file. If no findings survive verification, proceed directly to Step 4.

### Step 4: Submit the Review

**Skip if nothing new:** If you left zero inline comments during this review AND your verdict would be the same as the most recent review from this bot (compare against reviews in Step 1), call `noop` with a message like "No new findings — prior review still applies" and stop. Do not submit a redundant review.

After all comments are posted, step back and consider the PR as a whole. Call **`submit_pull_request_review`** with:
- The review type (REQUEST_CHANGES, COMMENT, or APPROVE)
- A review body that is **only the verdict and only if the verdict is not APPROVE**. If you have cross-cutting feedback that spans multiple files or cannot be expressed as inline comments, include it here. Otherwise, leave the review body empty — your inline comments already contain the detail.

**Bot-authored PRs:** If the PR author is `github-actions[bot]`, you can only submit a `COMMENT` review — `APPROVE` and `REQUEST_CHANGES` will fail because GitHub does not allow bot accounts to approve or request changes on their own PRs. Use `COMMENT` and state your verdict in the review body instead.

**Do NOT** describe what the PR does, list the files you reviewed, summarize inline comments, or restate prior review feedback. The PR author already knows what their PR does. Your inline comments already contain all the detail. The review body exists solely to communicate the approve/request-changes decision and important/critical feedback that cannot be covered in inline comments.

If you have no issues, or you have only provided NITPICK and LOW issues, submit an APPROVE review. Otherwise, submit a REQUEST_CHANGES review.

## Review Settings

- **Intensity**: `${{ inputs.intensity }}`
- **Minimum inline severity**: `${{ inputs.minimum_severity }}`

These override the defaults defined in the Code Review Reference above.

${{ inputs.additional-instructions }}