Integrate code-review skill into pr-review Pre-Flight and Try-Fix#34994
Conversation
Pre-Flight (Phase 1) now includes a deep code review step: - Part A (Steps 1-6): Context gathering (unchanged) - Part B (Step 7): Invokes code-review skill as independent sub-agent with independence-first principle preserved (reads code before narrative) - Outputs: content.md (context + summary) + code-review.md (full review) Try-Fix (Phase 2) now receives code-review findings as advisory hints: - Errors, warnings, failure-mode probes, blast radius passed to each model - Hints are advisory (inform approach) not mandatory (checklist to satisfy) - Only root-cause-relevant findings included to avoid distraction Report (Phase 3) updated: - Code Review row added to Phase Status table - Hard gate: NEEDS_CHANGES verdict blocks APPROVE regardless of Gate/Try-Fix - New section: Code Review Impact on Try-Fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The post-ai-summary-comment.ps1 script now loads pre-flight/code-review.md as a '🔬 Code Review — Deep Analysis' collapsible section, shown between Pre-Flight and Try-Fix in the PR comment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34994Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34994" |
There was a problem hiding this comment.
Pull request overview
This PR integrates the standalone code-review skill into the pr-review orchestrator so Pre-Flight produces deep code-review findings that can guide Try-Fix exploration, and so the final report can gate approval based on code-review verdicts.
Changes:
- Extend Pre-Flight to run
code-reviewas an independence-first sub-agent and persist full output topre-flight/code-review.md. - Thread Pre-Flight code-review findings into Try-Fix prompts as structured advisory hints.
- Update report logic and the AI summary comment script to include/display the deep code-review output.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/skills/pr-review/SKILL.md | Updates the orchestrator docs to include code-review in Pre-Flight and adds a code_review_findings section to the try-fix prompt template. |
| .github/pr-review/pr-preflight.md | Adds Part B (Step 7) instructing a sub-agent invocation of code-review and defines new outputs (content.md + code-review.md). |
| .github/pr-review/pr-report.md | Adds code-review status/impact sections and a hard gate rule for NEEDS_CHANGES. |
| .github/scripts/post-ai-summary-comment.ps1 | Adds a new collapsible “Code Review — Deep Analysis” section sourced from pre-flight/code-review.md. |
| .github/skills/code-review/SKILL.md | Updates the “How this differs” section to reflect pr-review’s new 3-phase layout and sub-agent usage. |
| prompt: | | ||
| Invoke the try-fix skill for PR #XXXXX: | ||
| - problem: {bug description from Pre-Flight} | ||
| - platform: {platform from Platform Selection} | ||
| - test_command: {test command from detected test type — use BuildAndRunHostApp.ps1 for UITest, Run-DeviceTests.ps1 for DeviceTest, dotnet test for UnitTest} | ||
| - target_files: | ||
| - src/{area}/{file1}.cs | ||
| - src/{area}/{file2}.cs | ||
| - code_review_findings: | ||
| - errors: | ||
| - {❌ Error finding 1 with file:line reference} | ||
| - {❌ Error finding 2 with file:line reference} | ||
| - warnings: | ||
| - {⚠️ Warning finding 1 with file:line reference} | ||
| - {⚠️ Warning finding 2 with file:line reference} | ||
| - failure_modes: | ||
| - {Failure mode 1}: {What happens in this scenario} | ||
| - {Failure mode 2}: {What happens in this scenario} | ||
| - blast_radius: {Summary — e.g., "Runs for ALL toolbar items at startup, not just badged ones"} | ||
| - verdict: {LGTM / NEEDS_CHANGES / NEEDS_DISCUSSION} (confidence: {high/medium/low}) | ||
|
|
There was a problem hiding this comment.
The try-fix invocation template introduces a new top-level code_review_findings input, but try-fix’s documented interface only defines optional Hints. To keep the skill contract consistent (and reduce prompt-parsing ambiguity), consider passing these findings under the existing hints field (or explicitly updating try-fix’s Inputs section to document code_review_findings). Also, the current example nests errors/warnings/etc as list items (- errors:, - warnings:), which reads like a YAML list-of-maps; a simple mapping structure would be clearer for humans and agents.
| > **🚨 Independence-first requirement:** Step 7 MUST be invoked as a **separate sub-agent** (via the `task` tool with `agent_type: "general-purpose"`) so the code-review skill can form its assessment from the code BEFORE reading any PR narrative. The sub-agent receives ONLY the PR number — not the context gathered in Part A. This prevents anchoring bias. | ||
|
|
||
| 7. **Invoke the code-review skill as a sub-agent:** | ||
|
|
||
| ``` | ||
| prompt: | | ||
| Run the code-review skill for PR #XXXXX. | ||
| Follow the full 6-step workflow in .github/skills/code-review/SKILL.md. | ||
| Output the review in the format specified by that skill. | ||
| agent_type: general-purpose | ||
| mode: sync | ||
| ``` |
There was a problem hiding this comment.
Step 7 says to invoke the code-review skill via the task tool as a separate sub-agent, but the example block only shows prompt, agent_type, and mode without indicating the actual task invocation shape used elsewhere (e.g., including the tool name/fields needed to launch the sub-agent, and—if required by the runner—the model selection). Tightening this snippet to match the real task call format would prevent confusion and make the “independence-first” requirement easier to follow correctly.
🔍 Multi-Model Code Review — PR #34994PR: Integrate code-review skill into pr-review Pre-Flight and Try-Fix CI Status
No test failures detected. Re-Review v3 (after commits
|
| # | Finding | Status |
|---|---|---|
| 1–4, 6–9 | Original 8 findings | ✅ FIXED (unchanged since v2) |
| 5/A | Invalid name param in task snippet |
✅ DISMISSED — name is a valid task tool parameter (generates human-readable agent ID). 2/3 reviewers confirmed after schema verification. Previous finding was incorrect. |
Verified Correct ✅
| Area | Verdict | Details |
|---|---|---|
post-gate-comment.ps1 deletion |
✅ Safe | All gate functionality migrated to unified script; legacy cleanup handles old comments |
Merge-Sessions regex fix |
✅ Correct | .Replace($block, ..., 1) — only outer <details> tag opened/collapsed; inner phase tags preserved |
| Double-call behavior | ✅ Correct | First call (gate-only) creates comment; second call (full) replaces same SHA session in-place via Merge-Sessions keying |
Review-PR.ps1 script reference |
✅ Correct | Points to post-ai-summary-comment.ps1; both early (gate) and STEP 3 (full) calls aligned |
RuntimeIdentifierOverride + _MauiDeviceTestUnpackaged |
✅ Matches CI | Verified against eng/devices/windows.cake lines 153, 181–183 |
WindowsAppSDKSelfContained removal |
✅ Correct | Replaced by _MauiDeviceTestUnpackaged which scopes it to device test projects via .props |
| Gate-only early post | ✅ Works | Gate section alone satisfies the if (-not $gateSection -and ...) guard |
| DryRun safety | ✅ Correct | Legacy cleanup only runs after actual post (DryRun exits earlier) |
New Findings
🟡 MODERATE
1. Legacy cleanup re-fetches comments already in scope (3/3 reviewers)
📁 post-ai-summary-comment.ps1 (cleanup section, ~line 354)
The script paginate-fetches all PR comments twice per run: once to find the existing <!-- AI Summary --> comment (~line 243), and again to find legacy <!-- AI Gate --> comments for cleanup (~line 356). The first fetch's $allComments is already in scope.
Fix: Reuse $allComments from the first fetch instead of making a second gh api ... --paginate call. This is called on every gate post and every phase post, so it doubles API traffic per review run.
🟢 MINOR
2. Legacy cleanup lacks a guard against deleting the unified comment (1/3 reviewers — included as defensive suggestion)
📁 post-ai-summary-comment.ps1 (cleanup section)
If a gate/content.md file ever contained the literal string <!-- AI Gate -->, the unified comment body would match the legacy marker and the cleanup would delete the just-posted comment. Risk is low (content is auto-generated), but a defensive guard is easy:
$legacy = @($allComments | Where-Object {
$_.body -and $_.body.Contains($legacyMarker) -and -not $_.body.Contains($MARKER)
})3. $postGateScript variable name is stale (1/3 reviewers)
📁 Review-PR.ps1 (~line 502)
Variable is named $postGateScript but now points to post-ai-summary-comment.ps1. Cosmetic — rename to $postSummaryScript for clarity.
Recommendation
✅ Approve
The expanded PR is well-executed. The unified comment architecture is sound (double-call with SHA-keyed merging works correctly), the Merge-Sessions regex fix is correct, the Windows device-test build fix matches CI exactly, and nothing is lost in the post-gate-comment.ps1 deletion. The previous name parameter finding was incorrect and is dismissed.
The one actionable item (redundant API call, Finding 1) is a nice-to-have optimization, not a blocker. Findings 2–3 are defensive suggestions.
📜 Review History
v1 (initial): 9 findings,
v2 (after 369cfe6): 8/9 FIXED, 1 partially fixed + 1 new moderate, ✅ APPROVE WITH MINOR FIX
v3 (after 3183be2, 8ebfe44, 67b1a9a): Previous partial fix DISMISSED (was incorrect), 1 new moderate (API optimization), 2 minor suggestions. ✅ APPROVE
… independence Finding 1: Add failure/fallback path for Step 7 code-review sub-agent. If sub-agent fails/times out: set verdict to SKIPPED, omit hints from Try-Fix prompts, do NOT apply code-review hard gate in Report. Finding 2: Add explicit read instructions in Report prerequisites. Agent must read pre-flight/content.md before determining recommendation. Finding 3: Restructure decision table as priority cascade with 'first match wins' semantics. Add row for code-review SKIPPED/unavailable. Resolve conflicting rows by explicit priority ordering. Finding 4: Fold code_review_findings into try-fix's documented 'hints' field instead of introducing an undocumented top-level input. Finding 5: Complete the task tool invocation snippet with proper task() call format including description parameter. Finding 6: Fix template/selectivity contradiction by using comments in template to show warnings are conditionally included. Finding 7: Add validation constraint for independence-first (prompt MUST NOT contain Part A context). Clarify 'independent' in Try-Fix means different fix approach, not isolation from code-review context. Finding 8: Re-add Gate row to Quick Reference table with recovery guidance for when gate result is missing from prompt. Finding 9: Fix phase numbering (Phase 3 → Phase 2 for multi-model config) and Report gate prerequisite (Phases 1-3 → Phases 1-2). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Addressed all 9 review findings in commit 369cfe6:
|
Passing /p:WindowsAppSDKSelfContained=true on the dotnet command line propagates the property to ALL ProjectReferences (e.g., Graphics.csproj), which then fail the WindowsAppSDK arch validation because the RID is not propagated to dependency builds. This caused the Gate on PR #34699 to fail both 'without fix' and 'with fix' builds with: WindowsAppSDKSelfContained requires a supported Windows architecture [Graphics.csproj::TargetFramework=net10.0-windows10.0.19041.0] Replace WindowsAppSDKSelfContained=true with the documented _MauiDeviceTestUnpackaged=true property, which the test project's Microsoft.Maui.TestUtils.DeviceTests.Runners.props converts to WindowsAppSDKSelfContained=true ONLY on the device test project itself. Also add SelfContained=true to mirror the canonical pattern in eng/devices/windows.cake (buildOnly task, lines 178-188). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After switching to _MauiDeviceTestUnpackaged the build progressed past Graphics.csproj but then failed in the PRI generation step because TestUtils.DeviceTests was built without a RID subfolder while Controls.DeviceTests was built at win-x64\, so PRI175/PRI252 fired: WINAPPSDKGENERATEPROJECTPRIFILE : error : PRI175: 0x80070002 - Processing Resources failed with error: The system cannot find the file specified. WINAPPSDKGENERATEPROJECTPRIFILE : error : PRI252: 0xdef00071 - File ...\TestUtils.DeviceTests\...\win-x64\Microsoft.Maui.TestUtils.DeviceTests.pri not found. Plain '-r win-x64' (RuntimeIdentifier) is auto-suppressed on non-leaf ProjectReferences, leaving dependency assets in the no-RID output dir. Use RuntimeIdentifierOverride instead - this is exactly what the cake script does (eng/devices/windows.cake:153) and forces every referenced project in the graph to honor the RID. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously the Gate (Test Before/After Fix) result and the AI Review
summary were posted as two separate comments on the PR via
post-gate-comment.ps1 and post-ai-summary-comment.ps1 respectively.
This produced two side-by-side bot comments per review run, which is
noisy and made it hard to follow what happened on a given commit.
Now there is a single comment per PR (marker: <!-- AI Summary -->).
Each session inside that comment includes a top, always-open
"Gate -- Test Before & After Fix" section followed by the existing
collapsed phase sections (Pre-Flight / Code Review / Fix / Report).
Changes:
* post-ai-summary-comment.ps1
- Loads gate/content.md alongside the phase content; renders it as
the first <details open> section inside each session block.
- Either gate or phase content is sufficient -- no longer requires
phases to exist (so the script can be called early after the gate
finishes).
- Fixes a long-standing Merge-Sessions regex bug: the previous
[regex]::Replace(... , 1) call was interpreting the trailing 1 as
a RegexOptions value rather than a replace count, which silently
became a global replace and forced EVERY inner <details> tag in
the latest session to expand. Now uses the instance-method form
[regex]::new(pattern).Replace(input, replacement, 1) so only the
outer session wrapper is re-tagged.
- Cleans up any legacy standalone <!-- AI Gate --> comments after a
successful post.
* Review-PR.ps1
- STEP 1 now invokes post-ai-summary-comment.ps1 (which will create
or update the unified comment with just the gate section if no
phase content exists yet) instead of the deleted
post-gate-comment.ps1.
- STEP 3 still calls the same script after the review phases finish,
enriching the same comment in place via the SHA-based session
merge.
* post-gate-comment.ps1
- Deleted; no remaining callers.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…4994) > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could [test the resulting artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from this PR and let us know in a comment if this change resolves your issue. Thank you! ## Description Integrates the standalone `code-review` skill into the `pr-review` orchestrator so that deep code analysis happens during Pre-Flight and its findings feed into Try-Fix exploration. ### Changes **Pre-Flight (Phase 1)** — `.github/pr-review/pr-preflight.md` - Added Part B (Step 7): Invokes code-review skill as an independent sub-agent - Independence-first principle preserved — sub-agent receives only the PR number, not Part A context - Outputs both `content.md` (context + code review summary) and `code-review.md` (full review) **Try-Fix (Phase 2)** — `.github/skills/pr-review/SKILL.md` - Try-fix prompt template now includes `code_review_findings` (errors, warnings, failure modes, blast radius) - Findings are advisory hints — inform approach, not a mandatory checklist - Only root-cause-relevant findings included to avoid distraction **Report (Phase 3)** — `.github/pr-review/pr-report.md` - Code Review row added to Phase Status table - Hard gate: `NEEDS_CHANGES` verdict blocks `APPROVE` regardless of Gate/Try-Fix - New "Code Review Impact on Try-Fix" section **AI Summary** — `.github/scripts/post-ai-summary-comment.ps1` - Full code-review output shown as separate "🔬 Code Review — Deep Analysis" collapsible section - Gracefully skipped if `code-review.md` does not exist ### Design decisions - The `code-review` skill itself is NOT modified — stays standalone and independently invokable - The `try-fix` skill is NOT modified — already accepts hints - Code review runs as a sub-agent to prevent anchoring bias (independence-first) - Report hard gate prevents approving PRs with unresolved code review errors --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
Description
Integrates the standalone
code-reviewskill into thepr-revieworchestrator so that deep code analysis happens during Pre-Flight and its findings feed into Try-Fix exploration.Changes
Pre-Flight (Phase 1) —
.github/pr-review/pr-preflight.mdcontent.md(context + code review summary) andcode-review.md(full review)Try-Fix (Phase 2) —
.github/skills/pr-review/SKILL.mdcode_review_findings(errors, warnings, failure modes, blast radius)Report (Phase 3) —
.github/pr-review/pr-report.mdNEEDS_CHANGESverdict blocksAPPROVEregardless of Gate/Try-FixAI Summary —
.github/scripts/post-ai-summary-comment.ps1code-review.mddoes not existDesign decisions
code-reviewskill itself is NOT modified — stays standalone and independently invokabletry-fixskill is NOT modified — already accepts hints