Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/agents/pr/PLAN-TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ See `SHARED-RULES.md` for complete details. Key points:

**Round 1: Run try-fix with each model (SEQUENTIAL)**
- [ ] claude-sonnet-4.5
- [ ] claude-opus-4.5
- [ ] claude-opus-4.6
- [ ] gpt-5.2
- [ ] gpt-5.2-codex
- [ ] gemini-3-pro-preview
Expand All @@ -68,10 +68,10 @@ See `SHARED-RULES.md` for complete details. Key points:
- [ ] Invoke EACH model: "Any NEW fix ideas?"
- [ ] Record responses in Cross-Pollination table
- [ ] Run try-fix for new ideas (SEQUENTIAL)
- [ ] Repeat until ALL 5 say "NO NEW IDEAS" (max 3 rounds)
- [ ] Repeat until ALL 6 say "NO NEW IDEAS" (max 3 rounds)

**Completion:**
- [ ] Cross-Pollination table has all 5 responses
- [ ] Cross-Pollination table has all 6 responses
- [ ] Mark Exhausted: Yes
- [ ] Compare passing candidates with PR's fix
- [ ] Select best fix (results → simplicity → robustness)
Expand Down
2 changes: 1 addition & 1 deletion .github/agents/pr/SHARED-RULES.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ Phase 4 uses these 5 AI models for try-fix exploration (run SEQUENTIALLY):
| Order | Model |
|-------|-------|
| 1 | `claude-sonnet-4.5` |
| 2 | `claude-opus-4.5` |
| 2 | `claude-opus-4.6` |
| 3 | `gpt-5.2` |
| 4 | `gpt-5.2-codex` |
| 5 | `gemini-3-pro-preview` |
Expand Down
74 changes: 71 additions & 3 deletions .github/agents/pr/post-gate.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,16 @@ If Gate is not passed, go back to `.github/agents/pr.md` and complete phases 1-2

If try-fix cannot run due to environment issues, **STOP and ask the user**. Do NOT mark attempts as "BLOCKED" and continue.

### 🚨 CRITICAL: Stop on Environment Blockers (Applies to Phase 4)

The same "Stop on Environment Blockers" rule from `pr.md` applies here. If try-fix cannot run due to:
- Missing Appium drivers
- Device/emulator not available
- WinAppDriver not installed
- Platform tools missing

**STOP and ask the user** before continuing. Do NOT mark try-fix attempts as "BLOCKED" and continue. Either fix the environment issue or get explicit user permission to skip.

---

## 🔧 FIX: Explore and Select Fix (Phase 3)
Expand Down Expand Up @@ -52,7 +62,7 @@ Phase 4 uses a **multi-model approach** to maximize fix diversity. Each AI model

#### Round 1: Run try-fix with Each Model

Run the `try-fix` skill **5 times sequentially**, once with each model (see `SHARED-RULES.md` for model list).
Run the `try-fix` skill **6 times sequentially**, once with each model (see `SHARED-RULES.md` for model list).

**For each model**, invoke the try-fix skill:
```
Expand All @@ -70,14 +80,31 @@ Generate ONE independent fix idea. Review the PR's fix first to ensure your appr

**Wait for each to complete before starting the next.**

**🧹 MANDATORY: Clean up between attempts.** After each try-fix completes (pass or fail), run these commands before starting the next attempt:

```bash
# 1. Restore any baseline state from the previous attempt (safe no-op if none exists)
pwsh .github/scripts/EstablishBrokenBaseline.ps1 -Restore

# 2. Restore all tracked files to HEAD (the merged PR state)
# This catches any files the previous attempt modified but didn't restore
git checkout HEAD -- .

# 3. Remove untracked files added by the previous attempt
# git checkout restores tracked files but does NOT remove new untracked files
git clean -fd --exclude=CustomAgentLogsTmp/
```

**Why this is required:** Each try-fix attempt modifies source files. If an attempt fails mid-way (build error, timeout, model error), it may not run its own cleanup step. Without explicit cleanup, the next attempt starts with a dirty working tree, which can cause missing files, corrupt state, or misleading test results. Use `HEAD` (not just `-- .`) to also restore deleted files.

#### Round 2+: Cross-Pollination Loop (MANDATORY)

After Round 1, invoke EACH of the 5 models to ask for new ideas. **No shortcuts allowed.**

**❌ WRONG**: Using `explore`/`glob`, declaring exhaustion without invoking each model
**✅ CORRECT**: Invoke EACH model via task agent and ask explicitly

**Steps (repeat until all 5 say "NO NEW IDEAS", max 3 rounds):**
**Steps (repeat until all 6 say "NO NEW IDEAS", max 3 rounds):**

1. **Compile bounded summary** (max 3-4 bullets per attempt):
- Attempt #, approach (1 line), result (✅/❌), key learning (1 line)
Expand Down Expand Up @@ -176,7 +203,7 @@ Update the state file:
| Model | Round 2 Response |
|-------|------------------|
| claude-sonnet-4.5 | NO NEW IDEAS |
| claude-opus-4.5 | NO NEW IDEAS |
| claude-opus-4.6 | NO NEW IDEAS |
| gpt-5.2 | NO NEW IDEAS |
| gpt-5.2-codex | NO NEW IDEAS |
| gemini-3-pro-preview | NO NEW IDEAS |
Expand Down Expand Up @@ -298,3 +325,44 @@ Update all phase statuses to complete.
- ❌ **Forgetting to revert between attempts** - Each try-fix must start from broken baseline, end with PR restored
- ❌ **Declaring exhaustion prematurely** - All 5 models must confirm "no new ideas" via actual invocation
- ❌ **Rushing the report** - Take time to write clear justification
- ❌ **Skipping cleanup between attempts** - ALWAYS run `-Restore` + `git checkout HEAD -- .` + `git clean -fd --exclude=CustomAgentLogsTmp/` between try-fix attempts (see Step 1)

---

## Common Errors and Recovery

### skill(try-fix) fails with "ENOENT: no such file or directory"

**Symptom:** `skill(try-fix) Failed to read skill file: Error: ENOENT: no such file or directory, open '.../.github/skills/try-fix/SKILL.md'`

**Root cause:** A previous try-fix attempt failed mid-way and left the working tree in a dirty state. Files may have been modified or deleted by `EstablishBrokenBaseline.ps1` without being restored.

**Fix:** Run cleanup before retrying:
```bash
pwsh .github/scripts/EstablishBrokenBaseline.ps1 -Restore
git checkout HEAD -- .
git clean -fd --exclude=CustomAgentLogsTmp/
```

Then retry the try-fix attempt. The skill file should now be accessible.

**Prevention:** Always run the cleanup commands between try-fix attempts (see Step 1).

### try-fix attempt starts with dirty working tree

**Symptom:** `git status` shows modified files before the attempt starts, or the build fails with unexpected errors from files the attempt didn't touch.

**Root cause:** Previous attempt didn't restore its changes (crashed, timed out, or model didn't follow Step 8 restore instructions).

**Fix:** Same as above — run `-Restore` + `git checkout HEAD -- .` + `git clean -fd --exclude=CustomAgentLogsTmp/` to reset to the merged PR state.

### Build errors unrelated to the fix being attempted

**Symptom:** Build fails with errors in files the try-fix attempt didn't modify (e.g., XAML parse errors, unrelated compilation failures).

**Root cause:** Often caused by dirty working tree from a previous attempt. Can also be transient environment issues.

**Fix:**
1. Run cleanup: `pwsh .github/scripts/EstablishBrokenBaseline.ps1 -Restore && git checkout HEAD -- . && git clean -fd --exclude=CustomAgentLogsTmp/`
2. Retry the attempt
3. If it fails again with the same unrelated error, treat this as an environment/worktree blocker: STOP the try-fix workflow, do NOT continue with the next model, and ask the user to investigate (see "Stop on Environment Blockers").
8 changes: 4 additions & 4 deletions .github/scripts/BuildAndRunHostApp.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -262,8 +262,8 @@ if ($Platform -eq "catalyst") {

Write-Success "MacCatalyst app prepared (Appium will launch with test name)"
} else {
Write-Warning "MacCatalyst app not found at: $appPath"
Write-Warning "Test may use wrong app bundle if another version is registered"
Write-Warn "MacCatalyst app not found at: $appPath"
Write-Warn "Test may use wrong app bundle if another version is registered"
}

# Set log file path directly - app will write ILogger output here
Expand Down Expand Up @@ -323,7 +323,7 @@ if ($Platform -eq "android") {
& adb -s $DeviceUdid logcat -d | Select-String "com.microsoft.maui.uitests|DOTNET" > $deviceLogFile

if ((Get-Item $deviceLogFile).Length -eq 0) {
Write-Warning "No logs found for com.microsoft.maui.uitests, dumping entire logcat..."
Write-Warn "No logs found for com.microsoft.maui.uitests, dumping entire logcat..."
& adb -s $DeviceUdid logcat -d > $deviceLogFile
}

Expand Down Expand Up @@ -397,7 +397,7 @@ if (Test-Path $deviceLogFile) {
Write-Host ""
Write-Info "Full device log: $deviceLogFile"
} else {
Write-Warning "Could not read device log file"
Write-Warn "Could not read device log file"
}

Write-Host "═══════════════════════════════════════════════════════" -ForegroundColor Cyan
Expand Down
6 changes: 3 additions & 3 deletions .github/scripts/BuildAndRunSandbox.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ if ($Platform -eq "catalyst") {
Write-Success "MacCatalyst Sandbox app launched with log capture"
}
} else {
Write-Warning "MacCatalyst Sandbox app not found at: $appPath"
Write-Warn "MacCatalyst Sandbox app not found at: $appPath"
}
}

Expand Down Expand Up @@ -379,7 +379,7 @@ try {
# Fallback: If we couldn't get PID, dump entire logcat buffer (unfiltered)
# This ensures we always have logs for the agent to analyze
Write-Host ""
Write-Warning "Could not capture app PID from Appium test output"
Write-Warn "Could not capture app PID from Appium test output"
Write-Info "Dumping entire logcat buffer (unfiltered)..."
& adb -s $DeviceUdid logcat -d > $deviceLogFile
Write-Info "Logcat dumped to: $deviceLogFile (UNFILTERED - contains all apps)"
Expand Down Expand Up @@ -469,7 +469,7 @@ try {
Write-Info "All logs are from Sandbox app only (Maui.Controls.Sample.Sandbox)"
}
} else {
Write-Warning "Could not read device log file"
Write-Warn "Could not read device log file"
}

Write-Host "═══════════════════════════════════════════════════════" -ForegroundColor Cyan
Expand Down
Loading
Loading