diff --git a/.github/agents/pr-reviewer.md b/.github/agents/pr-reviewer.md index 3071174c2ba7..c791a23253b1 100644 --- a/.github/agents/pr-reviewer.md +++ b/.github/agents/pr-reviewer.md @@ -9,13 +9,29 @@ You are a specialized PR review agent for the .NET MAUI repository. ## Core Instructions -**MANDATORY FIRST STEP**: Before beginning your review, read these instruction files in order: - -1. `.github/instructions/pr-reviewer-agent/core-guidelines.md` - Core philosophy, workflow, code analysis patterns -2. `.github/instructions/pr-reviewer-agent/testing-guidelines.md` - Which app to use (Sandbox vs HostApp), fetch PR, build/deploy, edge cases, SafeArea testing -3. `.github/instructions/pr-reviewer-agent/sandbox-setup.md` - Sandbox modification, instrumentation, validation checkpoint -4. `.github/instructions/pr-reviewer-agent/error-handling.md` - Handling build errors and unexpected results -5. `.github/instructions/pr-reviewer-agent/output-format.md` - Review structure, redundancy elimination +**🚨 CRITICAL WORKFLOW RULE** + +**YOU MUST DO THESE BEFORE ANYTHING ELSE (including creating plans or todos):** + +1. Check current state: `git branch --show-current` +2. Read instruction files IN THIS EXACT ORDER: + 1. `.github/instructions/pr-reviewer-agent/core-guidelines.md` - Core philosophy, workflow, code analysis patterns + 2. `.github/instructions/pr-reviewer-agent/testing-guidelines.md` - Which app to use (Sandbox vs HostApp), fetch PR, build/deploy, edge cases, SafeArea testing + 3. `.github/instructions/pr-reviewer-agent/sandbox-setup.md` - Sandbox modification, instrumentation, validation checkpoint + 4. `.github/instructions/pr-reviewer-agent/error-handling.md` - Handling build errors and unexpected results + 5. `.github/instructions/pr-reviewer-agent/checkpoint-resume.md` - Checkpoint/resume system for environment limitations + 6. `.github/instructions/pr-reviewer-agent/output-format.md` - Review structure, redundancy elimination +3. Fetch and analyze PR details + +**ONLY AFTER completing steps 1-3 above may you:** +- Create a todo list +- Start modifying code +- Begin testing + +**Why this order matters:** +- Instructions contain critical context you MUST understand first +- Creating plans before reading instructions = wrong assumptions +- You may already be on the PR branch - check first! **ALSO READ** (context-specific): - `.github/copilot-instructions.md` - General coding standards @@ -34,4 +50,6 @@ You are a specialized PR review agent for the .NET MAUI repository. **Workflow**: Fetch PR β†’ Modify Sandbox β†’ Build/Deploy β†’ Test β†’ Compare WITH/WITHOUT PR β†’ Test edge cases β†’ Review +**Checkpoint/Resume**: If you cannot complete testing due to environment limitations (missing device, platform unavailable), use the checkpoint system in `checkpoint-resume.md`. + **See instruction files above for complete details.** \ No newline at end of file diff --git a/.github/instructions/pr-reviewer-agent/checkpoint-resume.md b/.github/instructions/pr-reviewer-agent/checkpoint-resume.md new file mode 100644 index 000000000000..d830bb6354fc --- /dev/null +++ b/.github/instructions/pr-reviewer-agent/checkpoint-resume.md @@ -0,0 +1,249 @@ +# Checkpoint/Resume System + +When you encounter environment limitations that prevent completing the review, you can pause and create a checkpoint for someone else to complete the testing. + +## When to Create a Checkpoint + +Create a checkpoint when you cannot proceed due to: +- **Physical device required** but only simulator/emulator available +- **Platform unavailable** (e.g., need Windows/Mac but working from Linux) +- **Specific OS version needed** that isn't installed +- **Hardware requirements** (e.g., specific Android API level, iOS version) +- **Performance testing** that requires real device +- **Specialized hardware** (e.g., specific screen resolution, foldable device) + +## How to Create a Checkpoint + +1. **Stop at the blocking point** - Don't try to work around limitations +2. **Document current state** - What you've completed so far +3. **Specify required action** - Exactly what needs to be done +4. **Provide resume instructions** - How to continue after the action is complete + +## Checkpoint Format + +When creating a checkpoint, output this in your response: + +````markdown +## πŸ›‘ CHECKPOINT: [Brief Description] + +### Current State +- βœ… **Completed**: [List what you've done] + - Fetched PR #XXXXX + - Modified Sandbox app with test case + - Analyzed code changes + +- ⏸️ **Paused At**: [Where you stopped] + - Ready to test on [platform/device] + - Built app is ready for deployment + +### Required Action +**Platform**: [iOS/Android/Windows/Mac] +**Device**: [Physical device / Specific simulator / Specific API level] +**Why**: [Brief explanation of why this specific environment is needed] + +**Steps to complete**: +1. [Specific command or action needed] +2. [What to observe/measure] +3. [What data to capture] + +### How to Resume + +After completing the required action, provide these results in a comment: + +``` +**Test Results from [Device/Platform]**: + +**Console Output**: +``` +[Paste console output here] +``` + +**Observed Behavior**: +- [What happened] +- [Measurements/observations] + +**Screenshots** (if applicable): +[Attach screenshots] +``` + +Then tag me to resume: @copilot please resume review +```` + +## Resume Process + +When you receive checkpoint results: +1. **Acknowledge the results** - Confirm you received the data +2. **Validate completeness** - Ensure all required data was captured +3. **Continue workflow** - Pick up where you left off +4. **Complete the review** - Finish remaining steps and provide final review + +## Example Checkpoints + +### Example 1: Physical iOS Device Testing + +```markdown +## πŸ›‘ CHECKPOINT: Physical iOS Device Testing Required + +### Current State +- βœ… **Completed**: + - Fetched PR #25789 - Fix RTL layout on iOS + - Modified Sandbox app with RTL CollectionView test + - Built app successfully for iOS + - Analyzed code changes in `CollectionViewHandler.iOS.cs` + +- ⏸️ **Paused At**: Ready to deploy to physical iPhone + +### Required Action +**Platform**: iOS +**Device**: Physical iPhone (iOS 17+) +**Why**: RTL layout rendering differences between simulator and device have been reported in the issue + +**Steps to complete**: +1. Deploy built app to physical iPhone: + +2. Launch app and observe: + - CollectionView with RTL FlowDirection + - Check padding values (should be 16px left, 0px right) + - Scroll behavior in RTL mode + - Visual alignment of items + +### How to Resume + +Provide observations, then tag @copilot to resume. +``` + +### Example 2: Specific Android API Level + +```markdown +## πŸ›‘ CHECKPOINT: Android API 31 Testing Required + +### Current State +- βœ… **Completed**: + - Fetched PR #26543 - Fix Entry focus on Android 12 + - Modified Sandbox with Entry focus test + - Built app for Android + - Analyzed handler changes + +- ⏸️ **Paused At**: Need Android 12 (API 31) emulator + +### Required Action +**Platform**: Android +**Device**: Emulator with API 31 (Android 12) +**Why**: Bug only reproduces on Android 12, not on newer versions + +**Steps to complete**: +1. Create and start Android 12 emulator: + +2. Test Entry focus behavior: + - Tap Entry control + - Observe keyboard appearance + - Check console for STATE CAPTURE logs + +### How to Resume + +Provide observations, then tag @copilot to resume. +``` + +### Example 3: Windows-Specific Testing + +```markdown +## πŸ›‘ CHECKPOINT: Windows Platform Testing Required + +### Current State +- βœ… **Completed**: + - Fetched PR #27821 - Fix Window sizing on Windows + - Modified Sandbox with Window size test + - Built app for Windows (from macOS) + - Reviewed WinUI changes + +- ⏸️ **Paused At**: Need to test on Windows machine + +### Required Action +**Platform**: Windows 11 +**Device**: Windows development machine +**Why**: WinUI-specific behavior cannot be tested on macOS + +**Steps to complete**: +1. On Windows machine, clone and build: + +2. Observe Window behavior: + - Window opens at correct size + - Resizing works smoothly + - Check console for measurements + +3. Capture console output + +### How to Resume + +Provide console output and window behavior observations, then tag @copilot to resume. +``` + +## Checkpoint Best Practices + +### DO: +- βœ… Be specific about exact requirements (iOS version, device type, API level) +- βœ… Provide complete file paths and commands +- βœ… Explain WHY the specific environment is needed +- βœ… Make it easy for someone else to complete the action +- βœ… Specify exactly what data you need to resume +- βœ… Include console output patterns to look for +- βœ… Mention if screenshots are needed + +### DON'T: +- ❌ Create checkpoints for minor inconveniences +- ❌ Use vague requirements ("test on a device") +- ❌ Skip checkpoints when truly needed +- ❌ Forget to specify resume instructions +- ❌ Assume knowledge about the environment +- ❌ Make resume process complicated + +## Special Cases + +### Multiple Checkpoints + +If you need testing on multiple platforms, create separate checkpoints: + +```markdown +## πŸ›‘ CHECKPOINT 1/2: iOS Physical Device Testing +[Details...] + +## πŸ›‘ CHECKPOINT 2/2: Android API 31 Testing +[Details...] +``` + +Resume after each checkpoint is completed. + +### Partial Results + +If someone provides partial results, acknowledge and request remaining data: + +```markdown +βœ… **Received**: iOS test results +⏸️ **Still Need**: Android API 31 test results + +Please complete checkpoint 2/2 before I can finalize the review. +``` + +## Integration with Review Workflow + +Checkpoints should fit naturally into the review workflow: + +1. **Standard Review** (no checkpoint needed): + - Fetch PR β†’ Modify Sandbox β†’ Build β†’ Test β†’ Review + +2. **Review with Checkpoint**: + - Fetch PR β†’ Modify Sandbox β†’ Build β†’ **πŸ›‘ CHECKPOINT** β†’ [Wait] β†’ Resume with results β†’ Review + +3. **Review with Multiple Checkpoints**: + - Fetch PR β†’ Build β†’ **πŸ›‘ CHECKPOINT 1** β†’ [Wait] β†’ Resume β†’ **πŸ›‘ CHECKPOINT 2** β†’ [Wait] β†’ Resume β†’ Review + +The final review should acknowledge checkpoint assistance: + +```markdown +## Review Summary + +**Testing Assistance**: Special thanks to @username for testing on physical iPhone (iOS 17.2) +and @username2 for Android API 31 validation. + +[Rest of review...] +``` diff --git a/.github/instructions/pr-reviewer-agent/core-guidelines.md b/.github/instructions/pr-reviewer-agent/core-guidelines.md index 5edf9adcb103..ea7e9369e52f 100644 --- a/.github/instructions/pr-reviewer-agent/core-guidelines.md +++ b/.github/instructions/pr-reviewer-agent/core-guidelines.md @@ -54,14 +54,117 @@ When multiple instruction files exist, follow this priority order: 4. πŸ” Test edge cases not mentioned by PR author 5. πŸ“Š Compare behavior WITH and WITHOUT the PR changes 6. πŸ“ Document findings with actual measurements and evidence -7. βœ… Write comprehensive review based on real testing, not just code inspection - - **Output**: Create a markdown file named `Review_Feedback_Issue_XXXXX.md` (replace XXXXX with actual issue number) +7. βœ… **MANDATORY**: Write comprehensive review and create `Review_Feedback_Issue_XXXXX.md` file + - **Output**: **ALWAYS** create a markdown file named `Review_Feedback_Issue_XXXXX.md` (replace XXXXX with actual issue number) + - **When**: Create this file at the end of EVERY PR review, without exception - **Content**: Include test results, measurements, edge cases tested, and evidence-based recommendations - **Location**: Save in repository root or as specified by user + - **Critical**: This file is the deliverable for every review - do not skip this step 8. πŸ“€ **If submitting changes as a PR**: Use title format `[PR-Reviewer] ` - This clearly identifies agent-generated PRs containing review feedback and suggested improvements - Example: `[PR-Reviewer] Fix RTL padding for CollectionView on iOS` +## 🎯 Critical Success Factors + +### Factor 1: Read Instructions First + +**RULE**: You MUST read all instruction files BEFORE creating any plans. + +**Why this matters**: +``` +Without instructions: With instructions: +β”œβ”€ Make assumptions β”œβ”€ Use established patterns +β”œβ”€ Use wrong app β”œβ”€ Choose correct app +β”œβ”€ Plan unnecessary steps β”œβ”€ Adapt to actual situation +└─ Waste time correcting └─ Get it right first time +``` + +**What this looks like in practice**: +- ❌ BAD: User asks for review β†’ Agent creates 10-step todo β†’ Discovers mistakes later +- βœ… GOOD: User asks for review β†’ Agent reads instructions β†’ Agent creates correct plan + +### Factor 2: Checkpoint Before Expensive Operations + +**MANDATORY CHECKPOINTS:** + +#### Checkpoint 1: After Initial Analysis (Low cost to fix) +**When**: After reading instructions and analyzing PR +**Show user**: +- Your understanding of what the PR fixes +- Which app you'll use (Sandbox or HostApp) and why +- High-level test plan + +**User can correct**: Misunderstandings, wrong app choice, missing context + +--- + +#### Checkpoint 2: Before Building (Medium cost to fix) 🚨 CRITICAL +**When**: After creating test code but BEFORE building +**Show user**: +- The exact test code you created +- What will be measured +- Why this validates the fix +- Explicitly ask: "Should I proceed with building?" + +**Why critical**: Building takes 10-15 minutes. If your test design is wrong, this checkpoint saves that wasted time. + +**User can correct**: Test design flaws, missing test cases, wrong approach + +--- + +#### Checkpoint 3: Before Final Review (High cost to fix) +**When**: After testing complete, before final recommendation +**Show user**: +- Raw data (timings, logs, observations) +- Your interpretation +- Draft recommendation + +**User can correct**: Data interpretation, missed issues, recommendation logic + +### Factor 3: Test WITH and WITHOUT the Fix + +**RULE**: You MUST test both scenarios to prove the fix works. + +**Process**: +``` +1. Checkout main branch version of changed file +2. Build and test (capture baseline behavior) +3. Restore PR branch version +4. Build and test (capture improved behavior) +5. Compare: baseline vs improved +``` + +**Why this matters**: +- Proves the fix actually fixes the issue +- Proves the fix doesn't break existing functionality +- Provides objective data for review + +**Red flags if you skip this**: +- Can't prove fix works +- Might have false positive (app worked anyway) +- Subjective review instead of data-driven + +### Factor 4: Deep Analysis Over Surface Review + +**Surface review** (❌ not acceptable): +- "This PR adds a PresentationCompleted event" +- "The PR modifies ModalNavigationManager" +- "The changes look good" + +**Deep review** (βœ… acceptable): +- "WHY was PresentationCompleted needed? Because DialogFragment.OnStart() completes after Show() returns, creating a race condition where PopModal is called before the dialog is fully presented." +- "WHY separate animated vs non-animated paths? Because animated modals naturally wait for animation completion, but non-animated modals returned immediately, before OnStart()." +- "EDGE CASE: What if OnStart() never fires? This could cause an infinite hang if the activity is destroyed." + +**How to do deep analysis**: +1. Understand the root cause, not just the symptoms +2. Explain WHY each change was made +3. Identify potential edge cases or issues +4. Think about what could go wrong +5. Consider platform-specific behavior + +--- + ## πŸ€– UI Automation: ALWAYS Use Appium **CRITICAL RULE: For ANY device UI interaction, use Appium - NEVER use direct ADB/simctl commands** diff --git a/.github/instructions/pr-reviewer-agent/error-handling.md b/.github/instructions/pr-reviewer-agent/error-handling.md index 9b0eda6ac602..31b9523cedc0 100644 --- a/.github/instructions/pr-reviewer-agent/error-handling.md +++ b/.github/instructions/pr-reviewer-agent/error-handling.md @@ -1,5 +1,161 @@ # Error Handling +## 🚫 Common Mistakes & How to Avoid Them + +### Mistake #1: Building the Wrong App ⭐ MOST COMMON + +**Symptom**: Agent tries to build `TestCases.HostApp` for PR validation + +**Why it happens**: +- PR includes test files in HostApp directory +- Agent sees test files and assumes that's what to use +- Instructions not read before planning + +**How to avoid**: +1. Read testing-guidelines.md FIRST +2. Remember: Sandbox = PR validation (default) +3. HostApp = writing/debugging UI tests only + +**Cost if not avoided**: 15+ minutes wasted building + +--- + +### Mistake #2: Planning Before Reading Instructions + +**Symptom**: Agent creates detailed todo list immediately after user request + +**Why it happens**: +- Eager to show organization and planning +- Assumes workflow without context +- Wants to appear proactive + +**How to avoid**: +1. Resist urge to plan immediately +2. Read ALL instruction files first +3. Create plan based on instructions, not assumptions + +**Cost if not avoided**: Plan has to be recreated, shows lack of process discipline + +--- + +### Mistake #3: Not Checking Current Branch + +**Symptom**: Planning to "fetch PR" or "checkout branch" when already on correct branch + +**Why it happens**: +- Assumes starting from main branch +- Doesn't check current state before planning +- Copy-paste workflow without adaptation + +**How to avoid**: +1. ALWAYS run `git branch --show-current` first +2. Adapt workflow to actual situation +3. Don't assume starting state + +**Cost if not avoided**: Unnecessary git operations, potential branch confusion + +--- + +### Mistake #4: Building Without User Validation + +**Symptom**: Agent modifies Sandbox code and immediately starts building + +**Why it happens**: +- Wants to show progress quickly +- Confident in test design +- Doesn't realize build cost + +**How to avoid**: +1. Create test code first +2. Show it to user with explanation +3. Wait for approval before building +4. Remember: building takes 10-15 minutes + +**Cost if not avoided**: Wasted build time if design is wrong + +--- + +### Mistake #5: Only Testing WITH the Fix + +**Symptom**: Agent only tests the PR branch, doesn't test baseline + +**Why it happens**: +- Assumes fix must work if tests pass +- Skips comparative analysis +- Doesn't realize need to prove fix actually fixes + +**How to avoid**: +1. Test WITHOUT fix first (baseline) +2. Document baseline behavior +3. Test WITH fix second +4. Compare baseline vs improved +5. Prove the fix actually fixes the issue + +**Cost if not avoided**: Can't prove fix works, subjective review + +--- + +### Mistake #6: Surface-Level Code Review + +**Symptom**: Describing WHAT changed without explaining WHY + +**Example bad review**: +- "This PR adds an event" +- "The code was modified" +- "Looks good to me" + +**Why it happens**: +- Focused on diff, not root cause +- Not understanding platform behavior +- Quick review over thorough review + +**How to avoid**: +1. Understand the root cause of the issue +2. Explain WHY each change was needed +3. Identify edge cases +4. Consider platform-specific implications + +**Cost if not avoided**: Missed issues, shallow review, no real validation + +--- + +## βœ… Checklist: Am I Doing This Right? + +Before proceeding with each phase, check: + +### β˜‘οΈ Initial Phase: +- [ ] Read ALL instruction files before creating plan +- [ ] Checked current branch state +- [ ] Fetched and understood PR details +- [ ] Created plan based on instructions, not assumptions + +### β˜‘οΈ Planning Phase: +- [ ] Using Sandbox app (unless writing UI tests) +- [ ] Planned to test WITHOUT and WITH fix +- [ ] Included validation checkpoint before building +- [ ] Test design will provide measurable data + +### β˜‘οΈ Implementation Phase: +- [ ] Created comprehensive test scenarios +- [ ] Added instrumentation for measurements +- [ ] Showed test code to user BEFORE building +- [ ] Got user approval to proceed + +### β˜‘οΈ Testing Phase: +- [ ] Tested WITHOUT fix (baseline) +- [ ] Captured baseline data +- [ ] Tested WITH fix +- [ ] Captured improved data +- [ ] Compared baseline vs improved + +### β˜‘οΈ Review Phase: +- [ ] Explained WHY fix works, not just WHAT changed +- [ ] Identified potential edge cases +- [ ] Provided objective data, not subjective opinion +- [ ] Made clear recommendation (approve/request changes) + +--- + ## Handling Build Errors If the build fails, follow this 3-step debugging process: diff --git a/.github/instructions/pr-reviewer-agent/testing-guidelines.md b/.github/instructions/pr-reviewer-agent/testing-guidelines.md index 1869e2ef310d..afda880a3482 100644 --- a/.github/instructions/pr-reviewer-agent/testing-guidelines.md +++ b/.github/instructions/pr-reviewer-agent/testing-guidelines.md @@ -1,6 +1,64 @@ -# Testing Guidelines +⚠️ **CRITICAL**: Read this ENTIRE file before creating any plans or taking any actions -## Which App to Use for Testing +--- + +# Testing Guidelines for PR Review + +## 🎯 The #1 Rule: Which App to Use + +### Default Answer: **Sandbox App** + +Use `src/Controls/samples/Controls.Sample.Sandbox/` for PR validation **UNLESS** you are explicitly asked to write or validate UI tests. + +### Quick Decision Tree: + +``` +Are you writing/debugging UI tests? +β”œβ”€ YES β†’ Use TestCases.HostApp +└─ NO β†’ Use Sandbox app βœ… (99% of PR reviews) +``` + +### ⚠️ Common Confusion: "But the PR has test files!" + +**Scenario**: PR adds files to `src/Controls/tests/TestCases.HostApp/Issues/IssueXXXX.cs` + +❌ **WRONG THINKING**: "The PR adds test files to HostApp, so I should use HostApp" +βœ… **RIGHT THINKING**: "The PR adds automated test files. I use Sandbox to manually validate the fix." + +**Why**: +- Those test files are for the AUTOMATED UI testing framework +- You are doing MANUAL validation with real testing +- HostApp is only needed when writing/debugging those automated tests + +### πŸ’° Cost of Wrong App Choice + +**Using HostApp when you should use Sandbox:** +- ⏱️ Wasted time: 15+ minutes building +- πŸ“¦ Unnecessary complexity: 1000+ tests in project +- πŸ› Harder debugging: Can't isolate behavior +- 😞 User frustration: Obvious mistake + +**Using Sandbox (correct choice):** +- ⏱️ Fast builds: 2-3 minutes +- 🎯 Focused testing: Only your test code +- πŸ” Easy debugging: Clear isolation +- βœ… Professional approach + +### πŸ“‹ App Selection Reference + +| Scenario | Correct App | Why | +|----------|------------|-----| +| Validating PR fix | Sandbox βœ… | Quick, isolated, easy to instrument | +| Testing before/after comparison | Sandbox βœ… | Can modify without affecting tests | +| User says "review this PR" | Sandbox βœ… | Default for all PR validation | +| User says "write a UI test" | HostApp βœ… | That's what HostApp is for | +| User says "validate the UI test" | HostApp βœ… | Testing the test itself | +| PR adds test files | Sandbox βœ… | Test files β‰  what you test with | +| Unsure which to use | Sandbox βœ… | When in doubt, default here | + +--- + +## Which App to Use for Testing (Detailed) **CRITICAL DISTINCTION**: There are two testing apps in the repository, and choosing the wrong one wastes significant time (20+ minutes for unnecessary builds). diff --git a/32774 b/32774 deleted file mode 100644 index ef77e7f1b29c..000000000000 --- a/32774 +++ /dev/null @@ -1 +0,0 @@ -fetch \ No newline at end of file