dotnet · PureWeen · Apr 30, 2026 · Apr 23, 2026 · Apr 23, 2026 · Apr 23, 2026
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -101,6 +101,22 @@ When referencing or triggering CI pipelines, use these current pipeline names:
 
 **⚠️ Old pipeline names** (e.g., `MAUI-UITests-public`, `MAUI-public`) are **outdated** and should NOT be used. Always use the names above.
 
+### Investigating CI Failures
+
+**🚨 ALWAYS use the `azdo-build-investigator` skill when investigating CI failures or assessing merge readiness.** Its instructions direct you to invoke the `ci-analysis` skill first for the core investigation workflow, then apply MAUI-specific corrections (correct pipeline names, XHarness quirks, binlog guidance).
+
+Do NOT default to manually querying AzDO APIs or rely solely on `gh pr checks` pass/fail counts.
+
+**When to use it:**
+- "How does CI look?" / "Is CI green?" / "Can we merge?"
+- "What's failing?" / "Are these known failures?"
+- "Is this PR safe to merge?" / "Any CI concerns?"
+- After any PR push to verify the build
+
+**Verifying specific tests:** When asked "did test X pass?" or "did the new test run?", query the **actual AzDO test results** — do NOT infer whether a test ran by inspecting code attributes. Class-level traits, base class categories, and assembly-level attributes can all cause a test to run even when the method itself has no visible category. Check the evidence, not the code.
+
+**Anti-pattern:** Writing ad-hoc scripts to parse AzDO build timelines. The skills handle Helix work item details, known issue cross-referencing, and test result aggregation that manual approaches miss.
+
 ### Gradle / Maven Dependency Failures (CFSClean)
 
 The official CI build uses CFSClean network isolation which blocks `repo.maven.apache.org`. All Gradle/Maven dependencies resolve through the `dotnet-public-maven` Azure Artifacts feed.
@@ -309,11 +325,6 @@ Skills are modular capabilities that can be invoked directly or used by agents.
    - **Two modes**: Verify failure only (test creation) or full verification (test + fix)
    - **Used by**: After creating tests, before considering PR complete
 
-9. **pr-build-status** (`.github/skills/pr-build-status/SKILL.md`)
-   - **Purpose**: Retrieves Azure DevOps build information for PRs (build IDs, stage status, failed jobs)
-   - **Trigger phrases**: "check build for PR #XXXXX", "why did PR build fail", "get build status"
-   - **Used by**: When investigating CI failures
-
 10. **run-integration-tests** (`.github/skills/run-integration-tests/SKILL.md`)
    - **Purpose**: Build, pack, and run .NET MAUI integration tests locally
    - **Trigger phrases**: "run integration tests", "test templates locally", "run macOSTemplates tests", "run RunOniOS tests"

diff --git a/.github/skills/azdo-build-investigator/SKILL.md b/.github/skills/azdo-build-investigator/SKILL.md
@@ -32,6 +32,10 @@ The `ci-analysis` skill and its `Get-CIStatus.ps1` script are loaded automatical
 
 Most failures are in `maui-pr`. Device test failures appear in `maui-pr-devicetests`. Focus on the first failing pipeline before checking others.
 
+**When CI hasn't run:** Community PRs require a maintainer to trigger builds. Use `/azp run maui-pr` (or `maui-pr-devicetests`, `maui-pr-uitests`) in a PR comment, or trigger via Azure CLI. Not all pipelines run automatically — `maui-pr-devicetests` and `maui-pr-uitests` may need explicit triggers depending on the changed files.
+
+**Escalation:** For deep Helix log analysis (recurring failures, machine-specific issues, comparing passing vs. failing runs), escalate to the `helix-investigation` skill.
+
 ## MAUI-Specific Quirks
 
 ### XHarness Exit-0 Blind Spot
@@ -71,6 +75,17 @@ If available, use the `mcp-binlog-tool` MCP server to analyze downloaded `.binlo
 | `No test result files found` | `maui-pr-devicetests` Helix logs | Tests never ran or app crashed on launch |
 | UI test screenshot diff | `maui-pr-uitests` | Visual regression; check baseline images |
 
+## Test Count Deduplication
+
+When querying AzDO test results directly (e.g., via the `/test/runs/{id}/results` API), **always deduplicate before reporting counts**. MAUI UI tests produce multiple test runs per test because each test executes across:
+- **Runtime variants**: CoreCLR and Mono
+- **Platform versions**: e.g., iOS 18.5 and iOS latest, Android API 30 and API 36
+- **Retry attempts**: failed jobs are retried, each attempt publishes a new test run
+
+A single failing test can appear in 4–8+ test runs. Summing raw `totalTests - passedTests` across all runs inflates failure counts dramatically.
+
+**How to deduplicate**: Group by **test name + OS platform** (extract the OS token — `ios`, `android`, `mac`, `win` — from the run name as the grouping key). For example, "DatePicker_Format_D on iOS" vs "DatePicker_Format_D on Android" are distinct failures worth reporting separately. Collapse retries and runtime variants (coreclr/mono) of the same test on the same OS — if a test fails on both coreclr and mono for iOS, that's one issue, not two.
+
 ### Gradle / Maven / CFSClean Failures
 
 **Error signatures:**