Integrate UI test category detection into PR review and fix gate reliability by kubaflo · Pull Request #35133 · dotnet/maui

kubaflo · 2026-04-24T18:36:16Z

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Depends on #35136 (pipeline category detection — should merge first)

What this does

Two things:

1. UI test category detection in PR review

During the PR review workflow, Step 0.5 detects which UI test categories the PR impacts and writes the result to the AI summary comment. This gives reviewers visibility into which UI tests are relevant.

Detection reuses the 3-tier script from #35136 (test attributes → source paths → AI reasoning).

AI summary shows a new 🧪 UI Tests section with detected categories before the gate section.

2. Gate reliability fixes

Multiple fixes to make the gate (verify-tests-fail.ps1) more deterministic:

Fix	Problem it solves
Absolute path resolution	Gate scripts not found on Linux CI agents (`Resolve-Path`, `GetFullPath`)
File existence check	Instant cryptic failure when verify script is missing — now logs clear error
3x retry on ENV ERROR	Emulator timeouts, ADB failures, app crashes — transient issues that pass on retry
Strip bad report blocks	Old verify script produces `Passed: False` with empty counts — stripped instead of shown
Gate log in fallback	When report is missing, shows last 20 lines of gate output instead of just `❌ FAILED / Platform: IOS`

Files

File	Changes
`.github/scripts/Review-PR.ps1`	Step 0.5 category detection + all 5 gate fixes
`.github/scripts/post-ai-summary-comment.ps1`	Add `uitests` phase to render detected categories
`.github/pr-review/pr-preflight.md`	Step 7: AI identifies impacted UI test categories

Validation — PR reviewer builds (Apr 26)

10 builds against real PRs — all succeeded ✅. Category detection shown in AI summary comment.

PR	Categories Detected	Build	AI Summary
#35037 (WebView theme)	`ViewBaseTests,WebView`	13940071	comment
#35031 (Shell memory leak)	`Shell`	13940072	comment
#35020 (XAML Hot Reload)	(none — XAML only)	13940073	✅ Shows "No UI test categories"
#35008 (Shell SearchHandler)	`Shell`	13940074	✅
#34997 (RadioButton gradient)	`RadioButton,ViewBaseTests`	13940075	✅
#34980 (DatePicker rotation)	`ViewBaseTests`	13940076	✅
#34974 (Picker CharacterSpacing)	`ViewBaseTests`	13940077	✅
#34923 (SwipeView threshold)	`SwipeView,ViewBaseTests`	13940078	✅
#34907 (CollectionView ScrollTo)	`CollectionView`	13940079	✅
#34845 (RefreshView binding)	`RefreshView,ViewBaseTests`	13940080	✅

github-actions · 2026-04-24T18:36:25Z

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 35133

Or

Run remotely in PowerShell:

iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 35133"

Copilot

Pull request overview

This PR adds automatic UI test category detection so PR UI test jobs can skip running when the PR doesn’t touch their category group, reducing overall UI test time and CI load.

Changes:

Introduces a PowerShell-based category detection script (test diff scanning + source-path heuristics + optional AI hints).
Adds a “Discover” stage and per-job early gating to skip irrelevant UI test jobs in maui-pr-uitests.
Adds orchestration and reporting scripts to trigger the UI test pipeline and post a consolidated PR comment, plus integrates category detection into the PR review flow.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`eng/scripts/detect-ui-test-categories.ps1`	Detects relevant UI test categories from PR changes and emits variables for downstream filtering.
`eng/pipelines/common/ui-tests.yml`	Adds the discovery stage and wires detected category outputs into UI test stages/jobs.
`eng/pipelines/common/ui-tests-steps.yml`	Adds an early per-job filter step and conditions many steps on whether the job should run.
`eng/pipelines/ci-uitests.yml`	Adds `prNumber` / `categories` parameters to support manual targeted runs.
`eng/pipelines/ci-copilot.yml`	Exposes `AZURE_DEVOPS_EXT_PAT` for cross-org AzDO queuing from the Copilot pipeline.
`.github/scripts/trigger-uitest-pipeline.ps1`	Orchestrates detect → queue → monitor → summarize flow for UI tests.
`.github/scripts/post-uitest-categories-comment.ps1`	Posts/updates a single PR comment summarizing UI test results and failures.
`.github/scripts/post-ai-summary-comment.ps1`	Adds a UI Tests section to the unified AI summary comment.
`.github/scripts/Review-PR.ps1`	Adds a preliminary category detection step and improves gate retry/report handling.
`.github/pr-review/pr-preflight.md`	Updates the PR pre-flight checklist to include identifying impacted UI test categories.

Comments suppressed due to low confidence (1)

eng/pipelines/common/ui-tests-steps.yml:275

The iOS cleanup step runs whenever platform == ios, even when the new early category check set SHOULD_RUN_TESTS=False and all prior steps were skipped. This can waste time and may fail because test artifacts/simulators were never provisioned. Gate this step on SHOULD_RUN_TESTS as well (similar to the other tasks in this template).

- bash: |
    cat ${BASH_SOURCE[0]}
    pwsh ./build.ps1 --target=Cleanup -Script eng/devices/${{ parameters.platform }}.cake ---results="$(TestResultsDirectory)" ${{ parameters.cakeArgs }}
  displayName: Cleanup and Create Simulator Logs if Test Run Failed To
  condition: ${{ eq(parameters.platform, 'ios') }}
  continueOnError: true

Copilot · 2026-04-24T18:42:48Z

-> **Validation constraint:** The Step 7 prompt MUST NOT contain issue titles, root-cause descriptions, bug summaries, or any Part A content — only `PR #XXXXX`. If you find yourself adding context "to help" the sub-agent, you are violating independence-first.
+> **Validation constraint:** The Step 8 prompt MUST NOT contain issue titles, root-cause descriptions, bug summaries, or any Part A content — only `PR #XXXXX`. If you find yourself adding context "to help" the sub-agent, you are violating independence-first.

 7. **Invoke the code-review skill as a sub-agent:**


Step numbering is inconsistent after adding the new “Identify Impacted UI Test Categories” step: this section is now labeled “Part B: Code Review (Step 8)”, but the instruction below still says “7. Invoke the code-review skill as a sub-agent”. Update the numbering to avoid confusion when following the checklist.

Suggested change

7. **Invoke the code-review skill as a sub-agent:**

8. **Invoke the code-review skill as a sub-agent:**

Copilot · 2026-04-24T18:42:48Z

+        if ([string]::IsNullOrWhiteSpace($uitestCategories) -or $uitestCategories -eq 'NONE') {
+            Write-Host "  ℹ️ No UI test categories detected" -ForegroundColor DarkGray
+        } else {
+            Write-Host "  🎯 Detected categories: $uitestCategories" -ForegroundColor Green


detect-ui-test-categories.ps1 uses an empty UITestCategoryList output to mean “run the full matrix” (it returns without setting the variable). Here, an empty result is treated as “no UI test categories detected”, which is misleading in the common fallback-to-all case. Treat empty as “ALL/full matrix” and reserve NONE for “skip all UI tests”.

Copilot · 2026-04-24T18:42:48Z

+    if (-not [string]::IsNullOrWhiteSpace($aiContent) -and $aiContent.Trim() -ne 'NONE') {
+        # Extract category names (lines like "Button — justification")
+        $aiCatLines = @($aiContent -split "`n" | ForEach-Object {
+            if ($_ -match '^([A-Za-z]+)') { $Matches[1] }


AI category parsing only captures leading letters (^([A-Za-z]+)), which will truncate categories that contain digits (e.g., Material3 becomes Material). Expand the pattern to include digits/underscores so AI-provided categories can round-trip correctly.

Suggested change

if ($_ -match '^([A-Za-z]+)') { $Matches[1] }

if ($_ -match '^([A-Za-z0-9_]+)') { $Matches[1] }

Copilot · 2026-04-24T18:42:49Z

+# EARLY CHECK: Determine if this category group should run tests
+# This runs FIRST to avoid wasting time on provisioning if no tests will run
+# Also calculates matching categories to avoid duplicating this logic later
+- pwsh: |


This early check claims it runs first to avoid wasting time on provisioning when a job is skipped, but the template still invokes provision.yml unconditionally later in the job. That means even skipped category jobs will still pay most of the provisioning cost, which undermines the intended runtime savings. Consider gating the provisioning step(s) on SHOULD_RUN_TESTS=True as well (or adding a condition hook parameter to the provisioning template).

Copilot · 2026-04-24T18:42:49Z

+    Write-Host "Test file changes detected under '$TestRoot'." -ForegroundColor Green
+}
+
+$categoryPattern = '^\+\s*\[Category\((?<value>[^\)]*)\)\]'


The regex used to detect added [Category(...)] lines only matches attributes that end immediately with )]. Many tests use combined attributes like [Category(UITestCategories.X), Order(1)], which won't match and will cause category detection to miss relevant categories (leading to overly broad runs). Update the pattern to also match when Category(...) is followed by , or ] (and allow whitespace).

Suggested change

$categoryPattern = '^\+\s*\[Category$(?<value>[^$]*)\)\]'

$categoryPattern = '^\+\s*\[Category$(?<value>[^$]*)\)\s*(?:,|\])'

Copilot · 2026-04-24T18:42:49Z

+            $content = Get-Content $file -Raw
+            $fileMatches = [regex]::Matches($content, '\[Category\(([^\)]*)\)\]')
+            foreach ($m in $fileMatches) {
+                $rawValue = $m.Groups[1].Value.Trim()
+                if ([string]::IsNullOrWhiteSpace($rawValue)) { continue }
+                if ($rawValue -match '^UITestCategories\.(?<name>[A-Za-z0-9_]+)$') {
+                    $cat = $Matches['name']
+                } elseif ($rawValue -match '^["''](?<name>[A-Za-z0-9_ -]+)["'']$') {
+                    $cat = $Matches['name']
+                } elseif ($rawValue -match 'nameof\(UITestCategories\.(?<name>[A-Za-z0-9_]+)\)') {
+                    $cat = $Matches['name']
+                } else { continue }


When scanning the full contents of modified test files, the \[Category$([^$]*)\)\] regex has the same limitation as the diff regex and will not match combined attributes like [Category(UITestCategories.X), Order(1)]. This will prevent Tier 1 fallback detection from finding existing categories in many files. Adjust the regex to capture the Category(...) argument even when additional attributes follow in the same bracket.

Copilot · 2026-04-24T18:42:49Z

+    } elseif (-not [string]::IsNullOrWhiteSpace($env:SYSTEM_ACCESSTOKEN)) {
+        $h['Authorization'] = "Bearer $env:SYSTEM_ACCESSTOKEN"


Get-GitHubHeaders falls back to SYSTEM_ACCESSTOKEN as a GitHub Bearer token. SYSTEM_ACCESSTOKEN is an Azure DevOps token and won't authenticate to the GitHub API, so PR metadata/label calls will fail unless GH_TOKEN is present. Consider removing this fallback and emitting a clear warning when GH_TOKEN is missing (or support a dedicated GitHub token env var).

Suggested change

} elseif (-not [string]::IsNullOrWhiteSpace($env:SYSTEM_ACCESSTOKEN)) {

$h['Authorization'] = "Bearer $env:SYSTEM_ACCESSTOKEN"

} else {

Write-Host "##[warning]GH_TOKEN is not set. GitHub API requests will be sent without authentication and may fail or be rate-limited."

PR review integration (depends on #35136 for pipeline changes): - Review-PR.ps1: Step 0.5 detects categories, gate retry on ENV ERROR, absolute path resolution, bad report format stripping - post-uitest-categories-comment.ps1: Rich results with platform table, failure classification, supports -OutputFile for AI summary embedding - trigger-uitest-pipeline.ps1: Orchestrator for detect → queue → monitor - post-ai-summary-comment.ps1: UI Tests section (before gate) - pr-preflight.md: Step 7 AI category identification, Step 8 code review - ci-copilot.yml: DNCENG_PUBLIC_PAT for cross-org build queuing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Two issues: 1. The copilot agent was overwriting gate/content.md with its own generic version that lacked actual error messages. Added explicit instruction not to touch gate/content.md. 2. The verify script's failure details were in a collapsed <details> block. Changed to an open #### heading so the error reason is immediately visible (e.g. screenshot baseline mismatch). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-04-27T10:39:03Z

🔍 Skill Validation Results

✅ Static Checks Passed

Skills checked: 15 | Agents checked: 3

Full validator output

Found 1 skill(s)
[verify-tests-fail-without-fix] 📊 verify-tests-fail-without-fix: 2,271 BPE tokens [chars/4: 2,189] (detailed ✓), 26 sections, 7 code blocks
✅ All checks passed (1 skill(s))
Found 3 agent(s)
Validated 3 agent(s)

✅ All checks passed (3 agent(s))

❌ LLM Evaluation Failed

0/1 skill(s) passed validation

Skill	Scenario	Baseline	Skilled	Verdict

❌ verify-tests-fail-without-fix: Eval scenario 'Regression: tests passing without fix means verification FAILED' prompt mentions target name 'verify-tests-fail-without-fix' (skill or agent) — remove the target name from the prompt to avoid biasing baseline runs. Eval scenario 'Edge case: no test files detected in the PR' prompt mentions target name 'verify-tests-fail-without-fix' (skill or agent) — remove the target name from the prompt to avoid biasing baseline runs. Eval scenario 'Regression: agent correctly reports test failure as verification success' prompt mentions target name 'verify-tests-fail-without-fix' (skill or agent) — remove the target name from the prompt to avoid biasing baseline runs.

🔍 Full results and investigation steps

github-actions

Expert Code Review: 8 findings posted inline (8 moderate). See summary comment for methodology and discarded findings.

Generated by Expert Code Review · ● 29.8M

github-actions · 2026-04-29T19:50:38Z

+        git remote add _detect_base $baseRepoCloneUrl
+        git fetch _detect_base "$TargetBranch" --no-tags --prune --depth=200 | Out-Null
+        git update-ref refs/remotes/origin/$TargetBranch _detect_base/$TargetBranch | Out-Null
+
+        # Fetch head commit (works for forks too) and check it out so the diff reflects the PR changes.
+        git remote remove _detect_head 2>$null | Out-Null
+        git remote add _detect_head $headRepoCloneUrl
+        git fetch _detect_head "$headSha" --no-tags --depth=200 | Out-Null
+        git checkout --quiet $headSha | Out-Null


🟡 Moderate — 3/3 consensus | Git state not restored on failure

Manual PR mode performs git checkout --quiet $headSha (detached HEAD) and adds temporary remotes (_detect_base, _detect_head) without a try/finally to restore original state. If the script fails mid-execution, the working tree is left detached and stray remotes persist in .git/config.

Recommendation: Wrap the manual-mode git block in try/finally. In finally, restore the original HEAD (git checkout $originalRef) and remove both temporary remotes unconditionally.

github-actions · 2026-04-29T19:50:38Z

+    $gateOutput | ForEach-Object { Write-Host "    $_" }
+
+    # Check if this was an ENV ERROR (emulator timeout, ADB failure, etc.)
+    $gateContentFile = Join-Path $gateOutputDir "verify-tests-fail/verification-report.md"


🟡 Moderate — 2/3 consensus | Stale report file across retries

$gateContentFile (verification-report.md) is read each retry iteration to detect ENV ERROR, but is never cleared between attempts. If attempt 1 writes an ENV ERROR report and attempt 2 crashes before overwriting it, the stale file causes misclassification of the second attempt.

Recommendation: Delete or rename $gateContentFile at the start of each loop iteration before invoking the verify script:

if (Test-Path $gateContentFile) { Remove-Item $gateContentFile -Force }

github-actions · 2026-04-29T19:50:38Z

+        $pr = Invoke-WithRetry -Uri $prUrl -Headers (Get-GitHubHeaders)
+        $TargetBranch = $pr.base.ref
+        $headRef = $pr.head.ref
+        $headSha = $pr.head.sha


🟡 Moderate — 2/3 consensus | No null guards on API response fields

If $pr.head.repo is null (common when a fork is deleted), accessing .clone_url silently returns $null in PowerShell. Subsequent git remote add _detect_head "" and git fetch _detect_head "" produce confusing errors. The outer try/catch may or may not catch native command failures depending on $ErrorActionPreference.

Recommendation: Add explicit null checks before git operations:

if ([string]::IsNullOrWhiteSpace($headSha) -or [string]::IsNullOrWhiteSpace($headRepoCloneUrl)) { Write-Host "##[warning]Incomplete PR API response (fork may be deleted). Falling back." return }

github-actions · 2026-04-29T19:50:38Z

+        Start-Sleep -Seconds 30
+    }
+}
+if ($isEnvError) {


🟡 Moderate — 2/3 consensus | $isEnvError post-loop invariant is non-obvious

The check if ($isEnvError) after the loop is technically correct (it can only be $true here if ALL iterations were env errors, since non-env-error iterations break). However, the variable name suggests "any" rather than "all", and if $maxGateAttempts is ever 0, $isEnvError would be undefined ($null → falsy, so safe but fragile).

Recommendation: Either add a clarifying comment explaining the invariant, or use a counter: if ($envErrorCount -eq $maxGateAttempts) { ... }.

github-actions · 2026-04-29T19:50:39Z

+$uitestCategories = ""
+
+$detectScript = Join-Path $RepoRoot "eng/scripts/detect-ui-test-categories.ps1"
+if (Test-Path $detectScript) {


🟡 Moderate — 2/3 consensus | Variable name shadowing

$detectScript is defined here for detect-ui-test-categories.ps1, then reassigned at line ~499 to Detect-TestsInDiff.ps1. Both live in the same script scope. Future edits could accidentally reference the wrong script path.

Recommendation: Use distinct names — e.g., $uitestDetectScript (Step 0.5) and $diffDetectScript (Step 1).

github-actions · 2026-04-29T19:50:39Z

+                } else {
+                    $message = "Unrecognized category expression '$rawValue'. Expected formats: UITestCategories.<Name>, nameof(UITestCategories.<Name>), or a quoted string."
+                    Write-Host "##[error]$message"
+                    throw $message


🟡 Moderate — 2/3 consensus | throw on unrecognized category halts all detection

When the diff scan encounters a [Category()] expression that doesn't match the three supported patterns (e.g., a constant like [Category(MyConstants.Button)]), the script throws an exception. This kills the child pwsh process, causing the parent to receive zero categories — silently skipping Tier 2/3 fallback detection.

Note: The existing-file scan at line ~314 uses continue for unrecognized formats, creating an inconsistency.

Recommendation: Replace throw $message with Write-Host "##[warning]$message" + continue. Let Tier 2/3 fill gaps gracefully.

github-actions · 2026-04-29T19:50:39Z

+    if ($touchesControls) {
+        # Changed files under src/Controls/ but couldn't map to specific categories — run all
+        Write-Host "Changed files touch Controls/Core/Essentials but no specific categories identified. Running all." -ForegroundColor Yellow
+        return


🟡 Moderate — 2/3 consensus | "Run all" fallback paths never set output variables

This return (and several others at lines 52, 57, 130, 154, 165, 174, 180) exits without emitting ##vso[task.setvariable variable=UITestCategoryList;isOutput=true] or UITestCategoryMatrix. The log says "Running all" but the pipeline receives no output — which may be interpreted as "run nothing" rather than "run everything" depending on how the consuming YAML handles an unset matrix variable.

Recommendation: For each "run all" fallback, explicitly emit a sentinel (e.g., ALL) or the full category list, ensuring the pipeline YAML has a well-defined behavior for this case.

github-actions · 2026-04-29T19:50:39Z

+# ============================================================================
+
+if (-not [string]::IsNullOrWhiteSpace($diff)) {
+    foreach ($line in $diff -split "`n") {


🟡 Moderate — 2/3 consensus | Line-by-line scan cannot detect multi-line attributes

The Tier 1 diff scan splits on \n and applies the category regex per-line. While current codebase conventions use single-line [Category(...)] attributes (making this safe today), the approach silently misses any future multi-line formatting. Under-detection is safe (falls back to broader detection) but worth noting.

Recommendation: Consider adding a comment documenting this assumption, or collapsing consecutive + lines before regex matching to handle wrapped attributes.

github-actions

Expert Code Review: 2 findings posted inline (1 warning, 1 suggestion). Both are minor — overall the implementation is solid with proper error handling, retry logic, and fail-safe fallbacks.

Note

🔒 Integrity filter blocked 5 items

The following items were blocked because they don't meet the GitHub integrity level.

683daa4 list_commits: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
8413e08 list_commits: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
388b82f list_commits: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
bd3a0e5 list_commits: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
f49d3bd list_commits: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Expert Code Review · ● 40.6M

github-actions · 2026-04-29T20:28:29Z

+                } else {
+                    $message = "Unrecognized category expression '$rawValue'. Expected formats: UITestCategories.<Name>, nameof(UITestCategories.<Name>), or a quoted string."
+                    Write-Host "##[error]$message"
+                    throw $message


⚠️ Warning — Aggressive abort on unrecognized [Category] format

If a test in the diff uses an uncommon [Category] expression format (e.g., a constant from another class, string interpolation, or concatenation), this throw aborts the entire detection script.

In practice the caller in Review-PR.ps1 treats this as non-fatal (category detection silently returns empty), so it's fail-safe. However, in the Azure Pipelines context (direct invocation), this terminates with an error and the ##vso output variables are never set — the pipeline would fall back to "run all categories" which is fine, but the ##[error] log line may cause confusion during triage.

Consider: Write-Host "##[warning]$message" + continue instead of throw, so detection continues for remaining diff lines and only logs a warning for the unrecognized expression.

github-actions · 2026-04-29T20:28:29Z

+    $aiCatList = @($AiCategories -split '[,\n]' | ForEach-Object { ($_ -replace '\s*[-—].*$', '').Trim() } | Where-Object { $_ -and $_ -ne 'NONE' })
+    if ($aiCatList.Count -gt 0) {
+        Write-Host "Tier 3 (AI reasoning): $([string]::Join(', ', $aiCatList))" -ForegroundColor Green
+        foreach ($c in $aiCatList) { $addedCategories.Add($c) | Out-Null }


💡 Suggestion — AI-provided categories added without validation

Tier 3 adds AI-suggested categories directly to $addedCategories without checking they exist in UITestCategories.cs. If the AI halluccinates a category name (e.g., "Modal", "Toolbar"), it creates a matrix job that runs zero tests — wasting a CI slot and potentially masking detection failures.

Consider: Validate against the known category list (Tier 2's $pathToCategoryMap already enumerates valid categories), or emit a ##[warning] for unrecognized AI categories so they're visible in logs without silently creating empty jobs.

…ats and AI hallucinations - Replace 'throw' with '##[warning]' + continue when an unrecognized [Category(...)] expression is found in a diff. Previously a single unsupported expression (e.g., a constant from another class) aborted the whole detection script and silently fell back to running ALL categories. Now the script logs a warning and lets Tier 2 / Tier 3 fill in the gaps. - Validate Tier 3 (AI-suggested) categories against UITestCategories.cs before adding them to the matrix. Hallucinated category names would otherwise create matrix jobs that run zero tests, wasting a CI slot and masking detection failures. Invalid names are now skipped with a warning. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kubaflo · 2026-04-29T21:04:53Z

Addressed both findings from the latest expert review in 3db7ee4:

⚠️ Warning — Aggressive abort on unrecognized [Category] format (line 296)
Replaced throw with Write-Host "##[warning]" + continue. A single unrecognized category expression (e.g., a constant from another class) no longer aborts the entire detection script. Tier 2 (source paths) and Tier 3 (AI) still get a chance to fill in categories.

💡 Suggestion — AI-provided categories added without validation (line 369)
Tier 3 now reads UITestCategories.cs once, builds a HashSet of valid category names (case-insensitive), and skips any AI-suggested category that isn't defined there. Hallucinated names emit a ##[warning] instead of silently creating an empty matrix job. Validated locally against the file: 74 categories parsed, Material3 ✅, NotARealCategory ❌.

Did not adopt the prior review's findings (copilot-pull-request-reviewer and github-actions reviews from earlier today) — those cover separate issues (git state restoration on failure, stale gate report file across retries, etc.) which are worth addressing but are out of scope for this targeted pass on review #4200492243. Happy to follow up on those in a separate commit if you'd like.

Copilot AI review requested due to automatic review settings April 24, 2026 18:36

kubaflo mentioned this pull request Apr 24, 2026

Add smart UI test category detection to skip irrelevant test jobs #33176

Closed

Copilot started reviewing on behalf of kubaflo April 24, 2026 18:37 View session

kubaflo assigned kubaflo and jfversluis Apr 24, 2026

Copilot AI reviewed Apr 24, 2026

View reviewed changes

kubaflo added area-ai-agents Copilot CLI agents, agent skills, AI-assisted development agentic-workflows and removed agentic-workflows labels Apr 24, 2026

github-actions Bot mentioned this pull request Apr 25, 2026

[repo-status] 🌟 Daily Repo Status - April 25, 2026 #35135

Closed

kubaflo mentioned this pull request Apr 25, 2026

Add UI test category detection to maui-pr-uitests pipeline #35136

Draft

kubaflo force-pushed the feature/detect-uitest-categories branch from 8898e29 to 8586797 Compare April 25, 2026 09:50

kubaflo changed the title ~~Add smart UI test category detection to skip irrelevant test jobs~~ Integrate UI test category detection into PR review workflow Apr 25, 2026

kubaflo force-pushed the feature/detect-uitest-categories branch 2 times, most recently from 231f9a9 to 1b54ca1 Compare April 25, 2026 10:25

kubaflo changed the title ~~Integrate UI test category detection into PR review workflow~~ Integrate UI test category detection into PR review and fix gate reliability Apr 25, 2026

kubaflo force-pushed the feature/detect-uitest-categories branch from 1b54ca1 to 8413e08 Compare April 25, 2026 23:41

github-actions Bot mentioned this pull request Apr 27, 2026

[PR Review Queue] 2026-04-27 #35146

Closed

This was referenced Apr 28, 2026

[PR Review Queue] 2026-04-28 #35182

Closed

[PR Review Queue] 2026-04-29 #35207

Closed

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Apr 29, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Apr 29, 2026

View reviewed changes

	7. Invoke the code-review skill as a sub-agent:
	8. Invoke the code-review skill as a sub-agent:

	if ($_ -match '^([A-Za-z]+)') { $Matches[1] }
	if ($_ -match '^([A-Za-z0-9_]+)') { $Matches[1] }

	$categoryPattern = '^\+\s\[Category\((?<value>[^\)])\)\]'
	$categoryPattern = '^\+\s\[Category\((?<value>[^\)])\)\s*(?:,\|\])'

		} elseif (-not [string]::IsNullOrWhiteSpace($env:SYSTEM_ACCESSTOKEN)) {
		$h['Authorization'] = "Bearer $env:SYSTEM_ACCESSTOKEN"

Conversation

kubaflo commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

1. UI test category detection in PR review

2. Gate reliability fixes

Files

Validation — PR reviewer builds (Apr 26)

Uh oh!

github-actions Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Skill Validation Results

✅ Static Checks Passed

❌ LLM Evaluation Failed

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 29, 2026

kubaflo commented Apr 24, 2026 •

edited

Loading

github-actions Bot commented Apr 24, 2026 •

edited

Loading

github-actions Bot commented Apr 27, 2026 •

edited

Loading