Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
c8518ab
Add regression cross-reference script (STEP 0.6)
Copilot May 2, 2026
7b57175
Run regression tests from reverted fix PRs
Copilot May 2, 2026
ba78652
Run ALL test types in regression verification (UI, device, unit)
Copilot May 2, 2026
25d9d40
Run regression tests for OVERLAP too (max confidence)
Copilot May 3, 2026
b4ef904
Wire regression tests into try-fix candidate validation
Copilot May 3, 2026
f2c1d33
Renumber pipeline steps to whole numbers (1-8)
Copilot May 3, 2026
e4f0964
Merge regression test execution into STEP 3, renumber to 7 steps
Copilot May 3, 2026
e276532
Filter out fix PRs not merged into PR's base branch
Copilot May 4, 2026
9333658
Add STEP 3: run detected UI test categories
Copilot May 7, 2026
412aa32
Preserve STEP 3 results when Tier-3 refresh rewrites uitests/content.md
Copilot May 8, 2026
76bee77
STEP 3: enrich UI test failure detail in AI summary comment
Copilot May 8, 2026
0b8a025
STEP 3: surface build/deploy errors and avoid misleading βœ“ when categ…
Copilot May 8, 2026
4ec64ea
STEP 3: split multi-line dotnet test capture and dedup failure rendering
Copilot May 8, 2026
2b3eb9e
STEP 3: detect infrastructure failures and clearly label them in AI s…
Copilot May 8, 2026
7e9a5d3
STEP 3: retry on environment errors (same as Gate's verify-tests-fail…
Copilot May 8, 2026
0291e98
STEP 3: delegate UI test runs to shared Invoke-UITestWithRetry.ps1
Copilot May 8, 2026
8dbb0e9
STEP 3: broaden infra-failure detection to 'Build FAILED + 0 passes'
Copilot May 8, 2026
58f3cf7
STEP 3: align dotnet test invocation with CI pipeline 313 (TRX, TestC…
Copilot May 8, 2026
c78a14a
STEP 3: scope TRX fallback to current run via timestamp filter
Copilot May 8, 2026
3158a35
Add Pester tests for Review-PR.ps1 helpers (Get-TrxResults, Get-DotNe…
Copilot May 8, 2026
aaab66e
STEP 3: dispatch UI tests to dedicated child pipeline (mirrors CI 313)
Copilot May 9, 2026
596064a
Revert "STEP 3: dispatch UI tests to dedicated child pipeline (mirror…
Copilot May 9, 2026
e111348
STEP 3: add inline RunDeepUITests + UpdateAISummaryComment stages
Copilot May 9, 2026
3e7f401
TEMP: skip Gate + Try-Fix to speed up inline-stages validation
Copilot May 9, 2026
44911a6
STEP 3: fix cross-stage output variable lookup syntax
Copilot May 9, 2026
3ae6653
STEP 3: install workloads + use build.ps1 in RunDeepUITests stage
Copilot May 9, 2026
d124ca7
Fix Deep UI Tests: use -Category param instead of invalid -OutputDir/…
Copilot May 9, 2026
067e142
Drop backtick line-continuation in deep UI tests step
Copilot May 9, 2026
b1b0334
Disable Xcode version validation in RunDeepUITests stage
Copilot May 9, 2026
c5d5dca
Install Node.js + Appium in RunDeepUITests stage
Copilot May 9, 2026
9f3c09a
Prefer iOS 26 simulator like main ui-tests pipeline + allow Update st…
Copilot May 9, 2026
8f75dd6
Auto-create iPhone 11 Pro on iOS-26 sim if no matching device pre-ins…
Copilot May 9, 2026
a517210
Prefer iOS-26-0 over iOS-26-1 in Start-Emulator.ps1
Copilot May 9, 2026
5aebe39
Re-enable Gate + Try-Fix after inline-stages validated
Copilot May 9, 2026
19805e2
Add failed-test names + snapshot-diff PNGs to deep results
Copilot May 9, 2026
23e892d
Wrap deep section in HTML markers to prevent duplicate stacking
Copilot May 9, 2026
3c38b3b
Enable Android AVD creation in provision for android Platform
Copilot May 9, 2026
6766a96
Prefer highest iOS 26.x runtime (26.4) to match CI baselines
Copilot May 9, 2026
ead7278
Add Android SDK tools (adb, emulator) to PATH for Deep stage
Copilot May 9, 2026
b4357b7
Use AcesShared pool for Android Deep UI Tests
Copilot May 9, 2026
096f512
Re-enable Gate + Try-Fix for full end-to-end validation
Copilot May 9, 2026
d0f0bbe
Install iOS 26.4 simulator explicitly + re-disable Gate/TryFix
Copilot May 10, 2026
a120696
Remove iOS 26.4 download step (requires macOS 26.4+, not downloadable)
Copilot May 10, 2026
edc4057
Demand Tahoe image for iOS pool to ensure iOS 26.4 runtime
Copilot May 10, 2026
55b68e7
Skip iOS simulator download in Deep stage β€” use Tahoe pre-installed r…
Copilot May 10, 2026
7162bdf
Revert skipSimulatorSetup β€” iOS 26.0 needed for build, not just tests
Copilot May 10, 2026
17bfd75
Explicitly download iOS 26.4 simulator using latest available Xcode
Copilot May 10, 2026
887d05c
TEMP: Skip ReviewPR, hardcode ViewBaseTests for fast iOS 26.4 testing
Copilot May 10, 2026
d30b88d
Try multiple iOS 26.4 download approaches on Tahoe agent
Copilot May 10, 2026
9162cc1
Add Tahoe image demand to androidPool (matches main CI)
Copilot May 10, 2026
ce5f24d
Switch Android to ubuntu-22.04 with KVM (matches main CI)
Copilot May 10, 2026
0a1a24c
Free disk space on ubuntu agents for Android emulator
Copilot May 10, 2026
7a448ac
Force-restart Android app before tests to recover from ANR
Copilot May 10, 2026
e576531
Wait for Android settings service before tests (API 30 fix)
Copilot May 10, 2026
b9f4433
Add ignoreHiddenApiPolicyError capability for Android API 30
Copilot May 10, 2026
1958ebd
Pre-build test project before app restart to avoid ANR on Android
Copilot May 10, 2026
797268d
Fix: allow restore during Android test project pre-build
Copilot May 10, 2026
789f80d
Restore full pipeline flow (ReviewPR β†’ Deep β†’ Update)
Copilot May 10, 2026
795ab4c
Switch Android to MAUI-1ESPT pool with 1ESPT-Ubuntu22.04 image
Copilot May 10, 2026
14fa228
TEMP: Skip ReviewPR for fast Android MAUI-1ESPT validation
Copilot May 10, 2026
274a201
Add AVD boot step to Deep stage (matches ReviewPR stage)
Copilot May 10, 2026
67ace4e
Move Android app restart to right before dotnet test execution
Copilot May 10, 2026
b92230b
Remove manual app restart β€” let Appium manage Android app lifecycle
Copilot May 10, 2026
3678ce3
Pass DEVICE_UDID to BuildAndRunHostApp in Deep stage
Copilot May 10, 2026
8468094
Reset LASTEXITCODE at end of Start-Emulator.ps1
Copilot May 11, 2026
e0b383e
Restore full pipeline + Android 118/119 ViewBaseTests PASS
Copilot May 11, 2026
07d8c36
Improve deep test results comment formatting
Copilot May 11, 2026
a5a335a
Add <br/> after all </summary> tags + dynamic category title
Copilot May 11, 2026
474068f
Wrap each failed test in details/summary for collapsible logs
Copilot May 11, 2026
ffe67ab
Re-enable Gate + Try-Fix for full end-to-end pipeline
Copilot May 11, 2026
8eb3d35
Move comment posting + labels to Stage 3
Copilot May 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/agents/maui-expert-reviewer.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ Every bug fix needs a regression test. Modified code must be checked against git
- CHECK: Test covers the specific scenario from the issue report, not a generic case
- CHECK: Shared code changes are tested on all affected platforms
- CHECK: Previously-fixed issue numbers are cross-referenced when modifying the same code area
- CHECK: If `regression-check/risks.json` exists and contains `REVERT` entries, list the affected fix PRs/issues and require author acknowledgment that the reverted fix is intentional. The regression cross-reference script (`Find-RegressionRisks.ps1`) detects when a PR deletes lines that were previously added by a labeled bug-fix PR.
- CHECK: UI tests run on all applicable platforms unless there is a specific technical limitation
- CHECK: Snapshot baselines updated across all platforms when changing background color, font, or layout
- CHECK: Screenshot size matches capture method β€” a size mismatch means the capture changed, not the rendering
Expand Down
107 changes: 84 additions & 23 deletions .github/scripts/BuildAndRunHostApp.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -219,9 +219,13 @@ Write-Success "Test project: $TestProject"

#region Run Tests

# Determine the filter to use
# Determine the filter to use.
# NOTE: The CI pipeline `maui-pr-uitests` (definition 313) uses `TestCategory=`
# (see eng/pipelines/common/ui-tests-steps.yml lines 116-164). NUnit accepts
# both `Category=` and `TestCategory=` but Cake's RunTestWithLocalDotNet uses
# `TestCategory=` so we mirror that here for byte-for-byte parity with CI.
if ($Category) {
$effectiveFilter = "Category=$Category"
$effectiveFilter = "TestCategory=$Category"
Write-Step "Running UI tests with category: $Category"
} else {
$effectiveFilter = $TestFilter
Expand All @@ -233,27 +237,30 @@ if ($Platform -eq "android") {
Write-Info "Clearing Android logcat buffer before test..."
& adb -s $DeviceUdid logcat -c

# Dismiss any ANR dialogs that may have appeared during build/deploy.
# The emulator can sit idle during long builds, causing SystemUI ANR.
Write-Info "Dismissing any system dialogs before test..."
# Wait for Android settings service to be available.
Write-Info "Waiting for Android settings service..."
$settingsReady = $false
for ($i = 0; $i -lt 30; $i++) {
$settingsCheck = & adb -s $DeviceUdid shell settings get global device_name 2>&1
if ($settingsCheck -and $settingsCheck -notmatch "Can't find service|error") {
$settingsReady = $true
Write-Success "Settings service ready (device_name=$settingsCheck)"
break
}
Write-Info " Settings service not ready yet (attempt $($i+1)/30)..."
Start-Sleep -Seconds 5
}
if (-not $settingsReady) {
Write-Warn "Settings service may not be ready β€” tests might fail"
}

# Do NOT force-stop or restart the app here. Appium's UiAutomator2
# driver handles app lifecycle via appPackage/appActivity capabilities.
# Manual restart causes double-stop issues and the app ends up in a
# bad state. Just dismiss any system dialogs and let Appium handle it.
& adb -s $DeviceUdid shell am broadcast -a android.intent.action.CLOSE_SYSTEM_DIALOGS 2>$null
& adb -s $DeviceUdid shell input keyevent KEYCODE_ENTER 2>$null
& adb -s $DeviceUdid shell input keyevent KEYCODE_BACK 2>$null
Start-Sleep -Seconds 1
& adb -s $DeviceUdid shell input keyevent KEYCODE_WAKEUP 2>$null
& adb -s $DeviceUdid shell input keyevent KEYCODE_MENU 2>$null
Start-Sleep -Seconds 1

# Check for lingering ANR dialogs via window dump
$windowDump = & adb -s $DeviceUdid shell dumpsys window 2>$null | Select-String "Application Not Responding|ANR"
if ($windowDump) {
Write-Warn "ANR dialog detected β€” force-dismissing..."
& adb -s $DeviceUdid shell input keyevent KEYCODE_HOME 2>$null
Start-Sleep -Seconds 2
& adb -s $DeviceUdid shell am broadcast -a android.intent.action.CLOSE_SYSTEM_DIALOGS 2>$null
& adb -s $DeviceUdid shell input keyevent KEYCODE_BACK 2>$null
Start-Sleep -Seconds 1
}
}

# Capture test start time for iOS logs
Expand Down Expand Up @@ -306,17 +313,71 @@ $appiumLogFile = Join-Path $HostAppLogsDir "appium.log"
$env:APPIUM_LOG_FILE = $appiumLogFile
Write-Info "Set APPIUM_LOG_FILE: $appiumLogFile (screenshots will be saved here)"

# ── TRX setup (mirrors CI: eng/cake/dotnet.cake `RunTestWithLocalDotNet`) ──
# CI writes one trx per test run via:
# --logger "trx;LogFileName=<sanitized-name>.trx"
# --logger "console;verbosity=normal"
# --results-directory <test-results-dir>
# /p:VStestUseMSBuildOutput=false
# We reproduce that here so STEP 3's renderer can parse authoritative
# pass/fail counts from the TRX (instead of scraping console output, which is
# fragile when many tests run and lines get interleaved or wrapped).
$trxResultsDir = Join-Path $HostAppLogsDir "TestResults"
if (-not (Test-Path $trxResultsDir)) {
New-Item -ItemType Directory -Path $trxResultsDir -Force | Out-Null
}
# Sanitize the trx file name. NUnit/MSTest reject some characters. We keep
# alpha-numeric, dash, underscore and dot β€” same set Cake's
# SanitizeTestResultsFilename uses.
$trxBaseName = if ($Category) { "$Category-$Platform" } else {
($TestFilter -replace '[^A-Za-z0-9._-]', '_')
}
$trxBaseName = $trxBaseName -replace '[^A-Za-z0-9._-]', '_'
$trxFileName = "$trxBaseName.trx"
$trxFilePath = Join-Path $trxResultsDir $trxFileName
# Pre-clean stale TRX so we never read a previous run's results
if (Test-Path $trxFilePath) { Remove-Item $trxFilePath -Force -ErrorAction SilentlyContinue }

Write-Info "TRX file will be written to: $trxFilePath"

try {
# Run dotnet test and capture output
$testOutput = & dotnet test $TestProject --filter $effectiveFilter --logger "console;verbosity=detailed" 2>&1
# Run dotnet test using the SAME loggers and arguments CI uses in
# `RunTestWithLocalDotNet` (eng/cake/dotnet.cake line 943-981).
$trxRunStart = Get-Date
$testArgs = @($TestProject, "--filter", $effectiveFilter,
"--logger", "trx;LogFileName=$trxFileName",
"--logger", "console;verbosity=normal",
"--results-directory", $trxResultsDir,
"/p:VStestUseMSBuildOutput=false")
Write-Info "Actual dotnet test args: $($testArgs -join ' ')"
$testOutput = & dotnet test @testArgs 2>&1

# Save test output to file
$testOutput | Out-File -FilePath $testOutputFile -Encoding UTF8

# Output test results to the output stream so callers can capture them
# (Write-Host goes to the Information stream which is not captured by 2>&1)
$testOutput | ForEach-Object { Write-Output $_ }


# Surface the TRX path on a marker line so callers (Invoke-UITestWithRetry
# and Review-PR.ps1) can locate the authoritative results file regardless
# of where the working directory was when this script ran.
if (Test-Path $trxFilePath) {
Write-Output ">>> TRX_RESULT_FILE: $trxFilePath"
} else {
# dotnet test may have written the TRX with a slightly different name
# (e.g. LogFileName argument stripped on Windows, or it injected a
# timestamp). Fall back to scanning the results dir for any .trx
# written AFTER this run started β€” never pick up a stale TRX from a
# previous category that shares the same results directory.
$latestTrx = Get-ChildItem -Path $trxResultsDir -Filter "*.trx" -ErrorAction SilentlyContinue |
Where-Object { $_.LastWriteTime -ge $trxRunStart } |
Sort-Object LastWriteTime -Descending | Select-Object -First 1
if ($latestTrx) {
Write-Output ">>> TRX_RESULT_FILE: $($latestTrx.FullName)"
}
}

$testExitCode = $LASTEXITCODE

Write-Host ""
Expand Down
Loading
Loading