diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 90dad179ff44..c97b4299a8fb 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -245,7 +245,18 @@ Skills are modular capabilities that can be invoked directly or used by agents. #### User-Facing Skills -1. **issue-triage** (`.github/skills/issue-triage/SKILL.md`) +1. **pr-review** (`.github/skills/pr-review/SKILL.md`) + - **Purpose**: End-to-end PR review orchestrator — 3 phases: pr-preflight, try-fix, pr-report. Gate runs separately before this skill via Review-PR.ps1. + - **Trigger phrases**: "review PR #XXXXX", "work on PR #XXXXX", "fix issue #XXXXX", "continue PR #XXXXX" + - **Capabilities**: Multi-model fix exploration, alternative comparison, PR review recommendation + - **Do NOT use for**: Just running tests manually → Use `sandbox-agent` + - **Phase instructions** (in `.github/pr-review/`): + - `pr-preflight.md` — Context gathering from issue/PR + - `pr-report.md` — Final recommendation + - **Phase skill**: `try-fix` — Multi-model fix exploration + - **Note**: Gate (test verification) runs as a script step in `Review-PR.ps1` before this skill is invoked. Gate result is passed in the prompt. + +2. **issue-triage** (`.github/skills/issue-triage/SKILL.md`) - **Purpose**: Query and triage open issues that need milestones, labels, or investigation - **Trigger phrases**: "find issues to triage", "show me old Android issues", "what issues need attention" - **Scripts**: `init-triage-session.ps1`, `query-issues.ps1`, `record-triage.ps1` @@ -286,8 +297,8 @@ Skills are modular capabilities that can be invoked directly or used by agents. - **Trigger phrases**: "write XAML tests for #XXXXX", "test XamlC behavior", "reproduce XAML parsing bug" - **Output**: Test files for Controls.Xaml.UnitTests -8. **verify-tests-fail-without-fix** (`.github/skills/verify-tests-fail-without-fix/SKILL.md`) - - **Purpose**: Verifies UI tests catch the bug before fix and pass with fix +9. **verify-tests-fail-without-fix** (`.github/skills/verify-tests-fail-without-fix/SKILL.md`) + - **Purpose**: Verifies tests catch the bug before fix and pass with fix. Auto-detects test type (UI, device, unit, XAML) and dispatches to the appropriate runner. - **Two modes**: Verify failure only (test creation) or full verification (test + fix) - **Used by**: After creating tests, before considering PR complete diff --git a/.github/instructions/gh-aw-workflows.instructions.md b/.github/instructions/gh-aw-workflows.instructions.md index c8cd70ecc3be..2230e8c86492 100644 --- a/.github/instructions/gh-aw-workflows.instructions.md +++ b/.github/instructions/gh-aw-workflows.instructions.md @@ -6,6 +6,34 @@ applyTo: # gh-aw (GitHub Agentic Workflows) Guidelines +## 🚨 Before You Build: Prefer Built-in gh-aw Features + +**CRITICAL RULE:** Before implementing any trigger, output, scheduling, or interaction mechanism in a gh-aw workflow, check whether gh-aw has a built-in feature that does it. gh-aw extends GitHub Actions with many convenience features — manually reimplementing them is always worse (more code, more bugs, missing platform integration like emoji reactions, sanitized inputs, and noise reduction). + +### Step 1: Check the anti-patterns table below +### Step 2: If not listed, check the [triggers reference](https://github.github.com/gh-aw/reference/triggers/), [frontmatter reference](https://github.github.com/gh-aw/reference/frontmatter/), and [safe-outputs reference](https://github.github.com/gh-aw/reference/safe-outputs/) +### Step 3: If a built-in exists, use it. If not, proceed with manual implementation. + +### Anti-Patterns: Manual Reimplementations to Avoid + +| If you're about to implement... | Use this built-in instead | Docs | +|---------------------------------|--------------------------|------| +| `issue_comment` + `startsWith(comment.body, '/cmd')` | `slash_command:` trigger | [Command Triggers](https://github.github.com/gh-aw/reference/command-triggers/) | +| Manual emoji reaction on triggering comment | `reaction:` field under `on:` | [Frontmatter](https://github.github.com/gh-aw/reference/frontmatter/) | +| Posting "workflow started/completed" status comments | `status-comment: true` under `on:` | [Frontmatter](https://github.github.com/gh-aw/reference/frontmatter/) | +| Fixed cron schedule (`0 9 * * 1`) for non-critical timing | `schedule: weekly on monday around 9:00` (fuzzy) | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | +| Manual `if:` to skip bot-authored PRs | `skip-bots:` under `on:` | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | +| Manual `if:` to skip by author role | `skip-roles:` under `on:` | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | +| Manual label check + removal for one-shot commands | `label_command:` trigger | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | +| Editing old comments to collapse them | `hide-older-comments: true` on `add-comment:` | [Safe Outputs](https://github.github.com/gh-aw/reference/safe-outputs/) | +| Creating no-op report issues | `noop: report-as-issue: false` | [Safe Outputs / Monitoring](https://github.github.com/gh-aw/patterns/monitoring/) | +| Auto-closing older issues from same workflow | `close-older-issues: true` on `create-issue:` | [Safe Outputs](https://github.github.com/gh-aw/reference/safe-outputs/) | +| Disabling workflow after a date | `stop-after:` under `on:` | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | +| Manual approval gating | `manual-approval:` under `on:` | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | +| Search-based skip logic in `steps:` | `skip-if-match:` / `skip-if-no-match:` under `on:` | [Triggers](https://github.github.com/gh-aw/reference/triggers/) | + +**Note:** gh-aw is actively developed. If a capability feels like something a framework would provide natively, check the reference docs — it probably exists even if it's not in this table yet. + ## Architecture gh-aw workflows are authored as `.md` files with YAML frontmatter, compiled to `.lock.yml` via `gh aw compile`. The lock file is auto-generated — **never edit it manually**. @@ -29,6 +57,8 @@ agent job: | Platform steps | ✅ Yes | ✅ Yes | ✅ Yes | Platform-controlled | | Agent container | ❌ Scrubbed | ❌ Scrubbed | ❌ Scrubbed | ✅ But sandboxed | +**⚠️ Agent container credential nuance:** `GITHUB_TOKEN` and `gh` CLI credentials are scrubbed inside the agent container. However, `COPILOT_TOKEN` (used for LLM inference) is present in the environment via `--env-all`. Any subprocess (e.g., `dotnet build`, `npm install`) inherits this variable. The AWF network firewall, `redact_secrets.cjs` (post-agent log scrubbing), and the threat detection agent limit the blast radius. See [Security Boundaries](#security-boundaries) below. + ### Step Ordering (Critical) User `steps:` **always run before** platform-generated steps. You cannot insert user steps after platform steps. @@ -48,6 +78,41 @@ By default, `gh aw compile` automatically injects a fork guard into the activati To **allow fork PRs**, add `forks: ["*"]` to the `pull_request` trigger in the `.md` frontmatter. The compiler removes the auto-injected guard from the compiled `if:` conditions. This is safe when the workflow uses the `Checkout-GhAwPr.ps1` pattern (checkout + trusted-infra restore) and the agent is sandboxed. +## Security Boundaries + +### Key Principles (from [GitHub Security Lab](https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/)) + +1. **Never execute untrusted PR code with elevated credentials.** The classic "pwn-request" attack is `pull_request_target` + checkout PR + run build scripts with `GITHUB_TOKEN`. The attack surface includes build scripts (`make`, `build.ps1`), package manager hooks (`npm postinstall`, MSBuild targets), and test runners. + +2. **Treating PR contents as passive data is safe.** Reading, analyzing, or diffing PR code is fine — the danger is *executing* it. Our gh-aw workflows read code for evaluation; they never build or run it. + +3. **`pull_request_target` grants write permissions and secrets access.** This is by design — the workflow YAML comes from the base branch (trusted). But any step that checks out and runs fork code in this context creates a vulnerability. + +4. **`pull_request` from forks has no secrets access.** GitHub withholds secrets because the workflow YAML comes from the fork (untrusted). This is the safe default for CI builds on fork PRs. + +5. **The `workflow_run` pattern separates privilege from code execution.** Build in an unprivileged `pull_request` job → pass artifacts → process in a privileged `workflow_run` job. This is architecturally what gh-aw does: agent runs read-only, `safe_outputs` job has write permissions. + +### gh-aw Defense Layers + +| Layer | What it does | What it doesn't do | +|-------|-------------|-------------------| +| **AWF network firewall** | Restricts outbound to allowlisted domains | Doesn't prevent reading env vars inside the container | +| **`redact_secrets.cjs`** | Scrubs known secret values from logs/artifacts post-agent | Doesn't catch encoded/obfuscated values | +| **Threat detection agent** | Reviews agent outputs before safe-outputs publishes them | Can miss novel exfiltration techniques | +| **Safe-outputs permission separation** | Write operations happen in separate job, not the agent | Agent can still request writes via safe-output tools | +| **`max: 1` on `add-comment`** | Limits agent to one comment | That one comment could contain sensitive data (mitigated by redaction) | +| **XPIA prompt** | Instructs LLM to resist prompt injection from untrusted content | LLM compliance is probabilistic, not guaranteed | +| **`pre_activation` role check** | Gates on write-access collaborators | Does not apply if `roles: all` is set | + +### Rules for gh-aw Workflow Authors + +- ✅ **DO** treat PR contents as passive data (read, analyze, diff) +- ✅ **DO** run data-gathering scripts in `steps:` (pre-agent, trusted context) not inside the agent +- ✅ **DO** use `Checkout-GhAwPr.ps1` for `workflow_dispatch` to restore trusted `.github/` from base +- ❌ **DO NOT** run `dotnet build`, `npm install`, or any build command on untrusted PR code inside the agent — build tool hooks (MSBuild targets, postinstall scripts) can read `COPILOT_TOKEN` from the environment +- ❌ **DO NOT** execute workspace scripts (`.ps1`, `.sh`, `.py`) after checking out a fork PR in `steps:` — those run with `GITHUB_TOKEN` +- ❌ **DO NOT** set `roles: all` on workflows that process PR content — this allows any user to trigger the workflow + ## Fork PR Handling ### The "pwn-request" Threat Model @@ -65,12 +130,13 @@ Reference: https://securitylab.github.com/resources/github-actions-preventing-pw | `workflow_dispatch` | ❌ Skipped | ✅ Works — user steps handle checkout and restore is final | | `issue_comment` (same-repo) | ✅ Yes | ✅ Works — files already on PR branch | | `issue_comment` (fork) | ✅ Yes | ⚠️ Works — `checkout_pr_branch.cjs` re-checks out fork branch after user steps, potentially overwriting restored infra. Acceptable because agent is sandboxed (no credentials, max 1 comment via safe-outputs). Pre-flight check catches missing `SKILL.md` if fork isn't rebased. | +| `slash_command` | ✅ Yes (compiles to `issue_comment` internally) | Same behavior as `issue_comment` above, but with platform-managed command matching, emoji reactions, and sanitized input. Prefer `slash_command:` over manual `issue_comment` + `startsWith()`. | ### The `issue_comment` + Fork Problem For `/slash-command` triggers on fork PRs, `checkout_pr_branch.cjs` runs AFTER all user steps and re-checks out the fork branch. This overwrites any files restored by user steps (e.g., `.github/skills/`). A fork could include a crafted `SKILL.md` that alters the agent's evaluation behavior. -**Accepted residual risk:** The agent runs in a sandboxed container with all credentials scrubbed. The worst outcome is a manipulated evaluation comment (`safe-outputs: add-comment: max: 1`). The agent has no ability to push code, access secrets, or exfiltrate data. The pre-flight check in the agent prompt catches the case where `SKILL.md` is missing entirely (fork not rebased on `main`). +**Accepted residual risk:** The agent runs in a sandboxed container with `GITHUB_TOKEN` and `gh` CLI credentials scrubbed. `COPILOT_TOKEN` (for LLM inference) remains in the environment but the AWF network firewall restricts outbound connections to an allowlist of domains, `redact_secrets.cjs` scrubs known secret values from logs/outputs post-agent, and the threat detection agent reviews outputs before they are published. The worst practical outcome is a manipulated evaluation comment (`safe-outputs: add-comment: max: 1`). The pre-flight check in the agent prompt catches the case where `SKILL.md` is missing entirely (fork not rebased on `main`). **Upstream issue:** [github/gh-aw#18481](https://github.com/github/gh-aw/issues/18481) — "Using gh-aw in forks of repositories" @@ -88,17 +154,15 @@ steps: ``` The script: -1. Captures the base branch SHA before checkout -2. Checks out the PR branch via `gh pr checkout` -3. Deletes `.github/skills/` and `.github/instructions/` (prevents fork-added files) -4. Restores them from the base branch SHA (best-effort, non-fatal) +1. Verifies the PR author has write access and rejects fork PRs +2. Captures the base branch SHA before checkout +3. Checks out the PR branch via `gh pr checkout` +4. Restores `.github/skills/`, `.github/instructions/`, and `.github/copilot-instructions.md` from the base branch SHA (fatal on failure) **Behavior by trigger:** - **`workflow_dispatch`**: Platform checkout is skipped, so the restore IS the final workspace state (trusted files from base branch) -- **`pull_request`** (same-repo): User step restores trusted infra. `checkout_pr_branch.cjs` runs after and re-checks out PR branch — for same-repo PRs, skill files typically match main unless the PR modified them. -- **`pull_request`** (fork with `forks: ["*"]`): Same as above, but fork's skill files may differ. Same residual risk as `issue_comment` fork case — agent is sandboxed, pre-flight catches missing `SKILL.md`. -- **`issue_comment`** (same-repo): Platform re-checks out PR branch — files already match, effectively a no-op -- **`issue_comment`** (fork): Platform re-checks out fork branch after us, overwriting restored files. Agent is sandboxed; pre-flight in the prompt catches missing `SKILL.md` +- **`slash_command`** (same-repo): Platform's `checkout_pr_branch.cjs` handles checkout. Skill files typically match main unless the PR modified them. +- **`slash_command`** (fork): Platform re-checks out fork branch after user steps, overwriting restored files. Agent is sandboxed; pre-flight in the prompt catches missing `SKILL.md` ### Anti-Patterns diff --git a/.github/pr-review/pr-gate.md b/.github/pr-review/pr-gate.md index 817496178333..78f98ddc5d54 100644 --- a/.github/pr-review/pr-gate.md +++ b/.github/pr-review/pr-gate.md @@ -1,8 +1,9 @@ -# PR Gate — Test Verification +# PR Gate - Test Before and After Fix > **⛔ This phase MUST pass before continuing to Try-Fix. If it fails, stop and inform user.** -> 🚨 Gate verification MUST run via task agent — never inline. +> In CI (Review-PR.ps1), the gate runs `verify-tests-fail.ps1` directly as a script step. +> For manual usage, you can invoke it yourself or via a task agent. --- @@ -26,41 +27,32 @@ Choose a platform that is BOTH affected by the bug AND available on the current ## Steps -1. **Check if tests exist:** +1. **Detect tests in PR** using the shared detection script: ```bash - gh pr view XXXXX --json files --jq '.files[].path' | grep -E "TestCases\.(HostApp|Shared\.Tests)" + pwsh .github/scripts/shared/Detect-TestsInDiff.ps1 -PRNumber XXXXX ``` - If NO tests exist → inform user, suggest `write-tests-agent`. Gate is ⚠️ SKIPPED. + This auto-detects all test types: UI tests, device tests, unit tests, XAML tests. + If NO tests detected → inform user, suggest `write-tests-agent`. Gate is ⚠️ SKIPPED. -2. **Select platform** — must be affected by bug AND available on host (see Platform Selection above). +2. **Select platform** — must be affected by bug AND available on host (see table above). -3. **Run verification via task agent** (MUST use task agent — never inline): +3. **Run verification** via `verify-tests-fail.ps1`: + ```bash + pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 \ + -Platform {platform} -RequireFullVerification + ``` + In CI, `Review-PR.ps1` calls this script directly. For manual usage, you can also invoke + it via a task agent for isolation: ``` Invoke the `task` agent with this prompt: "Invoke the verify-tests-fail-without-fix skill for this PR: - Platform: {platform} - - TestFilter: 'IssueXXXXX' - RequireFullVerification: true Report back: Did tests FAIL without fix? Did tests PASS with fix? Final status?" ``` -**Why task agent?** Running inline allows substituting commands and fabricating results. Task agent runs in isolation. - ---- - -## Expected Result - -``` -╔═══════════════════════════════════════════════════════════╗ -║ VERIFICATION PASSED ✅ ║ -╠═══════════════════════════════════════════════════════════╣ -║ - FAIL without fix (as expected) ║ -║ - PASS with fix (as expected) ║ -╚═══════════════════════════════════════════════════════════╝ -``` - --- ## If Gate Fails @@ -72,25 +64,44 @@ Choose a platform that is BOTH affected by the bug AND available on the current ## Output File +> 🚨 **CRITICAL OUTPUT RULES:** +> - Write gate results ONLY to `gate/content.md` — NEVER copy gate results into other phases (pre-flight, try-fix, report) +> - Use the EXACT template below — no extra explanations, no "Reason:" paragraphs, no "Notes:" sections +> - Keep it SHORT — the template is the complete output + ```bash mkdir -p CustomAgentLogsTmp/PRState/{PRNumber}/PRAgent/gate ``` -Write `content.md`: +Write `content.md` using this **exact** template (fill in values, don't add anything else): + ```markdown ### Gate Result: {✅ PASSED / ❌ FAILED / ⚠️ SKIPPED} **Platform:** {platform} -**Mode:** Full Verification -- Tests FAIL without fix: {✅/❌} -- Tests PASS with fix: {✅/❌} +| # | Type | Test Name | Filter | +|---|------|-----------|--------| +| 1 | {type} | {name} | `{filter}` | + +| Step | Expected | Actual | Result | +|------|----------|--------|--------| +| Without fix | FAIL | {FAIL/PASS} | {✅/❌} | +| With fix | PASS | {FAIL/PASS} | {✅/❌} | +``` + +If gate is SKIPPED (no tests found), write only: + +```markdown +### Gate Result: ⚠️ SKIPPED + +No tests detected in PR. Suggest adding tests via `write-tests-agent`. ``` --- ## Common Mistakes -- ❌ Running inline — MUST use task agent -- ❌ Using `BuildAndRunHostApp.ps1` — that runs ONE direction; the skill does TWO -- ❌ Claiming results from a single test run — script does TWO runs automatically +- ❌ Adding verbose explanations to gate/content.md — use the exact template above +- ❌ Copying gate results into try-fix/content.md or report/content.md — gate results belong ONLY in gate/content.md +- ❌ Skipping gate because tests are device tests, not UI tests — the skill supports all test types diff --git a/.github/pr-review/pr-report.md b/.github/pr-review/pr-report.md index ffe233ad1172..89651509fd3f 100644 --- a/.github/pr-review/pr-report.md +++ b/.github/pr-review/pr-report.md @@ -4,11 +4,14 @@ > 🚨 **DO NOT post any comments.** This phase only produces output files. +> 🚨 **DO NOT duplicate content from other phases.** Reference gate/try-fix results by status only (e.g., "Gate: ✅ PASSED") — do NOT copy their full output into report/content.md. + --- ## Prerequisites -- Phases 1-3 (Pre-Flight, Gate, Try-Fix) must be complete before starting +- Phases 1-2 (Pre-Flight, Try-Fix) must be complete before starting +- Gate result is available from the prompt (ran separately before this skill) --- diff --git a/.github/scripts/BuildAndRunHostApp.ps1 b/.github/scripts/BuildAndRunHostApp.ps1 index 70f92a297f9a..76959772f48e 100644 --- a/.github/scripts/BuildAndRunHostApp.ps1 +++ b/.github/scripts/BuildAndRunHostApp.ps1 @@ -313,8 +313,9 @@ try { # Save test output to file $testOutput | Out-File -FilePath $testOutputFile -Encoding UTF8 - # Display test output - $testOutput | ForEach-Object { Write-Host $_ } + # Output test results to the output stream so callers can capture them + # (Write-Host goes to the Information stream which is not captured by 2>&1) + $testOutput | ForEach-Object { Write-Output $_ } $testExitCode = $LASTEXITCODE diff --git a/.github/scripts/Checkout-GhAwPr.ps1 b/.github/scripts/Checkout-GhAwPr.ps1 index a2f9533bb7d2..1231e451be7d 100644 --- a/.github/scripts/Checkout-GhAwPr.ps1 +++ b/.github/scripts/Checkout-GhAwPr.ps1 @@ -1,26 +1,25 @@ <# .SYNOPSIS - Shared PR checkout for gh-aw (GitHub Agentic Workflows). + Shared PR checkout and trusted-infra restore for gh-aw workflows. .DESCRIPTION Checks out a PR branch and restores trusted agent infrastructure (skills, - instructions) from the base branch. Works for both same-repo and fork PRs. + instructions) from the base branch. This gives the agent the PR's code + changes with the latest skills and instructions from main. - This script is only invoked for workflow_dispatch triggers. For pull_request - and issue_comment, the gh-aw platform's checkout_pr_branch.cjs handles PR - checkout automatically (it runs as a platform step after all user steps). - workflow_dispatch skips the platform checkout entirely, so this script is - the only thing that gets the PR code onto disk. + Currently used for workflow_dispatch triggers. For slash_command and + issue_comment triggers, the gh-aw platform's checkout_pr_branch.cjs + handles PR checkout automatically — but may overwrite trusted infra + with fork-supplied files. Call this script after platform checkout to + restore trusted .github/ from the base branch. - SECURITY NOTE: This script checks out PR code onto disk. This is safe - because NO subsequent user steps execute workspace code — the gh-aw - platform copies the workspace into a sandboxed container with scrubbed - credentials before starting the agent. The classic "pwn-request" attack - requires checkout + execution; we only do checkout. + SECURITY: Before checkout, the script verifies the PR author has + write access (write, maintain, or admin) and rejects fork PRs. + This prevents checkout of untrusted code in privileged contexts. DO NOT add steps after this that run scripts from the workspace - (e.g., ./build.sh, pwsh ./script.ps1). That would create an actual - fork code execution vulnerability. See: + (e.g., ./build.sh, pwsh ./script.ps1). That would create a code + execution vulnerability. See: https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/ .NOTES @@ -42,9 +41,46 @@ if (-not $env:PR_NUMBER -or $env:PR_NUMBER -eq '0') { $PrNumber = $env:PR_NUMBER +# ── Verify PR is same-repo and author has write access ─────────────────────── + +$RawJson = gh pr view $PrNumber --repo $env:GITHUB_REPOSITORY --json author,isCrossRepository --jq '{author: .author.login, isFork: .isCrossRepository}' +if ($LASTEXITCODE -ne 0) { + Write-Host "❌ Failed to fetch PR #$PrNumber metadata" + exit 1 +} + +try { + $PrInfo = $RawJson | ConvertFrom-Json +} catch { + Write-Host "❌ PR #$PrNumber returned malformed JSON: $RawJson" + exit 1 +} + +if (-not $PrInfo -or -not $PrInfo.author) { + Write-Host "❌ PR #$PrNumber returned empty or malformed metadata" + exit 1 +} + +if ($PrInfo.isFork) { + Write-Host "⏭️ PR #$PrNumber is from a fork — skipping. Fork PRs are evaluated in the sandboxed agent container via the platform's checkout_pr_branch.cjs." + exit 1 +} + +$Permission = gh api "repos/$($env:GITHUB_REPOSITORY)/collaborators/$($PrInfo.author)/permission" --jq '.permission' +if ($LASTEXITCODE -ne 0) { + Write-Host "❌ Failed to check permissions for '$($PrInfo.author)'" + exit 1 +} + +$AllowedRoles = @('admin', 'write', 'maintain') +if ($Permission -notin $AllowedRoles) { + Write-Host "⏭️ PR author '$($PrInfo.author)' has '$Permission' access. workflow_dispatch only processes PRs from authors with write access." + exit 1 +} + +Write-Host "✅ PR #$PrNumber by '$($PrInfo.author)' ($Permission access, same-repo)" + # ── Save base branch SHA ───────────────────────────────────────────────────── -# Must be captured BEFORE checkout replaces HEAD. -# Exported for potential use by downstream platform steps (e.g., checkout_pr_branch.cjs) $BaseSha = git rev-parse HEAD if ($LASTEXITCODE -ne 0) { @@ -52,6 +88,7 @@ if ($LASTEXITCODE -ne 0) { exit 1 } Add-Content -Path $env:GITHUB_ENV -Value "BASE_SHA=$BaseSha" +Write-Host "Base branch SHA: $BaseSha" # ── Checkout PR branch ────────────────────────────────────────────────────── @@ -65,17 +102,15 @@ Write-Host "✅ Checked out PR #$PrNumber" git log --oneline -1 # ── Restore agent infrastructure from base branch ──────────────────────────── -# This script only runs for workflow_dispatch (other triggers use the platform's -# checkout_pr_branch.cjs instead). For workflow_dispatch the platform checkout is -# skipped, so this restore IS the final workspace state. -# rm -rf first to prevent fork-added files from surviving the restore. - -if (Test-Path '.github/skills/') { Remove-Item -Recurse -Force '.github/skills/' } -if (Test-Path '.github/instructions/') { Remove-Item -Recurse -Force '.github/instructions/' } +# Replace skills and instructions with base branch versions to ensure the agent +# always uses trusted infrastructure from main. Uses git checkout to read files +# directly from the commit tree — works in shallow clones (no history traversal). +# Restore BEFORE deleting so a failure doesn't leave the workspace without infra. git checkout $BaseSha -- .github/skills/ .github/instructions/ .github/copilot-instructions.md 2>&1 if ($LASTEXITCODE -eq 0) { Write-Host "✅ Restored agent infrastructure from base branch ($BaseSha)" } else { - Write-Host "⚠️ Could not restore agent infrastructure from base branch — files may come from the PR branch" + Write-Host "❌ Failed to restore agent infrastructure from base branch — aborting to prevent running with untrusted infra" + exit 1 } diff --git a/.github/scripts/EstablishBrokenBaseline.ps1 b/.github/scripts/EstablishBrokenBaseline.ps1 index 2a6d57506ec2..0fe68a2f8d34 100644 --- a/.github/scripts/EstablishBrokenBaseline.ps1 +++ b/.github/scripts/EstablishBrokenBaseline.ps1 @@ -64,6 +64,9 @@ $script:TestPathPatterns = @( "*.Tests/*", "*.UnitTests/*", "*TestCases*", + "*TestUtils*", + "*DeviceTests.Runners*", + "*DeviceTests.Shared*", "*snapshots*", "*.png", "*.jpg", diff --git a/.github/scripts/Fix-MilestoneDrift.Tests.ps1 b/.github/scripts/Fix-MilestoneDrift.Tests.ps1 new file mode 100644 index 000000000000..2628b5334f32 --- /dev/null +++ b/.github/scripts/Fix-MilestoneDrift.Tests.ps1 @@ -0,0 +1,394 @@ +#!/usr/bin/env pwsh +#Requires -Modules Pester +<# +.SYNOPSIS + Pester tests for Fix-MilestoneDrift.ps1. + Tests the pure functions (milestone mapping, matching, linked-issue extraction) + without hitting GitHub or Git. + +.EXAMPLE + Invoke-Pester ./Fix-MilestoneDrift.Tests.ps1 + Invoke-Pester ./Fix-MilestoneDrift.Tests.ps1 -Output Detailed +#> + +BeforeAll { + . "$PSScriptRoot/Fix-MilestoneDrift.ps1" +} + +Describe 'ConvertTo-Milestone' { + It 'maps GA tag "" to ""' -ForEach @( + @{ Tag = '10.0.0'; Expected = '.NET 10.0 GA' } + @{ Tag = '9.0.0'; Expected = '.NET 9.0 GA' } + ) { + ConvertTo-Milestone $Tag | Should -Be $Expected + } + + It 'maps SR tag "" to ""' -ForEach @( + @{ Tag = '10.0.10'; Expected = '.NET 10 SR1' } + @{ Tag = '10.0.11'; Expected = '.NET 10 SR1.1' } + @{ Tag = '10.0.20'; Expected = '.NET 10 SR2' } + @{ Tag = '10.0.31'; Expected = '.NET 10 SR3.1' } + @{ Tag = '10.0.40'; Expected = '.NET 10 SR4' } + @{ Tag = '10.0.41'; Expected = '.NET 10 SR4.1' } + @{ Tag = '10.0.50'; Expected = '.NET 10 SR5' } + @{ Tag = '9.0.82'; Expected = '.NET 9 SR8.2' } + @{ Tag = '9.0.90'; Expected = '.NET 9 SR9' } + @{ Tag = '10.0.100'; Expected = '.NET 10 SR10' } + @{ Tag = '10.0.101'; Expected = '.NET 10 SR10.1' } + ) { + ConvertTo-Milestone $Tag | Should -Be $Expected + } + + It 'maps early patch "" to SR1' -ForEach @( + @{ Tag = '10.0.1' } + @{ Tag = '10.0.5' } + @{ Tag = '10.0.9' } + ) { + ConvertTo-Milestone $Tag | Should -Be '.NET 10.0 SR1' + } + + It 'returns $null for non-SR tags' -ForEach @( + @{ Tag = '10.0.0-preview.7.25406.3' } + @{ Tag = 'not-a-tag' } + @{ Tag = '' } + ) { + ConvertTo-Milestone $Tag | Should -BeNullOrEmpty + } + + It 'maps preview "" with label "