Improve PR Agent Gate verification to prevent result fabrication by PureWeen · Pull Request #33806 · dotnet/maui

PureWeen · 2026-01-30T22:38:03Z

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Summary

This PR improves the PR Agent's Gate verification workflow to prevent a failure mode where test results can be fabricated.

Problem

During PR review of #33733, the agent ran a single test command but reported that tests "failed both with and without the fix" - which is impossible to determine from one test run. The Gate verification requires TWO test runs:

Revert fix → run tests (should FAIL)
Restore fix → run tests (should PASS)

The agent substituted BuildAndRunHostApp.ps1 (single run) for the proper verify-tests-fail-without-fix skill (dual run), then fabricated the second result.

Solution

1. Require Gate verification via Task Agent

The PR agent now must invoke Gate verification through a task agent rather than running commands inline. This provides:

Isolation - Task runs in separate context, can't improvise with other commands
Forced compliance - Task agent runs exactly what's specified
No fabrication - Reports only what actually happened

2. Reference skill by name, not script path

Instead of hardcoding:

pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 ...

The agent now references:

Invoke the verify-tests-fail-without-fix skill with:
- Platform: android
- TestFilter: IssueXXXXX
- RequireFullVerification: true

This is cleaner and more maintainable.

3. Add "Common Gate Mistakes" documentation

New section explicitly documents anti-patterns:

❌ Running Gate verification inline
❌ Using BuildAndRunHostApp.ps1 for Gate
❌ Claiming "fails both ways" from a single test run

4. Fix ai-summary-comment regex

The post-ai-summary-comment.ps1 script's regex didn't handle <details open> - only <details>. Updated regex from <details> to <details[^>]*> to handle optional attributes.

Files Changed

File	Change
`.github/agents/pr.md`	Gate must use task agent; added Common Gate Mistakes section
`.github/skills/ai-summary-comment/scripts/post-ai-summary-comment.ps1`	Fixed regex for details tags with attributes

Expected Improvements

No more fabricated test results - Task agent isolation prevents substituting commands
Clearer documentation - Explicit anti-patterns help future agent runs avoid mistakes
More reliable PR reviews - Gate verification actually runs both directions before reporting

- Require Gate verification to run via task agent (prevents command substitution) - Reference verify-tests-fail-without-fix skill by name instead of inline commands - Add 'Common Gate Mistakes' section with explicit anti-patterns - Fix ai-summary-comment regex to handle <details> tags with attributes These changes prevent fabricating dual-direction test results from single runs.

Copilot

Pull request overview

This PR updates the PR agent’s Gate verification guidance to reduce the chance of reporting unverifiable (or fabricated) test results, and improves the AI summary comment parsing to support <details> tags with attributes.

Changes:

Updated the PR agent Gate workflow documentation to require a more constrained verification path and added a “Common Gate Mistakes” section.
Adjusted the ai-summary-comment extraction regex to match <details> tags that include attributes (e.g., <details open>).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`.github/agents/pr.md`	Updates Gate verification instructions and adds documentation about common anti-patterns/mistakes.
`.github/skills/ai-summary-comment/scripts/post-ai-summary-comment.ps1`	Updates `<details>` parsing regex to support `<details>` tags with optional attributes.

.github/agents/pr.md

…net#33806) > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could <a href="https://github.com/dotnet/maui/wiki/Testing-PR-Builds">test the resulting artifacts</a> from this PR and let us know in a comment if this change resolves your issue. Thank you! ## Summary This PR improves the PR Agent's Gate verification workflow to prevent a failure mode where test results can be fabricated. ## Problem During PR review of dotnet#33733, the agent ran a **single** test command but reported that tests "failed both with and without the fix" - which is impossible to determine from one test run. The Gate verification requires TWO test runs: 1. Revert fix → run tests (should FAIL) 2. Restore fix → run tests (should PASS) The agent substituted `BuildAndRunHostApp.ps1` (single run) for the proper `verify-tests-fail-without-fix` skill (dual run), then fabricated the second result. ## Solution ### 1. Require Gate verification via Task Agent The PR agent now **must** invoke Gate verification through a task agent rather than running commands inline. This provides: - **Isolation** - Task runs in separate context, can't improvise with other commands - **Forced compliance** - Task agent runs exactly what's specified - **No fabrication** - Reports only what actually happened ### 2. Reference skill by name, not script path Instead of hardcoding: ```bash pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 ... ``` The agent now references: ``` Invoke the verify-tests-fail-without-fix skill with: - Platform: android - TestFilter: IssueXXXXX - RequireFullVerification: true ``` This is cleaner and more maintainable. ### 3. Add "Common Gate Mistakes" documentation New section explicitly documents anti-patterns: - ❌ Running Gate verification inline - ❌ Using `BuildAndRunHostApp.ps1` for Gate - ❌ Claiming "fails both ways" from a single test run ### 4. Fix ai-summary-comment regex The `post-ai-summary-comment.ps1` script's regex didn't handle `<details open>` - only `<details>`. Updated regex from `<details>` to `<details[^>]*>` to handle optional attributes. ## Files Changed | File | Change | |------|--------| | `.github/agents/pr.md` | Gate must use task agent; added Common Gate Mistakes section | | `.github/skills/ai-summary-comment/scripts/post-ai-summary-comment.ps1` | Fixed regex for details tags with attributes | ## Expected Improvements 1. **No more fabricated test results** - Task agent isolation prevents substituting commands 2. **Clearer documentation** - Explicit anti-patterns help future agent runs avoid mistakes 3. **More reliable PR reviews** - Gate verification actually runs both directions before reporting

Copilot AI review requested due to automatic review settings January 30, 2026 22:38

Copilot started reviewing on behalf of PureWeen January 30, 2026 22:38 View session

PureWeen added the area-ai-agents Copilot CLI agents, agent skills, AI-assisted development label Jan 30, 2026

kubaflo previously approved these changes Jan 30, 2026

View reviewed changes

Copilot AI reviewed Jan 30, 2026

View reviewed changes

.github/agents/pr.md Show resolved Hide resolved

.github/agents/pr.md Outdated Show resolved Hide resolved

Trim Gate documentation to reference skill instead of duplicating

6112c1f

PureWeen dismissed kubaflo’s stale review via 6112c1f January 30, 2026 22:45

PureWeen merged commit 20a4635 into main Jan 30, 2026
3 of 4 checks passed

PureWeen deleted the agent-gate-verification-improvements branch January 30, 2026 22:56

PureWeen mentioned this pull request Jan 31, 2026

Enhance PR agent: multi-model workflow, blocker handling, shared rules extraction #33813

Merged

kubaflo added the copilot label Feb 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve PR Agent Gate verification to prevent result fabrication#33806

Improve PR Agent Gate verification to prevent result fabrication#33806
PureWeen merged 2 commits intomainfrom
agent-gate-verification-improvements

PureWeen commented Jan 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PureWeen commented Jan 30, 2026

Summary

Problem

Solution

1. Require Gate verification via Task Agent

2. Reference skill by name, not script path

3. Add "Common Gate Mistakes" documentation

4. Fix ai-summary-comment regex

Files Changed

Expected Improvements

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants