Skip to content

Proposal: instructions, skills and review agent#13397

Merged
MichalPavlik merged 14 commits intodotnet:mainfrom
T-Gro:feat/skill-proposals
Mar 18, 2026
Merged

Proposal: instructions, skills and review agent#13397
MichalPavlik merged 14 commits intodotnet:mainfrom
T-Gro:feat/skill-proposals

Conversation

@T-Gro
Copy link
Copy Markdown
Member

@T-Gro T-Gro commented Mar 17, 2026

What

Adds folder-scoped instructions, topic-scoped skills, and an expert review agent for GitHub Copilot working in this repo.

How it works

  • Instructions (.github/instructions/) activate automatically when editing files matching their applyTo glob. They provide folder-specific guidance (e.g., concurrency rules for src/Build/BackEnd/, evaluation model rules for src/Build/Evaluation/).

  • Skills (.github/skills/) activate on topic keywords in conversation. They cover overarching concerns like backward compatibility assessment, performance optimization, SDK integration, error authoring, and binary log compatibility. The existing changewaves skill gets a cross-reference to the new assessing-breaking-changes skill (how vs when).

  • Review agent (.github/agents/expert-reviewer.md) is invoked via @expert-reviewer. It runs a 4-wave review process:

    1. Find — one sub-agent per quality dimension (24 total), parallel batches of 6. Each returns LGTM or a finding with exact file:line and a concrete failing scenario.
    2. Validate — proves each finding by tracing code flow in the PR branch, writing proof-of-concept tests, or simulating thread timelines. Multi-model consensus (Opus + Codex + Gemini) for borderline cases.
    3. Post — inline review comments at exact diff locations via GitHub CLI.
    4. Summary — 24-dimension checkbox table as review body.

Example

Applied to PR #13350 — found a shared-state concurrency race, an unlogged exception, and a null-deref edge case. 20 of 24 dimensions passed LGTM.

Scope

No production code changes. Only .github/ files (instructions, skills, agent). One minor addition to the existing changewaves skill (cross-reference line).

T-Gro and others added 13 commits March 16, 2026 18:16
Generated from analysis of 10,081 GitHub comments (2016-2026) across
dotnet/msbuild, dotnet/sdk, dotnet/project-system, and dotnet/dotnet.

Artifacts:
- 1 reviewer persona skill (24 dimensions, 99 rules)
- 10 folder-scoped instruction files
- 5 overarching skills

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Convert reviewer persona to @rainersigwald-reviewer agent
- Add deep-review skill as slim trigger for agent invocation
- Apply progressive disclosure: SKILL.md overview + DIMENSIONS.md reference
- Rename all skill names to gerund form per best practices
- Rewrite all descriptions in third person for proper discovery
- Deduplicate instruction files against AGENTS.md content
- Trim verbose explanations (Claude is smart)
- Remove time-sensitive statistics
- Verify all 23+ doc links exist on disk

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Rename agent to expert-reviewer (no personal attribution)
- Rename skill folder to reviewing-msbuild-code
- Remove all comment counts, evidence counts, PR references
- Remove frequency/evidence labels from dimensions
- Present all content as authoritative engineering guidance

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Review dimensions are now launched as 5 parallel sub-tasks:
- Agent A: BLOCKING dims (compat, changewaves, concurrency, eval model, security)
- Agent B: MAJOR quality (tests, errors, strings, API, naming)
- Agent C: MAJOR design (perf, targets, design, cross-platform, SDK)
- Agent D: MAJOR correctness (file I/O, infra, edge cases, deps)
- Agent E: MODERATE+NIT (logging, simplification, idioms, docs, scope)

Each uses claude-opus-4.6 model via the task tool. Results are
aggregated and deduplicated before presenting consolidated findings.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… GH CLI posting

- Sub-agents must return LGTM for clean dimensions (no forced output)
- Findings require exact file:line and code-flow verification
- Phase 2: Multi-model vote (Opus+Codex+Gemini), keep ≥2/3 consensus
- Phase 3: Post inline comments via GH CLI at exact diff locations
- Phase 4: Summary table with 24-dimension checkbox list
- Hypothetical/speculative findings are rejected

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Verifiers must use the PR diff or PR branch ref, not the main
branch — PR code (new files, new methods) doesn't exist on main.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The LGTM-first wording was too permissive - sub-agents explained away
real concurrency issues (shared mutable state race, process-global CWD).

New wording:
- 'Be rigorous, not a nodder' — LGTM only when genuinely verified clean
- Do NOT invent findings, but do NOT explain away real issues
- For concurrency: trace all threads, map timelines, check overlapping access
- For edge cases: construct concrete failing scenarios
- Verification must check PR branch, not main

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Review workflow now has 4 waves:
- Wave 1: Find candidates (parallel dimension agents, rigorous not nodder)
- Wave 2: Actively validate (trace code flow in PR branch, write proof-
  of-concept tests, simulate thread timelines, multi-model consensus)
- Wave 3: Post inline comments with scenarios + test snippets as proof
- Wave 4: Summary table with 24-dimension checkboxes

Key lessons incorporated:
- 'LGTM is the best outcome' was too permissive (missed real races)
- Verification must check PR branch, not main
- Concrete failing scenarios required (no hypotheticals)
- Thread interleaving timelines for concurrency findings
- Test snippet/pseudocode in inline feedback as evidence
- Balance: rigorous (catch real bugs) not nodder (rubber stamp),
  precise (concrete scenarios) not nitpicker (hypotheticals)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Strip lessons-learned framing. Pure directive instructions:
- Wave 1: Find (concrete scenarios only, no hypotheticals)
- Wave 2: Validate (code trace, build, test, thread timeline)
- Wave 3: Post (inline at file:line with proof)
- Wave 4: Summary (24-dim checkbox table)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each dimension gets its own focused sub-agent instead of grouping
5 dimensions per agent. Run 24 agents in parallel batches of 6
(4 batches). Each agent evaluates exactly one dimension — better
focus, clearer LGTM/ISSUE signal, no cross-dimension interference.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Agent (expert-reviewer.md): full methodology, workflow, dimensions
- Skill (reviewing-msbuild-code): 10-line discovery trigger only
- DIMENSIONS.md: detailed checklists, referenced by agent
- Deleted redundant deep-review skill (was duplicating the agent)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Dimensions are inherent to the agent — separate file risks being
skipped. All 24 dimensions with CHECK flags now live in the agent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…(when)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@T-Gro T-Gro marked this pull request as ready for review March 17, 2026 10:33
@T-Gro T-Gro requested a review from a team as a code owner March 17, 2026 10:33
Copilot AI review requested due to automatic review settings March 17, 2026 10:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds repo-scoped Copilot guidance artifacts under .github/: folder-targeted instruction files, topic-targeted skills, and an @expert-reviewer agent definition to standardize MSBuild PR review.

Changes:

  • Add multiple new skills covering SDK/MSBuild integration, performance, error/warning authoring, binary log compatibility, backwards-compatibility assessment, and a “reviewing-msbuild-code” entrypoint.
  • Add multiple folder-scoped instructions to guide edits in key MSBuild areas (BackEnd, Evaluation, Framework, Tasks/Targets, Shared, Logging, CLI, etc.).
  • Add an expert reviewer agent (@expert-reviewer) and update the existing ChangeWaves skill with a cross-reference to the breaking-changes assessment skill.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
.github/skills/sdk-msbuild-integration/SKILL.md New skill documenting SDK/MSBuild boundary patterns and evaluation/import ordering guidance.
.github/skills/reviewing-msbuild-code/SKILL.md New lightweight skill that points users to invoke @expert-reviewer.
.github/skills/msbuild-performance/SKILL.md New performance-focused guidance for hot paths, allocations, and collection/string practices.
.github/skills/error-and-warning-authoring/SKILL.md New guidance for authoring actionable diagnostics, MSB code assignment, localization, and WarnAsError impact.
.github/skills/changewaves/SKILL.md Adds cross-reference to the breaking-changes assessment skill.
.github/skills/binary-log-considerations/SKILL.md New guidance for binlog format stability, event serialization, and compatibility testing.
.github/skills/backwards-compatibility/SKILL.md New skill for assessing compatibility risk and when ChangeWaves are needed.
.github/instructions/tasks.instructions.md New folder-scoped guidance for built-in task authoring, compat, and path/logging patterns.
.github/instructions/targets.instructions.md New guidance for .targets/.props ordering, conditions, compat, and incremental build correctness.
.github/instructions/shared-code.instructions.md New guidance for shared linked code, IPC packet stability, and cross-assembly impact.
.github/instructions/logging.instructions.md New guidance for logging/binary log stability and logger behavior constraints.
.github/instructions/globbing.instructions.md New guidance for globbing semantics, performance, and cross-platform behavior.
.github/instructions/framework.instructions.md New guidance for public API discipline and compatibility expectations in Microsoft.Build.Framework.
.github/instructions/evaluation.instructions.md New guidance for evaluation model integrity and hot-path safety.
.github/instructions/cli.instructions.md New guidance for CLI switch stability, parsing, server mode, and output compatibility.
.github/instructions/buildcheck.instructions.md New guidance for analyzer authoring, codes, severity, and performance constraints.
.github/instructions/backend-execution.instructions.md New guidance for concurrency, IPC/versioning, scheduler correctness, and lifecycle constraints.
.github/agents/expert-reviewer.md New agent defining a 24-dimension review methodology and multi-wave workflow.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@MichalPavlik MichalPavlik merged commit 32d344d into dotnet:main Mar 18, 2026
10 checks passed
AR-May pushed a commit to AR-May/msbuild that referenced this pull request Mar 19, 2026
## What

Adds folder-scoped **instructions**, topic-scoped **skills**, and an
**expert review agent** for GitHub Copilot working in this repo.

## How it works

- **Instructions** (`.github/instructions/`) activate automatically when
editing files matching their `applyTo` glob. They provide
folder-specific guidance (e.g., concurrency rules for
`src/Build/BackEnd/`, evaluation model rules for
`src/Build/Evaluation/`).

- **Skills** (`.github/skills/`) activate on topic keywords in
conversation. They cover overarching concerns like backward
compatibility assessment, performance optimization, SDK integration,
error authoring, and binary log compatibility. The existing
`changewaves` skill gets a cross-reference to the new
`assessing-breaking-changes` skill (how vs when).

- **Review agent** (`.github/agents/expert-reviewer.md`) is invoked via
`@expert-reviewer`. It runs a 4-wave review process:
1. **Find** — one sub-agent per quality dimension (24 total), parallel
batches of 6. Each returns LGTM or a finding with exact file:line and a
concrete failing scenario.
2. **Validate** — proves each finding by tracing code flow in the PR
branch, writing proof-of-concept tests, or simulating thread timelines.
Multi-model consensus (Opus + Codex + Gemini) for borderline cases.
3. **Post** — inline review comments at exact diff locations via GitHub
CLI.
  4. **Summary** — 24-dimension checkbox table as review body.

## Example

Applied to [PR
dotnet#13350](dotnet#13350 (review))
— found a shared-state concurrency race, an unlogged exception, and a
null-deref edge case. 20 of 24 dimensions passed LGTM.

## Scope

No production code changes. Only `.github/` files (instructions, skills,
agent). One minor addition to the existing `changewaves` skill
(cross-reference line).

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants