Skip to content

feat: add agents-verifier to consumer repo sync#194

Merged
stranske merged 1 commit intomainfrom
feat/add-verifier-to-consumer-sync
Dec 26, 2025
Merged

feat: add agents-verifier to consumer repo sync#194
stranske merged 1 commit intomainfrom
feat/add-verifier-to-consumer-sync

Conversation

@stranske
Copy link
Copy Markdown
Owner

@stranske stranske commented Dec 26, 2025

Automated Status Summary

Scope

  • The verifier workflow was only in the Workflows repo, but it makes more sense in consumer repos because:
  • - Original issues are created in the consumer repo
  • - PRs run in the consumer repo
  • - Follow-up issues should be created in the consumer repo
  • ## Usage
  • Consumer repos need to:
  • 1. Copy agents-verifier.yml to their .github/workflows/
  • 2. Ensure CODEX_AUTH_JSON secret is set
  • The workflow automatically triggers when PRs are merged.

Tasks

  • Tasks section missing from source issue.

Acceptance criteria

  • Acceptance criteria section missing from source issue.

Head SHA: db7e685
Latest Runs: ❔ in progress — Gate
Required: gate: ❔ in progress

Workflow / Job Result Logs
Agents PR meta manager ❔ in progress View run
CI Autofix Loop ✅ success View run
Copilot code review ❔ in progress View run
Gate ❔ in progress View run
Health 40 Sweep ❔ in progress View run
Health 44 Gate Branch Protection ✅ success View run
Health 45 Agents Guard ✅ success View run
Health 50 Security Scan ❔ in progress View run
Maint 52 Validate Workflows ✅ success View run
PR 11 - Minimal invariant CI ✅ success View run
Selftest CI ❔ in progress View run

Changes:
- Add verifier to SYNC_FILES for automatic consumer repo sync
- Update template to use 'secrets: inherit' instead of explicit secret
Copilot AI review requested due to automatic review settings December 26, 2025 20:03
@stranske stranske temporarily deployed to agent-high-privilege December 26, 2025 20:03 — with GitHub Actions Inactive
@agents-workflows-bot
Copy link
Copy Markdown
Contributor

⚠️ Action Required: Unable to determine source issue for PR #194. The PR title, branch name, or body must contain the issue number (e.g. #123, branch: issue-123, or the hidden marker ).

@github-actions
Copy link
Copy Markdown
Contributor

Automated Status Summary

Head SHA: f1d949c
Latest Runs: ⏳ pending — Gate
Required contexts: Gate / gate, Health 45 Agents Guard / Enforce agents workflow protections
Required: core tests (3.11): ⏳ pending, core tests (3.12): ⏳ pending, docker smoke: ⏳ pending, gate: ⏳ pending

Workflow / Job Result Logs
(no jobs reported) ⏳ pending

Coverage Overview

  • Coverage history entries: 1

Coverage Trend

Metric Value
Current 77.97%
Baseline 0.00%
Delta +77.97%
Minimum 70.00%
Status ✅ Pass

Updated automatically; will refresh on subsequent CI/Docker completions.


Keepalive checklist

Scope

No scope information available

Tasks

  • No tasks defined

Acceptance criteria

  • No acceptance criteria defined

@stranske stranske merged commit fe7ac54 into main Dec 26, 2025
56 checks passed
@stranske stranske deleted the feat/add-verifier-to-consumer-sync branch December 26, 2025 20:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds the agents-verifier workflow to the consumer repository synchronization system and updates its template to use the cleaner secrets: inherit syntax.

Key Changes:

  • Added agents-verifier.yml to the sync file list for automated distribution to consumer repos
  • Updated the template to use secrets: inherit instead of explicitly passing individual secrets

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
templates/consumer-repo/.github/workflows/agents-verifier.yml Updated to use secrets: inherit for cleaner secret passing to reusable workflow
.github/workflows/maint-68-sync-consumer-repos.yml Added agents-verifier.yml to SYNC_FILES array for automatic synchronization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"agents-orchestrator.yml:agents-70-orchestrator.yml"
"agents-pr-meta.yml:agents-pr-meta.yml"
"agents-keepalive-loop.yml:agents-keepalive-loop.yml"
"agents-verifier.yml:agents-verifier.yml"
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new entry is not placed in alphabetical order within the SYNC_FILES array. The array should be sorted alphabetically by filename. Currently the order is:

  • agents-orchestrator.yml
  • agents-orchestrator.yml (duplicate mapping)
  • agents-pr-meta.yml
  • agents-keepalive-loop.yml
  • agents-verifier.yml (new entry)

The correct alphabetical order should be:

  • agents-keepalive-loop.yml
  • agents-orchestrator.yml (both mappings)
  • agents-pr-meta.yml
  • agents-verifier.yml

Move the agents-verifier.yml entry to maintain alphabetical ordering, which improves maintainability and makes it easier to locate entries in the array.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 113 to 117
"agents-orchestrator.yml:agents-70-orchestrator.yml"
"agents-pr-meta.yml:agents-pr-meta.yml"
"agents-keepalive-loop.yml:agents-keepalive-loop.yml"
"agents-verifier.yml:agents-verifier.yml"
"autofix.yml:autofix.yml"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge New verifier never synced to consumers

The verifier thin-caller was added to the sync detection list, but the apply step still only copies orchestrator/pr-meta/keepalive/autofix/pr-00 workflows—there is no agents-verifier.yml entry in SYNC_TEMPLATES, so even when differences are detected the subsequent copy step does nothing and the PR creation stage sees no changes. Consumer repos therefore will not receive or update the verifier workflow despite this addition, defeating the intended sync.

Useful? React with 👍 / 👎.

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Keepalive Loop Status

PR #194 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Gate success
Tasks 0/2 complete
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

stranske added a commit that referenced this pull request Jan 7, 2026
Created 3 test issues:
- #193: Stripe integration (should FAIL capability check)
- #194: Health monitoring (should trigger task decomposition)
- #196: Manager list API (should detect as duplicate of #133)

Updated testing metrics dashboard to track progress.
stranske added a commit that referenced this pull request Jan 7, 2026
* Add Phase 3 integration plan with testing cycle for Manager-Database

Phase 3: Pre-Agent Intelligence (4 capabilities)
- 3A: Capability Check - supplements agents:optimize with feasibility gate
  - Runs BEFORE agent assignment on Issues (not after)
  - Adds needs-human label when agent cannot proceed
- 3B: Task Decomposition - auto-split large issues
- 3C: Duplicate Detection - comment-only mode, track false positives
- 3D: Semantic Labeling - auto-suggest/apply labels

Testing Plan:
- Test repo: Manager-Database
- ~11 test issues across 4 capabilities
- False positive tracking for dedup (target: <5%)
- Metrics dashboard for validation

Also updates:
- Mark Collab-Admin PR #113 as merged (7/7 repos now synced)
- All immediate tasks completed
- Phase 3 ready to begin

* Add Phase 3 test issues for Manager-Database

Created 3 test issues:
- #193: Stripe integration (should FAIL capability check)
- #194: Health monitoring (should trigger task decomposition)
- #196: Manager list API (should detect as duplicate of #133)

Updated testing metrics dashboard to track progress.

* Add Phase 4: Full Automation & Cleanup plan

Phase 4 includes 5 initiatives:
- 4A: Label Cleanup - Remove bloat labels, standardize across 7 repos
- 4B: User Guide - Operational documentation for label system (sync to consumers)
- 4C: Auto-Pilot Label - End-to-end issue-to-merged-PR automation
- 4D: Conflict Resolution - Automated merge conflict handling in keepalive
- 4E: Verify-to-Issue - Create follow-up issues from verification feedback

Key decisions:
- Auto-pilot uses workflow_dispatch between steps (not chained labels)
- Conflict detection added to keepalive loop (not separate workflow)
- Verify-to-issue is user-triggered (not automatic, avoids false positives)

Also identifies 7 additional automation opportunities for future phases.

Testing plan defined for Manager-Database.

* Correct label analysis after codebase search + expand Phase 5

Label Analysis Corrections:
- agents:pause/paused ARE functional (keepalive_gate.js, keepalive-runner.js)
- agents:activated IS functional (agents_pr_meta_keepalive.js)
- from:codex/copilot ARE functional (merge_manager.js)
- automerge IS functional (merge_manager.js, agents_belt_scan.js)
- agents (bare) IS functional (agent_task.yml template)
- risk:low, ci:green, codex-ready ARE functional (merge_manager.js, issue templates)

Only 5-6 labels confirmed as bloat:
- codex (bare) - redundant with agent:codex
- ai:agent - zero matches
- auto-merge-audit - zero matches
- automerge:ok - zero matches
- architecture, backend, cli, etc. - repo-specific, not synced

Phase 5 Analysis:
- 5A: Auto-labeling - label_matcher.py EXISTS, ready for workflow
- 5B: Coverage check - maint-coverage-guard.yml EXISTS, add soft PR check
- 5C: Stale PR cleanup - not needed
- 5D: Dependabot - partial (auto-label exists, add auto-merge)
- 5E: Issue lint - soft warning approach
- 5F: Cross-repo linking - weekly scan with semantic_matcher.py
- 5G: Metrics - hybrid LangSmith (LLM) + custom (workflow)

* Consolidate agents:pause to agents:paused + expand Phase 4-5 plans

Label consolidation:
- Replace agents:pause with agents:paused in all source files
- Update keepalive_gate.js PAUSE_LABEL constant
- Update keepalive_orchestrator_gate_runner.js hardcoded check
- Update test to use agents:paused
- Update documentation in README, CLAUDE.md, GoalsAndPlumbing.md

Phase 4 updates:
- 4A: Add idiosyncratic repo bloat cleanup strategy (per-repo audit)
- 4B: Add optional issue creation feature to user guide (deferred)
- 4D: Full conflict resolution implementation with code examples
- 4E: Complete verify-to-issue workflow implementation

Phase 5 updates:
- 5F: Marked as SKIPPED (not needed per user decision)
- 5G: Full LangSmith integration plan + custom metrics

All keepalive tests pass (8/8).
stranske added a commit that referenced this pull request Jan 7, 2026
* Add Phase 3 integration plan with testing cycle for Manager-Database

Phase 3: Pre-Agent Intelligence (4 capabilities)
- 3A: Capability Check - supplements agents:optimize with feasibility gate
  - Runs BEFORE agent assignment on Issues (not after)
  - Adds needs-human label when agent cannot proceed
- 3B: Task Decomposition - auto-split large issues
- 3C: Duplicate Detection - comment-only mode, track false positives
- 3D: Semantic Labeling - auto-suggest/apply labels

Testing Plan:
- Test repo: Manager-Database
- ~11 test issues across 4 capabilities
- False positive tracking for dedup (target: <5%)
- Metrics dashboard for validation

Also updates:
- Mark Collab-Admin PR #113 as merged (7/7 repos now synced)
- All immediate tasks completed
- Phase 3 ready to begin

* Add Phase 3 test issues for Manager-Database

Created 3 test issues:
- #193: Stripe integration (should FAIL capability check)
- #194: Health monitoring (should trigger task decomposition)
- #196: Manager list API (should detect as duplicate of #133)

Updated testing metrics dashboard to track progress.

* Add Phase 4: Full Automation & Cleanup plan

Phase 4 includes 5 initiatives:
- 4A: Label Cleanup - Remove bloat labels, standardize across 7 repos
- 4B: User Guide - Operational documentation for label system (sync to consumers)
- 4C: Auto-Pilot Label - End-to-end issue-to-merged-PR automation
- 4D: Conflict Resolution - Automated merge conflict handling in keepalive
- 4E: Verify-to-Issue - Create follow-up issues from verification feedback

Key decisions:
- Auto-pilot uses workflow_dispatch between steps (not chained labels)
- Conflict detection added to keepalive loop (not separate workflow)
- Verify-to-issue is user-triggered (not automatic, avoids false positives)

Also identifies 7 additional automation opportunities for future phases.

Testing plan defined for Manager-Database.

* Correct label analysis after codebase search + expand Phase 5

Label Analysis Corrections:
- agents:pause/paused ARE functional (keepalive_gate.js, keepalive-runner.js)
- agents:activated IS functional (agents_pr_meta_keepalive.js)
- from:codex/copilot ARE functional (merge_manager.js)
- automerge IS functional (merge_manager.js, agents_belt_scan.js)
- agents (bare) IS functional (agent_task.yml template)
- risk:low, ci:green, codex-ready ARE functional (merge_manager.js, issue templates)

Only 5-6 labels confirmed as bloat:
- codex (bare) - redundant with agent:codex
- ai:agent - zero matches
- auto-merge-audit - zero matches
- automerge:ok - zero matches
- architecture, backend, cli, etc. - repo-specific, not synced

Phase 5 Analysis:
- 5A: Auto-labeling - label_matcher.py EXISTS, ready for workflow
- 5B: Coverage check - maint-coverage-guard.yml EXISTS, add soft PR check
- 5C: Stale PR cleanup - not needed
- 5D: Dependabot - partial (auto-label exists, add auto-merge)
- 5E: Issue lint - soft warning approach
- 5F: Cross-repo linking - weekly scan with semantic_matcher.py
- 5G: Metrics - hybrid LangSmith (LLM) + custom (workflow)

* Consolidate agents:pause to agents:paused + expand Phase 4-5 plans

Label consolidation:
- Replace agents:pause with agents:paused in all source files
- Update keepalive_gate.js PAUSE_LABEL constant
- Update keepalive_orchestrator_gate_runner.js hardcoded check
- Update test to use agents:paused
- Update documentation in README, CLAUDE.md, GoalsAndPlumbing.md

Phase 4 updates:
- 4A: Add idiosyncratic repo bloat cleanup strategy (per-repo audit)
- 4B: Add optional issue creation feature to user guide (deferred)
- 4D: Full conflict resolution implementation with code examples
- 4E: Complete verify-to-issue workflow implementation

Phase 5 updates:
- 5F: Marked as SKIPPED (not needed per user decision)
- 5G: Full LangSmith integration plan + custom metrics

All keepalive tests pass (8/8).

* feat: Implement Phase 4-5 automation features

Phase 4 implementations:
- 4A: Add scripts/cleanup_labels.py for label auditing
  - Classifies labels as functional/bloat/idiosyncratic
  - Requires --confirm flag for actual deletion
  - Reports audit results with recommendations

- 4D: Add conflict detection for keepalive pipeline
  - .github/scripts/conflict_detector.js module
  - Detects conflicts from GitHub API, CI logs, PR comments
  - templates/consumer-repo/.github/codex/prompts/fix_merge_conflicts.md

- 4E: Add agents-verify-to-issue.yml workflow
  - Creates follow-up issues from verification feedback
  - User-triggered via verify:create-issue label
  - Extracts concerns and low scores automatically

Phase 5 implementations:
- 5A: Add agents-auto-label.yml workflow
  - Semantic label matching for new issues
  - 90% threshold for auto-apply, 75% for suggestions
  - Uses existing label_matcher.py script

- 5G: Add LangSmith tracing to tools/llm_provider.py
  - _setup_langsmith_tracing() function
  - Auto-configures when LANGSMITH_API_KEY present

Also:
- Update .github/sync-manifest.yml with new sync entries
- Update docs/LABELS.md with new label documentation

* fix: Address Copilot review comments on PR #650

Code quality improvements based on automated code review:

1. tools/llm_provider.py:
   - Fix LangSmith API key env var (LANGSMITH_API_KEY vs LANGCHAIN_API_KEY)
   - Improve f-string formatting for logging
   - Add usage comment for LANGSMITH_ENABLED constant

2. .github/scripts/conflict_detector.js:
   - Add debug logging in catch blocks instead of silent failures
   - Makes debugging easier when log downloads fail

3. .github/workflows/agents-verify-to-issue.yml:
   - Replace /tmp file usage with GitHub Actions environment files
   - Use heredoc delimiter for multi-line output
   - Consolidate find and extract steps for cleaner flow

4. .github/workflows/agents-auto-label.yml:
   - Make Workflows repo checkout configurable (not hardcoded)
   - Use github.paginate() for label retrieval (handles >100 labels)

5. templates/consumer-repo/.github/codex/prompts/fix_merge_conflicts.md:
   - Replace hardcoded 'main' with {{base_branch}} template variable
   - Make verification steps language-agnostic (not Python-specific)
   - Add note about checking project README for test commands

* fix: Fix CI test failures

1. Fix test_integration_template_installs_and_tests
   - The test used --user pip install flag which fails in virtualenvs
   - Added _in_virtualenv() helper to detect virtualenv environment
   - Only use --user flag when NOT in a virtualenv

2. Add new workflows to expected names mapping
   - agents-auto-label.yml: 'Auto-Label Issues'
   - agents-verify-to-issue.yml: 'Create Issue from Verification'

3. Update workflow documentation
   - docs/ci/WORKFLOWS.md: Added bullet points for new workflows
   - docs/ci/WORKFLOW_SYSTEM.md: Added table rows for new workflows

All 1120 tests now pass.

* fix: Remove duplicate env key in agents-auto-label.yml

actionlint was failing because the Match labels step had two env blocks.
Merged ISSUE_TITLE and ISSUE_BODY into the main env block.
stranske added a commit that referenced this pull request Jan 8, 2026
…#653)

* Add Phase 3 integration plan with testing cycle for Manager-Database

Phase 3: Pre-Agent Intelligence (4 capabilities)
- 3A: Capability Check - supplements agents:optimize with feasibility gate
  - Runs BEFORE agent assignment on Issues (not after)
  - Adds needs-human label when agent cannot proceed
- 3B: Task Decomposition - auto-split large issues
- 3C: Duplicate Detection - comment-only mode, track false positives
- 3D: Semantic Labeling - auto-suggest/apply labels

Testing Plan:
- Test repo: Manager-Database
- ~11 test issues across 4 capabilities
- False positive tracking for dedup (target: <5%)
- Metrics dashboard for validation

Also updates:
- Mark Collab-Admin PR #113 as merged (7/7 repos now synced)
- All immediate tasks completed
- Phase 3 ready to begin

* Add Phase 3 test issues for Manager-Database

Created 3 test issues:
- #193: Stripe integration (should FAIL capability check)
- #194: Health monitoring (should trigger task decomposition)
- #196: Manager list API (should detect as duplicate of #133)

Updated testing metrics dashboard to track progress.

* Add Phase 4: Full Automation & Cleanup plan

Phase 4 includes 5 initiatives:
- 4A: Label Cleanup - Remove bloat labels, standardize across 7 repos
- 4B: User Guide - Operational documentation for label system (sync to consumers)
- 4C: Auto-Pilot Label - End-to-end issue-to-merged-PR automation
- 4D: Conflict Resolution - Automated merge conflict handling in keepalive
- 4E: Verify-to-Issue - Create follow-up issues from verification feedback

Key decisions:
- Auto-pilot uses workflow_dispatch between steps (not chained labels)
- Conflict detection added to keepalive loop (not separate workflow)
- Verify-to-issue is user-triggered (not automatic, avoids false positives)

Also identifies 7 additional automation opportunities for future phases.

Testing plan defined for Manager-Database.

* Correct label analysis after codebase search + expand Phase 5

Label Analysis Corrections:
- agents:pause/paused ARE functional (keepalive_gate.js, keepalive-runner.js)
- agents:activated IS functional (agents_pr_meta_keepalive.js)
- from:codex/copilot ARE functional (merge_manager.js)
- automerge IS functional (merge_manager.js, agents_belt_scan.js)
- agents (bare) IS functional (agent_task.yml template)
- risk:low, ci:green, codex-ready ARE functional (merge_manager.js, issue templates)

Only 5-6 labels confirmed as bloat:
- codex (bare) - redundant with agent:codex
- ai:agent - zero matches
- auto-merge-audit - zero matches
- automerge:ok - zero matches
- architecture, backend, cli, etc. - repo-specific, not synced

Phase 5 Analysis:
- 5A: Auto-labeling - label_matcher.py EXISTS, ready for workflow
- 5B: Coverage check - maint-coverage-guard.yml EXISTS, add soft PR check
- 5C: Stale PR cleanup - not needed
- 5D: Dependabot - partial (auto-label exists, add auto-merge)
- 5E: Issue lint - soft warning approach
- 5F: Cross-repo linking - weekly scan with semantic_matcher.py
- 5G: Metrics - hybrid LangSmith (LLM) + custom (workflow)

* Consolidate agents:pause to agents:paused + expand Phase 4-5 plans

Label consolidation:
- Replace agents:pause with agents:paused in all source files
- Update keepalive_gate.js PAUSE_LABEL constant
- Update keepalive_orchestrator_gate_runner.js hardcoded check
- Update test to use agents:paused
- Update documentation in README, CLAUDE.md, GoalsAndPlumbing.md

Phase 4 updates:
- 4A: Add idiosyncratic repo bloat cleanup strategy (per-repo audit)
- 4B: Add optional issue creation feature to user guide (deferred)
- 4D: Full conflict resolution implementation with code examples
- 4E: Complete verify-to-issue workflow implementation

Phase 5 updates:
- 5F: Marked as SKIPPED (not needed per user decision)
- 5G: Full LangSmith integration plan + custom metrics

All keepalive tests pass (8/8).

* feat: Implement Phase 4-5 automation features

Phase 4 implementations:
- 4A: Add scripts/cleanup_labels.py for label auditing
  - Classifies labels as functional/bloat/idiosyncratic
  - Requires --confirm flag for actual deletion
  - Reports audit results with recommendations

- 4D: Add conflict detection for keepalive pipeline
  - .github/scripts/conflict_detector.js module
  - Detects conflicts from GitHub API, CI logs, PR comments
  - templates/consumer-repo/.github/codex/prompts/fix_merge_conflicts.md

- 4E: Add agents-verify-to-issue.yml workflow
  - Creates follow-up issues from verification feedback
  - User-triggered via verify:create-issue label
  - Extracts concerns and low scores automatically

Phase 5 implementations:
- 5A: Add agents-auto-label.yml workflow
  - Semantic label matching for new issues
  - 90% threshold for auto-apply, 75% for suggestions
  - Uses existing label_matcher.py script

- 5G: Add LangSmith tracing to tools/llm_provider.py
  - _setup_langsmith_tracing() function
  - Auto-configures when LANGSMITH_API_KEY present

Also:
- Update .github/sync-manifest.yml with new sync entries
- Update docs/LABELS.md with new label documentation

* fix: Address Copilot review comments on PR #650

Code quality improvements based on automated code review:

1. tools/llm_provider.py:
   - Fix LangSmith API key env var (LANGSMITH_API_KEY vs LANGCHAIN_API_KEY)
   - Improve f-string formatting for logging
   - Add usage comment for LANGSMITH_ENABLED constant

2. .github/scripts/conflict_detector.js:
   - Add debug logging in catch blocks instead of silent failures
   - Makes debugging easier when log downloads fail

3. .github/workflows/agents-verify-to-issue.yml:
   - Replace /tmp file usage with GitHub Actions environment files
   - Use heredoc delimiter for multi-line output
   - Consolidate find and extract steps for cleaner flow

4. .github/workflows/agents-auto-label.yml:
   - Make Workflows repo checkout configurable (not hardcoded)
   - Use github.paginate() for label retrieval (handles >100 labels)

5. templates/consumer-repo/.github/codex/prompts/fix_merge_conflicts.md:
   - Replace hardcoded 'main' with {{base_branch}} template variable
   - Make verification steps language-agnostic (not Python-specific)
   - Add note about checking project README for test commands

* fix: Fix CI test failures

1. Fix test_integration_template_installs_and_tests
   - The test used --user pip install flag which fails in virtualenvs
   - Added _in_virtualenv() helper to detect virtualenv environment
   - Only use --user flag when NOT in a virtualenv

2. Add new workflows to expected names mapping
   - agents-auto-label.yml: 'Auto-Label Issues'
   - agents-verify-to-issue.yml: 'Create Issue from Verification'

3. Update workflow documentation
   - docs/ci/WORKFLOWS.md: Added bullet points for new workflows
   - docs/ci/WORKFLOW_SYSTEM.md: Added table rows for new workflows

All 1120 tests now pass.

* fix: Remove duplicate env key in agents-auto-label.yml

actionlint was failing because the Match labels step had two env blocks.
Merged ISSUE_TITLE and ISSUE_BODY into the main env block.

* feat: Add Phase 3 workflows and sync configuration

Phase 3 Pre-Agent Intelligence workflows:
- agents-capability-check.yml: Pre-flight agent feasibility gate
- agents-decompose.yml: Task decomposition for large issues
- agents-dedup.yml: Duplicate detection using embeddings
- agents-auto-label.yml: Semantic label matching

Also includes:
- agents-verify-to-issue.yml: Create follow-up issues from verification (Phase 4E)
- Updated sync-manifest.yml with all new workflow entries
- pr_verifier.py: Auth error fallback for LLM provider resilience
- Tests for fallback behavior

All Phase 3 scripts have 129 tests passing.

* docs: Add comprehensive Phase 3 testing plan

- Mark all Phase 3 implementation tasks as complete
- Add detailed test suite with 12 specific test cases:
  - Suite A: Capability Check (3 tests)
  - Suite B: Task Decomposition (3 tests)
  - Suite C: Duplicate Detection (4 tests)
  - Suite D: Auto-Label (2 tests)
- Include pre-testing checklist and execution tracking table
- Add rollback plan and success criteria
- Include sample issue bodies for reproducible tests

* docs: Add deployment verification plan for cross-repo testing

Addresses known issue: verify:compare works on Travel-Plan-Permission
but fails on Trend_Model_Project PR #4249.

New deployment verification plan includes:
- Phase 1: Sync deployment tracking across all 7 repos
- Phase 2: Existing workflow verification (investigate failures)
- Phase 3: New workflow verification with specific test cases
- Phase 4: Troubleshooting guide for common issues
- Cross-repo verification summary with minimum pass criteria

Separates deployment verification from functional regression testing.

* docs: Resolve verify:compare investigation - PR not merged (expected behavior)

Investigation findings for Trend_Model_Project PR #4249:
- Root cause: PR is OPEN, not merged
- Verifier correctly skipped (designed for merged PRs only)
- verify:* labels missing in most repos (only Travel-Plan-Permission has them)
- Added label prerequisite checklist to deployment plan
- Updated verification summary with resolved status
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants