Skip to content

fix: improve Codex coverage verification to prevent timeouts#386

Merged
stranske merged 2 commits intomainfrom
fix/codex-coverage-verification
Dec 31, 2025
Merged

fix: improve Codex coverage verification to prevent timeouts#386
stranske merged 2 commits intomainfrom
fix/codex-coverage-verification

Conversation

@stranske
Copy link
Copy Markdown
Owner

@stranske stranske commented Dec 31, 2025

Automated Status Summary

Scope

Test coverage is at 71.60% with 17 scripts below 95% coverage. Low coverage makes scripts risky to modify and harder to maintain.

Tasks

  • IMPORTANT: A task is NOT complete until you run pytest tests/ --cov=scripts --cov-report=term-missing and verify the script shows ≥95% in the Cover column. Do not mark a task complete without verifying the actual coverage percentage.
  • Add tests for scripts/sync_tool_versions.py until coverage ≥95% (currently 0.00%)
  • Add tests for scripts/update_residual_history.py until coverage ≥95% (currently 0.00%)
  • Add tests for scripts/validate_version_pins.py until coverage ≥95% (currently 0.00%)
  • Add tests for scripts/sync_test_dependencies.py until coverage ≥95% (currently 15.32%)
  • Add tests for scripts/auto_type_hygiene.py until coverage ≥95% (currently 34.78%)
  • Add tests for scripts/keepalive_metrics_collector.py until coverage ≥95% (currently 46.48%)
  • Add tests for scripts/keepalive_metrics_dashboard.py until coverage ≥95% (currently 56.67%)
  • Add tests for scripts/workflow_health_check.py until coverage ≥95% (currently 62.62%)
  • Add tests for scripts/classify_test_failures.py until coverage ≥95% (currently 62.87%)
  • Add tests for scripts/mypy_autofix.py until coverage ≥95% (currently 63.08%)
  • Add tests for scripts/ledger_validate.py until coverage ≥95% (currently 65.32%)
  • Add tests for scripts/mypy_return_autofix.py until coverage ≥95% (currently 82.55%)
  • Add tests for scripts/ledger_migrate_base.py until coverage ≥95% (currently 85.48%)
  • Add tests for scripts/ci_failure_analyzer.py until coverage ≥95% (currently 87.35%)
  • Add tests for scripts/fix_cosmetic_aggregate.py until coverage ≥95% (currently 92.31%)
  • Add tests for scripts/coverage_history_append.py until coverage ≥95% (currently 92.75%)
  • Add tests for scripts/workflow_validator.py until coverage ≥95% (currently 93.27%)

Acceptance criteria

  • Before marking ANY task complete, you MUST:
  • 1. Run pytest tests/ --cov=scripts --cov-report=term-missing
  • 2. Find the script in the output table
  • 3. Confirm the Cover column shows ≥95%
  • 4. Only then mark that specific task complete
  • Overall coverage ≥95%
  • Each script in scripts/ shows ≥95% in coverage output
  • All existing tests pass
  • New tests in tests/scripts/ directory

Head SHA: 19f0595
Latest Runs: ❔ in progress — Gate
Required: gate: ❔ in progress

Workflow / Job Result Logs
Agents PR meta manager ❔ in progress View run
CI Autofix Loop ❔ in progress View run
Copilot code review ❔ in progress View run
Gate ❔ in progress View run
Health 40 Sweep ✅ success View run
Health 44 Gate Branch Protection ❔ in progress View run
Health 45 Agents Guard ✅ success View run
Health 50 Security Scan ❔ in progress View run
Maint 52 Validate Workflows ✅ success View run
PR 11 - Minimal invariant CI ✅ success View run
Selftest CI ❔ in progress View run
Validate Sync Manifest ✅ success View run

- Add @slow marker to 3 integration tests that run external tools
- Update all keepalive prompts to use targeted coverage verification:
  - Use -m 'not slow' to skip slow tests
  - Use --cov=scripts/specific_module for faster feedback
  - Run specific test file when available
- Prevents Codex timeout issues during coverage verification
Copilot AI review requested due to automatic review settings December 31, 2025 14:44
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions github-actions bot added the autofix Opt-in automated formatting & lint remediation label Dec 31, 2025
@agents-workflows-bot
Copy link
Copy Markdown
Contributor

⚠️ Action Required: Unable to determine source issue for PR #386. The PR title, branch name, or body must contain the issue number (e.g. #123, branch: issue-123, or the hidden marker ).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 31, 2025

Status | ✅ no new diagnostics
History points | 1
Timestamp | 2025-12-31 14:46:03 UTC
Report artifact | autofix-report-pr-386
Remaining | 0
New | 0
No additional artifacts

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents CI timeout issues during Codex coverage verification by optimizing test execution strategies. The solution involves marking slow integration tests for selective exclusion and providing Codex with targeted coverage verification instructions to avoid running the full test suite unnecessarily.

Key Changes:

  • Added @slow markers to three integration tests that invoke external tools (ruff, isort, black, mypy)
  • Updated coverage verification instructions across all Codex prompt templates to use targeted coverage checks and skip slow tests
  • Introduced pytest options (-m "not slow", -x, targeted --cov) to reduce execution time

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/workflows/test_autofix_pipeline_diverse.py Added @slow marker to integration test that handles diverse autofix scenarios
tests/workflows/test_autofix_pipeline.py Added @slow marker to integration test for trivial ruff issues
tests/workflows/test_autofix_full_pipeline.py Added @slow marker to integration test for full lint and typing pipeline
.github/templates/keepalive-instruction.md Updated coverage verification instructions with targeted pytest commands and slow test exclusion
.github/codex/prompts/keepalive_next_task.md Updated coverage verification instructions with targeted pytest commands and slow test exclusion
templates/consumer-repo/.github/templates/keepalive-instruction.md Updated coverage verification instructions with targeted pytest commands and slow test exclusion (template)
templates/consumer-repo/.github/codex/prompts/keepalive_next_task.md Updated coverage verification instructions with targeted pytest commands and slow test exclusion (template)
agents/codex-prompt.md Updated coverage verification instructions with targeted pytest commands and slow test exclusion

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@stranske stranske merged commit ffdf1d5 into main Dec 31, 2025
23 checks passed
@stranske stranske deleted the fix/codex-coverage-verification branch December 31, 2025 14:48
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 31, 2025

Automated Status Summary

Head SHA: 590246a
Latest Runs: ⏳ pending — Gate
Required contexts: Gate / gate, Health 45 Agents Guard / Enforce agents workflow protections
Required: core tests (3.11): ⏳ pending, core tests (3.12): ⏳ pending, docker smoke: ⏳ pending, gate: ⏳ pending

Workflow / Job Result Logs
(no jobs reported) ⏳ pending

Coverage Overview

  • Coverage history entries: 1

Coverage Trend

Metric Value
Current 0.00%
Baseline 85.00%
Delta -85.00%
Minimum 70.00%
Status ❌ Below minimum

Updated automatically; will refresh on subsequent CI/Docker completions.


Keepalive checklist

Scope

Test coverage is at 71.60% with 17 scripts below 95% coverage. Low coverage makes scripts risky to modify and harder to maintain.

Tasks

  • IMPORTANT: A task is NOT complete until you run pytest tests/ --cov=scripts --cov-report=term-missing and verify the script shows ≥95% in the Cover column. Do not mark a task complete without verifying the actual coverage percentage.
  • Add tests for scripts/sync_tool_versions.py until coverage ≥95% (currently 0.00%)
  • Add tests for scripts/update_residual_history.py until coverage ≥95% (currently 0.00%)
  • Add tests for scripts/validate_version_pins.py until coverage ≥95% (currently 0.00%)
  • Add tests for scripts/sync_test_dependencies.py until coverage ≥95% (currently 15.32%)
  • Add tests for scripts/auto_type_hygiene.py until coverage ≥95% (currently 34.78%)
  • Add tests for scripts/keepalive_metrics_collector.py until coverage ≥95% (currently 46.48%)
  • Add tests for scripts/keepalive_metrics_dashboard.py until coverage ≥95% (currently 56.67%)
  • Add tests for scripts/workflow_health_check.py until coverage ≥95% (currently 62.62%)
  • Add tests for scripts/classify_test_failures.py until coverage ≥95% (currently 62.87%)
  • Add tests for scripts/mypy_autofix.py until coverage ≥95% (currently 63.08%)
  • Add tests for scripts/ledger_validate.py until coverage ≥95% (currently 65.32%)
  • Add tests for scripts/mypy_return_autofix.py until coverage ≥95% (currently 82.55%)
  • Add tests for scripts/ledger_migrate_base.py until coverage ≥95% (currently 85.48%)
  • Add tests for scripts/ci_failure_analyzer.py until coverage ≥95% (currently 87.35%)
  • Add tests for scripts/fix_cosmetic_aggregate.py until coverage ≥95% (currently 92.31%)
  • Add tests for scripts/coverage_history_append.py until coverage ≥95% (currently 92.75%)
  • Add tests for scripts/workflow_validator.py until coverage ≥95% (currently 93.27%)

Acceptance criteria

  • Before marking ANY task complete, you MUST:
  • 1. Run pytest tests/ --cov=scripts --cov-report=term-missing
  • 2. Find the script in the output table
  • 3. Confirm the Cover column shows ≥95%
  • 4. Only then mark that specific task complete
  • Overall coverage ≥95%
  • Each script in scripts/ shows ≥95% in coverage output
  • All existing tests pass
  • New tests in tests/scripts/ directory

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 31, 2025

🤖 Keepalive Loop Status

PR #386 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Gate success
Tasks 20/27 complete
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autofix Opt-in automated formatting & lint remediation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants