feat: pulse CI self-healing — re-run stale checks after workflow fixes merge#2981
feat: pulse CI self-healing — re-run stale checks after workflow fixes merge#2981marcusquinn merged 1 commit intomainfrom
Conversation
…ow fixes merge The previous CI pattern detection (GH#2975) could detect systemic failures and file issues, but existing PRs retained stale failed/cancelled checks even after the workflow bug was fixed on main. New workflow code only runs on new events, so pre-fix PRs stay UNSTABLE indefinitely. Added self-healing logic: after detecting a systemic pattern AND confirming the fix has merged (closed issue with merged PR, or the check passes on recent PRs), the pulse re-runs the stale failed workflow runs on affected PRs. This completes the detect-fix-heal cycle. Guard rails: only re-run when fix is confirmed on main, only the specific failed run (not all checks), max 10 re-runs per cycle, don't re-run if the re-run itself fails.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Mar 6 05:02:41 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|



Summary
Completes the detect-fix-heal cycle for systemic CI failures by adding self-healing to the pulse's CI pattern detection section.
Problem
PR #2976 added detection (identify systemic CI patterns) and PR #2974 fixed the specific bugs. But existing PRs retained stale failed/cancelled check results because the new workflow code only runs on new events. Without manual intervention (
gh run rerun), those PRs stay UNSTABLE indefinitely — the pulse could detect the pattern was fixed but couldn't heal the affected PRs.Fix
After detecting a systemic CI failure pattern AND confirming the fix has merged (closed issue with merged PR, or the check passes on recently-created PRs), the pulse now re-runs the stale failed workflow runs on affected PRs using
gh run rerun <run_id>.Guard rails:
The complete cycle
Testing
This is agent guidance (markdown), not executable code. Validated by reviewing the logic against the GH#2973 scenario — the pulse would have detected the pattern, filed an issue, a worker would fix the workflow, and on the next cycle the pulse would re-run the 9 stale checks automatically.