fix(keepalive): prevent race condition from resetting iteration counter#129
fix(keepalive): prevent race condition from resetting iteration counter#129
Conversation
The summary job was using inputs.iteration (from evaluate job at workflow START) instead of reading the current state's iteration. When multiple runs overlap or a later run's evaluate runs before an earlier run's summary saves, stale iteration values overwrite newer ones. Now reads iteration from the current persisted state before calculating nextIteration, ensuring we never lose iteration progress due to timing.
Automated Status SummaryHead SHA: 2afab0f
Coverage Overview
Coverage Trend
Updated automatically; will refresh on subsequent CI/Docker completions. Keepalive checklistScopeNo scope information available Tasks
Acceptance criteria
|
🤖 Keepalive Loop StatusPR #129 | Agent: Codex | Iteration 0/5 Current State
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a race condition in the keepalive loop's iteration counter where stale iteration values from the evaluate job could overwrite newer values in persisted state. When multiple workflow runs overlap, the evaluate job captures state at workflow start, but by the time the summary job runs, another workflow may have already incremented the iteration, causing the counter to reset incorrectly.
Key Changes:
- Modified
updateKeepaliveLoopSummaryto read iteration from persisted state instead of trusting potentially stale input values - Added comprehensive test coverage for the race condition scenario
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| .github/scripts/keepalive_loop.js | Updated updateKeepaliveLoopSummary to prioritize previousState.iteration over inputs.iteration, preventing stale values from overwriting current state |
| .github/scripts/tests/keepalive-loop.test.js | Added test case that simulates race condition with stale inputs but current state, verifying iteration is preserved from state |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Problem
PR #124's iteration counter was showing 1 after two successful Codex runs. The keepalive loop summary showed
iteration:0in state despite multiple successful runs completing.Root Cause
The
updateKeepaliveLoopSummaryfunction usedinputs.iteration(from the evaluate job at workflow START) instead of reading the current state's iteration. When multiple runs overlap or a later run's evaluate job runs before an earlier run's summary saves, stale iteration values overwrite newer ones.In the logs, you could see:
The evaluate job reads state at the START of the workflow, but by the time the summary job runs, another workflow may have already incremented the iteration.
Solution
The summary job now reads the iteration from the current persisted state (
previousState.iteration) before calculatingnextIteration, rather than trusting the potentially-staleinputs.iteration.Testing
Related
Discovered while investigating #124 iteration counter not incrementing
Automated Status Summary
Scope
GITHUB_STEP_SUMMARYoutput so iteration results are visible in the Actions UITasks
agent:codexlabelagents-keepalive-loop.ymlafter agent runbuildStatusBlock()inagents_pr_meta_update_body.jsto acceptagentTypeparameteragentTypeis set (CLI agent): hide workflow table, hide head SHA/required checksagent:*label):<!-- gate-summary: -->comment posting (use step summary instead)<!-- keepalive-round: N -->instruction comments (task appendix replaces this)<!-- keepalive-loop-summary -->to be the single source of truthagent:*label):<!-- gate-summary: -->commentagent_typeoutput to detect job so downstream workflows know the modeagents-pr-meta.ymlto conditionally skip gate summary for CLI agent PRsAcceptance criteria
Head SHA: e16dbd9
Latest Runs: ✅ success — Gate
Required: gate: ✅ success