Skip to content

chore: sync workflow templates#272

Merged
stranske merged 1 commit intomainfrom
sync/workflows-bc1620a3c7c4
Feb 26, 2026
Merged

chore: sync workflow templates#272
stranske merged 1 commit intomainfrom
sync/workflows-bc1620a3c7c4

Conversation

@stranske
Copy link
Copy Markdown
Owner

Sync Summary

Files Updated

  • agents-auto-pilot.yml: Auto-pilot - end-to-end automation orchestrator (format → optimize → agent → verify)
  • keepalive_loop.js: Core keepalive loop logic

Files Skipped

  • pr-00-gate.yml: File exists and sync_mode is create_only
  • ci.yml: File exists and sync_mode is create_only
  • dependabot.yml: File exists and sync_mode is create_only
  • llm_slots.json: None

Review Checklist

  • CI passes with updated workflows
  • No repo-specific customizations were overwritten

Source: stranske/Workflows
Manifest: .github/sync-manifest.yml

Automated sync from stranske/Workflows
Template hash: bc1620a3c7c4

Changes synced from sync-manifest.yml
Copilot AI review requested due to automatic review settings February 26, 2026 18:59
@stranske stranske added sync Automated sync from Workflows automated Automated sync from Workflows labels Feb 26, 2026
@stranske-keepalive
Copy link
Copy Markdown
Contributor

⚠️ Action Required: Unable to determine source issue for PR #272. The PR title, branch name, or body must contain the issue number (e.g. #123, branch: issue-123, or the hidden marker ).

@stranske-keepalive
Copy link
Copy Markdown
Contributor

stranske-keepalive bot commented Feb 26, 2026

🤖 Keepalive Loop Status

PR #272 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Disposition skipped (transient)
Gate success
Tasks 0/6 complete
Timeout 45 min (default)
Timeout usage 9m elapsed (21%, 36m remaining)
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

@stranske-keepalive
Copy link
Copy Markdown
Contributor

stranske-keepalive bot commented Feb 26, 2026

Keepalive Work Log (click to expand)
# Time (UTC) Agent Action Result Files Tasks Progress Commit Gate
0 2026-02-26 19:01:18 Codex wait (missing-agent-label-transient) skipped 0 0/6
0 2026-02-26 19:02:27 Codex wait (missing-agent-label-transient) skipped 0 0/6 cancelled
0 2026-02-26 19:09:52 Codex wait (missing-agent-label-transient) skipped 0 0/6 success

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR syncs workflow templates from the stranske/Workflows repository, updating the auto-pilot orchestration and keepalive loop logic. The changes enhance the reliability of belt dispatcher workflows by adding retry logic with verification, implementing automatic re-dispatch when dispatchers fail, and improving keepalive loop behavior after review actions.

Changes:

  • Enhanced belt dispatcher with 3-attempt retry loop and run verification to handle GitHub Actions silent cancellations (observed in issue #34)
  • Added automatic belt dispatcher re-dispatch in branch-check backoff logic when no active dispatcher is found
  • Reset rounds_without_task_completion counter to 0 after review actions to give agents a chance to act on feedback

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
.github/workflows/agents-auto-pilot.yml Added belt dispatcher retry logic with verification (3 attempts, 15s wait + verification per attempt) and re-dispatch logic in branch-check backoff to handle missing/cancelled dispatchers
.github/scripts/keepalive_loop.js Added special case to reset rounds_without_task_completion counter after review actions, matching the existing force_retry pattern

Comment on lines +2375 to +2427
// Re-dispatch the belt if no recent dispatcher run is active
// for this issue. Only consider runs created in the last 30
// minutes to avoid matching stale runs for other issues.
let redispatched = false;
try {
const { data: runs } = await withRetry((client) =>
client.rest.actions.listWorkflowRuns({
owner: context.repo.owner,
repo: context.repo.repo,
workflow_id: 'agents-71-codex-belt-dispatcher.yml',
per_page: 10,
})
);
const cutoff = new Date(Date.now() - 30 * 60 * 1000);
const recentRuns = runs.workflow_runs.filter(
r => new Date(r.created_at) >= cutoff &&
r.event === 'workflow_dispatch'
);
const alive = recentRuns.find(
r => r.status === 'queued' || r.status === 'in_progress'
);
if (!alive) {
core.info(
`No active belt dispatcher run in last 30m ` +
`(${recentRuns.length} recent runs checked); re-dispatching`
);
const { data: repoInfo } = await withRetry((client) =>
client.rest.repos.get({
owner: context.repo.owner,
repo: context.repo.repo,
})
);
const dispatchRef = repoInfo.default_branch || 'main';
await withRetry((client) =>
client.rest.actions.createWorkflowDispatch({
owner: context.repo.owner,
repo: context.repo.repo,
workflow_id: 'agents-71-codex-belt-dispatcher.yml',
ref: dispatchRef,
inputs: {
agent_key: agentKey,
force_issue: String(issueNumber),
dry_run: 'false',
},
})
);
redispatched = true;
core.info(`Re-dispatched belt for issue #${issueNumber}`);
} else {
core.info(
`Belt dispatcher run ${alive.id} still ${alive.status}; skipping re-dispatch`
);
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The re-dispatch logic checks for any active belt dispatcher run in the last 30 minutes but doesn't verify it's for the current issue. The comment at line 2376 states "Re-dispatch the belt if no recent dispatcher run is active for this issue", but the code at lines 2393-2395 checks for any run with status 'queued' or 'in_progress' without filtering by the issue number.

This could cause the re-dispatch to be skipped when another issue's belt dispatcher is running, even though the current issue might need its own dispatcher to be re-dispatched. Consider filtering the recentRuns to only include runs for the current issue, possibly by checking the workflow run's inputs or associated pull request.

Copilot uses AI. Check for mistakes.
} catch (checkError) {
core.warning(
`Could not verify dispatcher run after attempt ${attempt}: ` +
`${checkError?.message}; status unknown, will retry.`
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message at line 1965 always says "will retry" regardless of which attempt it is. On the final attempt (attempt 3), this message is misleading because there won't be another retry. Consider making this message conditional like the one at line 1960, showing either "will retry" or "no more attempts" based on whether attempt is less than maxDispatchAttempts.

Suggested change
`${checkError?.message}; status unknown, will retry.`
`${checkError?.message}; ` +
(attempt < maxDispatchAttempts ? 'status unknown, will retry.' : 'status unknown, no more attempts.')

Copilot uses AI. Check for mistakes.
@stranske stranske merged commit fff8455 into main Feb 26, 2026
108 of 117 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automated Automated sync from Workflows sync Automated sync from Workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants