fix(sync): Use OWNER_PR_PAT for consumer repo sync#234
Conversation
SERVICE_BOT_PAT doesn't have push access to stranske/Template. Switch to OWNER_PR_PAT which should have access to all owner repos. Fallback order: OWNER_PR_PAT -> SERVICE_BOT_PAT
There was a problem hiding this comment.
Pull request overview
This PR fixes a 403 authorization error in the consumer repository sync workflow by switching from CODESPACES_WORKFLOWS token to OWNER_PR_PAT as the primary authentication token. The SERVICE_BOT_PAT is retained as a fallback.
Key Changes:
- Updated the primary sync token from
CODESPACES_WORKFLOWStoOWNER_PR_PATto provide access to user-owned repositories likestranske/Template
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| matrix: ${{ fromJson(needs.prepare.outputs.repos) }} | ||
| env: | ||
| SYNC_TOKEN: ${{ secrets.CODESPACES_WORKFLOWS || secrets.SERVICE_BOT_PAT }} | ||
| SYNC_TOKEN: ${{ secrets.OWNER_PR_PAT || secrets.SERVICE_BOT_PAT }} |
There was a problem hiding this comment.
Consider updating similar sync workflows (maint-65-sync-label-docs.yml and maint-69-sync-integration-repo.yml) to use the same token hierarchy. Both of these workflows currently use CODESPACES_WORKFLOWS || SERVICE_BOT_PAT and target some of the same repositories (e.g., stranske/Template), which could face the same 403 error issue this PR is addressing. This would ensure consistent token usage across all repository sync workflows.
Automated Status SummaryHead SHA: deaf670
Coverage Overview
Coverage Trend
Updated automatically; will refresh on subsequent CI/Docker completions. Keepalive checklistScopeNo scope information available Tasks
Acceptance criteria
|
🤖 Keepalive Loop StatusPR #234 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
…loop - Use nullish coalescing (??) instead of logical OR (||) for tasksTotal in work-log table rows so that 0 displays as "0" instead of "?" - Use previousState?.iteration ?? iteration instead of bare iteration in rounds_without_task_completion recalculation to stay consistent with the "current persisted iteration" rule (line 2739-2741) Both fixes address review feedback from Copilot on Counter_Risk PR #234. https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL
…loop (#1649) - Use nullish coalescing (??) instead of logical OR (||) for tasksTotal in work-log table rows so that 0 displays as "0" instead of "?" - Use previousState?.iteration ?? iteration instead of bare iteration in rounds_without_task_completion recalculation to stay consistent with the "current persisted iteration" rule (line 2739-2741) Both fixes address review feedback from Copilot on Counter_Risk PR #234. https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL Co-authored-by: Claude <noreply@anthropic.com>
* fix: correct tasksTotal falsy check and stale iteration in keepalive loop - Use nullish coalescing (??) instead of logical OR (||) for tasksTotal in work-log table rows so that 0 displays as "0" instead of "?" - Use previousState?.iteration ?? iteration instead of bare iteration in rounds_without_task_completion recalculation to stay consistent with the "current persisted iteration" rule (line 2739-2741) Both fixes address review feedback from Copilot on Counter_Risk PR #234. https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL * fix: prevent keepalive from stopping without fixing CI failures When all tasks were complete but Gate was failing (e.g., lint-ruff), the keepalive loop would stop after `complete_gate_failure_rounds` reached its max, without giving fix attempts a fair chance. Three interacting issues caused this: 1. The `complete-gate-failure-max` check fired BEFORE the fix classification logic in the decision tree, blocking fix attempts once the counter reached the max. 2. Transient gate states (cancelled) incremented the counter even though they don't represent actual fix failures, consuming the fix budget with infrastructure noise. 3. The `consecutive_fix_rounds` counter was reset on wait/stop actions, losing track of prior fix attempts. Changes: - Restructure evaluate decision tree: handle cancelled gates first (without consuming fix budget), then try fix before stopping when all tasks complete and gate failing - Only increment complete_gate_failure_rounds on actual agent execution rounds (fix/run/conflict), not on wait/skip/stop - Preserve consecutive_fix_rounds across wait/stop/defer actions (only reset on non-fix agent execution) - Increase default completeGateFailureMax from 2 to 3, allowing 2 fix attempts before stopping - Add 10 focused tests for counter behavior Fixes the issue seen in Counter_Risk PR #235 where the agent completed all 27 tasks but stopped with complete-gate-failure-max despite lint-ruff failures that could have been fixed. https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL * chore: sync template scripts * fix: address review comments on PR #1651 1. Fix infinite wait loop for non-fixable gate failures: remove `isAgentExecution` requirement from counter increment — the `gateActuallyFailed` check already filters transient states (cancelled/pending), so the counter advances on every genuine failure round regardless of action type. 2. Sync template keepalive_loop.js with all main file changes: - Restructured evaluate decision tree (cancelled → allComplete → remaining) - Updated counter logic (increment on actual failure, preserve on non-success) - Fix round preservation across wait/stop/defer actions - Default completeGateFailureMax 2 → 3 3. Add test for the non-fixable wait+failure scenario; update stop-action test expectation to match new counter semantics. https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL * fix: update test to expect gate-cancelled-transient reason The cancelled gate test expected 'gate-cancelled' but the code now returns 'gate-cancelled-transient' for non-rate-limit cancellations. Updated the assertion to match. https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
The sync workflow was failing for
stranske/Templatewith a 403 error becauseSERVICE_BOT_PATdoesn't have push access to that repo.Fix
Switch from
CODESPACES_WORKFLOWS || SERVICE_BOT_PATtoOWNER_PR_PAT || SERVICE_BOT_PAT.OWNER_PR_PATshould have access to all repos owned by the user, including Template.Testing
After merge, re-run Maint 68 workflow and verify Template gets a sync PR.