Skip to content

fix(sync): Use OWNER_PR_PAT for consumer repo sync#234

Merged
stranske merged 1 commit intomainfrom
fix/sync-use-github-app
Dec 28, 2025
Merged

fix(sync): Use OWNER_PR_PAT for consumer repo sync#234
stranske merged 1 commit intomainfrom
fix/sync-use-github-app

Conversation

@stranske
Copy link
Copy Markdown
Owner

Summary

The sync workflow was failing for stranske/Template with a 403 error because SERVICE_BOT_PAT doesn't have push access to that repo.

Fix

Switch from CODESPACES_WORKFLOWS || SERVICE_BOT_PAT to OWNER_PR_PAT || SERVICE_BOT_PAT.

OWNER_PR_PAT should have access to all repos owned by the user, including Template.

Testing

After merge, re-run Maint 68 workflow and verify Template gets a sync PR.

SERVICE_BOT_PAT doesn't have push access to stranske/Template.
Switch to OWNER_PR_PAT which should have access to all owner repos.

Fallback order: OWNER_PR_PAT -> SERVICE_BOT_PAT
Copilot AI review requested due to automatic review settings December 28, 2025 00:15
@agents-workflows-bot
Copy link
Copy Markdown
Contributor

⚠️ Action Required: Unable to determine source issue for PR #234. The PR title, branch name, or body must contain the issue number (e.g. #123, branch: issue-123, or the hidden marker ).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a 403 authorization error in the consumer repository sync workflow by switching from CODESPACES_WORKFLOWS token to OWNER_PR_PAT as the primary authentication token. The SERVICE_BOT_PAT is retained as a fallback.

Key Changes:

  • Updated the primary sync token from CODESPACES_WORKFLOWS to OWNER_PR_PAT to provide access to user-owned repositories like stranske/Template

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

matrix: ${{ fromJson(needs.prepare.outputs.repos) }}
env:
SYNC_TOKEN: ${{ secrets.CODESPACES_WORKFLOWS || secrets.SERVICE_BOT_PAT }}
SYNC_TOKEN: ${{ secrets.OWNER_PR_PAT || secrets.SERVICE_BOT_PAT }}
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider updating similar sync workflows (maint-65-sync-label-docs.yml and maint-69-sync-integration-repo.yml) to use the same token hierarchy. Both of these workflows currently use CODESPACES_WORKFLOWS || SERVICE_BOT_PAT and target some of the same repositories (e.g., stranske/Template), which could face the same 403 error issue this PR is addressing. This would ensure consistent token usage across all repository sync workflows.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

Automated Status Summary

Head SHA: deaf670
Latest Runs: ⏳ pending — Gate
Required contexts: Gate / gate, Health 45 Agents Guard / Enforce agents workflow protections
Required: core tests (3.11): ⏳ pending, core tests (3.12): ⏳ pending, docker smoke: ⏳ pending, gate: ⏳ pending

Workflow / Job Result Logs
(no jobs reported) ⏳ pending

Coverage Overview

  • Coverage history entries: 1

Coverage Trend

Metric Value
Current 77.97%
Baseline 0.00%
Delta +77.97%
Minimum 70.00%
Status ✅ Pass

Updated automatically; will refresh on subsequent CI/Docker completions.


Keepalive checklist

Scope

No scope information available

Tasks

  • No tasks defined

Acceptance criteria

  • No acceptance criteria defined

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Keepalive Loop Status

PR #234 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Gate success
Tasks 0/0 complete
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

@stranske stranske merged commit 0c78414 into main Dec 28, 2025
42 checks passed
@stranske stranske deleted the fix/sync-use-github-app branch December 28, 2025 00:17
stranske pushed a commit that referenced this pull request Feb 24, 2026
…loop

- Use nullish coalescing (??) instead of logical OR (||) for tasksTotal
  in work-log table rows so that 0 displays as "0" instead of "?"
- Use previousState?.iteration ?? iteration instead of bare iteration
  in rounds_without_task_completion recalculation to stay consistent
  with the "current persisted iteration" rule (line 2739-2741)

Both fixes address review feedback from Copilot on Counter_Risk PR #234.

https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL
stranske added a commit that referenced this pull request Feb 24, 2026
…loop (#1649)

- Use nullish coalescing (??) instead of logical OR (||) for tasksTotal
  in work-log table rows so that 0 displays as "0" instead of "?"
- Use previousState?.iteration ?? iteration instead of bare iteration
  in rounds_without_task_completion recalculation to stay consistent
  with the "current persisted iteration" rule (line 2739-2741)

Both fixes address review feedback from Copilot on Counter_Risk PR #234.

https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL

Co-authored-by: Claude <noreply@anthropic.com>
stranske added a commit that referenced this pull request Feb 24, 2026
* fix: correct tasksTotal falsy check and stale iteration in keepalive loop

- Use nullish coalescing (??) instead of logical OR (||) for tasksTotal
  in work-log table rows so that 0 displays as "0" instead of "?"
- Use previousState?.iteration ?? iteration instead of bare iteration
  in rounds_without_task_completion recalculation to stay consistent
  with the "current persisted iteration" rule (line 2739-2741)

Both fixes address review feedback from Copilot on Counter_Risk PR #234.

https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL

* fix: prevent keepalive from stopping without fixing CI failures

When all tasks were complete but Gate was failing (e.g., lint-ruff),
the keepalive loop would stop after `complete_gate_failure_rounds`
reached its max, without giving fix attempts a fair chance. Three
interacting issues caused this:

1. The `complete-gate-failure-max` check fired BEFORE the fix
   classification logic in the decision tree, blocking fix attempts
   once the counter reached the max.

2. Transient gate states (cancelled) incremented the counter even
   though they don't represent actual fix failures, consuming the
   fix budget with infrastructure noise.

3. The `consecutive_fix_rounds` counter was reset on wait/stop
   actions, losing track of prior fix attempts.

Changes:
- Restructure evaluate decision tree: handle cancelled gates first
  (without consuming fix budget), then try fix before stopping when
  all tasks complete and gate failing
- Only increment complete_gate_failure_rounds on actual agent
  execution rounds (fix/run/conflict), not on wait/skip/stop
- Preserve consecutive_fix_rounds across wait/stop/defer actions
  (only reset on non-fix agent execution)
- Increase default completeGateFailureMax from 2 to 3, allowing
  2 fix attempts before stopping
- Add 10 focused tests for counter behavior

Fixes the issue seen in Counter_Risk PR #235 where the agent
completed all 27 tasks but stopped with complete-gate-failure-max
despite lint-ruff failures that could have been fixed.

https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL

* chore: sync template scripts

* fix: address review comments on PR #1651

1. Fix infinite wait loop for non-fixable gate failures: remove
   `isAgentExecution` requirement from counter increment — the
   `gateActuallyFailed` check already filters transient states
   (cancelled/pending), so the counter advances on every genuine
   failure round regardless of action type.

2. Sync template keepalive_loop.js with all main file changes:
   - Restructured evaluate decision tree (cancelled → allComplete → remaining)
   - Updated counter logic (increment on actual failure, preserve on non-success)
   - Fix round preservation across wait/stop/defer actions
   - Default completeGateFailureMax 2 → 3

3. Add test for the non-fixable wait+failure scenario; update
   stop-action test expectation to match new counter semantics.

https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL

* fix: update test to expect gate-cancelled-transient reason

The cancelled gate test expected 'gate-cancelled' but the code now
returns 'gate-cancelled-transient' for non-rate-limit cancellations.
Updated the assertion to match.

https://claude.ai/code/session_012WnYCcttvFEY3FETnhVcNL

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants