fix(keepalive): remove preflight dependency from run-codex job#109
fix(keepalive): remove preflight dependency from run-codex job#109
Conversation
The run-codex job was not appearing in workflow runs because adding preflight as a dependency with an output-based condition caused GitHub Actions to not create the reusable workflow job. This is a GitHub Actions behavior: reusable workflow jobs with complex dependency chains involving job outputs may not be created at all. Solution: Remove the preflight dependency from run-codex while keeping preflight for informational purposes in the summary job. Secret validation happens inside the reusable workflow itself.
Automated Status SummaryHead SHA: 36e9250
Coverage Overview
Coverage Trend
Updated automatically; will refresh on subsequent CI/Docker completions. Keepalive checklistScope
Tasks
Acceptance criteria
|
The autofix loop workflow had 'actions: read' permission but the reusable-codex-run.yml workflow declares 'actions: write'. This permission mismatch may have been causing the workflow to fail with startup_failure. Changed to 'actions: write' to match the reusable workflow and the keepalive loop which works correctly.
There was a problem hiding this comment.
Pull request overview
This PR fixes an issue where the run-codex job (calling the reusable-codex-run.yml workflow) was not appearing in workflow runs due to a complex dependency chain introduced in PR #107. The fix removes the preflight job from the run-codex dependencies and simplifies the conditional to only check the evaluate job's output.
Key changes:
- Removes preflight from run-codex job dependencies
- Simplifies the conditional from checking both
evaluate.outputs.action == 'run'ANDpreflight.outputs.secrets_ok == 'true'to only checkingevaluate.outputs.action == 'run' - Aligns the keepalive workflow pattern with the autofix workflow, which also calls reusable-codex-run.yml directly without a preflight check
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - evaluate | ||
| - preflight | ||
| if: needs.evaluate.outputs.action == 'run' && needs.preflight.outputs.secrets_ok == 'true' | ||
| if: needs.evaluate.outputs.action == 'run' |
There was a problem hiding this comment.
There's a validation inconsistency to be aware of: the preflight job validates that either CODEX_AUTH_JSON or WORKFLOWS_APP_ID is present (line 142 in the full file), but the reusable workflow strictly requires CODEX_AUTH_JSON and will fail if it's missing (see reusable-codex-run.yml:198-201).
This means if only WORKFLOWS_APP_ID credentials are configured, the preflight job would pass, but run-codex will fail later with "CODEX_AUTH_JSON secret is not set or empty."
Since this change removes the preflight dependency, the job will now run and fail during execution rather than being skipped upfront. Consider either:
- Updating preflight validation logic to match the actual requirements (require CODEX_AUTH_JSON)
- Removing the preflight job entirely if it's no longer serving a useful purpose
Note: This is a pre-existing inconsistency that becomes more visible with this change, not a new issue introduced by this PR.
Problem
Two issues preventing Codex from running:
1. Keepalive Loop: run-codex job not appearing
The
run-codexjob (calling reusable-codex-run.yml) was not appearing in workflow runs even whenaction == 'run'andsecrets_ok == 'true'.Root Cause: PR #107 added a preflight job dependency with an output-based condition:
This caused GitHub Actions to not create the reusable workflow job at all.
Evidence:
secrets_ok=true, butKeepalive next taskjob was completely missingKeepalive next taskjob appeared (though skipped due to evaluate condition)2. Autofix Loop: startup_failure on every run
The autofix loop has had
startup_failureon every single run since it was created.Root Cause: Permission mismatch - the autofix loop had
actions: readpermission but callsreusable-codex-run.ymlwhich declaresactions: write.Solutions
Keepalive Loop Fix
Remove the preflight dependency from run-codex:
Autofix Loop Fix
Change
actions: readtoactions: writeto match the reusable workflow.Testing
After merge:
agent:codexlabel). TheKeepalive next taskjob should now appear.startup_failure.Automated Status Summary
Scope
Tasks
Acceptance criteria
Head SHA: ea11578
Latest Runs: ✅ success — Gate
Required: gate: ✅ success