fix: label sync workflow js-yaml install#1662
Conversation
GitHub Actions ::warning:: commands truncate/mangle multi-line content. Emit a short annotation message and print full npm stderr in a collapsible ::group:: instead, so logs stay readable. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
…ation fixes Mirror the main setup-api-client changes into the consumer-repo template to prevent template drift: - Exponential backoff retry (3 attempts, 5s/10s) for transient npm errors - --legacy-peer-deps fallback on first failure - Short ::warning:: annotations with full stderr in collapsible ::group:: - Pin lru-cache@10.4.3 (was ^10.0.0) https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
…feguard Three changes to reusable-codex-run.yml to prevent work loss on timeout: 1. Pre-timeout watchdog: A background timer fires 5 minutes before max_runtime_minutes, committing and pushing any uncommitted work so it survives the job cancellation. Killed automatically if Codex finishes before the timer fires. 2. Robust parser import: Replace sys.path-based import of codex_jsonl_parser with importlib.util.spec_from_file_location. Consumer repos (e.g. Counter_Risk) have their own tools/ package with __init__.py that shadows the Workflows tools/ on sys.path, causing "No module named 'tools.codex_jsonl_parser'". 3. Commit step always runs: Add if: always() to the "Commit and push changes" step so uncommitted work is captured even on non-zero exit codes (the watchdog handles timeout, this handles failures). https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
parseCheckboxStates() and mergeCheckboxStates() only matched top-level checkboxes (^- \[), ignoring indented sub-tasks ( - \[). When PR Meta regenerated the PR body from the issue, auto-reconciled sub-task checkboxes were silently reverted to unchecked. This caused the keepalive loop to stall with rounds_without_task_completion: 8 despite the agent completing real work — PR #256 had 5 tasks auto-checked then immediately un-checked on every push. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
- P1: Add fetch/rebase before watchdog push to avoid non-fast-forward rejection when another workflow updates the branch during the run. Includes one retry with re-fetch/rebase and merge fallback. - P2: Export watchdog-saved in on.workflow_call.outputs so callers of the reusable workflow can observe the signal. - Copilot: Add git fetch before checking FETCH_HEAD to ensure it exists and is current (actions/checkout doesn't set FETCH_HEAD). - Copilot: Initialize watchdog-saved=false before background subshell so downstream consumers always get a defined value. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Update WORKFLOW_OUTPUTS.md to include the new watchdog-saved output from reusable-codex-run.yml, fixing the test_reusable_workflow_outputs_documented test. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
The body scan in extractIssueNumberFromPull was treating patterns like "Run #2615" as issue references, causing the Upsert PR body sections check to fail with a 404 when trying to fetch non-existent issues. Add a preceding-word filter to skip #NNN when preceded by common non-issue words (run, attempt, step, job, check, task, version, v). Add 12 unit tests covering the extraction logic. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
… to Claude runner Closes the three remaining feature gaps between the Claude and Codex runners identified in issue #1646: 1. **Session analysis (LLM-powered)**: Reuses analyze_codex_session.py which auto-detects Claude's plain-text session log (data_source=summary) and feeds it through the same LLM analysis pipeline for structured task completion assessment. Outputs feed into the keepalive loop. 2. **Completion checkpoint comment**: Posts a PR comment summarizing completed tasks and acceptance criteria using the shared post_completion_comment.js script. Supports both claude-prompt*.md and codex-prompt*.md file names. 3. **Error diagnostics**: Adds GITHUB_STEP_SUMMARY with error table, creates a diagnostics artifact (JSON + agent output), and posts a structured PR comment on non-transient failures with recovery guidance and log links. Uses a distinct <!-- claude-failure-notification --> marker. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Claude runner (reusable-claude-run.yml):
- Fix shell quoting of completed-tasks JSON by using env vars instead
of inline ${{ }} expansion which breaks on apostrophes in task names
- Declare OPENAI_API_KEY and CLAUDE_API_STRANSKE in workflow_call.secrets
so callers can pass them (matches Codex runner)
- Use printf instead of echo when writing PR body to disk to avoid
mangling of -n/-e prefixes or backslashes
- Add info log when falling back to codex-prompt file
Codex runner (reusable-codex-run.yml):
- Gate watchdog-saved=true on actual push success instead of emitting
it unconditionally after push attempts that may have both failed
- Use a fired-flag file so the watchdog kill only terminates the
background process if it's still sleeping (hasn't started its
commit/push work yet)
https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
All four conflicts were in reusable-codex-run.yml watchdog code where our branch has the fired-flag and push-success-gating improvements vs the unchanged main version. Kept our (HEAD) version for all. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
- Remove "task" from the non-issue prefix filter in extractIssueNumberFromPull so "Task #123" is correctly treated as an issue reference (flagged by Codex on PAEM sync PR) - Make --legacy-peer-deps retry conditional on ERESOLVE/peer-dep errors instead of only firing on the first attempt (flagged by Copilot on TMP sync PR) - Add test for "Task #N" being treated as a valid issue ref https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
The label sync workflow (maint-69-sync-labels.yml) has been failing since Feb 2 because npm install -g js-yaml installs to the global prefix which actions/github-script can't resolve. Install locally so Node's module resolution finds it in node_modules/. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Automated Status SummaryHead SHA: 12dd671
Coverage Overview
Coverage Trend
Top Coverage Hotspots (lowest coverage)
Updated automatically; will refresh on subsequent CI/Docker completions. Keepalive checklistScopeNo scope information available Tasks
Acceptance criteria
|
🤖 Keepalive Loop StatusPR #1662 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
Keepalive Work Log (click to expand)
|
There was a problem hiding this comment.
Pull request overview
Fixes workflow reliability issues in the Workflows repo, primarily aimed at unblocking label synchronization to consumer repos, while also tightening agent runner robustness and adding Claude session/LLM analysis + failure diagnostics.
Changes:
- Switch label-sync workflow to install
js-yamllocally soactions/github-scriptcanrequire()it. - Refine
setup-api-clientnpm retry behavior to only use--legacy-peer-depson peer-dep/ERESOLVE-like failures. - Improve Codex watchdog push reporting and extend Claude runner with session analysis, completion commenting, and richer failure diagnostics/commenting; plus update issue-number extraction logic/tests to treat “Task #N” as a valid issue reference.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/maint-69-sync-labels.yml |
Installs js-yaml locally for label YAML parsing in github-script. |
.github/actions/setup-api-client/action.yml |
Adjusts npm install fallback logic for peer-dependency conflicts. |
templates/consumer-repo/.github/actions/setup-api-client/action.yml |
Mirrors the same npm fallback logic change for synced consumer action. |
.github/workflows/reusable-codex-run.yml |
Makes watchdog push outcome explicit; avoids killing watchdog mid-save if it already fired. |
.github/workflows/reusable-claude-run.yml |
Adds Claude session capture/LLM task analysis outputs, completion checkpoint comment, and failure diagnostics/artifacts/PR comment. |
.github/scripts/agents_pr_meta_keepalive.js |
Allows “Task #N” issue references by removing task from the “skip non-issue refs” list. |
templates/consumer-repo/.github/scripts/agents_pr_meta_keepalive.js |
Mirrors the same issue-ref parsing behavior for synced consumer script. |
.github/scripts/__tests__/agents-pr-meta-keepalive.test.js |
Adds coverage for “Task #N” being treated as a valid issue ref. |
|
|
||
| - name: Install js-yaml | ||
| run: npm install -g js-yaml | ||
| run: npm install js-yaml |
There was a problem hiding this comment.
npm install js-yaml will create/update package-lock.json (and potentially run install scripts), which is inconsistent with the repo’s other workflows that install one-off Node deps using --no-save --no-package-lock to avoid workspace churn. Consider installing js-yaml with --no-save --no-package-lock (and optionally --ignore-scripts) to keep the workspace clean and reduce supply-chain risk.
| run: npm install js-yaml | |
| run: npm install js-yaml --no-save --no-package-lock --ignore-scripts |
| core.setOutput('error_summary', summary || ''); | ||
|
|
There was a problem hiding this comment.
classify_failure sets an error_summary output, but nothing in this workflow references steps.classify_failure.outputs.error_summary (and it isn’t exposed via jobs.claude.outputs). Either wire this output into a downstream step / job output, or remove it to avoid dead outputs that can confuse future maintenance.
| core.setOutput('error_summary', summary || ''); |
| OPENAI_API_KEY: | ||
| required: false | ||
| description: >- | ||
| OpenAI API key for LLM analysis (enables model selection beyond GitHub Models) | ||
| CLAUDE_API_STRANSKE: | ||
| required: false | ||
| description: 'Anthropic API key for LLM analysis (enables Claude slot)' |
There was a problem hiding this comment.
The PR description/title focuses on fixing the label sync workflow’s js-yaml install, but this PR also introduces substantial new behavior in reusable-claude-run.yml (LLM session analysis, PR comments, failure artifacts/comments). Please either update the PR description to cover these additional changes (and their intent/test plan), or split them into a separate PR so the label-sync fix can be reviewed/merged independently.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9e89707e89
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pr_number: process.env.PR_NUMBER, | ||
| commit_sha: process.env.COMMIT_SHA, | ||
| iteration: process.env.ITERATION, | ||
| prompt_file: promptFile, |
There was a problem hiding this comment.
Respect selected Claude prompt when posting completion checkpoint
This call passes a computed promptFile intended to prefer claude-prompt*.md, but postCompletionComment currently re-resolves PR-specific files as codex-prompt-${pr}.md internally (.github/scripts/post_completion_comment.js), so when both Codex and Claude prompt files exist on the same PR the Claude workflow can publish checkpoint tasks from the wrong prompt. That produces incorrect completion comments and downstream status summaries for mixed-agent PR histories.
Useful? React with 👍 / 👎.
Summary
maint-69-sync-labels.ymlwhich has been failing since Feb 2 becausenpm install -g js-yamlinstalls globally butactions/github-scriptcan't resolve global modules. Changed to local install (npm install js-yaml).This unblocks syncing
agent:claude,from:claude,agent:auto, and other missing labels to all consumer repos.Test plan
https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6