Skip to content

fix: label sync workflow js-yaml install#1662

Merged
stranske merged 13 commits intomainfrom
claude/fix-task-completion-concerns-I1gRT
Feb 26, 2026
Merged

fix: label sync workflow js-yaml install#1662
stranske merged 13 commits intomainfrom
claude/fix-task-completion-concerns-I1gRT

Conversation

@stranske
Copy link
Copy Markdown
Owner

Summary

  • Fix maint-69-sync-labels.yml which has been failing since Feb 2 because npm install -g js-yaml installs globally but actions/github-script can't resolve global modules. Changed to local install (npm install js-yaml).

This unblocks syncing agent:claude, from:claude, agent:auto, and other missing labels to all consumer repos.

Test plan

  • Label sync workflow completes successfully after merge

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6

claude and others added 13 commits February 25, 2026 21:13
GitHub Actions ::warning:: commands truncate/mangle multi-line content.
Emit a short annotation message and print full npm stderr in a
collapsible ::group:: instead, so logs stay readable.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
…ation fixes

Mirror the main setup-api-client changes into the consumer-repo template
to prevent template drift:
- Exponential backoff retry (3 attempts, 5s/10s) for transient npm errors
- --legacy-peer-deps fallback on first failure
- Short ::warning:: annotations with full stderr in collapsible ::group::
- Pin lru-cache@10.4.3 (was ^10.0.0)

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
…feguard

Three changes to reusable-codex-run.yml to prevent work loss on timeout:

1. Pre-timeout watchdog: A background timer fires 5 minutes before
   max_runtime_minutes, committing and pushing any uncommitted work
   so it survives the job cancellation. Killed automatically if
   Codex finishes before the timer fires.

2. Robust parser import: Replace sys.path-based import of
   codex_jsonl_parser with importlib.util.spec_from_file_location.
   Consumer repos (e.g. Counter_Risk) have their own tools/ package
   with __init__.py that shadows the Workflows tools/ on sys.path,
   causing "No module named 'tools.codex_jsonl_parser'".

3. Commit step always runs: Add if: always() to the "Commit and push
   changes" step so uncommitted work is captured even on non-zero
   exit codes (the watchdog handles timeout, this handles failures).

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
parseCheckboxStates() and mergeCheckboxStates() only matched top-level
checkboxes (^- \[), ignoring indented sub-tasks (  - \[). When PR Meta
regenerated the PR body from the issue, auto-reconciled sub-task
checkboxes were silently reverted to unchecked. This caused the keepalive
loop to stall with rounds_without_task_completion: 8 despite the agent
completing real work — PR #256 had 5 tasks auto-checked then immediately
un-checked on every push.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
- P1: Add fetch/rebase before watchdog push to avoid non-fast-forward
  rejection when another workflow updates the branch during the run.
  Includes one retry with re-fetch/rebase and merge fallback.
- P2: Export watchdog-saved in on.workflow_call.outputs so callers
  of the reusable workflow can observe the signal.
- Copilot: Add git fetch before checking FETCH_HEAD to ensure it
  exists and is current (actions/checkout doesn't set FETCH_HEAD).
- Copilot: Initialize watchdog-saved=false before background subshell
  so downstream consumers always get a defined value.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Update WORKFLOW_OUTPUTS.md to include the new watchdog-saved output
from reusable-codex-run.yml, fixing the test_reusable_workflow_outputs_documented
test.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
The body scan in extractIssueNumberFromPull was treating patterns like
"Run #2615" as issue references, causing the Upsert PR body sections
check to fail with a 404 when trying to fetch non-existent issues.

Add a preceding-word filter to skip #NNN when preceded by common
non-issue words (run, attempt, step, job, check, task, version, v).
Add 12 unit tests covering the extraction logic.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
… to Claude runner

Closes the three remaining feature gaps between the Claude and Codex runners
identified in issue #1646:

1. **Session analysis (LLM-powered)**: Reuses analyze_codex_session.py which
   auto-detects Claude's plain-text session log (data_source=summary) and
   feeds it through the same LLM analysis pipeline for structured task
   completion assessment. Outputs feed into the keepalive loop.

2. **Completion checkpoint comment**: Posts a PR comment summarizing completed
   tasks and acceptance criteria using the shared post_completion_comment.js
   script. Supports both claude-prompt*.md and codex-prompt*.md file names.

3. **Error diagnostics**: Adds GITHUB_STEP_SUMMARY with error table, creates
   a diagnostics artifact (JSON + agent output), and posts a structured PR
   comment on non-transient failures with recovery guidance and log links.
   Uses a distinct <!-- claude-failure-notification --> marker.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Claude runner (reusable-claude-run.yml):
- Fix shell quoting of completed-tasks JSON by using env vars instead
  of inline ${{ }} expansion which breaks on apostrophes in task names
- Declare OPENAI_API_KEY and CLAUDE_API_STRANSKE in workflow_call.secrets
  so callers can pass them (matches Codex runner)
- Use printf instead of echo when writing PR body to disk to avoid
  mangling of -n/-e prefixes or backslashes
- Add info log when falling back to codex-prompt file

Codex runner (reusable-codex-run.yml):
- Gate watchdog-saved=true on actual push success instead of emitting
  it unconditionally after push attempts that may have both failed
- Use a fired-flag file so the watchdog kill only terminates the
  background process if it's still sleeping (hasn't started its
  commit/push work yet)

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
All four conflicts were in reusable-codex-run.yml watchdog code where
our branch has the fired-flag and push-success-gating improvements
vs the unchanged main version. Kept our (HEAD) version for all.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
- Remove "task" from the non-issue prefix filter in
  extractIssueNumberFromPull so "Task #123" is correctly treated as
  an issue reference (flagged by Codex on PAEM sync PR)
- Make --legacy-peer-deps retry conditional on ERESOLVE/peer-dep
  errors instead of only firing on the first attempt (flagged by
  Copilot on TMP sync PR)
- Add test for "Task #N" being treated as a valid issue ref

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
The label sync workflow (maint-69-sync-labels.yml) has been failing
since Feb 2 because npm install -g js-yaml installs to the global
prefix which actions/github-script can't resolve. Install locally
so Node's module resolution finds it in node_modules/.

https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Copilot AI review requested due to automatic review settings February 26, 2026 05:14
@stranske stranske merged commit 4a8ba03 into main Feb 26, 2026
23 checks passed
@stranske stranske deleted the claude/fix-task-completion-concerns-I1gRT branch February 26, 2026 05:14
@stranske stranske temporarily deployed to agent-high-privilege February 26, 2026 05:14 — with GitHub Actions Inactive
@stranske-keepalive
Copy link
Copy Markdown
Contributor

Automated Status Summary

Head SHA: 12dd671
Latest Runs: ⏳ pending — Gate
Required contexts: Gate / gate, Health 45 Agents Guard / guard
Required: core tests (3.11): ⏳ pending, core tests (3.12): ⏳ pending, docker smoke: ⏳ pending, gate: ⏳ pending

Workflow / Job Result Logs
(no jobs reported) ⏳ pending

Coverage Overview

  • Coverage history entries: 1

Coverage Trend

Metric Value
Current 93.12%
Baseline 85.00%
Delta +8.12%
Minimum 70.00%
Status ✅ Pass

Top Coverage Hotspots (lowest coverage)

File Coverage Missing
src/cli_parser.py 81.8% 4
src/percentile_calculator.py 95.0% 1
src/aggregator.py 95.0% 2
src/__init__.py 100.0% 0
src/ndjson_parser.py 100.0% 0

Updated automatically; will refresh on subsequent CI/Docker completions.


Keepalive checklist

Scope

No scope information available

Tasks

  • No tasks defined

Acceptance criteria

  • No acceptance criteria defined

@agents-workflows-bot
Copy link
Copy Markdown
Contributor

🤖 Keepalive Loop Status

PR #1662 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Disposition skipped (transient)
Gate success
Tasks 0/2 complete
Timeout 45 min (default)
Timeout usage 4m elapsed (9%, 41m remaining)
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

@agents-workflows-bot
Copy link
Copy Markdown
Contributor

Keepalive Work Log (click to expand)
# Time (UTC) Agent Action Result Files Tasks Progress Commit Gate
0 2026-02-26 05:18:15 Codex wait (missing-agent-label-transient) skipped 0 0/2 success

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes workflow reliability issues in the Workflows repo, primarily aimed at unblocking label synchronization to consumer repos, while also tightening agent runner robustness and adding Claude session/LLM analysis + failure diagnostics.

Changes:

  • Switch label-sync workflow to install js-yaml locally so actions/github-script can require() it.
  • Refine setup-api-client npm retry behavior to only use --legacy-peer-deps on peer-dep/ERESOLVE-like failures.
  • Improve Codex watchdog push reporting and extend Claude runner with session analysis, completion commenting, and richer failure diagnostics/commenting; plus update issue-number extraction logic/tests to treat “Task #N” as a valid issue reference.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
.github/workflows/maint-69-sync-labels.yml Installs js-yaml locally for label YAML parsing in github-script.
.github/actions/setup-api-client/action.yml Adjusts npm install fallback logic for peer-dependency conflicts.
templates/consumer-repo/.github/actions/setup-api-client/action.yml Mirrors the same npm fallback logic change for synced consumer action.
.github/workflows/reusable-codex-run.yml Makes watchdog push outcome explicit; avoids killing watchdog mid-save if it already fired.
.github/workflows/reusable-claude-run.yml Adds Claude session capture/LLM task analysis outputs, completion checkpoint comment, and failure diagnostics/artifacts/PR comment.
.github/scripts/agents_pr_meta_keepalive.js Allows “Task #N” issue references by removing task from the “skip non-issue refs” list.
templates/consumer-repo/.github/scripts/agents_pr_meta_keepalive.js Mirrors the same issue-ref parsing behavior for synced consumer script.
.github/scripts/__tests__/agents-pr-meta-keepalive.test.js Adds coverage for “Task #N” being treated as a valid issue ref.


- name: Install js-yaml
run: npm install -g js-yaml
run: npm install js-yaml
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

npm install js-yaml will create/update package-lock.json (and potentially run install scripts), which is inconsistent with the repo’s other workflows that install one-off Node deps using --no-save --no-package-lock to avoid workspace churn. Consider installing js-yaml with --no-save --no-package-lock (and optionally --ignore-scripts) to keep the workspace clean and reduce supply-chain risk.

Suggested change
run: npm install js-yaml
run: npm install js-yaml --no-save --no-package-lock --ignore-scripts

Copilot uses AI. Check for mistakes.
Comment on lines +1485 to +1486
core.setOutput('error_summary', summary || '');

Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

classify_failure sets an error_summary output, but nothing in this workflow references steps.classify_failure.outputs.error_summary (and it isn’t exposed via jobs.claude.outputs). Either wire this output into a downstream step / job output, or remove it to avoid dead outputs that can confuse future maintenance.

Suggested change
core.setOutput('error_summary', summary || '');

Copilot uses AI. Check for mistakes.
Comment on lines +108 to +114
OPENAI_API_KEY:
required: false
description: >-
OpenAI API key for LLM analysis (enables model selection beyond GitHub Models)
CLAUDE_API_STRANSKE:
required: false
description: 'Anthropic API key for LLM analysis (enables Claude slot)'
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/title focuses on fixing the label sync workflow’s js-yaml install, but this PR also introduces substantial new behavior in reusable-claude-run.yml (LLM session analysis, PR comments, failure artifacts/comments). Please either update the PR description to cover these additional changes (and their intent/test plan), or split them into a separate PR so the label-sync fix can be reviewed/merged independently.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9e89707e89

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1414 to +1417
pr_number: process.env.PR_NUMBER,
commit_sha: process.env.COMMIT_SHA,
iteration: process.env.ITERATION,
prompt_file: promptFile,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Respect selected Claude prompt when posting completion checkpoint

This call passes a computed promptFile intended to prefer claude-prompt*.md, but postCompletionComment currently re-resolves PR-specific files as codex-prompt-${pr}.md internally (.github/scripts/post_completion_comment.js), so when both Codex and Claude prompt files exist on the same PR the Claude workflow can publish checkpoint tasks from the wrong prompt. That produces incorrect completion comments and downstream status summaries for mixed-agent PR histories.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants