Skip to content

feat: LangChain-enhanced task completion detection for keepalive#459

Merged
stranske merged 10 commits intomainfrom
feature/langchain-analysis
Jan 2, 2026
Merged

feat: LangChain-enhanced task completion detection for keepalive#459
stranske merged 10 commits intomainfrom
feature/langchain-analysis

Conversation

@stranske
Copy link
Copy Markdown
Owner

@stranske stranske commented Jan 2, 2026

Source: Issue #129

Automated Status Summary

Scope

  • After merging PR chore(codex): bootstrap PR for issue #101 #103 (multi-agent routing infrastructure), we need to:
  • 1. Validate the CLI agent pipeline works end-to-end with the new task-focused prompts
  • 2. Add GITHUB_STEP_SUMMARY output so iteration results are visible in the Actions UI
  • 3. Streamline the Automated Status Summary to reduce clutter when using CLI agents
  • 4. Clean up comment patterns to avoid a mix of old UI-agent and new CLI-agent comments

Tasks

  • ### Pipeline Validation
  • After PR chore(codex): bootstrap PR for issue #101 #103 merges, create a test PR with agent:codex label
  • Verify task appendix appears in Codex prompt (check workflow logs)
  • Verify Codex works on actual tasks (not random infrastructure work)
  • Verify keepalive comment updates with iteration progress
  • ### GITHUB_STEP_SUMMARY
  • Add step summary output to agents-keepalive-loop.yml after agent run
  • Include: iteration number, tasks completed, files changed, outcome
  • Ensure summary is visible in workflow run UI
  • ### Conditional Status Summary
  • Modify buildStatusBlock() in agents_pr_meta_update_body.js to accept agentType parameter
  • When agentType is set (CLI agent): hide workflow table, hide head SHA/required checks
  • Keep Scope/Tasks/Acceptance checkboxes for all cases
  • Pass agent type from workflow to the update_body job
  • ### Comment Pattern Cleanup
  • For CLI agents (agent:* label):
  • Suppress <!-- gate-summary: --> comment posting (use step summary instead)
  • Suppress <!-- keepalive-round: N --> instruction comments (task appendix replaces this)
  • Update <!-- keepalive-loop-summary --> to be the single source of truth
  • Ensure state marker is embedded in the summary comment (not separate)
  • For UI Codex (no agent:* label):
  • Keep existing comment patterns (instruction comments, connector bot reports)
  • Keep <!-- gate-summary: --> comment
  • Add agent_type output to detect job so downstream workflows know the mode
  • Update agents-pr-meta.yml to conditionally skip gate summary for CLI agent PRs

Acceptance criteria

  • CLI agent receives explicit tasks in prompt and works on them
  • Iteration results visible in Actions workflow run summary
  • PR body shows checkboxes but not workflow clutter when using CLI agents
  • UI Codex path (no agent label) continues to show full status summary
  • CLI agent PRs have ≤3 bot comments total (summary, one per iteration update) instead of 10+
  • State tracking is consolidated in the summary comment, not scattered
  • ## Dependencies
  • - Requires PR chore(codex): bootstrap PR for issue #101 #103 to be merged first
  • Head SHA: dbe2ff0
  • Latest Runs: ❔ in progress — Agents PR meta manager
  • Required: gate: ⏸️ not started
  • | Workflow / Job | Result | Logs |
  • |----------------|--------|------|
  • | Agents PR meta manager | ❔ in progress | View run |
  • Head SHA: e16dbd9
  • Latest Runs: ✅ success — Gate
  • Required: gate: ✅ success
  • | Workflow / Job | Result | Logs |
  • |----------------|--------|------|
  • | Agents PR meta manager | ❔ in progress | View run |
  • | CI Autofix Loop | ✅ success | View run |
  • | Copilot code review | ❔ in progress | View run |
  • | Gate | ✅ success | View run |
  • | Health 40 Sweep | ✅ success | View run |
  • | Health 44 Gate Branch Protection | ✅ success | View run |
  • | Health 45 Agents Guard | ✅ success | View run |
  • | Health 50 Security Scan | ✅ success | View run |
  • | Maint 52 Validate Workflows | ✅ success | View run |
  • | PR 11 - Minimal invariant CI | ✅ success | View run |
  • | Selftest CI | ✅ success | View run |

Head SHA: ac4aa0e
Latest Runs: ✅ success — Gate
Required: gate: ✅ success

Workflow / Job Result Logs
Agents PR meta manager ❔ in progress View run
CI Autofix Loop ✅ success View run
Gate ✅ success View run
Health 40 Sweep ✅ success View run
Health 44 Gate Branch Protection ✅ success View run
Health 45 Agents Guard ✅ success View run
Health 50 Security Scan ✅ success View run
Maint 52 Validate Workflows ❌ failure View run
PR 11 - Minimal invariant CI ✅ success View run
Selftest CI ✅ success View run
Validate Sync Manifest ✅ success View run

- Add llm_provider.py with GitHub Models → OpenAI → regex fallback chain
- Add codex_jsonl_parser.py for parsing Codex --json event streams
- Add codex_session_analyzer.py for task completion detection
- Add langchain optional dependency to pyproject.toml
- Add comprehensive tests for all new modules (38 tests)
- Add integration plan document with data source options

This implements Phase 0 of the LangChain keepalive integration:
- Option A: Summary only (current --output-last-message)
- Option B: Full JSONL stream (--json mode) - recommended
- Option B filtered: High-value events only (agent_message, reasoning, todo_list)

The JSONL parser handles Codex event schema variations including:
- Old (assistant_message) and new (agent_message) field names
- Streaming item updates
- Todo list items for direct task mapping

Refs #453
- Add SESSION_JSONL variable for PR-specific session file naming
- Change Codex execution to use --json flag, redirecting JSONL stream to file
- Add 'Analyze Codex session' step that parses session data with codex_jsonl_parser
- Output session metrics (events, messages, commands, file changes, todos)
- Include codex-session*.jsonl in artifact uploads

Part of #454: LangChain-enhanced task completion detection
- Add scripts/analyze_codex_session.py CLI for session analysis
  - Extract tasks from PR body checkboxes
  - Run LLM analysis via GitHub Models API (with OpenAI/regex fallback)
  - Support JSON, markdown, and github-actions output formats
  - Update PR body checkboxes based on completion detection

- Enhance workflow with dedicated LLM analysis step
  - New 'Analyze task completion with LLM' step after session parsing
  - Fetches PR body via gh CLI to extract tasks
  - Outputs completion results to GITHUB_OUTPUT

- Add 17 tests for CLI (100% pass)
  - Task extraction from PR body
  - Checkbox update logic
  - CLI integration tests

Part of #454: LangChain-enhanced task completion detection
…ation

- Add llmCompletedTasks parameter to autoReconcileTasks()
  - LLM tasks take priority over commit-based analysis
  - Commit analysis adds supplementary matches not covered by LLM
  - Deduplicates matches by task text (case-insensitive)

- Add LLM analysis outputs to reusable-codex-run.yml
  - llm-analysis-run: whether analysis was performed
  - llm-completed-tasks: JSON array of completed tasks
  - llm-has-completions: boolean for quick check
  - session-event-count, session-todo-count for metrics

- Save analysis JSON file for debugging (codex-analysis-{PR}.json)
  - Uploaded as artifact alongside session JSONL

- Update keepalive workflows to pass LLM tasks
  - agents-keepalive-loop.yml
  - templates/consumer-repo/.github/workflows/agents-keepalive-loop.yml

All 63 JS tests pass, all 55 Python tests pass.

Part of #454: LangChain-enhanced task completion detection
- Add llm_provider, llm_confidence, llm_analysis_run inputs to updateKeepaliveLoopSummary
- Display 🧠 Task Analysis section showing which provider was used
- Show warning when fallback provider (OpenAI or regex) was used
- Add llm-provider and llm-confidence outputs to reusable-codex-run.yml
- Update agents-keepalive-loop.yml to pass LLM info to summary
- Update consumer template with same changes
- Add 3 tests for LLM provider display scenarios

This gives users visibility into whether the primary GitHub Models
provider was used or if the system fell back to OpenAI or regex.
Copilot AI review requested due to automatic review settings January 2, 2026 20:56
@stranske stranske temporarily deployed to agent-high-privilege January 2, 2026 20:57 — with GitHub Actions Inactive
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions github-actions bot added the autofix Opt-in automated formatting & lint remediation label Jan 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 2, 2026

Status | ✅ no new diagnostics
History points | 1
Timestamp | 2026-01-02 22:30:51 UTC
Report artifact | autofix-report-pr-459
Remaining | 0
New | 0
No additional artifacts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 2, 2026

Automated Status Summary

Head SHA: 7de54ca
Latest Runs: ⏳ pending — Gate
Required contexts: Gate / gate, Health 45 Agents Guard / Enforce agents workflow protections
Required: core tests (3.11): ⏳ pending, core tests (3.12): ⏳ pending, docker smoke: ⏳ pending, gate: ⏳ pending

Workflow / Job Result Logs
(no jobs reported) ⏳ pending

Coverage Overview

  • Coverage history entries: 1

Coverage Trend

Metric Value
Current 92.21%
Baseline 85.00%
Delta +7.21%
Minimum 70.00%
Status ✅ Pass

Top Coverage Hotspots (lowest coverage)

File Coverage Missing
scripts/workflow_health_check.py 62.6% 28
scripts/classify_test_failures.py 62.9% 37
scripts/ledger_validate.py 65.3% 63
scripts/mypy_return_autofix.py 82.6% 11
scripts/ledger_migrate_base.py 85.5% 13
scripts/fix_cosmetic_aggregate.py 92.3% 1
scripts/coverage_history_append.py 92.8% 2
scripts/workflow_validator.py 93.3% 4
scripts/update_autofix_expectations.py 93.9% 1
scripts/pr_metrics_tracker.py 95.7% 3
scripts/generate_residual_trend.py 96.6% 1
scripts/build_autofix_pr_comment.py 97.0% 2
scripts/aggregate_agent_metrics.py 97.2% 0
scripts/fix_numpy_asserts.py 98.1% 0
scripts/sync_test_dependencies.py 98.3% 1

Updated automatically; will refresh on subsequent CI/Docker completions.


Keepalive checklist

Scope

  • After merging PR chore(codex): bootstrap PR for issue #101 #103 (multi-agent routing infrastructure), we need to:
  • 1. Validate the CLI agent pipeline works end-to-end with the new task-focused prompts
  • 2. Add GITHUB_STEP_SUMMARY output so iteration results are visible in the Actions UI
  • 3. Streamline the Automated Status Summary to reduce clutter when using CLI agents
  • 4. Clean up comment patterns to avoid a mix of old UI-agent and new CLI-agent comments

Tasks

  • ### Pipeline Validation
  • After PR chore(codex): bootstrap PR for issue #101 #103 merges, create a test PR with agent:codex label
  • Verify task appendix appears in Codex prompt (check workflow logs)
  • Verify Codex works on actual tasks (not random infrastructure work)
  • Verify keepalive comment updates with iteration progress
  • ### GITHUB_STEP_SUMMARY
  • Add step summary output to agents-keepalive-loop.yml after agent run
  • Include: iteration number, tasks completed, files changed, outcome
  • Ensure summary is visible in workflow run UI
  • ### Conditional Status Summary
  • Modify buildStatusBlock() in agents_pr_meta_update_body.js to accept agentType parameter
  • When agentType is set (CLI agent): hide workflow table, hide head SHA/required checks
  • Keep Scope/Tasks/Acceptance checkboxes for all cases
  • Pass agent type from workflow to the update_body job
  • ### Comment Pattern Cleanup
  • For CLI agents (agent:* label):
  • Suppress <!-- gate-summary: --> comment posting (use step summary instead)
  • Suppress <!-- keepalive-round: N --> instruction comments (task appendix replaces this)
  • Update <!-- keepalive-loop-summary --> to be the single source of truth
  • Ensure state marker is embedded in the summary comment (not separate)
  • For UI Codex (no agent:* label):
  • Keep existing comment patterns (instruction comments, connector bot reports)
  • Keep <!-- gate-summary: --> comment
  • Add agent_type output to detect job so downstream workflows know the mode
  • Update agents-pr-meta.yml to conditionally skip gate summary for CLI agent PRs

Acceptance criteria

  • CLI agent receives explicit tasks in prompt and works on them
  • Iteration results visible in Actions workflow run summary
  • PR body shows checkboxes but not workflow clutter when using CLI agents
  • UI Codex path (no agent label) continues to show full status summary
  • CLI agent PRs have ≤3 bot comments total (summary, one per iteration update) instead of 10+
  • State tracking is consolidated in the summary comment, not scattered
  • ## Dependencies
  • - Requires PR chore(codex): bootstrap PR for issue #101 #103 to be merged first
  • [ ]

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 2, 2026

🤖 Keepalive Loop Status

PR #459 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Disposition skipped (transient)
Gate success
Tasks 0/55 complete
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds LangChain-based LLM analysis for intelligent task completion detection in the keepalive automation loop. The implementation introduces a provider fallback chain (GitHub Models → OpenAI → Regex), JSONL session parsing from Codex --json output, and integrates task analysis results into PR updates and summary comments.

Key changes:

  • New Python modules for LLM provider abstraction, JSONL parsing, and session analysis
  • Workflow modifications to capture --json output and run LLM analysis
  • JavaScript updates to display LLM provider information and merge LLM-detected tasks with commit-based detection
  • 20 new tests covering Python CLI, analysis, and JavaScript display logic

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
tools/llm_provider.py LLM provider abstraction with GitHub Models/OpenAI/Regex fallback chain
tools/codex_jsonl_parser.py Parser for Codex JSONL event stream from --json output
tools/codex_session_analyzer.py Orchestrates session analysis using parsed JSONL and LLM providers
scripts/analyze_codex_session.py CLI entry point for analyzing sessions from GitHub Actions
tests/tools/test_llm_provider.py Unit tests for provider availability and fallback behavior
tests/tools/test_codex_jsonl_parser.py Tests for JSONL parsing including schema variations
tests/scripts/test_analyze_codex_session.py CLI integration tests with subprocess calls
.github/workflows/reusable-codex-run.yml Captures --json output and runs analysis steps
templates/consumer-repo/.github/workflows/agents-keepalive-loop.yml Template workflow integrating LLM task detection
.github/workflows/agents-keepalive-loop.yml Passes LLM metadata to summary comment generation
.github/scripts/keepalive_loop.js Displays LLM provider info and merges LLM/commit task sources
.github/scripts/__tests__/keepalive-loop.test.js Tests for LLM provider display in PR summaries
pyproject.toml Adds optional langchain dependencies
docs/plans/langchain-keepalive-integration.md Planning document describing architecture and options

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1228 to +1232
summaryLines.push(
'',
'### 🧠 Task Analysis',
`| Provider | ${providerIcon} ${providerLabel} |`,
`| Confidence | ${confidencePercent}% |`,
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table formatting is incomplete. Lines 1231-1232 create table rows without proper markdown table syntax (missing header separator and consistent column structure). The output will render as plain text rather than a table. Add proper table headers and separators, for example:

| Field | Value |
|-------|-------|
| Provider | ... |
| Confidence | ... |

Copilot uses AI. Check for mistakes.

# GitHub Models API endpoint (OpenAI-compatible)
GITHUB_MODELS_BASE_URL = "https://models.inference.ai.azure.com"
DEFAULT_MODEL = "gpt-4o-mini"
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states the primary provider uses "gpt-4.1-mini", but the code actually uses "gpt-4o-mini" (line 28). This is a discrepancy between documentation and implementation. "gpt-4.1-mini" doesn't appear to be a valid OpenAI model name. Update the PR description to reflect the actual model being used.

Copilot uses AI. Check for mistakes.
python3 << 'PYEOF'
import os
import sys
sys.path.insert(0, '.')
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline Python script sets sys.path.insert(0, '.') (line 465) but PYTHONPATH is already set to github.workspace (line 438). The relative path '.' may not resolve correctly depending on the working directory at execution time. For consistency and reliability, use the PYTHONPATH that's already configured or use an absolute path based on github.workspace.

Suggested change
sys.path.insert(0, '.')

Copilot uses AI. Check for mistakes.
Comment on lines +256 to +280
with patch("tools.llm_provider.get_llm_provider") as mock_provider:
from tools.llm_provider import RegexFallbackProvider

mock_provider.return_value = RegexFallbackProvider()

result = subprocess.run(
[
sys.executable,
"scripts/analyze_codex_session.py",
"--session-file",
str(sample_session_file),
"--pr-body-file",
str(sample_pr_body_file),
"--output",
"json",
"--update-pr-body",
"--updated-body-file",
str(updated_file),
],
capture_output=True,
text=True,
cwd=Path(__file__).parent.parent.parent,
)

assert result.returncode == 0
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock provider is set up using a context manager (lines 256-259), but the subprocess.run call (lines 261-278) spawns a separate Python process that won't inherit this mock. The patch only affects the current test process, not the subprocess. This test will actually use the real provider chain, not the mocked RegexFallbackProvider. To properly test this, either mock at the subprocess level (via environment manipulation) or refactor the CLI to be testable without subprocess calls.

Copilot uses AI. Check for mistakes.
Comment on lines +222 to +228
# Reuse the same prompt building logic
github_provider = GitHubModelsProvider()
prompt = github_provider._build_analysis_prompt(session_output, tasks, context)

try:
response = client.invoke(prompt)
result = github_provider._parse_response(response.content, tasks)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenAI provider instantiates a GitHubModelsProvider just to reuse its private methods. This creates an unnecessary object and tightly couples the two providers. Consider extracting the prompt building and response parsing into shared helper functions or methods on a base class.

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +114
for task in completed_tasks:
# Escape special regex characters in task
escaped_task = re.escape(task)

# Pattern to match unchecked checkbox with this task
pattern = re.compile(
rf"^([\s]*-\s*)\[ \](\s*){escaped_task}",
re.MULTILINE,
)

# Replace with checked version
updated_body = pattern.sub(rf"\1[x]\2{task}", updated_body)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern on line 109 requires an exact match of the task text after escaping, but the escaped task is inserted unescaped back into the replacement on line 114. If the task contains characters that were escaped (e.g., parentheses, brackets), the replacement will use the original unescaped version, which could cause the pattern to not match correctly on subsequent calls or create inconsistencies. Use escaped_task in the replacement as well, or use the original task variable in both places consistently.

Copilot uses AI. Check for mistakes.
Comment on lines +285 to +327
for task in tasks:
task_lower = task.lower()
# Simple keyword matching
task_words = set(task_lower.split())

# Check for completion signals
is_completed = any(
word in output_lower
and any(
p in output_lower
for p in ["completed", "finished", "done", "fixed", "✓", "[x]"]
)
for word in task_words
if len(word) > 3
)

# Check for progress signals
is_in_progress = any(
word in output_lower
and any(
p in output_lower
for p in ["working on", "started", "implementing", "in progress"]
)
for word in task_words
if len(word) > 3
)

# Check for blocker signals
is_blocked = any(
word in output_lower
and any(
p in output_lower for p in ["blocked", "stuck", "failed", "error", "cannot"]
)
for word in task_words
if len(word) > 3
)

if is_completed:
completed.append(task)
elif is_blocked:
blocked.append(task)
elif is_in_progress:
in_progress.append(task)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex fallback matching logic has a high likelihood of false positives. The current logic checks if any task word (longer than 3 characters) appears anywhere in the output along with a completion keyword anywhere else in the output. For example, if the output contains "completed refactoring" and a task is "Update tests", both "update" and "tests" are unrelated to "completed refactoring", but if either word appears anywhere in the output, the task would be marked as completed. Consider requiring proximity between the task words and status keywords, or using the defined but unused COMPLETION_PATTERNS, PROGRESS_PATTERNS, and BLOCKER_PATTERNS regex patterns.

Copilot uses AI. Check for mistakes.
Comment on lines +246 to +264
# Patterns indicating task completion
COMPLETION_PATTERNS = [
r"(?:completed?|finished|done|implemented|fixed|resolved)\s+(?:the\s+)?(.+?)(?:\.|$)",
r"✓\s+(.+?)(?:\.|$)",
r"\[x\]\s+(.+?)(?:\.|$)",
r"successfully\s+(?:completed?|implemented|fixed)\s+(.+?)(?:\.|$)",
]

# Patterns indicating work in progress
PROGRESS_PATTERNS = [
r"(?:working on|started|beginning|implementing)\s+(.+?)(?:\.|$)",
r"(?:in progress|ongoing):\s*(.+?)(?:\.|$)",
]

# Patterns indicating blockers
BLOCKER_PATTERNS = [
r"(?:blocked|stuck|cannot|failed|error)\s+(?:on\s+)?(.+?)(?:\.|$)",
r"(?:issue|problem|bug)\s+(?:with\s+)?(.+?)(?:\.|$)",
]
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The COMPLETION_PATTERNS, PROGRESS_PATTERNS, and BLOCKER_PATTERNS class variables are defined but never used. The analyze_completion method implements its own simpler keyword matching instead. Either remove these unused patterns or refactor the logic to use them.

Copilot uses AI. Check for mistakes.
items = event.get("items", [])
if not items and content:
# Try to parse from content
import contextlib
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of contextlib is done inside the method body rather than at the module level. This is unconventional and adds unnecessary overhead on each call. Move this import to the top of the file with other imports.

Copilot uses AI. Check for mistakes.
Comment on lines +404 to +406
eval "codex exec --json --skip-git-repo-check --sandbox \"$SANDBOX\" --output-last-message \"$OUTPUT_FILE\" $EXTRA_ARGS \"\$(cat \"\$PROMPT_FILE\")\"" > "$SESSION_JSONL" 2>&1 || CODEX_EXIT=$?
else
codex exec --skip-git-repo-check --sandbox "$SANDBOX" --output-last-message "$OUTPUT_FILE" "$(cat "$PROMPT_FILE")" || CODEX_EXIT=$?
codex exec --json --skip-git-repo-check --sandbox "$SANDBOX" --output-last-message "$OUTPUT_FILE" "$(cat "$PROMPT_FILE")" > "$SESSION_JSONL" 2>&1 || CODEX_EXIT=$?
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codex exec command redirects both stdout and stderr to the SESSION_JSONL file (using > "$SESSION_JSONL" 2>&1). This means any stderr output (warnings, errors, debug messages) will be mixed with the JSONL events, which could cause parsing failures. Consider separating stderr or using a more robust approach like tee to capture stdout while still allowing stderr to be visible in logs, or redirect stderr separately.

Copilot uses AI. Check for mistakes.
Root cause: The reusable workflow was calling scripts/analyze_codex_session.py
but the scripts were only available in the Workflows repo, not in consumer repos
that call the reusable workflow.

Changes:
- Expanded sparse checkout to include scripts/ and tools/ directories
- Made Workflows repo checkout ref dynamic (github.job_workflow_sha) so testing
  feature branches works correctly
- Updated PYTHONPATH to include .workflows-lib for imports
- Fixed script paths to use .workflows-lib/ prefix
- Added LLM dependency installation step from .workflows-lib/tools/requirements.txt
- Added requirements.txt for LLM dependencies (langchain-openai)
- Added error output display for debugging when LLM analysis fails
@stranske stranske temporarily deployed to agent-high-privilege January 2, 2026 21:57 — with GitHub Actions Inactive
The github.job_workflow_sha doesn't work correctly for checkout@v4 when using
sparse-checkout. Instead, extract the ref from github.workflow_ref which contains
the full path including the ref (e.g., refs/heads/feature/langchain-analysis).
@stranske stranske temporarily deployed to agent-high-privilege January 2, 2026 22:05 — with GitHub Actions Inactive
Temporarily disable sparse-checkout to do a full checkout and ensure the
scripts/ and tools/ directories are available. Will re-enable sparse-checkout
once the checkout issue is resolved.
@stranske stranske temporarily deployed to agent-high-privilege January 2, 2026 22:10 — with GitHub Actions Inactive
Add debugging to understand what context variables are available and
use fetch-depth: 0 to ensure the SHA is fetchable when using job_workflow_sha.
@stranske stranske temporarily deployed to agent-high-privilege January 2, 2026 22:16 — with GitHub Actions Inactive
Add a new input 'workflows_ref' that callers can use to specify which ref
of the Workflows repo to checkout for scripts. This is needed because
github.job_workflow_sha is not available in reusable workflow context.

Callers should set workflows_ref to match their @ref in the uses: line.
Defaults to 'main'.
@stranske stranske temporarily deployed to agent-high-privilege January 2, 2026 22:30 — with GitHub Actions Inactive
@stranske stranske merged commit c240629 into main Jan 2, 2026
498 of 499 checks passed
@stranske stranske deleted the feature/langchain-analysis branch January 2, 2026 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autofix Opt-in automated formatting & lint remediation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants