Conversation
Automated sync from stranske/Workflows Template hash: 640b9071ddde Changes synced from sync-manifest.yml
🤖 Keepalive Loop StatusPR #151 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
|
Status | ✅ no new diagnostics |
There was a problem hiding this comment.
Pull request overview
This PR syncs workflow templates and supporting scripts from the central Workflows repository. The changes introduce new LLM-based functionality for task completion analysis and issue formatting, along with minor improvements to existing keepalive and guard scripts.
Key Changes
- Adds new LLM provider abstraction with fallback chain (GitHub Models → OpenAI → regex patterns)
- Introduces issue formatting utilities using LangChain for converting raw issues to standardized template format
- Enhances keepalive prompt routing with additional CI failure mode aliases
- Updates documentation timestamp and improves security comment clarity in agents-guard
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/llm_provider.py | New LLM provider abstraction with GitHub Models and OpenAI support, includes confidence validation system |
| scripts/langchain/issue_formatter.py | New issue formatter that converts raw GitHub issues to AGENT_ISSUE_TEMPLATE format using LLM or regex fallback |
| scripts/langchain/prompts/format_issue.md | Prompt template defining rules for LLM-based issue formatting |
| docs/LABELS.md | Updates documentation timestamp from December 2025 to January 2026 |
| .github/scripts/keepalive_prompt_routing.js | Adds new CI failure mode aliases (ci_failure, fix-ci-failure) to FIX_MODES set |
| .github/scripts/keepalive_loop.js | Refactors verification status checking to avoid redundant toLowerCase() calls |
| .github/scripts/agents-guard.js | Expands security comment explaining the agents:allow-change label bypass behavior |
| def _parse_response( | ||
| self, | ||
| content: str, | ||
| tasks: list[str], | ||
| quality_context: SessionQualityContext | None = None, | ||
| ) -> CompletionAnalysis: | ||
| """Parse LLM response into CompletionAnalysis with BS detection.""" | ||
| try: | ||
| # Try to extract JSON from response | ||
| json_start = content.find("{") | ||
| json_end = content.rfind("}") + 1 | ||
| if json_start >= 0 and json_end > json_start: | ||
| data = json.loads(content[json_start:json_end]) | ||
| else: | ||
| raise ValueError("No JSON found in response") | ||
|
|
||
| raw_confidence = float(data.get("confidence", 0.5)) | ||
| completed = data.get("completed", []) | ||
| in_progress = data.get("in_progress", []) | ||
| reasoning = data.get("reasoning", "") | ||
|
|
||
| # Apply BS detection to validate/adjust confidence | ||
| adjusted_confidence, warnings = self._validate_confidence( | ||
| raw_confidence=raw_confidence, | ||
| completed_count=len(completed), | ||
| in_progress_count=len(in_progress), | ||
| quality_context=quality_context, | ||
| reasoning=reasoning, | ||
| ) | ||
|
|
||
| return CompletionAnalysis( | ||
| completed_tasks=completed, | ||
| in_progress_tasks=in_progress, | ||
| blocked_tasks=data.get("blocked", []), | ||
| confidence=adjusted_confidence, | ||
| reasoning=reasoning, | ||
| provider_used=self.name, | ||
| raw_confidence=raw_confidence if adjusted_confidence != raw_confidence else None, | ||
| confidence_adjusted=adjusted_confidence != raw_confidence, | ||
| quality_warnings=warnings if warnings else None, | ||
| ) | ||
| except (json.JSONDecodeError, ValueError) as e: | ||
| logger.warning(f"Failed to parse LLM response: {e}") | ||
| # Return empty analysis on parse failure | ||
| return CompletionAnalysis( | ||
| completed_tasks=[], | ||
| in_progress_tasks=[], | ||
| blocked_tasks=[], | ||
| confidence=0.0, | ||
| reasoning=f"Failed to parse response: {e}", | ||
| provider_used=self.name, | ||
| ) |
There was a problem hiding this comment.
The method _parse_response includes the quality_context parameter and calls _validate_confidence with it, but this parameter is not documented in the method's interface. When the method is called from OpenAIProvider.analyze_completion on line 378, no quality_context is passed, which means it will always be None. This inconsistency could lead to reduced validation capabilities for the OpenAI provider compared to the GitHub Models provider.
| def analyze_completion( | ||
| self, | ||
| session_output: str, | ||
| tasks: list[str], | ||
| context: str | None = None, | ||
| ) -> CompletionAnalysis: | ||
| client = self._get_client() | ||
| if not client: | ||
| raise RuntimeError("LangChain OpenAI not available") | ||
|
|
||
| # Reuse the same prompt building logic | ||
| github_provider = GitHubModelsProvider() | ||
| prompt = github_provider._build_analysis_prompt(session_output, tasks, context) | ||
|
|
||
| try: | ||
| response = client.invoke(prompt) | ||
| result = github_provider._parse_response(response.content, tasks) | ||
| # Override provider name | ||
| return CompletionAnalysis( | ||
| completed_tasks=result.completed_tasks, | ||
| in_progress_tasks=result.in_progress_tasks, | ||
| blocked_tasks=result.blocked_tasks, | ||
| confidence=result.confidence, | ||
| reasoning=result.reasoning, | ||
| provider_used=self.name, | ||
| ) | ||
| except Exception as e: | ||
| logger.error(f"OpenAI API error: {e}") | ||
| raise |
There was a problem hiding this comment.
The OpenAIProvider.analyze_completion method doesn't accept a quality_context parameter, but the GitHubModelsProvider does. This creates an inconsistent API between providers. Additionally, when OpenAIProvider calls github_provider._parse_response() on line 378, it doesn't pass the quality_context, which means OpenAI provider never benefits from confidence validation. Consider standardizing the interface across all providers.
| {task_list} | ||
|
|
||
| ## Session Output | ||
| {session_output[:8000]} # Truncate to avoid token limits |
There was a problem hiding this comment.
The session output is truncated to 8000 characters on line 255, but this magic number is not defined as a named constant. Consider extracting this to a module-level constant like MAX_SESSION_OUTPUT_LENGTH = 8000 to improve maintainability and make it easier to adjust if needed.
| def analyze_completion( | ||
| self, | ||
| session_output: str, | ||
| tasks: list[str], | ||
| context: str | None = None, | ||
| quality_context: SessionQualityContext | None = None, | ||
| ) -> CompletionAnalysis: |
There was a problem hiding this comment.
The method signature includes a quality_context parameter that is not present in the abstract base class definition. This creates an inconsistent API where the subclass has a different signature than the abstract method it's implementing. Consider either adding this parameter to the base class abstract method or handling it through a different mechanism to maintain consistency.
Sync Summary
Files Updated
Files Skipped
Review Checklist
Source: stranske/Workflows
Manifest:
.github/sync-manifest.yml