Conversation
Automated sync from stranske/Workflows Template hash: c5a1cca8dc92 Changes synced from sync-manifest.yml
🤖 Keepalive Loop StatusPR #142 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
There was a problem hiding this comment.
Pull request overview
This PR syncs workflow templates from the central stranske/Workflows repository, focusing on enhancing keepalive loop logic with evidence-based productivity tracking and quality metrics for AI agent task analysis.
- Adds comprehensive productivity scoring system that tracks file changes, task completion, and historical trends
- Implements diminishing returns detection for early stopping when work naturally completes
- Introduces quality metrics infrastructure for LLM analysis validation and confidence adjustment warnings
- Removes runtime dependencies from shared autofix versions file, keeping only dev tools
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
.github/workflows/autofix-versions.env |
Removes runtime dependency pins (PyYAML, Pydantic, Hypothesis, jsonschema) with clarifying documentation that only dev tools should be synced |
.github/workflows/agents-keepalive-loop.yml |
Adds five new quality metric inputs (raw confidence, quality warnings, data quality, effort score, analysis text length) to be passed to keepalive loop summary |
.github/scripts/keepalive_loop.js |
Implements evidence-based productivity scoring, diminishing returns detection, and quality metrics display infrastructure with BS detection warnings |
|
|
||
| // Quality metrics for BS detection and evidence-based decisions | ||
| const llmRawConfidence = toNumber(inputs.llm_raw_confidence ?? inputs.llmRawConfidence, llmConfidence); | ||
| const llmConfidenceAdjusted = toBool(inputs.llm_confidence_adjusted ?? inputs.llmConfidenceAdjusted, false); |
There was a problem hiding this comment.
The parameter name llm_confidence_adjusted is referenced in the JavaScript code but is never passed from the YAML workflow file. This will cause llmConfidenceAdjusted to always be false, making the confidence adjustment warning on lines 1289-1295 never display even when confidence has been adjusted. Either add the missing input parameter to the YAML file or remove the unused logic if this feature is not yet implemented.
| const llmConfidenceAdjusted = toBool(inputs.llm_confidence_adjusted ?? inputs.llmConfidenceAdjusted, false); | |
| const llmConfidenceAdjusted = llmRawConfidence !== llmConfidence; |
| // Track task completion trend | ||
| const previousTasks = state.tasks || {}; | ||
| const prevUnchecked = toNumber(previousTasks.unchecked, checkboxCounts.unchecked); | ||
| const tasksCompletedSinceLastRound = prevUnchecked - checkboxCounts.unchecked; |
There was a problem hiding this comment.
The tasksCompletedSinceLastRound calculation can result in a negative value if tasks are added between iterations (prevUnchecked < checkboxCounts.unchecked). This negative value is then used in productivity scoring (line 843) where only positive values are expected, potentially skewing the productivity score. Consider adding a check to ensure the value is non-negative: const tasksCompletedSinceLastRound = Math.max(0, prevUnchecked - checkboxCounts.unchecked);
| const tasksCompletedSinceLastRound = prevUnchecked - checkboxCounts.unchecked; | |
| const tasksCompletedSinceLastRound = Math.max(0, prevUnchecked - checkboxCounts.unchecked); |
| iteration >= 2 && | ||
| prevFilesChanged > 0 && | ||
| lastFilesChanged === 0 && | ||
| tasksCompletedSinceLastRound === 0; |
There was a problem hiding this comment.
In the diminishing returns detection, tasksCompletedSinceLastRound === 0 is checked, but this value can be negative if tasks are added between iterations. This could lead to false positives in detecting diminishing returns. The logic should check for tasksCompletedSinceLastRound <= 0 or ensure the value is clamped to non-negative values when calculated (see line 837).
| tasksCompletedSinceLastRound === 0; | |
| tasksCompletedSinceLastRound <= 0; |
| `| Provider | ${providerIcon} ${providerLabel} |`, | ||
| `| Confidence | ${confidencePercent}% |`, | ||
| ); | ||
|
|
||
| // Show quality metrics if available | ||
| if (sessionDataQuality) { | ||
| const qualityIcon = sessionDataQuality === 'high' ? '🟢' : | ||
| sessionDataQuality === 'medium' ? '🟡' : | ||
| sessionDataQuality === 'low' ? '🟠' : '🔴'; | ||
| summaryLines.push(`| Data Quality | ${qualityIcon} ${sessionDataQuality} |`); | ||
| } | ||
| if (sessionEffortScore > 0) { | ||
| summaryLines.push(`| Effort Score | ${sessionEffortScore}/100 |`); | ||
| } |
There was a problem hiding this comment.
The markdown table structure is broken when additional rows are conditionally added. Lines 1273-1274 start a table without a header row or separator, then lines 1282 and 1285 add rows to this table. Markdown tables require a header row followed by a separator row (e.g., | --- | --- |) before data rows. This will render incorrectly. Consider either using a definition list format or adding proper table headers and separators.
Sync Summary
Files Updated
Files Skipped
Review Checklist
Source: stranske/Workflows
Manifest:
.github/sync-manifest.yml