Conversation
Automated sync from stranske/Workflows Template hash: 63e90e8d6749 Changes synced from sync-manifest.yml
🤖 Keepalive Loop StatusPR #272 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
There was a problem hiding this comment.
Pull request overview
This PR syncs workflow templates and supporting scripts from the stranske/Workflows repository, updating several agent automation workflows and related tooling. The changes introduce a new comprehensive auto-pilot workflow and update existing workflows for duplicate detection, auto-labeling, and verification.
Changes:
- Upgraded default LLM model from gpt-4o-mini to gpt-4o for better evaluation quality
- Added new agents-auto-pilot.yml workflow for end-to-end issue automation
- Refactored multiple workflows to checkout locally instead of external Workflows repo
- Enhanced duplicate detection with stricter similarity thresholds and title-based filtering
- Updated auto-label logic to apply only the best matching label to prevent over-labeling
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/llm_provider.py | Changed default model to gpt-4o with explanation |
| scripts/langchain/issue_formatter.py | Added task decomposition functionality with new helper functions |
| .github/workflows/agents-verifier.yml | Updated default models and descriptions for evaluation |
| .github/workflows/agents-dedup.yml | Changed to local checkout, raised similarity threshold, added title-based filtering |
| .github/workflows/agents-auto-pilot.yml | New comprehensive workflow for end-to-end issue automation |
| .github/workflows/agents-auto-label.yml | Changed to local checkout, refined auto-apply logic |
| .github/scripts/parse_chatgpt_topics.py | Added admin/dependency keywords to implementation_notes section |
| .github/scripts/error_classifier.js | Added debug logging for error classification |
| .github/scripts/conflict_detector.js | Added filtering for automated comments in conflict detection |
| # Require at least 40% of words to overlap for a duplicate flag | ||
| max_words = max(len(title_words_new), len(title_words_match), 1) | ||
| overlap_ratio = len(shared_words) / max_words | ||
| if m.score >= 0.95 or overlap_ratio >= 0.4: |
There was a problem hiding this comment.
The condition on line 143 checks for very high similarity (score >= 0.95) OR title overlap >= 0.4. However, this creates an inconsistency: matches with score >= 0.95 bypass the title overlap requirement entirely. This means issues with 95%+ content similarity but completely different titles (overlap_ratio = 0) would still be flagged as duplicates. Consider if this is the intended behavior or if high-similarity matches should also require some minimum title overlap.
| # Require at least 40% of words to overlap for a duplicate flag | |
| max_words = max(len(title_words_new), len(title_words_match), 1) | |
| overlap_ratio = len(shared_words) / max_words | |
| if m.score >= 0.95 or overlap_ratio >= 0.4: | |
| # Require at least 40% of words to overlap for a duplicate flag, | |
| # and ensure even very high-similarity matches have some title overlap. | |
| max_words = max(len(title_words_new), len(title_words_match), 1) | |
| overlap_ratio = len(shared_words) / max_words | |
| if (m.score >= 0.95 and overlap_ratio > 0.0) or overlap_ratio >= 0.4: |
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: |
There was a problem hiding this comment.
The Python version was downgraded from 3.12 to 3.11. If this change is intentional for compatibility reasons, it should be documented why. If the codebase or dependencies require Python 3.12 features, this could cause runtime errors.
| with: | |
| with: | |
| # Use Python 3.11 intentionally: some dependencies in this workflow (including optional extras) | |
| # are not yet fully compatible with Python 3.12 in our CI environment. |
| repository: stranske/Workflows | ||
| path: workflows-repo | ||
| - name: Checkout repository | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
The checkout action was downgraded from v6 to v4. This is an unusual change as typically actions are upgraded to newer versions. If this is intentional for compatibility, it should be documented. Otherwise, consider keeping v6 which may have security or performance improvements.
| uses: actions/checkout@v4 | |
| uses: actions/checkout@v6 |
| if not tasks: | ||
| return formatted | ||
|
|
||
| from scripts.langchain import task_decomposer |
There was a problem hiding this comment.
The function attempts to import 'task_decomposer' from 'scripts.langchain', but this module doesn't exist in the repository. This will cause an ImportError at runtime when _apply_task_decomposition is called, breaking the issue formatting functionality.
| from scripts.langchain import task_decomposer | |
| try: | |
| from scripts.langchain import task_decomposer | |
| except ImportError: | |
| # Task decomposition is optional; if the module is not available, | |
| # fall back to returning the original formatted text. | |
| return formatted |
| auto_apply = [best_match] | ||
| # Move other high-confidence matches to suggestions | ||
| for m in matches[1:]: | ||
| if m.score >= auto_threshold and m not in suggestions: |
There was a problem hiding this comment.
The condition 'm not in suggestions' on line 141 is redundant. The 'suggestions' list (line 132) only contains matches where score < auto_threshold, so any match with score >= auto_threshold will never be in 'suggestions'. The check 'm not in suggestions' will always evaluate to True when 'm.score >= auto_threshold' is True, making the second condition unnecessary.
| if m.score >= auto_threshold and m not in suggestions: | |
| if m.score >= auto_threshold: |
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} | ||
| PYTHONPATH: ${{ github.workspace }}/workflows-repo | ||
| PYTHONPATH: ${{ github.workspace }} |
There was a problem hiding this comment.
The PYTHONPATH environment variable is set but unused. In line 38 and 83, 'cd workflows-repo' commands were removed but the PYTHONPATH was not removed from line 79. Since the checkout is now local (line 29), this environment variable is unnecessary.
| PYTHONPATH: ${{ github.workspace }} |
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: |
There was a problem hiding this comment.
The Python version was downgraded from 3.12 to 3.11. If this change is intentional for compatibility reasons, it should be documented why. If the codebase or dependencies require Python 3.12 features, this could cause runtime errors.
| with: | |
| with: | |
| # Intentionally pinned to Python 3.11 for current dependency/runtime compatibility. | |
| # TODO: Re-evaluate and upgrade to 3.12+ once all dependencies fully support it. |
| repository: stranske/Workflows | ||
| path: workflows-repo | ||
| - name: Checkout repository | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
The checkout action was downgraded from v6 to v4. This is an unusual change as typically actions are upgraded to newer versions. If this is intentional for compatibility, it should be documented. Otherwise, consider keeping v6 which may have security or performance improvements.
| uses: actions/checkout@v4 | |
| uses: actions/checkout@v6 |
| if not suggestions: | ||
| return formatted | ||
|
|
||
| from scripts.langchain import issue_optimizer |
There was a problem hiding this comment.
The function attempts to import 'issue_optimizer' from 'scripts.langchain' and call '_apply_task_decomposition', but this module doesn't exist in the repository. This will cause an ImportError at runtime, breaking the issue formatting functionality.
| if (mode === 'compare') { | ||
| core.setOutput('model', 'gpt-4o'); // GitHub Models - current flagship | ||
| core.setOutput('model2', 'gpt-5.2'); // OpenAI - GPT-5.2 | ||
| // gpt-5.2 (OpenAI) + Mistral-large (GitHub Models) for cross-provider comparison | ||
| core.setOutput('model', 'gpt-5.2'); | ||
| core.setOutput('model2', 'Mistral-large-2407'); | ||
| } else { | ||
| core.setOutput('model', ''); // Use default | ||
| core.setOutput('model', ''); // Use default (gpt-4o) | ||
| core.setOutput('model2', ''); | ||
| } | ||
| core.setOutput('provider', ''); // Use default |
There was a problem hiding this comment.
When compare mode is triggered via label, the model is set to 'gpt-5.2' (an OpenAI model) on line 159, but the provider is set to empty string on line 165, which will use the default 'github-models' provider (line 62). This mismatch will cause the workflow to fail as gpt-5.2 is not available on GitHub Models. The provider should be set to 'openai' when using gpt-5.2, or the logic should be updated to handle cross-provider comparison correctly.
Sync Summary
Files Updated
Files Skipped
Review Checklist
Source: stranske/Workflows
Manifest:
.github/sync-manifest.yml