Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
300 changes: 300 additions & 0 deletions .github/workflows/agents-issue-optimizer.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
name: Agents Issue Optimizer

on:
issues:
types: [labeled]
workflow_dispatch:
inputs:
issue_number:
description: "Issue number to optimize"
required: true
type: number
phase:
description: "Phase to run (analyze, apply, or format)"
required: true
type: choice
options:
- analyze
- apply
- format
Comment on lines +12 to +19
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions this is "Phase 1" and includes issue formatting capabilities, but the workflow file includes references to Phase 2 (apply suggestions) and Phase 3 (format) operations that depend on the missing 'issue_optimizer.py' module. Either the PR description should clarify that this is a complete multi-phase system, or the workflow should be scoped to only include the formatting phase (Phase 3) that can work with the included 'issue_formatter.py' module.

Copilot uses AI. Check for mistakes.

jobs:
optimize_issue:
runs-on: ubuntu-latest
permissions:
issues: write
contents: read

steps:
- name: Check trigger conditions
id: check
env:
EVENT_NAME: ${{ github.event_name }}
LABEL_NAME: ${{ github.event.label.name }}
DISPATCH_PHASE: ${{ inputs.phase }}
run: |
if [[ "$EVENT_NAME" == "workflow_dispatch" ]]; then
{
echo "phase=$DISPATCH_PHASE"
echo "issue_number=${{ inputs.issue_number }}"
echo "should_run=true"
} >> "$GITHUB_OUTPUT"
elif [[ "$LABEL_NAME" == "agents:optimize" ]]; then
{
echo "phase=analyze"
echo "issue_number=${{ github.event.issue.number }}"
echo "should_run=true"
} >> "$GITHUB_OUTPUT"
elif [[ "$LABEL_NAME" == "agents:apply-suggestions" ]]; then
{
echo "phase=apply"
echo "issue_number=${{ github.event.issue.number }}"
echo "should_run=true"
} >> "$GITHUB_OUTPUT"
elif [[ "$LABEL_NAME" == "agents:format" ]]; then
{
echo "phase=format"
echo "issue_number=${{ github.event.issue.number }}"
echo "should_run=true"
} >> "$GITHUB_OUTPUT"
else
echo "should_run=false" >> "$GITHUB_OUTPUT"
fi

- name: Checkout repository
if: steps.check.outputs.should_run == 'true'
uses: actions/checkout@v4

- name: Set up Python
if: steps.check.outputs.should_run == 'true'
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install dependencies
if: steps.check.outputs.should_run == 'true'
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
# Install langchain dependencies
pip install langchain langchain-core langchain-openai langchain-community

- name: Get issue body
if: steps.check.outputs.should_run == 'true'
id: get_issue
env:
GH_TOKEN: ${{ github.token }}
ISSUE_NUMBER: ${{ steps.check.outputs.issue_number }}
run: |
gh api "repos/${{ github.repository }}/issues/${ISSUE_NUMBER}" > /tmp/issue.json
jq -r '.body' /tmp/issue.json > /tmp/issue_body.md
echo "Issue body saved to /tmp/issue_body.md"

- name: Phase 1 - Analyze Issue
if: steps.check.outputs.should_run == 'true' && steps.check.outputs.phase == 'analyze'
id: analyze
env:
ISSUE_NUMBER: ${{ steps.check.outputs.issue_number }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
echo "Running analysis on issue #${ISSUE_NUMBER}"
python scripts/langchain/issue_optimizer.py \
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow references 'scripts/langchain/issue_optimizer.py' which does not exist in the repository. The script execution will fail when attempting to run this file. This includes imports of IssueOptimizationResult, format_suggestions_comment, _extract_suggestions_json, and apply_suggestions which are all expected to be in this missing module.

Copilot uses AI. Check for mistakes.
--input-file /tmp/issue_body.md \
--json > /tmp/suggestions.json

# Format the comment
python -c "
import json
import sys
sys.path.insert(0, 'scripts/langchain')
from issue_optimizer import IssueOptimizationResult, format_suggestions_comment

with open('/tmp/suggestions.json') as f:
data = json.load(f)

result = IssueOptimizationResult(
task_splitting=data.get('task_splitting', []),
blocked_tasks=data.get('blocked_tasks', []),
objective_criteria=data.get('objective_criteria', []),
missing_sections=data.get('missing_sections', []),
formatting_issues=data.get('formatting_issues', []),
overall_notes=data.get('overall_notes', ''),
provider_used=data.get('provider_used')
)

comment = format_suggestions_comment(result)
with open('/tmp/comment.md', 'w') as f:
f.write(comment)
" || {
echo "Failed to format comment, using raw JSON"
cat /tmp/suggestions.json > /tmp/comment.md
}

# Post comment
gh issue comment "${ISSUE_NUMBER}" --body-file /tmp/comment.md

echo "Analysis complete. Review suggestions and add 'agents:apply-suggestions' label to apply."

- name: Advisory issue dedup check
if: steps.check.outputs.should_run == 'true' && steps.check.outputs.phase == 'analyze'
env:
GH_TOKEN: ${{ github.token }}
ISSUE_NUMBER: ${{ steps.check.outputs.issue_number }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
echo "Checking for potential duplicate issues (advisory)"
gh api "repos/${{ github.repository }}/issues?state=open&per_page=100" --paginate > /tmp/open_issues.json
gh api "repos/${{ github.repository }}/issues/${ISSUE_NUMBER}/comments" --paginate > /tmp/dedup_comments.json

python - <<'PY' || true
import json
from scripts.langchain import issue_dedup

with open('/tmp/issue.json', encoding='utf-8') as f:
issue = json.load(f)
with open('/tmp/open_issues.json', encoding='utf-8') as f:
open_issues = json.load(f)
with open('/tmp/dedup_comments.json', encoding='utf-8') as f:
comments = json.load(f)

marker = issue_dedup.SIMILAR_ISSUES_MARKER
for comment in comments or []:
body = (comment or {}).get('body') or ''
if marker in body:
raise SystemExit(0)

# Conservative defaults; can be tuned later.
threshold = 0.82
store = issue_dedup.build_issue_vector_store(open_issues)
if store is None:
raise SystemExit(0)

title = (issue.get('title') or '').strip()
body = (issue.get('body') or '').strip()
query = f"{title}\n{body}".strip() if body else title
if not query:
raise SystemExit(0)

matches = issue_dedup.find_similar_issues(store, query, threshold=threshold, k=5)
comment = issue_dedup.format_similar_issues_comment(matches, max_items=5)
if comment:
with open('/tmp/dedup_comment.md', 'w', encoding='utf-8') as out:
out.write(comment)
Comment on lines +159 to +184
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation in the Python inline script. Lines 159-160, 162-166, and 170-172 use 2-space indentation while lines 157-158 and 168-169 use 0-space indentation. This will cause a Python IndentationError when the script runs. All lines within the same scope should use consistent indentation (either all at the same level after the opening 'with' statements, or properly nested).

Suggested change
with open('/tmp/dedup_comments.json', encoding='utf-8') as f:
comments = json.load(f)
marker = issue_dedup.SIMILAR_ISSUES_MARKER
for comment in comments or []:
body = (comment or {}).get('body') or ''
if marker in body:
raise SystemExit(0)
# Conservative defaults; can be tuned later.
threshold = 0.82
store = issue_dedup.build_issue_vector_store(open_issues)
if store is None:
raise SystemExit(0)
title = (issue.get('title') or '').strip()
body = (issue.get('body') or '').strip()
query = f"{title}\n{body}".strip() if body else title
if not query:
raise SystemExit(0)
matches = issue_dedup.find_similar_issues(store, query, threshold=threshold, k=5)
comment = issue_dedup.format_similar_issues_comment(matches, max_items=5)
if comment:
with open('/tmp/dedup_comment.md', 'w', encoding='utf-8') as out:
out.write(comment)
with open('/tmp/dedup_comments.json', encoding='utf-8') as f:
comments = json.load(f)
marker = issue_dedup.SIMILAR_ISSUES_MARKER
for comment in comments or []:
body = (comment or {}).get('body') or ''
if marker in body:
raise SystemExit(0)
# Conservative defaults; can be tuned later.
threshold = 0.82
store = issue_dedup.build_issue_vector_store(open_issues)
if store is None:
raise SystemExit(0)
title = (issue.get('title') or '').strip()
body = (issue.get('body') or '').strip()
query = f"{title}\n{body}".strip() if body else title
if not query:
raise SystemExit(0)
matches = issue_dedup.find_similar_issues(store, query, threshold=threshold, k=5)
comment = issue_dedup.format_similar_issues_comment(matches, max_items=5)
if comment:
with open('/tmp/dedup_comment.md', 'w', encoding='utf-8') as out:
out.write(comment)

Copilot uses AI. Check for mistakes.
Comment on lines +152 to +184
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow references 'scripts.langchain.issue_dedup' module which does not exist in the repository. The import statement will fail at runtime. Either this module needs to be added as part of this PR, or the dedup check step should be removed or updated to reference an existing module.

Suggested change
import json
from scripts.langchain import issue_dedup
with open('/tmp/issue.json', encoding='utf-8') as f:
issue = json.load(f)
with open('/tmp/open_issues.json', encoding='utf-8') as f:
open_issues = json.load(f)
with open('/tmp/dedup_comments.json', encoding='utf-8') as f:
comments = json.load(f)
marker = issue_dedup.SIMILAR_ISSUES_MARKER
for comment in comments or []:
body = (comment or {}).get('body') or ''
if marker in body:
raise SystemExit(0)
# Conservative defaults; can be tuned later.
threshold = 0.82
store = issue_dedup.build_issue_vector_store(open_issues)
if store is None:
raise SystemExit(0)
title = (issue.get('title') or '').strip()
body = (issue.get('body') or '').strip()
query = f"{title}\n{body}".strip() if body else title
if not query:
raise SystemExit(0)
matches = issue_dedup.find_similar_issues(store, query, threshold=threshold, k=5)
comment = issue_dedup.format_similar_issues_comment(matches, max_items=5)
if comment:
with open('/tmp/dedup_comment.md', 'w', encoding='utf-8') as out:
out.write(comment)
import sys
# The advisory issue deduplication helper module is not available
# in this repository. Skip this optional check without failing
# the workflow.
sys.stdout.write("Advisory deduplication module not found; skipping duplicate issue check.\n")

Copilot uses AI. Check for mistakes.
PY

if [[ -f /tmp/dedup_comment.md ]]; then
gh issue comment "${ISSUE_NUMBER}" --body-file /tmp/dedup_comment.md || true
else
echo "No likely duplicates detected."
fi

- name: Phase 2 - Apply Suggestions
if: steps.check.outputs.should_run == 'true' && steps.check.outputs.phase == 'apply'
id: apply
env:
ISSUE_NUMBER: ${{ steps.check.outputs.issue_number }}
GH_TOKEN: ${{ github.token }}
run: |
echo "Extracting suggestions from comments on issue #${ISSUE_NUMBER}"

# Get all comments and find the one with suggestions JSON
gh api "repos/${{ github.repository }}/issues/${ISSUE_NUMBER}/comments" --paginate > /tmp/comments.json

# Extract suggestions JSON from comment
python -c "
import json
import sys
import re
sys.path.insert(0, 'scripts/langchain')
from issue_optimizer import _extract_suggestions_json, apply_suggestions

Comment on lines +210 to +212
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow references 'from issue_optimizer import _extract_suggestions_json, apply_suggestions' which relies on the non-existent 'issue_optimizer.py' module. This import will fail at runtime and prevent Phase 2 from executing successfully.

Suggested change
sys.path.insert(0, 'scripts/langchain')
from issue_optimizer import _extract_suggestions_json, apply_suggestions
def _extract_suggestions_json(body: str):
\"\"\"Extract a JSON object containing suggestions from a comment body.
This looks first for a fenced ```json code block, and falls back to
the first top-level JSON object found in the text.
\"\"\"
if not body:
return None
# Try to find a ```json ... ``` fenced block
fenced_match = re.search(r\"```json\\s*(\\{[\\s\\S]*?\\})\\s*```\", body, re.IGNORECASE)
candidate = None
if fenced_match:
candidate = fenced_match.group(1)
else:
# Fallback: first JSON object-like substring
brace_match = re.search(r\"\\{[\\s\\S]*\\}\", body)
if brace_match:
candidate = brace_match.group(0)
if not candidate:
return None
try:
return json.loads(candidate)
except Exception:
return None
def apply_suggestions(issue_body: str, suggestions, use_llm: bool = False):
\"\"\"Apply suggestions to the issue body.
Expects suggestions to contain a 'formatted_body' or similar field.
Falls back to returning the original body if the structure is unknown.
\"\"\"
if isinstance(suggestions, dict):
if 'formatted_body' in suggestions and isinstance(suggestions['formatted_body'], str):
formatted = suggestions['formatted_body']
elif 'body' in suggestions and isinstance(suggestions['body'], str):
formatted = suggestions['body']
elif 'issue_body' in suggestions and isinstance(suggestions['issue_body'], str):
formatted = suggestions['issue_body']
else:
formatted = issue_body
else:
formatted = issue_body
return {'formatted_body': formatted}

Copilot uses AI. Check for mistakes.
with open('/tmp/comments.json') as f:
comments = json.load(f)

suggestions = None
for comment in comments:
body = comment.get('body', '')
extracted = _extract_suggestions_json(body)
if extracted:
suggestions = extracted
break

if not suggestions:
print('ERROR: No suggestions JSON found in comments')
sys.exit(1)

# Read current issue body
with open('/tmp/issue_body.md') as f:
issue_body = f.read()

# Apply suggestions
result = apply_suggestions(issue_body, suggestions, use_llm=False)

with open('/tmp/updated_body.md', 'w') as f:
f.write(result['formatted_body'])

print('Suggestions applied successfully')
" || exit 1

# Update issue body
gh issue edit "${ISSUE_NUMBER}" --body-file /tmp/updated_body.md

echo "Issue body updated with applied suggestions"

- name: Phase 3 - Format Issue
if: steps.check.outputs.should_run == 'true' && steps.check.outputs.phase == 'format'
id: format
env:
ISSUE_NUMBER: ${{ steps.check.outputs.issue_number }}
GH_TOKEN: ${{ github.token }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
echo "Formatting issue #${ISSUE_NUMBER} into AGENT_ISSUE_TEMPLATE structure"

# Format the issue using issue_formatter.py
python scripts/langchain/issue_formatter.py \
--input-file /tmp/issue_body.md \
--json > /tmp/format_result.json

# Extract formatted body
python -c "
import json
with open('/tmp/format_result.json') as f:
result = json.load(f)
formatted = result.get('formatted_body', '')
if not formatted:
print('ERROR: No formatted body returned')
import sys
sys.exit(1)
with open('/tmp/formatted_body.md', 'w') as f:
f.write(formatted)
print('Issue formatted successfully')
" || exit 1

# Update issue body with formatted version
gh issue edit "${ISSUE_NUMBER}" --body-file /tmp/formatted_body.md

echo "Issue body updated with formatted structure"

- name: Manage labels
if: steps.check.outputs.should_run == 'true'
env:
GH_TOKEN: ${{ github.token }}
ISSUE_NUMBER: ${{ steps.check.outputs.issue_number }}
PHASE: ${{ steps.check.outputs.phase }}
run: |
if [[ "$PHASE" == "apply" ]]; then
# Remove both optimization labels and add formatted label
gh issue edit "${ISSUE_NUMBER}" --remove-label "agents:optimize" || true
gh issue edit "${ISSUE_NUMBER}" --remove-label "agents:apply-suggestions"
gh issue edit "${ISSUE_NUMBER}" --add-label "agents:formatted"
echo "Labels updated: removed optimize/apply-suggestions, added formatted"
elif [[ "$PHASE" == "format" ]]; then
# Remove format trigger label and add formatted result label
gh issue edit "${ISSUE_NUMBER}" --remove-label "agents:format"
gh issue edit "${ISSUE_NUMBER}" --add-label "agents:formatted"
echo "Labels updated: removed format, added formatted"
fi
2 changes: 1 addition & 1 deletion autofix_report_enriched.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"changed": true, "classification": {"total": 0, "new": 0, "allowed": 0}, "timestamp": "2026-01-01T08:32:52Z", "files": ["scripts/sync_test_dependencies.py"]}
{"changed": true, "classification": {"total": 0, "new": 0, "allowed": 0}, "timestamp": "2026-01-05T21:20:11Z", "files": ["scripts/validate_dependency_test_setup.py"]}
26 changes: 25 additions & 1 deletion scripts/langchain/issue_formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,29 @@ def _formatted_output_valid(text: str) -> bool:
return all(section in text for section in required)


def _select_code_fence(text: str) -> str:
runs = [len(match.group(0)) for match in re.finditer(r"`+", text)]
fence_len = max(3, max(runs, default=0) + 1)
return "`" * fence_len


def _append_raw_issue_section(formatted: str, issue_body: str) -> str:
raw = issue_body.strip()
if not raw:
return formatted
marker = "<summary>Original Issue</summary>"
if marker in formatted:
return formatted
fence = _select_code_fence(raw)
details = (
"\n\n<details>\n"
"<summary>Original Issue</summary>\n\n"
f"{fence}text\n{raw}\n{fence}\n"
"</details>"
)
return f"{formatted.rstrip()}{details}\n"


def format_issue_body(issue_body: str, *, use_llm: bool = True) -> dict[str, Any]:
if not issue_body:
issue_body = ""
Expand All @@ -306,13 +329,14 @@ def format_issue_body(issue_body: str, *, use_llm: bool = True) -> dict[str, Any
content = getattr(response, "content", None) or str(response)
formatted = content.strip()
if _formatted_output_valid(formatted):
formatted = _append_raw_issue_section(formatted, issue_body)
return {
"formatted_body": formatted,
"provider_used": provider,
"used_llm": True,
}

formatted = _format_issue_fallback(issue_body)
formatted = _append_raw_issue_section(_format_issue_fallback(issue_body), issue_body)
return {
"formatted_body": formatted,
"provider_used": None,
Expand Down
Loading
Loading