fix: improve verifier follow-up issue format#310
Conversation
1. Remove confusing 'Satisfy:' prefix from tasks - it's redundant since the task list already implies items need to be completed 2. Filter out markdown section headers (## Related, etc.) that were incorrectly being captured as acceptance criteria 3. Filter out PR/Issue references that look like list items but aren't actual acceptance criteria This makes follow-up issues clearer and removes noise from the task list.
Automated Status SummaryHead SHA: b8957a8
Coverage Overview
Coverage Trend
Top Coverage Hotspots (lowest coverage)
Low Coverage Files (<50.0%)
Updated automatically; will refresh on subsequent CI/Docker completions. Keepalive checklistScopeNo scope information available Tasks
Acceptance criteria
|
🤖 Keepalive Loop StatusPR #310 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
There was a problem hiding this comment.
Pull request overview
This PR improves the format of verifier follow-up issues by removing redundant prefixes and filtering out incorrectly captured markdown content from task lists, addressing user feedback about confusing issue #306 formatting.
Key Changes:
- Removes "Satisfy:" prefix from task items generated from unmet criteria
- Adds filtering to prevent markdown section headers from being captured as acceptance criteria
- Adds filtering to prevent PR/Issue reference lines from being captured as task items
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| newTasks = findings.gaps.map((gap) => `Address: ${gap}`); | ||
| } else if (findings.unmetCriteria.length > 0) { | ||
| // Use verifier's specific unmet criteria | ||
| newTasks = findings.unmetCriteria.map((criterion) => `Satisfy: ${criterion}`); | ||
| // Use verifier's specific unmet criteria (no prefix - it's clear these need work) | ||
| newTasks = findings.unmetCriteria; | ||
| } else { | ||
| // Fall back to creating tasks from unmet acceptance criteria | ||
| newTasks = refinedUnmetCriteria.map((criterion) => `Satisfy: ${criterion}`); | ||
| newTasks = refinedUnmetCriteria; |
There was a problem hiding this comment.
There's an inconsistency in how tasks are generated. When gaps are identified (line 362), tasks are prefixed with "Address:", but when unmet criteria or refined unmet criteria are used (lines 365, 368), no prefix is added. This creates inconsistent task formatting in the follow-up issue.
For consistency, consider either:
- Removing the "Address:" prefix on line 362 to match the other cases (recommended based on the PR's rationale)
- Adding appropriate prefixes to lines 365 and 368 to match line 362
The first option aligns with the PR description's goal of removing redundant prefixes since task list items already imply they need work.
| // Skip items that look like markdown links to sections (e.g., "- PR #123") | ||
| if (/^[-–]\s*(PR|Issue)\s*#\d+/i.test(text)) { |
There was a problem hiding this comment.
The regex pattern is checking for text that starts with a hyphen/dash followed by "PR" or "Issue", but at this point text has already been extracted from the checkbox content (via match[1].trim()). The leading - or * was consumed by the checkbox pattern on line 184, so text will never start with - or –.
For example, if the input line is - [ ] - PR #296 - Description, then match[1] captures - PR #296 - Description, which will correctly match this pattern. However, if the input is - [ ] PR #296 - Description (without a leading dash after the checkbox), it won't match.
Consider simplifying the pattern to /^(PR|Issue)\s*#\d+/i to match text that starts directly with "PR" or "Issue" followed by a number. Alternatively, if you want to handle both cases, use /^[-–]?\s*(PR|Issue)\s*#\d+/i to make the leading dash optional.
| // Skip items that look like markdown links to sections (e.g., "- PR #123") | |
| if (/^[-–]\s*(PR|Issue)\s*#\d+/i.test(text)) { | |
| // Skip items that look like markdown links to sections (e.g., "- PR #123" or "PR #123") | |
| if (/^[-–]?\s*(PR|Issue)\s*#\d+/i.test(text)) { |
| items.push(match[1].trim()); | ||
| const text = match[1].trim(); | ||
| // Skip markdown section headers that were incorrectly captured as criteria | ||
| if (text.startsWith('#') || text.startsWith('##')) { |
There was a problem hiding this comment.
The check for startsWith('##') is redundant since any string starting with '##' will also start with '#'. The first condition will always catch both cases. Consider removing the || text.startsWith('##') portion to simplify the logic without changing behavior.
| if (text.startsWith('#') || text.startsWith('##')) { | |
| if (text.startsWith('#')) { |
|
@codex fix comments |
|
Summary
Testing
|
1. Add helper functions to detect placeholder and garbage content: - isPlaceholderContent(): Detects "section missing from source issue" text - looksLikeSectionHeader(): Filters markdown headers like "## Related" - looksLikeReferenceLink(): Filters PR/Issue reference links 2. Update extractUncheckedItems() to use the new filters, removing: - Placeholder text from bot-generated PR templates - Markdown section headers incorrectly captured as criteria - PR/Issue reference links that aren't actual acceptance criteria 3. Add hasSubstantiveContent flag to formatFollowUpIssue() return: - true when there are real tasks/criteria/gaps to address - false when all content is placeholders or empty 4. Update both verifier workflows to skip issue creation when hasSubstantiveContent is false Fixes issues like #313 (contentless) and #306 (garbage content). Tested: - PR #304 scenario: Now correctly filters out "## Related" and "- PR #N" - PR #310 scenario: Now returns hasSubstantiveContent=false - All 39 existing tests pass
* docs: update SystemEvaluation.md - parallel testing is implemented The documentation incorrectly stated that pytest-xdist was missing and tests run sequentially. In fact: - pytest-xdist is already in pyproject.toml (version 3.8.0) - reusable-10-ci-python.yml installs pytest-xdist automatically - The workflow uses '-n auto --dist=loadgroup' when xdist is detected Updated the Performance section and Recommendations to reflect the actual implemented state. * fix: prevent contentless follow-up issues from verifier 1. Add helper functions to detect placeholder and garbage content: - isPlaceholderContent(): Detects "section missing from source issue" text - looksLikeSectionHeader(): Filters markdown headers like "## Related" - looksLikeReferenceLink(): Filters PR/Issue reference links 2. Update extractUncheckedItems() to use the new filters, removing: - Placeholder text from bot-generated PR templates - Markdown section headers incorrectly captured as criteria - PR/Issue reference links that aren't actual acceptance criteria 3. Add hasSubstantiveContent flag to formatFollowUpIssue() return: - true when there are real tasks/criteria/gaps to address - false when all content is placeholders or empty 4. Update both verifier workflows to skip issue creation when hasSubstantiveContent is false Fixes issues like #313 (contentless) and #306 (garbage content). Tested: - PR #304 scenario: Now correctly filters out "## Related" and "- PR #N" - PR #310 scenario: Now returns hasSubstantiveContent=false - All 39 existing tests pass * Update .github/scripts/verifier_issue_formatter.js Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update .github/scripts/verifier_issue_formatter.js Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* docs: update SystemEvaluation.md - parallel testing is implemented The documentation incorrectly stated that pytest-xdist was missing and tests run sequentially. In fact: - pytest-xdist is already in pyproject.toml (version 3.8.0) - reusable-10-ci-python.yml installs pytest-xdist automatically - The workflow uses '-n auto --dist=loadgroup' when xdist is detected Updated the Performance section and Recommendations to reflect the actual implemented state. * fix: prevent contentless follow-up issues from verifier 1. Add helper functions to detect placeholder and garbage content: - isPlaceholderContent(): Detects "section missing from source issue" text - looksLikeSectionHeader(): Filters markdown headers like "## Related" - looksLikeReferenceLink(): Filters PR/Issue reference links 2. Update extractUncheckedItems() to use the new filters, removing: - Placeholder text from bot-generated PR templates - Markdown section headers incorrectly captured as criteria - PR/Issue reference links that aren't actual acceptance criteria 3. Add hasSubstantiveContent flag to formatFollowUpIssue() return: - true when there are real tasks/criteria/gaps to address - false when all content is placeholders or empty 4. Update both verifier workflows to skip issue creation when hasSubstantiveContent is false Fixes issues like #313 (contentless) and #306 (garbage content). Tested: - PR #304 scenario: Now correctly filters out "## Related" and "- PR #N" - PR #310 scenario: Now returns hasSubstantiveContent=false - All 39 existing tests pass * Update .github/scripts/verifier_issue_formatter.js Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update .github/scripts/verifier_issue_formatter.js Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Initial plan * Fix misleading comment in formatSimpleFollowUpIssue The comment claimed "Simple format always has substantive content" but the code correctly checks for it. Updated comment to accurately reflect that we verify content rather than assume it exists. Co-authored-by: stranske <23046322+stranske@users.noreply.github.com> --------- Co-authored-by: stranske <stranske@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: stranske <23046322+stranske@users.noreply.github.com>
Automated Status Summary
Scope
Tasks
Acceptance criteria
Head SHA: 9334b84
Latest Runs: ❔ in progress — Gate
Required: gate: ❔ in progress