Add Ralph autonomous development workflow#728
Conversation
Integrate Ralph pattern for autonomous AI development loops using GitHub issues as specs. This enables structured spec refinement and autonomous implementation with proper context rotation. Changes: - Add mask ralph spec command for interactive spec building - Add mask ralph loop command for autonomous implementation - Create SPEC.md issue template with standard format - Document Ralph workflow in CLAUDE.md - Create in-progress and needs-attention labels The workflow uses GitHub issue checkboxes as state, labels for lifecycle tracking (refining -> ready -> in-progress), and pre-commit hooks as the verification gate. Closes #723 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR adds Ralph autonomous development workflow infrastructure including documentation in CLAUDE.md, CLI command definitions in maskfile.md (with a noted duplicate block), updated issue template for spec-based workflows, environment configuration for jq dependency, and a new preflight validation script for dependency checking. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Greptile OverviewGreptile SummaryThis PR implements the Ralph autonomous development workflow, adding commands for spec refinement ( Key changes:
The workflow provides a complete autonomous development loop with checkboxes for tracking progress, git commit verification gates, and automatic PR creation on completion. Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| .github/ISSUE_TEMPLATE/ISSUE_TEMPLATE.md | Converted generic issue template to structured spec template with sections for requirements, decisions, and implementation |
| CLAUDE.md | Added Ralph workflow documentation including commands, labels, workflow steps, and completion signals - references labels that may not exist yet |
| maskfile.md | Added comprehensive Ralph autonomous development commands (setup, spec, loop, backlog, pull-request) with proper error handling and temp file cleanup |
| tools/ralph-preflight.sh | Added shared preflight validation script for Ralph commands to check required dependencies |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In @.github/ISSUE_TEMPLATE/SPEC.md:
- Around line 12-13: Update the SPEC template to remind users to manually
replace the date placeholders by adding an inline comment or note next to the
"**Created:** YYYY-MM-DD" and "**Last discussed:** YYYY-MM-DD" lines (search for
the exact "YYYY-MM-DD" strings or heading tokens "**Created:**" and "**Last
discussed:**") that instructs maintainers that GitHub does not inject dates and
they must update these fields manually.
- Line 6: Replace the hardcoded project reference projects:
["pocketsizefund/11"] in the issue templates (BUG.md, FEATURE.md, SPEC.md) with
the correct project identifier or remove the projects field entirely; locate the
exact string projects: ["pocketsizefund/11"] in each template and either change
it to projects: ["oscmcompany/1"] (per README) or delete that line so
contributors can manually assign issues.
In `@CLAUDE.md`:
- Around line 81-83: Rename the subsection header "### Wiggum Learnings" to "###
Ralph Learnings" so the title matches the description; locate the header string
"### Wiggum Learnings" in the CLAUDE.md content and replace it with "### Ralph
Learnings" (leave the following paragraph "Document failure patterns here after
Ralph loops to prevent recurrence:" unchanged).
In `@maskfile.md`:
- Around line 738-1064: The issue: the Ralph workflow assumes labels (refining,
ready, in-progress, needs-attention) exist but never creates or documents them;
add a setup step that creates these labels before first use. Implement a new
"ralph setup" command/section in maskfile.md that iterates the four label specs
(names, colors, descriptions) and uses gh label create (or skips if exists via
gh label list) to ensure labels exist, and update documentation/CLAUDE.md to
instruct running "mask ralph setup" before using "mask ralph spec" or "mask
ralph loop"; reference the label names
("refining","ready","in-progress","needs-attention"), the new command name
("mask ralph setup" or "ralph setup"), and use "gh label create" / "gh label
list" in the implementation.
- Around line 770-808: The inline template assigned to template_body in the spec
command diverges from the canonical .github/ISSUE_TEMPLATE/SPEC.md (missing
front matter and mismatched labels); update the spec command to load and use the
SPEC.md content instead of duplicating it inline by replacing the heredoc
assignment to template_body with code that reads .github/ISSUE_TEMPLATE/SPEC.md
(extracting the body after the front-matter and substituting dates), and ensure
the labels passed (e.g., "refining" and "feature") match the front-matter so
both sources remain identical.
- Around line 770-808: The heredoc assigned to template_body uses a
single-quoted delimiter (cat <<'TEMPLATE') which prevents command substitution
so $(date +%Y-%m-%d) will be literal; change the heredoc to an unquoted
delimiter (cat <<TEMPLATE) so command substitution runs, and escape any dollar
signs you want to remain literal inside the template; update the template_body
assignment accordingly (look for template_body and the heredoc start cat
<<'TEMPLATE').
- Around line 996-1011: The PR body template stored in pr_body is missing the
required completion marker; update the pr_body construction to append the
literal tag <promise>COMPLETE</promise> (alongside existing "Closes
#${issue_number}") so the loop detector that checks for that marker can detect
completion; ensure you modify the block that builds pr_body (uses variables
pr_body, issue_number, iteration) to include the tag before EOF.
- Around line 1031-1043: The modified_files report uses `git diff --name-only
origin/master` while the branch may not be pushed, so the posted failure_comment
can show an incomplete diff; ensure the branch is pushed before generating
modified_files and building failure_comment (e.g., run a push for branch_name
such as `git push --set-upstream origin ${branch_name}`) so the comparison
reflects the remote state, and/or change the modified_files logic to compare
against the local HEAD only if push fails and clearly indicate that the branch
wasn't pushed in the failure_comment; update the modified_files assignment and
the failure_comment construction to reference this pushed-or-local status.
- Around line 878-884: The script hardcodes "master" as the default branch;
change it to detect the repo's actual default branch and compare against that.
Replace the fixed "master" check using current_branch=$(git rev-parse
--abbrev-ref HEAD) by computing default_branch dynamically (e.g., use git
symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed
's|refs/remotes/origin/||' or fallback to parsing git remote show origin or
checking for "main"/"master"), then compare current_branch to default_branch and
adjust the echo/exit logic (references: current_branch, default_branch, git
rev-parse, git symbolic-ref, git remote show origin).
- Around line 974-979: The CLI invocation that launches Claude (the line
containing the claude command with arguments including
"--dangerously-skip-permissions") must not silently disable permissions; remove
the "--dangerously-skip-permissions" flag from the claude invocation and instead
restrict tools via Claude's config (explicitly enable only required tools like
git/test runners) or add an explicit user-acknowledgement step before running
the loop (e.g., a confirmation prompt/check) so autonomous iterations cannot
perform destructive shell/git actions without consent; update the invocation
string and document the required tool permissions accordingly.
There was a problem hiding this comment.
Pull request overview
Adds a “Ralph” autonomous development workflow to the repo by introducing Mask commands for spec refinement and autonomous implementation loops, plus supporting documentation/templates.
Changes:
- Adds
mask ralph spec [issue_number]to create/refine GitHub-issue-backed specs via an interactive Claude session. - Adds
mask ralph loop <issue_number>to run an autonomous implementation loop with pre-flight checks, label transitions, and PR creation on completion. - Introduces a
SPEC.mdGitHub issue template and documents the Ralph workflow inCLAUDE.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| maskfile.md | Adds the ralph spec and ralph loop Mask tasks that create/refine spec issues and run the autonomous loop. |
| CLAUDE.md | Documents Ralph commands, labels, workflow steps, and completion signal. |
| .github/ISSUE_TEMPLATE/SPEC.md | Adds a standardized spec issue template (Problem/Requirements/Open Questions/Decisions/Specification). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Implements #727: Adds a new command to the Ralph workflow that analyzes open GitHub issues for: - Potential duplicates or overlapping issues - Stale issues (no activity for 60+ days) - Issues that may already be implemented (via codebase search) - Consolidation opportunities The command posts analysis results as comments on a dedicated "Backlog Review" tracking issue, creating it if it doesn't exist. Key features: - On-demand execution via `mask ralph backlog` - Uses Claude to analyze issue similarities and search codebase - Suggestions only (no auto-closing) - Includes confidence levels on findings - Creates/reuses backlog-review labeled tracking issue Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add backlog command to the Ralph Workflow documentation and include the new backlog-review label in the labels section. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@maskfile.md`:
- Around line 814-815: The command uses an undocumented label --label "feature"
which will fail because "feature" is not in the repository labels per CLAUDE.md;
remove the --label "feature" argument (or replace it with one of the documented
labels such as "refining", "ready", "in-progress", "needs-attention", or
"backlog-review") so the command only uses valid labels and succeeds (search for
the string --label "feature" in maskfile.md to locate and update it).
♻️ Duplicate comments (13)
CLAUDE.md (2)
59-62: Verify that workflow labels exist before first use.The workflow requires four labels (
refining,ready,in-progress,needs-attention), but the PR objectives mention creating only two (in-progressandneeds-attention). The commands will fail ifrefiningorreadylabels don't exist.
90-90: Rename "Wiggum Learnings" to "Ralph Learnings" for consistency.The subsection contradicts its description: "Document failure patterns here after Ralph loops." The entire workflow is named "Ralph," and "Wiggum" appears nowhere else in the codebase or documentation.
maskfile.md (11)
770-808: Command substitution not evaluated in single-quoted heredoc.The heredoc uses
<<'TEMPLATE'(single-quoted delimiter), which prevents shell expansion. Line 773's$(date +%Y-%m-%d)will be inserted literally as the string$(date +%Y-%m-%d)rather than the actual date.
770-808: Template duplication with.github/ISSUE_TEMPLATE/SPEC.md.The inline template differs from the canonical file template. Maintaining two copies will cause drift.
824-826: Consider usinggh --jqflag to simplify parsing.The current approach pipes
ghoutput throughjq, which works but requiresjqas a separate dependency.
878-884: Hardcoded "master" branch may not work for repositories using "main".The pre-flight checks assume the default branch is "master", but many repositories use "main" as their default branch.
894-900: Silent error handling could hide authentication or network failures.The
|| echo ""fallback silently convertsgh issue viewfailures (missing auth, network errors) into an empty labels list, producing a misleading error message about the missingreadylabel.
962-972: Trap inside loop causes temp file leak.Creating
tmpfileand settingtrapinside the loop (lines 971-972) means each iteration overwrites the trap, so only the last temp file gets cleaned up. Earlier iterations' temp files will remain on disk.
974-983: Pipeline failure withset -euo pipefailleaves issue in inconsistent state.Any failure in the
claude | grep | tee | jqpipeline will abort the script immediately, leaving the issue labeledin-progresswithout transitioning toneeds-attentionor posting a failure comment.
978-978: Security risk:--dangerously-skip-permissionsdisables all permission prompts.This flag allows Claude to execute any operation without user confirmation. A malicious or compromised issue spec could use prompt injection to execute arbitrary commands when
mask ralph loopruns.
996-1011: Missing<promise>COMPLETE</promise>marker in PR body.The PR objectives specify including the completion marker in the PR body, but the template (lines 996-1011) only includes
Closes #${issue_number}without the<promise>COMPLETE</promise>tag detected on line 986.
1031-1043: Branch not pushed before failure report, making review difficult.Line 1031 reports modified files using
git diff --name-only origin/master, but the branch hasn't been pushed when max iterations is reached. Reviewers checking the failure comment won't be able to access the branch.
738-1064: Labels assumed to exist but never created.The Ralph commands use four labels (
refining,ready,in-progress,needs-attention), but there's no command or documentation for creating them before first use.
Implements #729: Adds a command to process PR review feedback with two phases: Planning (Interactive): - Auto-detects PR from current branch (or use --pr flag) - Fetches unresolved review threads via GraphQL API - Presents each suggestion one by one - User can Accept, Skip, Modify, or Quit - Builds a plan of approved changes Execution (Autonomous): - Claude implements all approved suggestions - Commits changes with meaningful messages - Posts reply to each thread explaining what was done - Resolves conversations via GraphQL API - Pushes changes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove dates from spec template (issue state has date info) - Move tmpfile creation and trap outside loop to prevent orphaned temp files - Add ERR trap handler to clean up and update labels on unexpected exits - Add explicit gh auth status check before fetching issue labels - Add <promise>COMPLETE</promise> marker to PR body per requirements - Push branch before posting failure comment so reviewers can access it - Use --template flag for gh issue create instead of inline body - Use JSON output for issue creation to reliably parse issue numbers - Remove hardcoded pocketsizefund/11 project reference from all templates Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Link new issues to oscmcompany/1 project board automatically. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Show first 3 lines of code blocks with line count summary instead of full code. Makes review comments with large suggested diffs easier to parse during interactive planning. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.github/ISSUE_TEMPLATE/SPEC.md:
- Line 1: Remove the trailing space on line 1 of the SPEC.md template: open
.github/ISSUE_TEMPLATE/SPEC.md and delete the extra whitespace at the end of the
first line so the file starts with a clean line without any trailing space.
In `@maskfile.md`:
- Around line 1412-1416: The PR uses the unsafe flag
--dangerously-skip-permissions when invoking the claude CLI (in the command that
sets result=$(claude ... --dangerously-skip-permissions --system-prompt
"${system_prompt}" "Implement the suggestions in the plan. Output results JSON
when done.") — replace this with an explicit, least-privilege permission option
or remove it and require an explicit consent flow, or add a clear
comment/docstring near the claude invocation referencing the security trade-off
and linking to an explicit permission configuration; update the invocation site
(the claude call and any callers that set system_prompt/result) to use the safer
permission mechanism or document the justification.
♻️ Duplicate comments (7)
CLAUDE.md (2)
58-65: Verify that all documented labels exist in the repository.The PR description mentions creating
in-progressandneeds-attentionlabels, but the workflow also requiresrefiningandreadylabels. Ensure all five labels exist before using the Ralph workflow.#!/bin/bash # Verify all Ralph workflow labels exist required_labels=("refining" "ready" "in-progress" "needs-attention" "backlog-review") echo "Checking for required labels..." existing_labels=$(gh label list --json name --jq '.[].name') for label in "${required_labels[@]}"; do if echo "$existing_labels" | grep -q "^${label}$"; then echo "✓ Label '${label}' exists" else echo "✗ Label '${label}' is MISSING" fi done
91-93: Rename "Wiggum Learnings" to "Ralph Learnings" for consistency.The subsection title "Wiggum Learnings" contradicts its description which references "Ralph loops". This should be "Ralph Learnings" to match the workflow context throughout the document.
📝 Suggested fix
-### Wiggum Learnings +### Ralph Learningsmaskfile.md (5)
773-774: Verifyfeaturelabel exists or remove it.Per learnings: "Only use labels already available on the GitHub repository." The
featurelabel is used here but may not exist. Consider using only documented Ralph labels or verify the label exists first.#!/bin/bash # Check if 'feature' label exists gh label list --json name --jq '.[].name' | grep -q "^feature$" && echo "Label exists" || echo "Label MISSING"
751-765: Addjqto the dependency checks.The command uses
jqat lines 777 and 787-788 but doesn't verify it's installed. Withset -euo pipefail, this will fail with an unhelpful "command not found" error.🔧 Proposed fix
if ! command -v claude &> /dev/null; then echo "Claude CLI is required" exit 1 fi + +if ! command -v jq &> /dev/null; then + echo "jq is required" + exit 1 +fi
840-854: Hardcoded "master" branch may not work for all repositories.Many repositories use "main" as the default branch. Consider detecting the default branch dynamically.
🔧 Proposed fix
+default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's|refs/remotes/origin/||' || echo "master") current_branch=$(git rev-parse --abbrev-ref HEAD) -if [ "$current_branch" != "master" ]; then - echo "Error: Not on master branch (currently on: ${current_branch})" - echo "Run: git checkout master" +if [ "$current_branch" != "$default_branch" ]; then + echo "Error: Not on default branch ${default_branch} (currently on: ${current_branch})" + echo "Run: git checkout ${default_branch}" exit 1 fi -echo " On master branch" +echo " On default branch (${default_branch})" -echo " Pulling latest master" -if ! git pull --ff-only origin master; then - echo "Error: Could not pull latest master" +echo " Pulling latest ${default_branch}" +if ! git pull --ff-only origin "$default_branch"; then + echo "Error: Could not pull latest ${default_branch}" echo "Resolve conflicts or check network/auth" exit 1 fi -echo " Master is up to date" +echo " ${default_branch} is up to date"
964-972: Security concern:--dangerously-skip-permissionsdisables all permission prompts.This flag allows Claude to execute any operation without user confirmation. Combined with feeding the GitHub issue body (which could contain prompt injection) as context, this creates a risk where a malicious issue spec could cause arbitrary command execution.
Consider:
- Removing the flag and configuring Claude's permissions explicitly via its configuration file
- Adding an explicit user acknowledgment before running the loop
- Documenting this security decision prominently if autonomous execution is required
1082-1096: Addjqto the dependency checks.The command uses
jqat lines 1135-1136 but doesn't verify it's installed.🔧 Proposed fix
if ! command -v claude &> /dev/null; then echo "Claude CLI is required" exit 1 fi + +if ! command -v jq &> /dev/null; then + echo "jq is required" + exit 1 +fi
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ng master Changed the ralph loop pre-flight checks to dynamically detect the repository's default branch instead of assuming "master". This supports repositories that use "main" or other branch names as their default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Creates refining, ready, in-progress, needs-attention, and backlog-review labels if they don't exist. Skips labels that already exist. Run `mask ralph setup` once before using other Ralph commands. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Re: @forstmeier's comment on maskfile.md line 783 (label colors): Applied the suggested color scheme:
|
|
Addressed @forstmeier's comments on maskfile.md lines 867 and 870: Added |
|
Addressed @forstmeier's comment on maskfile.md line 1 (formatting): Removed separator lines ( |
|
Addressed @forstmeier's comments on maskfile.md lines 973 and 1030: Error comments now include diagnostic context - git status (last 20 files) and recent commits (last 5) are appended to the issue comment when the loop fails unexpectedly. |
|
Re: @forstmeier's comment on maskfile.md line 1086 (using existing templates): Keeping current behavior where Ralph uses its own simpler format for PRs. The spec template is already used for issues. Template integration for PRs could be explored in a follow-up if needed. |
|
Addressed @forstmeier's comment on maskfile.md line 1189: Created |
|
Addressed @forstmeier's comments on maskfile.md lines 1207 and 1256: Removed staleness checking from |
|
Addressed @forstmeier's comment on maskfile.md line 1263: Changed placeholder instructions in the backlog report template to HTML comments ( |
|
Re: @forstmeier's comment on maskfile.md line 1295: The command is already named |
|
Re: @forstmeier's comment on maskfile.md line 1373 (GraphQL usage): The |
|
Addressed @forstmeier's comment about assignee field: The Ralph loop now automatically assigns the current GitHub user to the issue when starting work, and assigns them to the PR when it's created. Updated CLAUDE.md workflow documentation to reflect this. |
Labels: - Rename refining -> in-refinement, needs-attention -> attention-needed - Add 'ralph' actor label for human/Ralph handoff flexibility - Apply consistent color scheme per reviewer suggestion Commands: - Add 'mask ralph ready <issue_number>' command - Extract shared preflight checks to tools/ralph-preflight.sh - Auto-assign current GitHub user to issue and PR Documentation: - Change [issue] -> [issue_number] for clarity - Add compaction note to Ralph Learnings section - Document auto-assignment in workflow steps Output formatting: - Remove separator lines (===) from echo output - Add git status and recent commits to error comments - Use HTML comments for placeholders in backlog report Removed: - Staleness checking (handled by stale workflow action) Template consolidation: - Remove separate BUG.md, FEATURE.md, SPEC.md (merged via #722) - Update consolidated ISSUE_TEMPLATE.md with in-refinement label Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@maskfile.md`:
- Around line 829-833: Update the system prompt text that still instructs to use
a "Problem" section so it matches the current spec template which uses
"Description": find and replace the literal prompt phrase "Problem" with
"Description" in the system instruction block (also ensure the negative
instruction about adding the 'ready' label remains unchanged). Verify any other
occurrences of the word "Problem" in the prompt template are updated to
"Description" so headings won't be mismatched when generating specs.
- Around line 746-751: The preflight script invoked by source
"${MASKFILE_DIR}/tools/ralph-preflight.sh" and called via ralph_preflight
doesn't verify jq, causing ralph setup to fail when jq is missing; update the
ralph-preflight logic to include jq in its required tools check (e.g., add a
command -v jq or include "jq" in the required tools list) and ensure the script
exits with a clear error if jq is not found so ralph_preflight will fail fast
with a helpful message.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 13 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
forstmeier
left a comment
There was a problem hiding this comment.
Looks good. 👍
Bots have a few points of feedback to address but after that we can just cut additional stories based on anything they find and merge this.
Summary
mask ralph spec [issue]command for interactive spec building via Claude conversationmask ralph loop <issue>command for autonomous implementation with pre-flight checks, label management, and PR creationmask ralph backlogcommand for reviewing open issues for duplicates, overlaps, and stalenessmask ralph pr [--pr <number>]command for processing PR review feedback interactivelyin-progress,needs-attention, andbacklog-reviewlabels for workflow trackingTest plan
gh label list | grep -E "in-progress|needs-attention|backlog-review"mask ralph --helpmask ralph speccreates new issue with templatemask ralph spec <issue>starts interactive sessionmask ralph backlogcreates tracking issue and posts reportmask ralph pron a PR with review commentsCloses #723
Closes #727
Closes #729
🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
New Features
Configuration
✏️ Tip: You can customize this high-level summary in your review settings.