diff --git a/claude.md b/claude.md new file mode 100644 index 000000000..adc65bc41 --- /dev/null +++ b/claude.md @@ -0,0 +1,354 @@ +# Superpowers - Skills Library for AI Coding Assistants + +## Project Overview + +Superpowers is a comprehensive skills library that provides proven techniques, patterns, and workflows for AI coding assistants (primarily Claude Code). It implements a mandatory skills system that enforces best practices through structured workflows. + +**Core Philosophy:** +- Test-Driven Development (write tests first, always) +- Systematic over ad-hoc (process over guessing) +- Complexity reduction (simplicity as primary goal) +- Evidence over claims (verify before declaring success) +- Domain over implementation (work at problem level, not solution level) + +## Architecture + +### Directory Structure + +``` +superpowers/ +├── skills/ # Core skills library (20+ skills) +│ ├── testing/ # TDD, async testing, anti-patterns +│ ├── debugging/ # Systematic debugging, root cause tracing +│ ├── collaboration/ # Brainstorming, planning, code review +│ ├── meta/ # Creating and sharing skills +│ └── commands/ # Slash command definitions +├── commands/ # Top-level slash commands (thin wrappers) +├── lib/ # Initialization scripts +│ └── initialize-skills.sh # Git-based skills management +├── hooks/ # Session lifecycle hooks +└── .claude-plugin/ # Plugin manifest (if present) +``` + +### Skills System + +**Skill Structure:** +Each skill is a `SKILL.md` file with YAML frontmatter: + +```yaml +--- +name: skill-name +description: When to use this skill - triggers automatic activation +--- +# Skill content in markdown +``` + +**Key Skills Categories:** + +1. **Testing Skills** (`skills/testing/`) + - `test-driven-development` - RED-GREEN-REFACTOR cycle (mandatory) + - `condition-based-waiting` - Async test patterns + - `testing-anti-patterns` - Common pitfalls to avoid + +2. **Debugging Skills** (`skills/debugging/`) + - `systematic-debugging` - 4-phase root cause process + - `root-cause-tracing` - Backward tracing technique + - `verification-before-completion` - Ensure fixes work + - `defense-in-depth` - Multiple validation layers + +3. **Collaboration Skills** (`skills/collaboration/`) + - `brainstorming` - Socratic design refinement (before coding) + - `writing-plans` - Detailed implementation plans + - `executing-plans` - Batch execution with checkpoints + - `dispatching-parallel-agents` - Concurrent subagent workflows + - `requesting-code-review` - Pre-review checklist + - `receiving-code-review` - Responding to feedback + - `using-git-worktrees` - Parallel development branches + - `finishing-a-development-branch` - Merge/PR decision workflow + - `subagent-driven-development` - Fast iteration with quality gates + +4. **Meta Skills** (`skills/meta/`) + - `using-superpowers` - Introduction and mandatory protocols + - `writing-skills` - Create new skills following best practices + - `sharing-skills` - Contribute skills back via branch and PR + - `testing-skills-with-subagents` - Validate skill quality + +### Activation System + +**Automatic Discovery:** +Skills activate automatically based on their `description` field matching the current task context. The AI assistant checks for relevant skills before every task. + +**Mandatory Protocol (from `using-superpowers`):** +Before ANY user message response: +1. List available skills mentally +2. Ask: "Does ANY skill match this request?" +3. If yes → Use Skill tool to read and run the skill +4. Announce which skill you're using +5. Follow the skill exactly + +**Critical: If a skill exists for a task, using it is MANDATORY, not optional.** + +## Key Design Patterns + +### Test-Driven Development (TDD) + +**Iron Law:** NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST + +**RED-GREEN-REFACTOR Cycle:** +1. **RED** - Write one minimal test showing expected behavior +2. **Verify RED** - Run test, watch it fail correctly +3. **GREEN** - Write simplest code to pass the test +4. **Verify GREEN** - Run test, watch it pass +5. **REFACTOR** - Clean up while staying green +6. **Repeat** - Next test for next feature + +**Non-Negotiable Rules:** +- Write code before test? Delete it. Start over. +- Test passes immediately? It's testing existing behavior, not the new feature. +- Can't watch test fail? You don't know if it tests the right thing. + +### Systematic Debugging + +**Iron Law:** NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST + +**Four Phases (must complete each before next):** + +1. **Root Cause Investigation** + - Read error messages carefully + - Reproduce consistently + - Check recent changes + - Gather evidence in multi-component systems + - Trace data flow (use `root-cause-tracing` sub-skill) + +2. **Pattern Analysis** + - Find working examples + - Compare against references (read completely) + - Identify differences + - Understand dependencies + +3. **Hypothesis and Testing** + - Form single hypothesis + - Test minimally (one variable at a time) + - Verify before continuing + +4. **Implementation** + - Create failing test case (use TDD) + - Implement single fix + - Verify fix works + - **If 3+ fixes failed:** STOP and question architecture + +### Brainstorming + +**Use before writing code or implementation plans** + +**Process:** +1. Understand current project context +2. Ask questions one at a time (prefer multiple choice) +3. Explore 2-3 approaches with trade-offs +4. Present design in 200-300 word sections +5. Validate incrementally +6. Write design to `docs/plans/YYYY-MM-DD--design.md` +7. If continuing to implementation: + - Use `using-git-worktrees` for isolated workspace + - Use `writing-plans` for detailed plan + +**Key Principles:** +- One question at a time +- YAGNI ruthlessly +- Explore alternatives +- Incremental validation + +## Integration Points + +### Slash Commands + +Thin wrappers that activate corresponding skills: + +- `/superpowers:brainstorm` → `brainstorming` skill +- `/superpowers:write-plan` → `writing-plans` skill +- `/superpowers:execute-plan` → `executing-plans` skill + +Located in `commands/` directory as markdown files with YAML frontmatter. + +### SessionStart Hook + +Loads the `using-superpowers` skill at session start, establishing mandatory workflows for the entire session. + +### Git Worktrees + +The `using-git-worktrees` skill enables parallel development: +- Create isolated workspaces for features +- Switch between multiple branches without stashing +- Ideal for working on multiple features simultaneously + +### TodoWrite Integration + +**Critical Rule:** If a skill has a checklist, create TodoWrite todos for EACH item. + +Benefits: +- Prevents skipped steps +- Tracks progress +- Ensures accountability +- Tiny overhead vs. cost of missing steps + +## Implementation Notes + +### Initialization Script (`lib/initialize-skills.sh`) + +**Purpose:** Git-based skills repository management + +**Behavior:** +- Checks for skills directory at `~/.config/superpowers/skills` +- If exists and is git repo: + - Fetches from tracking remote + - Attempts fast-forward merge if behind + - Reports update status +- If doesn't exist: + - Clones from `https://github.com/obra/superpowers-skills.git` + - Sets up upstream remote + - Offers to fork if `gh` CLI available + +**Migration Handling:** +- Backs up old installation to `.bak` directories +- Preserves user's custom skills + +### Skills Discovery Mechanism + +Skills are discovered through: +1. File system scanning (`skills/*/SKILL.md`) +2. YAML frontmatter parsing +3. Description field matching against current task context +4. Automatic activation when description matches + +### Mandatory Workflow Enforcement + +The `using-superpowers` skill establishes a strict protocol: + +**Common Rationalizations (that mean you're about to fail):** +- "This is just a simple question" +- "I can check git/files quickly" +- "Let me gather information first" +- "This doesn't need a formal skill" +- "I remember this skill" +- "This doesn't count as a task" +- "The skill is overkill for this" +- "I'll just do this one thing first" + +**Reality:** If a skill exists for a task, using it is MANDATORY. + +## Code Quality Standards + +### From TDD Skill + +**Good Test Characteristics:** +- Minimal (one thing, "and" in name = split it) +- Clear name describing behavior +- Shows intent (demonstrates desired API) +- Uses real code (mocks only if unavoidable) +- Tests behavior, not implementation + +**Test-First Benefits:** +- Finds bugs before commit (faster than debugging after) +- Prevents regressions +- Documents behavior +- Enables fearless refactoring + +### From Systematic Debugging Skill + +**Red Flags (STOP and follow process):** +- "Quick fix for now, investigate later" +- "Just try changing X and see if it works" +- "Add multiple changes, run tests" +- "Skip the test, I'll manually verify" +- "It's probably X, let me fix that" +- "I don't fully understand but this might work" +- Each fix reveals new problem (architectural issue) + +**Human Partner Signals:** +- "Is that not happening?" - You assumed without verifying +- "Will it show us...?" - You should have added evidence gathering +- "Stop guessing" - Proposing fixes without understanding +- "Ultrathink this" - Question fundamentals +- "We're stuck?" - Your approach isn't working + +## Extension and Contribution + +### Creating New Skills + +Follow the `writing-skills` skill: +1. Use YAML frontmatter with `name` and `description` +2. Write clear, actionable content +3. Include examples (Good vs Bad) +4. Specify when to use / when not to use +5. Add checklists for systematic processes +6. Test with `testing-skills-with-subagents` + +### Sharing Skills + +Follow the `sharing-skills` skill: +1. Fork the repository +2. Create a branch for your skill +3. Follow skill creation guidelines +4. Validate quality with testing skill +5. Submit PR to upstream + +### Plugin Installation + +**Via Plugin Marketplace:** +```bash +/plugin marketplace add obra/superpowers-marketplace +/plugin install superpowers@superpowers-marketplace +``` + +**Verify:** +```bash +/help +# Should show: +# /superpowers:brainstorm - Interactive design refinement +# /superpowers:write-plan - Create implementation plan +# /superpowers:execute-plan - Execute plan in batches +``` + +**Update:** +```bash +/plugin update superpowers +``` + +## Critical Success Factors + +1. **Mandatory Skill Usage** - If skill exists, you MUST use it +2. **TDD is Non-Negotiable** - RED-GREEN-REFACTOR always +3. **Systematic Debugging** - 4 phases, no skipping +4. **Brainstorm Before Coding** - Design first, implement second +5. **TodoWrite for Checklists** - Track every checklist item +6. **Announce Skill Usage** - Transparency with human partner +7. **Follow Exactly** - Don't adapt away the discipline + +## Common Failure Modes + +1. **Rationalization** - "This is different because..." +2. **Skipping Steps** - "I'll test after" / "Quick fix for now" +3. **False Completion** - Marking done without verification +4. **Process Adaptation** - "Spirit not ritual" / "Being pragmatic" +5. **Checklist Mental Processing** - Not using TodoWrite +6. **Guessing vs. Investigating** - Proposing fixes before root cause + +## Success Metrics + +From real-world usage: +- **Systematic debugging:** 15-30 min vs 2-3 hours of thrashing +- **First-time fix rate:** 95% vs 40% +- **New bugs introduced:** Near zero vs common +- **Test coverage:** Comprehensive vs spotty +- **Rework rate:** Minimal vs frequent + +## License + +MIT License - See LICENSE file for details + +## Support and Resources + +- **Issues:** https://github.com/obra/superpowers/issues +- **Marketplace:** https://github.com/obra/superpowers-marketplace +- **Blog Post:** https://blog.fsck.com/2025/10/09/superpowers/ +- **Total Skills:** 20+ and growing diff --git a/superpowers-desktop/README.md b/superpowers-desktop/README.md new file mode 100644 index 000000000..684ef1986 --- /dev/null +++ b/superpowers-desktop/README.md @@ -0,0 +1,285 @@ +# Superpowers for Claude Desktop + +**Proven workflows for AI coding assistants, adapted for Claude Desktop.** + +This distribution brings the Superpowers skills library to Claude Desktop users, with packages optimized for both Pro and Free subscription tiers. + +## What Is This? + +Superpowers is a comprehensive skills library providing: +- **Test-Driven Development** - RED-GREEN-REFACTOR cycle (mandatory) +- **Systematic Debugging** - 4-phase root cause process +- **Brainstorming** - Socratic design refinement +- **20+ Additional Skills** - Testing, collaboration, code review, git workflows + +Originally built as a Claude Code plugin, this distribution adapts the skills for Claude Desktop with realistic expectations about what works without the full plugin system. + +## Choose Your Path + +### ⭐ Pro Users ($20/month) + +**Best experience for Claude Desktop.** + +- ✅ Full 20-skill library +- ✅ Persistent project knowledge +- ✅ Custom instructions +- ✅ One-time 15-minute setup +- ⚠️ Manual skill invocation (no automatic activation) +- ⚠️ Manual checklist tracking (no TodoWrite) + +**→ [Get Started with Pro Setup](pro-setup/SETUP.md)** + +### 💡 Free Users ($0) + +**Limited but workable experience.** + +- ✅ Core 3 workflows (TDD, debugging, brainstorming) +- ✅ Quick-reference cheat sheets +- ✅ 2-minute setup per session +- ❌ Must upload files each conversation +- ❌ No persistent knowledge +- ❌ No custom instructions +- ❌ No enforcement + +**→ [Get Started with Free Mode](free-mode/QUICK-START.md)** + +### 🚀 Claude Code Users (Recommended) + +**Full experience with all features.** + +If you have access to Claude Code, use the native plugin instead: + +```bash +/plugin marketplace add obra/superpowers-marketplace +/plugin install superpowers@superpowers-marketplace +``` + +See [main repository](https://github.com/obra/superpowers) for details. + +--- + +## Feature Comparison + +| Feature | Claude Code Plugin | Desktop Pro | Desktop Free | +|---------|-------------------|-------------|--------------| +| **Automatic skill activation** | ✅ Yes | ❌ No | ❌ No | +| **Persistent skills** | ✅ Yes | ✅ Yes | ❌ No | +| **Custom instructions** | ✅ Yes | ✅ Yes | ❌ No | +| **TodoWrite tracking** | ✅ Yes | ⚠️ Manual | ⚠️ Manual | +| **Subagent spawning** | ✅ Yes | ❌ No | ❌ No | +| **Setup time** | 5 min | 15 min | 2 min/session | +| **Monthly cost** | $0* | $20 | $0 | +| **Skills available** | 20+ full | 20+ full | 3 core | +| **Context efficiency** | High | Medium | Low | +| **Enforcement** | Strong | Weak | None | + +*If already using Claude Code + +--- + +## What Works Well (All Tiers) + +**Core workflows function without automation:** + +1. **Test-Driven Development** + - RED-GREEN-REFACTOR cycle + - "Write test first" mandate + - Verification steps + +2. **Systematic Debugging** + - 4-phase process + - Root cause investigation + - No-fix-without-understanding rule + +3. **Brainstorming** + - Socratic questioning + - Alternative exploration + - Incremental design validation + +**These workflows are tool-agnostic and work great with manual invocation.** + +## What Doesn't Work + +**These features require Claude Code plugin:** + +- ❌ Automatic skill activation based on task context +- ❌ SessionStart hooks for automatic setup +- ❌ TodoWrite tool for checklist tracking +- ❌ Task tool for spawning subagents +- ❌ Parallel execution workflows +- ❌ Mandatory enforcement mechanism + +**Workarounds:** +- Manual skill invocation (you remember to use them) +- Explicit checklist tracking in responses +- Sequential instead of parallel workflows + +--- + +## Philosophy + +The workflows in this library are built on: + +- **Test-Driven Development** - Write tests first, always +- **Systematic over ad-hoc** - Process over guessing +- **Complexity reduction** - Simplicity as primary goal +- **Evidence over claims** - Verify before declaring success +- **Domain over implementation** - Work at problem level + +--- + +## Quick Start (Choose Your Tier) + +### Pro Users + +1. Create new Project in Claude Desktop +2. Upload all files from `pro-setup/skills/` +3. Set custom instructions from `pro-setup/custom-instructions.txt` +4. Reference skills: "Use test-driven-development.md for this feature" + +**[Full Pro Setup Guide →](pro-setup/SETUP.md)** + +### Free Users + +1. Download `free-mode/core-workflows.md` +2. Upload to each new conversation +3. Say: "Follow the workflows in core-workflows.md" +4. For quick reference, use cheat sheets in `free-mode/cheat-sheets/` + +**[Full Free Quick-Start →](free-mode/QUICK-START.md)** + +--- + +## Contents + +``` +superpowers-desktop/ +├── README.md (you are here) +├── pro-setup/ +│ ├── SETUP.md - Pro setup guide +│ ├── custom-instructions.txt - Custom instructions +│ └── skills/ - Full 20-skill library +│ ├── core/ - Start here +│ ├── testing/ +│ ├── debugging/ +│ ├── collaboration/ +│ └── meta/ +├── free-mode/ +│ ├── QUICK-START.md - Free mode guide +│ ├── core-workflows.md - Condensed core skills +│ └── cheat-sheets/ - One-page references +│ ├── tdd-cheat-sheet.md +│ ├── debugging-cheat-sheet.md +│ └── brainstorming-cheat-sheet.md +└── conversion/ - Maintenance scripts +``` + +--- + +## Migration Paths + +### From Free → Pro + +**When to upgrade:** +- Tired of uploading files every session +- Want full skill library +- Need persistent project context + +**What you gain:** +- Persistent skills (no reuploads) +- Custom instructions (automatic reminders) +- Full 20-skill library +- Better context management + +**What you still won't have:** +- Automatic activation (still manual) +- TodoWrite tracking (still manual) +- Subagent spawning + +### From Desktop Pro → Claude Code + +**When to switch:** +- Want automatic skill activation +- Need TodoWrite tracking +- Want subagent workflows +- Want git-based auto-updates + +**What you gain:** +- Everything. Full plugin experience. + +--- + +## Maintenance & Updates + +### For Users + +**Pro users:** +- Watch for updates to this repository +- Download new skill files when available +- Re-upload to your project + +**Free users:** +- Check for updated core-workflows.md +- Download and use in new conversations + +### For Maintainers + +See `conversion/README.md` for instructions on: +- Running conversion scripts +- Updating from plugin source +- Testing both Pro and Free packages +- Versioning and releases + +--- + +## Support + +- **Issues:** [https://github.com/obra/superpowers/issues](https://github.com/obra/superpowers/issues) +- **Original Plugin:** [https://github.com/obra/superpowers](https://github.com/obra/superpowers) +- **Marketplace:** [https://github.com/obra/superpowers-marketplace](https://github.com/obra/superpowers-marketplace) + +--- + +## License + +MIT License - see [LICENSE](../LICENSE) file for details. + +--- + +## Acknowledgments + +- **Original Superpowers Plugin:** Jesse Vincent ([@obra](https://github.com/obra)) +- **Claude Desktop Adaptation:** This distribution +- **Community:** All contributors to the skills library + +--- + +## Decision Guide + +**Still not sure which path?** + +**Use Claude Code Plugin if:** +- ✅ You have access to Claude Code +- ✅ You want the best experience +- ✅ You want automatic workflows + +**Use Desktop Pro if:** +- ✅ You can't use Claude Code +- ✅ You're willing to pay $20/month +- ✅ You want full skills without reuploading +- ✅ Manual invocation is acceptable + +**Use Desktop Free if:** +- ✅ You can't afford Pro +- ✅ You only need core workflows +- ✅ 2-minute setup per session is fine +- ✅ You understand the limitations + +**Still unsure?** Start with Free mode. If you find it useful but tedious, upgrade to Pro. If you love it and want automation, switch to Claude Code. + +--- + +**Ready to get started?** + +- **[Pro Setup Guide →](pro-setup/SETUP.md)** +- **[Free Quick-Start →](free-mode/QUICK-START.md)** diff --git a/superpowers-desktop/conversion/README.md b/superpowers-desktop/conversion/README.md new file mode 100644 index 000000000..be85bf85c --- /dev/null +++ b/superpowers-desktop/conversion/README.md @@ -0,0 +1,242 @@ +# Conversion Scripts + +This directory contains scripts for converting Superpowers Claude Code plugin skills to Claude Desktop format. + +## Purpose + +The conversion process: +1. Strips tool-specific references (Skill, Task, TodoWrite tools) +2. Rewrites cross-references (superpowers: format → .md files) +3. Adds Desktop-specific notes +4. Preserves core skill content and workflows + +## Scripts + +### convert-skill.sh + +Converts a single skill file. + +**Usage:** +```bash +./convert-skill.sh +``` + +**Example:** +```bash +./convert-skill.sh \ + ../../skills/test-driven-development/SKILL.md \ + ../pro-setup/skills/core/test-driven-development.md +``` + +**What it does:** +- Replaces "Skill tool" → "skill reference" +- Replaces "Task tool" → "manual task breakdown" +- Replaces "TodoWrite" → "explicit checklist tracking" +- Replaces "SessionStart hook" → "custom instructions" +- Rewrites "superpowers:skill-name" → "skill-name.md (in project knowledge)" +- Rewrites "@file.md" → "file.md (in project knowledge)" +- Adds Desktop compatibility note after frontmatter + +### convert-all-skills.sh + +Converts all skills from the plugin source to Desktop format. + +**Usage:** +```bash +./convert-all-skills.sh +``` + +**What it does:** +- Converts all 20 skills from `../../skills/` directory +- Organizes into categories (core, testing, debugging, collaboration, meta) +- Copies example files (TypeScript, shell scripts) +- Outputs to `../pro-setup/skills/` directory + +**Output structure:** +``` +pro-setup/skills/ +├── core/ (4 skills) +├── testing/ (2 skills + examples) +├── debugging/ (3 skills + scripts) +├── collaboration/ (8 skills) +└── meta/ (3 skills) +``` + +## Maintenance Workflow + +### When Plugin Skills Are Updated + +1. **Pull latest plugin changes:** + ```bash + cd /home/user/superpowers + git pull origin main + ``` + +2. **Run conversion:** + ```bash + cd superpowers-desktop/conversion + ./convert-all-skills.sh + ``` + +3. **Verify conversions:** + ```bash + # Check a sample file + head -n 20 ../pro-setup/skills/core/test-driven-development.md + + # Should see: + # - Frontmatter with name and description + # - Desktop compatibility note + # - No tool-specific references + ``` + +4. **Update free-mode if core skills changed:** + - Manually review changes to core 4 skills + - Update `../free-mode/core-workflows.md` if needed + - Update cheat sheets if significant changes + +5. **Test both packages:** + - Pro: Upload sample skills to test project + - Free: Upload core-workflows.md to conversation + - Verify workflows still function correctly + +6. **Commit and release:** + ```bash + git add ../pro-setup ../free-mode + git commit -m "Update skills from plugin v" + git push + ``` + +## Conversion Rules + +### Tool Reference Replacements + +| Plugin | Desktop | +|--------|---------| +| `Skill tool` | `skill reference` | +| `Task tool` | `manual task breakdown` | +| `TodoWrite` | `explicit checklist tracking` | +| `SessionStart hook` | `custom instructions` | +| `dispatch subagent` | `break down into sequential tasks` | +| `subagent` | `separate task` | + +### Cross-Reference Replacements + +| Plugin | Desktop | +|--------|---------| +| `superpowers:skill-name` | `skill-name.md (in project knowledge)` | +| `@file.md` | `file.md (in project knowledge)` | +| `**REQUIRED SUB-SKILL:** Use superpowers:foo` | `**REQUIRED:** Reference foo.md from project knowledge` | + +### Content Preserved + +- Frontmatter (name, description) +- Core skill content +- Workflows and processes +- Examples and code +- Checklists and verification steps +- Philosophy and rationale + +### Content Added + +After frontmatter: +```markdown +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +``` + +## Testing Conversions + +### Automated Checks + +```bash +# Check all conversions completed +find ../pro-setup/skills -name "*.md" | wc -l +# Should be 20+ + +# Check for unconverted references (should return nothing) +grep -r "Skill tool" ../pro-setup/skills/ +grep -r "Task tool" ../pro-setup/skills/ +grep -r "superpowers:" ../pro-setup/skills/ | grep -v "Note for Claude" + +# Check Desktop notes were added +grep -r "Note for Claude Desktop" ../pro-setup/skills/ | wc -l +# Should be 20+ +``` + +### Manual Verification + +1. **Read a converted skill** - Ensure it's coherent +2. **Check cross-references** - Links should point to .md files +3. **Verify workflows intact** - Core processes preserved +4. **Test in Claude Desktop** - Upload and use in conversation + +## Version Tracking + +When updating from plugin: + +1. **Note plugin version:** + ```bash + cd /home/user/superpowers + git log -1 --oneline + # Record commit hash + ``` + +2. **Tag Desktop distribution:** + ```bash + cd superpowers-desktop + git tag -a v1.1.0 -m "Sync with plugin v3.4.0 (commit abc123)" + git push --tags + ``` + +3. **Update CHANGELOG:** + - List which skills were updated + - Note any significant workflow changes + - Document any breaking changes + +## Troubleshooting + +### "Conversion failed for skill X" + +**Check:** +- Does source file exist at expected path? +- Is source file valid markdown? +- Are there sed-incompatible characters? + +**Fix:** +- Verify source path in convert-all-skills.sh +- Check for special regex characters in content +- Run convert-skill.sh manually to see specific error + +### "Free mode out of sync" + +**Problem:** Core skills updated in Pro but not reflected in free-mode/ + +**Solution:** +- Manually review changes to core 4 skills +- Update core-workflows.md with essential changes +- Keep condensed (don't just copy-paste full skills) +- Update cheat sheets if presentation changed + +### "Cross-references broken" + +**Problem:** Links to non-existent files + +**Solution:** +- Check conversion rules in convert-skill.sh +- Verify referenced files are actually in pro-setup/skills/ +- Update sed rules if new reference patterns introduced + +## Future Improvements + +Potential automation enhancements: + +1. **Auto-sync free-mode** - Script to generate core-workflows.md from core skills +2. **Validation tests** - Automated checks for broken references +3. **Diff tool** - Compare plugin vs desktop versions to highlight changes +4. **Version checker** - Auto-detect when plugin has updates + +## Contact + +For questions about the conversion process: +- Open issue in main repository +- Reference this README +- Include skill name and specific conversion issue diff --git a/superpowers-desktop/conversion/convert-all-skills.sh b/superpowers-desktop/conversion/convert-all-skills.sh new file mode 100755 index 000000000..80664ce07 --- /dev/null +++ b/superpowers-desktop/conversion/convert-all-skills.sh @@ -0,0 +1,124 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Convert all Superpowers skills for Claude Desktop +# Organizes into categories for better project knowledge navigation + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +SKILLS_DIR="/home/user/superpowers/skills" +OUTPUT_DIR="$SCRIPT_DIR/../pro-setup/skills" + +echo "Converting Superpowers skills for Claude Desktop..." +echo "Source: $SKILLS_DIR" +echo "Output: $OUTPUT_DIR" +echo + +# Create category directories +mkdir -p "$OUTPUT_DIR"/{core,testing,debugging,collaboration,meta} + +# Core skills (highest priority) +echo "Converting core skills..." +CORE_SKILLS=( + "using-superpowers" + "test-driven-development" + "systematic-debugging" + "brainstorming" +) + +for skill in "${CORE_SKILLS[@]}"; do + if [ -f "$SKILLS_DIR/$skill/SKILL.md" ]; then + "$SCRIPT_DIR/convert-skill.sh" \ + "$SKILLS_DIR/$skill/SKILL.md" \ + "$OUTPUT_DIR/core/$skill.md" + fi +done + +# Testing skills +echo "Converting testing skills..." +TESTING_SKILLS=( + "condition-based-waiting" + "testing-anti-patterns" +) + +for skill in "${TESTING_SKILLS[@]}"; do + if [ -f "$SKILLS_DIR/$skill/SKILL.md" ]; then + "$SCRIPT_DIR/convert-skill.sh" \ + "$SKILLS_DIR/$skill/SKILL.md" \ + "$OUTPUT_DIR/testing/$skill.md" + fi +done + +# Debugging skills +echo "Converting debugging skills..." +DEBUGGING_SKILLS=( + "root-cause-tracing" + "verification-before-completion" + "defense-in-depth" +) + +for skill in "${DEBUGGING_SKILLS[@]}"; do + if [ -f "$SKILLS_DIR/$skill/SKILL.md" ]; then + "$SCRIPT_DIR/convert-skill.sh" \ + "$SKILLS_DIR/$skill/SKILL.md" \ + "$OUTPUT_DIR/debugging/$skill.md" + fi +done + +# Collaboration skills +echo "Converting collaboration skills..." +COLLABORATION_SKILLS=( + "writing-plans" + "executing-plans" + "dispatching-parallel-agents" + "requesting-code-review" + "receiving-code-review" + "using-git-worktrees" + "finishing-a-development-branch" + "subagent-driven-development" +) + +for skill in "${COLLABORATION_SKILLS[@]}"; do + if [ -f "$SKILLS_DIR/$skill/SKILL.md" ]; then + "$SCRIPT_DIR/convert-skill.sh" \ + "$SKILLS_DIR/$skill/SKILL.md" \ + "$OUTPUT_DIR/collaboration/$skill.md" + fi +done + +# Meta skills +echo "Converting meta skills..." +META_SKILLS=( + "writing-skills" + "sharing-skills" + "testing-skills-with-subagents" +) + +for skill in "${META_SKILLS[@]}"; do + if [ -f "$SKILLS_DIR/$skill/SKILL.md" ]; then + "$SCRIPT_DIR/convert-skill.sh" \ + "$SKILLS_DIR/$skill/SKILL.md" \ + "$OUTPUT_DIR/meta/$skill.md" + fi +done + +# Copy example files if they exist +echo "Copying example files..." +if [ -f "$SKILLS_DIR/condition-based-waiting/example.ts" ]; then + cp "$SKILLS_DIR/condition-based-waiting/example.ts" \ + "$OUTPUT_DIR/testing/condition-based-waiting-example.ts" +fi + +if [ -f "$SKILLS_DIR/root-cause-tracing/find-polluter.sh" ]; then + cp "$SKILLS_DIR/root-cause-tracing/find-polluter.sh" \ + "$OUTPUT_DIR/debugging/find-polluter.sh" +fi + +echo +echo "✓ Conversion complete!" +echo " Core skills: ${#CORE_SKILLS[@]}" +echo " Testing skills: ${#TESTING_SKILLS[@]}" +echo " Debugging skills: ${#DEBUGGING_SKILLS[@]}" +echo " Collaboration skills: ${#COLLABORATION_SKILLS[@]}" +echo " Meta skills: ${#META_SKILLS[@]}" +echo +echo "Skills ready in: $OUTPUT_DIR" diff --git a/superpowers-desktop/conversion/convert-skill.sh b/superpowers-desktop/conversion/convert-skill.sh new file mode 100755 index 000000000..cac1abe92 --- /dev/null +++ b/superpowers-desktop/conversion/convert-skill.sh @@ -0,0 +1,63 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Convert a Superpowers skill for Claude Desktop use +# Usage: ./convert-skill.sh + +if [ $# -ne 2 ]; then + echo "Usage: $0 " + exit 1 +fi + +INPUT="$1" +OUTPUT="$2" + +if [ ! -f "$INPUT" ]; then + echo "Error: Input file '$INPUT' not found" + exit 1 +fi + +# Create output directory if needed +mkdir -p "$(dirname "$OUTPUT")" + +# Process the file with sed +sed ' +# Strip Skill tool references +s/Use the Skill tool to read and run the skill/Reference the skill from project knowledge/g +s/Skill tool/skill reference/g + +# Strip Task tool references +s/Task tool/manual task breakdown/g +s/dispatch.*subagent/break down into sequential tasks/g +s/subagent/separate task/g + +# Strip TodoWrite references +s/TodoWrite/explicit checklist tracking/g +s/create TodoWrite todos/track checklist items explicitly in your responses/g +s/YOU MUST create TodoWrite todos/you must track checklist items explicitly/g + +# Strip SessionStart hook references +s/SessionStart Hook/custom instructions (Pro) or manual reminder (Free)/g +s/SessionStart hook/custom instructions/g + +# Rewrite cross-references from superpowers: format +s/superpowers:\([a-z-]*\)/\1.md (in project knowledge)/g + +# Rewrite REQUIRED SUB-SKILL references +s/\*\*REQUIRED SUB-SKILL:\*\* Use superpowers:\([a-z-]*\)/**REQUIRED:** Reference \1.md from project knowledge/g + +# Rewrite @ file references (but keep @graphviz-conventions as example) +s/@\([a-z-]*\)\.md/\1.md (in project knowledge)/g + +# Add desktop-specific notes after frontmatter +/^---$/ { + N + /^---\n$/ { + a\ +\ +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. + } +} +' "$INPUT" > "$OUTPUT" + +echo "✓ Converted: $INPUT → $OUTPUT" diff --git a/superpowers-desktop/free-mode/QUICK-START.md b/superpowers-desktop/free-mode/QUICK-START.md new file mode 100644 index 000000000..36d584980 --- /dev/null +++ b/superpowers-desktop/free-mode/QUICK-START.md @@ -0,0 +1,366 @@ +# Superpowers Free Mode Quick-Start + +**Time Required:** 2 minutes per conversation + +**Cost:** $0 (Claude Desktop free tier) + +--- + +## What You Get + +**Free mode provides:** +- ✅ Core 3 workflows (TDD, debugging, brainstorming) +- ✅ One-page cheat sheets for quick reference +- ✅ Essential patterns without automation + +**What you don't get:** +- ❌ Persistent skills (must upload each session) +- ❌ Custom instructions +- ❌ Full 20-skill library +- ❌ Automatic activation + +**Good enough for:** Learning the workflows, occasional use, evaluation before upgrading + +--- + +## Quick Start (Do This Every Conversation) + +### Method 1: Full Workflows (Recommended) + +1. **Download** `core-workflows.md` from this directory +2. **Start new conversation** in Claude Desktop +3. **Upload file** using the paperclip icon +4. **Say:** "Follow the core workflows in core-workflows.md" + +**What's included:** +- Test-Driven Development (TDD) +- Systematic Debugging +- Brainstorming +- Quick reference for all three + +### Method 2: Cheat Sheets (Quick Reference) + +For quick tasks, use single-page cheat sheets: + +1. **Download** the relevant cheat sheet: + - `cheat-sheets/tdd-cheat-sheet.md` - For features/bugfixes + - `cheat-sheets/debugging-cheat-sheet.md` - For bugs/failures + - `cheat-sheets/brainstorming-cheat-sheet.md` - For design +2. **Upload to conversation** +3. **Say:** "Use TDD from tdd-cheat-sheet.md" + +**Benefit:** Smaller files, faster upload, focused on one workflow + +--- + +## Using the Workflows + +### For Implementing Features + +``` +You: [Upload core-workflows.md] + "Use TDD from core-workflows.md to implement user validation" + +Claude: "I'll follow the RED-GREEN-REFACTOR cycle. + + RED: Writing failing test first..." +``` + +**Workflow:** Write test → Watch fail → Implement → Watch pass → Refactor + +### For Debugging Issues + +``` +You: [Upload core-workflows.md] + "Use systematic debugging from core-workflows.md for this error" + +Claude: "I'll follow the 4-phase process. + + Phase 1: Root Cause Investigation + Let me read the error carefully..." +``` + +**Workflow:** Root Cause → Pattern → Hypothesis → Fix + +### For Designing Features + +``` +You: [Upload core-workflows.md] + "Use brainstorming from core-workflows.md to design the login feature" + +Claude: "I'll use the socratic method to refine your idea. + + First, let me check the current project state..." +``` + +**Workflow:** Understand → Explore → Present → Document + +--- + +## Tips for Success + +### 1. Upload at Start of Every Conversation + +Free tier = no persistent knowledge. Upload fresh each time. + +### 2. Be Explicit About Which Workflow + +Don't just upload and hope: +- ❌ "Here's a file" (ambiguous) +- ✅ "Use TDD from core-workflows.md" (explicit) + +### 3. Track Checklists Yourself + +No automatic tracking. Stay organized: + +``` +**My TDD Checklist:** +- [x] Write failing test for valid input +- [x] Run test, saw it fail +- [x] Implement validation +- [ ] Run test, verify pass +- [ ] Write test for invalid input +``` + +Update as you go. + +### 4. Refer Back to the Document + +During the conversation: +``` +"What does systematic debugging say about Phase 2?" +"Check the TDD red flags in core-workflows.md" +``` + +### 5. Keep Files Handy + +**Create a workflow:** +1. Keep workflow files in an easy-to-access folder +2. Drag and drop into conversations +3. Becomes quick muscle memory + +--- + +## Workflow Reference + +### TDD (Test-Driven Development) + +**The Iron Law:** NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST + +**Cycle:** +1. RED - Write test +2. Verify RED - Watch fail (mandatory) +3. GREEN - Implement minimal code +4. Verify GREEN - Watch pass (mandatory) +5. REFACTOR - Clean up +6. Repeat + +**Red Flag:** Code before test? Delete and start over. + +### Systematic Debugging + +**The Iron Law:** NO FIXES WITHOUT ROOT CAUSE FIRST + +**Phases:** +1. Root Cause - Investigate, reproduce, gather evidence +2. Pattern - Find working examples, compare +3. Hypothesis - Form theory, test minimally +4. Fix - Create test, implement, verify + +**Red Flag:** "Quick fix for now"? Stop, return to Phase 1. + +### Brainstorming + +**Purpose:** Design first, implement second + +**Process:** +1. Understand - Ask questions one at a time +2. Explore - Propose 2-3 approaches +3. Present - 200-300 words, validate incrementally +4. Document - Write to design doc + +**Principle:** YAGNI ruthlessly + +--- + +## Common Questions + +### "Do I need to upload every time?" + +**Yes.** Free tier has no persistent project knowledge. + +**Workaround:** Keep files easily accessible, takes 5 seconds. + +### "Can I use part of a workflow?" + +**Not recommended.** Workflows are designed as complete systems. Cherry-picking steps loses the benefit. + +**Example:** TDD without watching tests fail? You don't know if they test the right thing. + +### "What if Claude doesn't follow the workflow?" + +**Remind explicitly:** +``` +"According to core-workflows.md, what should we do first?" +"Check the TDD checklist in core-workflows.md" +"Core-workflows.md says no fixes without root cause - let's investigate" +``` + +### "Can I modify the workflows?" + +**You can, but:** +- Workflows are proven and tested +- Modifications often rationalize away the discipline +- If it feels "too rigid," that's usually the point + +**Better:** Use as-is for a few weeks, then evaluate. + +--- + +## Upgrading to Pro + +### When to Consider Pro + +**Upgrade if:** +- ✅ Uploading every conversation feels tedious +- ✅ You want the full 20-skill library +- ✅ You need custom instructions +- ✅ You use these workflows regularly + +**What you gain:** +- Persistent skills (no more uploads!) +- Custom instructions (automatic reminders) +- Full skills library (testing, collaboration, meta) +- Better context management + +**What you still won't have:** +- Automatic activation (still manual, but easier) +- TodoWrite tracking (still manual) +- Subagent spawning + +### Migration Path + +1. Decide Pro is worth $20/month for you +2. Upgrade your Claude Desktop subscription +3. Follow [Pro Setup Guide](../pro-setup/SETUP.md) +4. Keep using the same workflows, just more conveniently + +--- + +## Cheat Sheet Comparison + +| Sheet | Use When | Size | Content | +|-------|----------|------|---------| +| **tdd-cheat-sheet.md** | Implementing features/bugs | 1 page | RED-GREEN-REFACTOR reference | +| **debugging-cheat-sheet.md** | Debugging issues | 1 page | 4-phase process reference | +| **brainstorming-cheat-sheet.md** | Designing features | 1 page | Socratic method reference | +| **core-workflows.md** | General work | 3-4 pages | All three workflows in detail | + +**Strategy:** +- **Daily work:** Upload core-workflows.md once, use all session +- **Quick tasks:** Upload specific cheat sheet for focused reference +- **Learning:** Read full core-workflows.md to understand principles + +--- + +## Example Session + +``` +[New conversation] + +You: [Uploads core-workflows.md] + "I need to add email validation to my signup form. + Use core-workflows.md for this." + +Claude: "I see you want to add email validation. + According to core-workflows.md, I should use: + 1. Brainstorming (if design needed) + 2. TDD (for implementation) + + Is the design clear, or should we brainstorm first?" + +You: "Design is clear - just validate email format" + +Claude: "Great! I'll use TDD from core-workflows.md. + + Phase: RED - Write failing test + + test('rejects invalid email format', () => { + expect(validateEmail('notanemail')).toBe(false); + }); + + Let me run this to verify it fails..." + +[Conversation continues through TDD cycle] +``` + +--- + +## Troubleshooting + +### Claude Isn't Following the Workflow + +**Try:** +1. More explicit reference: "Section 1 of core-workflows.md says..." +2. Quote directly: "The TDD Iron Law is: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST" +3. Checklist enforcement: "What's the next step in the RED-GREEN-REFACTOR cycle?" + +### File Upload Issues + +**Solutions:** +- Ensure file is `.md` format +- Check file isn't corrupted +- Try re-downloading from distribution +- Verify file size is reasonable (<100KB) + +### Workflow Feels Too Rigid + +**Remember:** +- Rigidity is the point (prevents rationalization) +- "Too rigid" usually means "preventing me from skipping steps" +- Try it exactly as written for 2-3 weeks before modifying +- Shortcuts lose the benefits + +--- + +## Next Steps + +1. **Download** `core-workflows.md` (or cheat sheets) +2. **Try it** on a small task +3. **Follow exactly** - Don't skip steps +4. **Evaluate** - Does this help? +5. **Consider Pro** - If you like it but uploading is tedious + +--- + +## Getting Help + +- **Workflow questions:** Read core-workflows.md carefully +- **Technical issues:** [GitHub Issues](https://github.com/obra/superpowers/issues) +- **Want full experience:** Try [Claude Code plugin](https://github.com/obra/superpowers) + +--- + +## Files in This Directory + +``` +free-mode/ +├── QUICK-START.md (you are here) +├── core-workflows.md (upload this for full workflows) +└── cheat-sheets/ + ├── tdd-cheat-sheet.md (1-page TDD reference) + ├── debugging-cheat-sheet.md (1-page debugging reference) + └── brainstorming-cheat-sheet.md (1-page design reference) +``` + +--- + +**Ready to start?** + +1. Download `core-workflows.md` +2. Start a new Claude conversation +3. Upload the file +4. Say: "Follow the workflows in core-workflows.md" +5. Start building! + +**Good luck! 🚀** diff --git a/superpowers-desktop/free-mode/cheat-sheets/brainstorming-cheat-sheet.md b/superpowers-desktop/free-mode/cheat-sheets/brainstorming-cheat-sheet.md new file mode 100644 index 000000000..aeaf71f2f --- /dev/null +++ b/superpowers-desktop/free-mode/cheat-sheets/brainstorming-cheat-sheet.md @@ -0,0 +1,101 @@ +# Brainstorming Cheat Sheet + +## Purpose +Turn rough ideas into fully-formed designs through collaborative questioning. + +## The Process + +``` +Understand → Explore → Present → Document +Ask questions 2-3 approaches 200-300 words Design doc +``` + +## 1. Understanding + +- **Check context** - Files, docs, recent commits +- **Ask questions** - ONE at a time +- **Prefer multiple choice** - Easier to answer +- **Focus on** - Purpose, constraints, success criteria + +## 2. Exploring + +- **Propose 2-3 approaches** - With trade-offs +- **Lead with recommendation** - And explain why +- **YAGNI ruthlessly** - Remove unnecessary features + +## 3. Presenting + +- **Break into sections** - 200-300 words each +- **Validate incrementally** - "Does this look right so far?" +- **Cover all aspects** - Architecture, components, data flow, errors, testing +- **Be flexible** - Go back and clarify if needed + +## 4. Documenting + +- **Write design** - `docs/plans/YYYY-MM-DD--design.md` +- **Commit to git** +- **If continuing** - Create implementation plan + +## Key Principles + +✅ **One question at a time** - Don't overwhelm +✅ **YAGNI ruthlessly** - Simplicity first +✅ **Explore alternatives** - 2-3 before settling +✅ **Incremental validation** - Check after each section +✅ **Be flexible** - Clarify when needed + +## Good Questions + +**Multiple choice:** +> "Should we store this in memory or database? +> Memory is faster but doesn't persist. +> Database persists but adds latency." + +**Open-ended:** +> "What's the most important constraint for this feature?" + +**One at a time:** +> ❌ "How should we handle auth, data validation, and errors?" +> ✅ "How should we handle authentication?" + +## Design Section Example + +**Section 1: Architecture** (250 words) +```markdown +## Architecture + +We'll use a three-tier architecture: + +**API Layer**: Express.js REST endpoints handle HTTP requests. +Validates input, calls service layer, returns responses. + +**Service Layer**: Business logic and orchestration. +Coordinates between repositories, enforces business rules. + +**Data Layer**: PostgreSQL with Prisma ORM. +Handles persistence and queries. + +**Why this approach**: Separates concerns, testable layers, +familiar stack. Alternative considered: GraphQL + MongoDB, +but REST + PostgreSQL better matches team expertise. + +Does this architecture look right so far? +``` + +## Remember + +> "Design first, implement second. +> Never skip brainstorming to 'save time.'" + +## Use Before + +- Writing code for new features +- Major refactoring +- Architectural decisions +- Complex implementations + +## Don't Use For + +- Clear mechanical changes +- Bug fixes (use debugging instead) +- Trivial updates diff --git a/superpowers-desktop/free-mode/cheat-sheets/debugging-cheat-sheet.md b/superpowers-desktop/free-mode/cheat-sheets/debugging-cheat-sheet.md new file mode 100644 index 000000000..748362d41 --- /dev/null +++ b/superpowers-desktop/free-mode/cheat-sheets/debugging-cheat-sheet.md @@ -0,0 +1,77 @@ +# Systematic Debugging Cheat Sheet + +## The Iron Law +**NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST** + +## 4-Phase Process + +``` +Phase 1 → Phase 2 → Phase 3 → Phase 4 +Root Cause Pattern Hypothesis Implementation +Investigate Analyze Test Fix +``` + +## Phase 1: Root Cause + +1. **Read errors** - Complete stack traces, line numbers +2. **Reproduce** - Reliable steps to trigger +3. **Check changes** - Git diff, recent commits +4. **Gather evidence** - Log at each boundary (multi-component) +5. **Trace data** - Where does bad value originate? Fix source. + +## Phase 2: Pattern + +1. **Find working** - Similar code that works +2. **Compare** - Read reference completely, don't skim +3. **Differences** - List every difference +4. **Dependencies** - What does it need? + +## Phase 3: Hypothesis + +1. **Form hypothesis** - "I think X because Y" (write it down) +2. **Test minimally** - Smallest change, one variable +3. **Verify** - Worked? Phase 4. Didn't? New hypothesis. + +## Phase 4: Fix + +1. **Failing test** - Create reproduction (use TDD) +2. **Single fix** - Address root cause only +3. **Verify** - Test passes, nothing else breaks +4. **If 3+ fixes failed** - STOP. Question architecture. + +## Red Flags = STOP, PHASE 1 + +- ❌ "Quick fix for now" +- ❌ "Just try X" +- ❌ "Multiple changes" +- ❌ "Skip the test" +- ❌ "It's probably X" +- ❌ "One more fix" (after 2+) + +**All mean: Return to Phase 1.** + +## Common Mistakes + +| Mistake | Reality | +|---------|---------| +| "Simple issue" | Process is fast | +| "No time" | Systematic > thrashing | +| "Just try first" | Sets bad pattern | +| "Test after" | Untested fixes don't stick | + +## Example Flow + +``` +1. Read error: "Cannot read property 'x' of undefined" +2. Trace: data → transform → render (breaks in render) +3. Find: transform returns null when empty +4. Hypothesis: "Missing null check in transform" +5. Test: Write failing test for empty input +6. Fix: Add null check in transform +7. Verify: Test passes, bug resolved +``` + +## Remember + +> "Symptom fixes waste time and create new bugs. +> Always find root cause first." diff --git a/superpowers-desktop/free-mode/cheat-sheets/tdd-cheat-sheet.md b/superpowers-desktop/free-mode/cheat-sheets/tdd-cheat-sheet.md new file mode 100644 index 000000000..27f8e8d88 --- /dev/null +++ b/superpowers-desktop/free-mode/cheat-sheets/tdd-cheat-sheet.md @@ -0,0 +1,68 @@ +# TDD Cheat Sheet + +## The Iron Law +**NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST** + +## RED-GREEN-REFACTOR + +``` +RED → Verify RED → GREEN → Verify GREEN → REFACTOR +Write test Watch fail Write min code Watch pass Clean up +``` + +## The Cycle + +1. **RED**: Write one minimal test (1 behavior, clear name, real code) +2. **Verify RED**: Run test, confirm it fails correctly **(MANDATORY)** +3. **GREEN**: Write simplest code to pass (no extras, no YAGNI) +4. **Verify GREEN**: Run test, confirm it passes **(MANDATORY)** +5. **REFACTOR**: Remove duplication, improve names, stay green +6. **Repeat**: Next test for next feature + +## Red Flags = DELETE CODE + +- ❌ Code before test +- ❌ Test passes immediately +- ❌ "I'll test after" +- ❌ "Already manually tested" +- ❌ "Keep as reference" +- ❌ "Too simple to test" + +**All mean: Delete code. Start over with TDD.** + +## Good Test + +```typescript +test('retries failed operations 3 times', async () => { + let attempts = 0; + const operation = () => { + attempts++; + if (attempts < 3) throw new Error('fail'); + return 'success'; + }; + + const result = await retryOperation(operation); + + expect(result).toBe('success'); + expect(attempts).toBe(3); +}); +``` + +✅ Clear name +✅ One behavior +✅ Real code + +## Completion Checklist + +- [ ] Every function has a test +- [ ] Watched each test fail first +- [ ] Failed for expected reason +- [ ] Wrote minimal code to pass +- [ ] All tests pass +- [ ] No errors/warnings +- [ ] Edge cases covered + +## Remember + +> "Tests written after code pass immediately. +> Passing immediately proves nothing." diff --git a/superpowers-desktop/free-mode/core-workflows.md b/superpowers-desktop/free-mode/core-workflows.md new file mode 100644 index 000000000..68e025dff --- /dev/null +++ b/superpowers-desktop/free-mode/core-workflows.md @@ -0,0 +1,304 @@ +# Superpowers Core Workflows (Free Mode) + +> **Note:** This is a condensed version for Claude Desktop free users. Pro users should use the full skills library in Projects. [Learn more](../README.md) + +## Quick Reference + +- **Implementing features/bugfixes** → Test-Driven Development (Section 1) +- **Debugging issues** → Systematic Debugging (Section 2) +- **Designing features** → Brainstorming (Section 3) + +--- + +## 1. Test-Driven Development (TDD) + +### The Iron Law +**NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST** + +### RED-GREEN-REFACTOR Cycle + +1. **RED** - Write one minimal test + - Test ONE behavior + - Clear name describing what should happen + - Use real code, not mocks + +2. **Verify RED** - Watch it fail + - Run test, confirm it fails correctly + - Failure because feature missing, not typos + - **MANDATORY:** Never skip this step + +3. **GREEN** - Write minimal code + - Simplest code to pass the test + - No extra features + - No "while I'm here" improvements + +4. **Verify GREEN** - Watch it pass + - Run test, confirm it passes + - All other tests still pass + - **MANDATORY:** Never skip this step + +5. **REFACTOR** - Clean up + - Remove duplication + - Improve names + - Keep tests green + +6. **Repeat** - Next test for next feature + +### Red Flags (DELETE CODE AND START OVER) + +- Code before test +- Test passes immediately +- "I'll test after" +- "Already manually tested" +- "Tests after achieve same goals" +- "Keep as reference" +- "This is simple" + +### Example + +```typescript +// RED: Write failing test +test('retries failed operations 3 times', async () => { + let attempts = 0; + const operation = () => { + attempts++; + if (attempts < 3) throw new Error('fail'); + return 'success'; + }; + + const result = await retryOperation(operation); + + expect(result).toBe('success'); + expect(attempts).toBe(3); +}); + +// Verify RED: Run test → FAIL + +// GREEN: Write minimal implementation +async function retryOperation(fn: () => Promise): Promise { + for (let i = 0; i < 3; i++) { + try { + return await fn(); + } catch (e) { + if (i === 2) throw e; + } + } + throw new Error('unreachable'); +} + +// Verify GREEN: Run test → PASS +``` + +--- + +## 2. Systematic Debugging + +### The Iron Law +**NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST** + +### 4-Phase Process (Complete Each Before Next) + +#### Phase 1: Root Cause Investigation + +1. **Read Error Messages Carefully** + - Don't skip past errors + - Read complete stack traces + - Note line numbers, file paths + +2. **Reproduce Consistently** + - Can you trigger it reliably? + - Exact steps to reproduce? + - If not reproducible → gather more data + +3. **Check Recent Changes** + - What changed that could cause this? + - Git diff, recent commits + - New dependencies, config changes + +4. **Gather Evidence** (Multi-component systems) + - Add logging at each component boundary + - Log what enters, what exits + - Run once to see WHERE it breaks + +5. **Trace Data Flow** + - Where does bad value originate? + - What called this with bad value? + - Trace backward to source + - Fix at source, not symptom + +#### Phase 2: Pattern Analysis + +1. **Find Working Examples** + - Locate similar working code + - What works that's similar? + +2. **Compare Against References** + - Read reference implementation COMPLETELY + - Don't skim - understand fully + +3. **Identify Differences** + - What's different between working and broken? + - List every difference + +4. **Understand Dependencies** + - What does this need? + - Settings, config, environment? + +#### Phase 3: Hypothesis and Testing + +1. **Form Single Hypothesis** + - "I think X is the root cause because Y" + - Be specific, write it down + +2. **Test Minimally** + - SMALLEST possible change + - One variable at a time + +3. **Verify Before Continuing** + - Worked? → Phase 4 + - Didn't work? → New hypothesis + - Don't add more fixes on top + +#### Phase 4: Implementation + +1. **Create Failing Test Case** + - Simplest possible reproduction + - MUST have before fixing + - Use TDD cycle above + +2. **Implement Single Fix** + - Address root cause + - ONE change at a time + +3. **Verify Fix** + - Test passes? + - No other tests broken? + +4. **If 3+ Fixes Failed** + - STOP + - Question the architecture + - Discuss with team before continuing + +### Red Flags (STOP, RETURN TO PHASE 1) + +- "Quick fix for now" +- "Just try changing X" +- "Add multiple changes" +- "Skip the test" +- "It's probably X" +- "I don't fully understand" +- "One more fix" (after 2+ failures) + +--- + +## 3. Brainstorming + +### Overview +Turn rough ideas into fully-formed designs through collaborative questioning. + +### Process + +**Understanding the Idea:** +1. Check current project state (files, docs, commits) +2. Ask questions ONE at a time +3. Prefer multiple choice when possible +4. Focus on: purpose, constraints, success criteria + +**Exploring Approaches:** +1. Propose 2-3 different approaches with trade-offs +2. Lead with your recommended option and why + +**Presenting Design:** +1. Once you understand, present the design +2. Break into sections of 200-300 words +3. Ask after each section: "Does this look right so far?" +4. Cover: architecture, components, data flow, error handling, testing + +**After Design:** +1. Write to `docs/plans/YYYY-MM-DD--design.md` +2. Commit to git +3. If continuing: Create implementation plan + +### Key Principles + +- **One question at a time** - Don't overwhelm +- **YAGNI ruthlessly** - Remove unnecessary features +- **Explore alternatives** - 2-3 approaches before settling +- **Incremental validation** - Present sections, validate each + +--- + +## Workflow Integration + +### Starting Any Task + +1. **Check for relevant workflow above** +2. **Announce you're using it** + - "I'm using TDD to implement this feature" + - "I'm using systematic debugging to find this bug" +3. **Follow it exactly** +4. **Track checklist items explicitly** + +### Common Violations + +| Excuse | Reality | +|--------|---------| +| "Too simple for process" | Process is fast for simple cases | +| "No time for process" | Systematic is FASTER than thrashing | +| "I'll follow spirit not letter" | Letter IS spirit - no shortcuts | +| "This is different because..." | It's not. Follow the process. | + +### Checklists + +**Before marking work complete:** +- [ ] All tests pass (ran them, saw green) +- [ ] No errors or warnings in output +- [ ] Followed TDD (wrote tests first, watched them fail) +- [ ] If debugging: Found root cause (not symptom fix) +- [ ] If feature: Design discussed first (brainstormed) +- [ ] Code is minimal (no YAGNI violations) +- [ ] Ready for review (would pass code review) + +--- + +## How to Use This Document + +**Every conversation:** +1. Upload this file to conversation +2. Say: "Follow the core workflows in core-workflows.md" +3. Reference specific sections as needed +4. Track checklist items explicitly + +**For specific tasks:** +- "Use TDD from core-workflows.md to implement X" +- "Use systematic debugging from core-workflows.md for this bug" +- "Use brainstorming from core-workflows.md to design Y" + +--- + +## Limitations (Free Mode) + +- Must upload this file each conversation +- No automatic workflow activation +- No persistent project knowledge +- Manual checklist tracking +- No enforcement mechanism + +**Want automatic activation?** [Upgrade to Pro and use the full skills library](../pro-setup/SETUP.md) + +--- + +## Summary + +**Three core workflows:** +1. **TDD**: Write test first, watch fail, implement, watch pass +2. **Systematic Debugging**: Root Cause → Pattern → Hypothesis → Fix +3. **Brainstorming**: Question → Explore → Present → Validate + +**All workflows share:** +- Mandatory steps (no skipping) +- Systematic process (no ad-hoc) +- Evidence-based (no guessing) +- Verification required (no assumptions) + +**Remember:** If you catch yourself rationalizing ("just this once", "too simple", "spirit not letter"), STOP. That means you're about to violate the workflow. diff --git a/superpowers-desktop/pro-setup/SETUP.md b/superpowers-desktop/pro-setup/SETUP.md new file mode 100644 index 000000000..d9c9420dc --- /dev/null +++ b/superpowers-desktop/pro-setup/SETUP.md @@ -0,0 +1,334 @@ +# Superpowers Pro Setup Guide + +**Time Required:** 15 minutes one-time setup + +**Prerequisites:** Claude Desktop Pro subscription ($20/month) + +--- + +## Step 1: Create a New Project + +1. Open Claude Desktop +2. Click on "Projects" in the sidebar +3. Click "+ New Project" +4. Name it: **"Superpowers Development"** (or your preferred name) +5. Click "Create" + +--- + +## Step 2: Add Skills to Project Knowledge + +### Option A: Upload All Skills (Recommended) + +1. In your project, click "Add content" → "Upload files" +2. Navigate to: `superpowers-desktop/pro-setup/skills/` +3. **Upload all directories:** + - `core/` (4 files) - **Start here!** + - `testing/` (2 files + 1 example) + - `debugging/` (3 files + 1 script) + - `collaboration/` (8 files) + - `meta/` (3 files) +4. Upload `index.md` at the root of skills/ + +**Total:** 20+ skill files + index + +### Option B: Start with Core Skills Only + +If you want to start small: + +1. Upload only the `core/` directory (4 files): + - `using-superpowers.md` + - `test-driven-development.md` + - `systematic-debugging.md` + - `brainstorming.md` +2. Upload `index.md` +3. Add more categories later as needed + +--- + +## Step 3: Set Custom Instructions + +1. In your project, click "Project settings" (gear icon) +2. Find "Custom instructions" section +3. Copy the content from `custom-instructions.txt` +4. Paste into the custom instructions field +5. Click "Save" + +**What this does:** +- Reminds Claude to check for relevant skills before responding +- Establishes protocol for using skills +- Provides quick reference to available skills + +--- + +## Step 4: Verify Installation + +Start a new conversation in your project and test: + +**Test 1: Skill Discovery** +``` +You: "I need to implement a new login feature" +Claude: "I'm using brainstorming.md to refine this design..." +``` + +**Test 2: TDD Enforcement** +``` +You: "Write code for user validation" +Claude: "I'm using test-driven-development.md. + Let me write the test first..." +``` + +**Test 3: Debugging Protocol** +``` +You: "The tests are failing" +Claude: "I'm using systematic-debugging.md. + Let me investigate the root cause..." +``` + +--- + +## Step 5: Learn the Workflow + +### Invoking Skills + +**Manual invocation (always works):** +``` +"Use test-driven-development.md to implement this" +"Follow systematic-debugging.md for this bug" +"Reference brainstorming.md to design this feature" +``` + +**Relying on custom instructions (sometimes works):** +- Custom instructions remind Claude to check for skills +- Not as reliable as manual invocation +- Works best for obvious task matches + +**Best practice:** Explicitly reference the skill when starting a task. + +### Tracking Checklists + +Skills often include checklists. Track them explicitly: + +```markdown +**TDD Checklist for login feature:** +- [ ] Write failing test for valid credentials +- [ ] Run test, verify it fails +- [ ] Implement minimal validation +- [ ] Run test, verify it passes +- [ ] Write failing test for invalid credentials +... +``` + +Update the list as you complete items. + +--- + +## Tips for Success + +### 1. Start Every Task Right + +Before coding or debugging: +- Ask yourself: "Is there a skill for this?" +- Check `index.md` in project knowledge +- Reference the skill explicitly + +### 2. Use the Index + +`index.md` is your friend: +- Symptom-based lookup +- Quick decision guide +- Skill descriptions + +### 3. Core Skills Are Mandatory + +These four skills should be used for almost everything: +- `test-driven-development.md` - **Any feature/bugfix** +- `systematic-debugging.md` - **Any bug/failure** +- `brainstorming.md` - **Before new features** +- `using-superpowers.md` - **Understanding the system** + +### 4. Announce Skill Usage + +Always say: +``` +"I'm using [skill-name] to [action]" +``` + +This helps you verify Claude is actually following the skill. + +### 5. Track Explicitly + +No automatic TodoWrite in Desktop. Track manually: +- Use checkboxes in responses +- Update status explicitly +- Don't skip checklist items + +--- + +## Common Issues + +### "Claude Isn't Using Skills" + +**Problem:** Claude responds without referencing skills. + +**Solution:** +- Explicitly invoke: "Use test-driven-development.md for this" +- Check custom instructions are set +- Remind: "Check project knowledge for relevant skills" + +### "Too Many Files in Context" + +**Problem:** Large project knowledge competes for context. + +**Solution:** +- Start with core skills only +- Add categories as needed +- Consider using multiple projects for different work types + +### "Skills Seem Out of Date" + +**Problem:** Original plugin skills were updated. + +**Solution:** +- Check for updates to superpowers-desktop +- Download new skill files +- Re-upload to your project + +--- + +## Workflow Examples + +### Example 1: Implementing a Feature + +``` +You: "I need to add email validation to the signup form" + +Claude: "I'm using brainstorming.md to design this feature first. + Let me ask some questions to understand the requirements..." + +[After design is complete] + +Claude: "I'm using test-driven-development.md to implement. + Let me write the first failing test..." + +[Follows RED-GREEN-REFACTOR cycle] +``` + +### Example 2: Debugging an Issue + +``` +You: "The login endpoint is returning 500 errors" + +Claude: "I'm using systematic-debugging.md for this. + + Phase 1: Root Cause Investigation + Let me read the error messages carefully..." + +[Follows 4-phase process] +``` + +### Example 3: Code Review + +``` +You: "This branch is ready for review" + +Claude: "I'm using requesting-code-review.md. + Let me verify the pre-review checklist..." + +[Works through checklist] +``` + +--- + +## Customization + +### Adding Your Own Skills + +You can add custom skills to your project: + +1. Create a new `.md` file following the skill format: +```markdown +--- +name: your-skill-name +description: Use when [triggering conditions] - [what it does] +--- + +# Your Skill Name + +## Overview +... +``` + +2. Upload to your project +3. Add to your mental index of available skills + +### Adapting Custom Instructions + +Feel free to modify `custom-instructions.txt` to: +- Add project-specific workflows +- Emphasize certain skills over others +- Add reminders for your team's practices + +--- + +## Maintenance + +### Updating Skills + +When new versions are released: + +1. Download updated `pro-setup/skills/` directory +2. In your project, delete old skill files +3. Upload new skill files +4. Update custom instructions if needed + +### Backing Up Your Project + +Projects are stored by Claude Desktop. To backup: +- Export conversations regularly +- Keep a copy of your skill files locally +- Document any custom skills you've added + +--- + +## Next Steps + +**You're ready to use Superpowers!** + +1. **Start with a small task** - Try TDD on a simple function +2. **Use the index** - Reference `index.md` when unsure +3. **Be explicit** - Always invoke skills by name +4. **Track progress** - Maintain checklists in responses +5. **Iterate** - The more you use skills, the more natural they become + +--- + +## Getting Help + +- **Skill usage questions:** Reference the skill's content +- **Installation issues:** Re-check steps above +- **Bug reports:** [GitHub Issues](https://github.com/obra/superpowers/issues) +- **General questions:** [Discussions](https://github.com/obra/superpowers/discussions) + +--- + +## Comparison to Claude Code Plugin + +**What you're missing:** +- Automatic skill activation (must invoke manually) +- TodoWrite tool (track manually) +- Task tool for subagents (break down manually) +- SessionStart hooks (custom instructions instead) +- Git-based auto-updates (manual reuploads) + +**What you have:** +- Full skill content and workflows +- Persistent project knowledge +- Custom instructions for reminders +- All 20+ skills available + +**If you have access to Claude Code, use the plugin instead for the full experience.** + +--- + +**Happy coding with Superpowers! 🚀** diff --git a/superpowers-desktop/pro-setup/custom-instructions.txt b/superpowers-desktop/pro-setup/custom-instructions.txt new file mode 100644 index 000000000..245d00579 --- /dev/null +++ b/superpowers-desktop/pro-setup/custom-instructions.txt @@ -0,0 +1,80 @@ +# Superpowers Skills Protocol for Claude Desktop + +## MANDATORY FIRST RESPONSE PROTOCOL + +Before responding to ANY user message, you MUST complete this checklist: + +1. ☐ Check project knowledge for relevant skills +2. ☐ Ask yourself: "Does ANY skill match this request?" +3. ☐ If yes → Reference and read the skill from project knowledge +4. ☐ Announce which skill you're using +5. ☐ Follow the skill exactly + +**Responding WITHOUT completing this checklist = automatic failure.** + +## Core Skills (Always Check First) + +- **test-driven-development.md**: Use when implementing ANY feature or bugfix + - Mandatory: Write test first, watch it fail, then implement + - No production code without a failing test first + +- **systematic-debugging.md**: Use when encountering ANY bug, test failure, or unexpected behavior + - Mandatory: Find root cause before attempting fixes + - 4-phase process: Root Cause → Pattern → Hypothesis → Implementation + +- **brainstorming.md**: Use before writing code for new features + - Mandatory: Design first, implement second + - Socratic questioning to refine ideas + +- **using-superpowers.md**: Establishes mandatory workflows + - Read this first to understand the system + +## Critical Rules + +1. **Skills are MANDATORY, not optional**: If a skill exists for your task, you MUST use it + +2. **Announce skill usage**: "I'm using [skill-name] to [action]" + +3. **Track checklists explicitly**: When a skill has a checklist, track each item explicitly in your responses (no automatic tracking in Desktop) + +4. **No rationalizations**: + - "This is simple" → Still use the skill + - "I'll test after" → Write test first + - "Quick fix for now" → Find root cause first + - "The skill is overkill" → If it exists, use it + +## Available Skill Categories + +**Core** (in skills/core/): +- using-superpowers.md, test-driven-development.md, systematic-debugging.md, brainstorming.md + +**Testing** (in skills/testing/): +- condition-based-waiting.md, testing-anti-patterns.md + +**Debugging** (in skills/debugging/): +- root-cause-tracing.md, verification-before-completion.md, defense-in-depth.md + +**Collaboration** (in skills/collaboration/): +- writing-plans.md, executing-plans.md, requesting-code-review.md, receiving-code-review.md, using-git-worktrees.md, finishing-a-development-branch.md, subagent-driven-development.md, dispatching-parallel-agents.md + +**Meta** (in skills/meta/): +- writing-skills.md, sharing-skills.md, testing-skills-with-subagents.md + +See skills/index.md in project knowledge for complete descriptions. + +## Instructions ≠ Permission to Skip Workflows + +User's specific instructions describe WHAT to do, not HOW. + +"Add X", "Fix Y" = the goal, NOT permission to skip brainstorming, TDD, or RED-GREEN-REFACTOR. + +## Summary + +**Starting any task:** +1. Check if relevant skill exists in project knowledge +2. If yes → Reference and use it +3. Announce you're using it +4. Follow what it says exactly +5. Track checklists explicitly in responses + +**Finding a relevant skill = mandatory to use it. Not optional.** diff --git a/superpowers-desktop/pro-setup/skills/collaboration/dispatching-parallel-agents.md b/superpowers-desktop/pro-setup/skills/collaboration/dispatching-parallel-agents.md new file mode 100644 index 000000000..d2dbdc300 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/dispatching-parallel-agents.md @@ -0,0 +1,182 @@ +--- +name: dispatching-parallel-agents +description: Use when facing 3+ independent failures that can be investigated without shared state or dependencies - dispatches multiple Claude agents to investigate and fix independent problems concurrently +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Dispatching Parallel Agents + +## Overview + +When you have multiple unrelated failures (different test files, different subsystems, different bugs), investigating them sequentially wastes time. Each investigation is independent and can happen in parallel. + +**Core principle:** Dispatch one agent per independent problem domain. Let them work concurrently. + +## When to Use + +```dot +digraph when_to_use { + "Multiple failures?" [shape=diamond]; + "Are they independent?" [shape=diamond]; + "Single agent investigates all" [shape=box]; + "One agent per problem domain" [shape=box]; + "Can they work in parallel?" [shape=diamond]; + "Sequential agents" [shape=box]; + "Parallel dispatch" [shape=box]; + + "Multiple failures?" -> "Are they independent?" [label="yes"]; + "Are they independent?" -> "Single agent investigates all" [label="no - related"]; + "Are they independent?" -> "Can they work in parallel?" [label="yes"]; + "Can they work in parallel?" -> "Parallel dispatch" [label="yes"]; + "Can they work in parallel?" -> "Sequential agents" [label="no - shared state"]; +} +``` + +**Use when:** +- 3+ test files failing with different root causes +- Multiple subsystems broken independently +- Each problem can be understood without context from others +- No shared state between investigations + +**Don't use when:** +- Failures are related (fix one might fix others) +- Need to understand full system state +- Agents would interfere with each other + +## The Pattern + +### 1. Identify Independent Domains + +Group failures by what's broken: +- File A tests: Tool approval flow +- File B tests: Batch completion behavior +- File C tests: Abort functionality + +Each domain is independent - fixing tool approval doesn't affect abort tests. + +### 2. Create Focused Agent Tasks + +Each agent gets: +- **Specific scope:** One test file or subsystem +- **Clear goal:** Make these tests pass +- **Constraints:** Don't change other code +- **Expected output:** Summary of what you found and fixed + +### 3. Dispatch in Parallel + +```typescript +// In Claude Code / AI environment +Task("Fix agent-tool-abort.test.ts failures") +Task("Fix batch-completion-behavior.test.ts failures") +Task("Fix tool-approval-race-conditions.test.ts failures") +// All three run concurrently +``` + +### 4. Review and Integrate + +When agents return: +- Read each summary +- Verify fixes don't conflict +- Run full test suite +- Integrate all changes + +## Agent Prompt Structure + +Good agent prompts are: +1. **Focused** - One clear problem domain +2. **Self-contained** - All context needed to understand the problem +3. **Specific about output** - What should the agent return? + +```markdown +Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts: + +1. "should abort tool with partial output capture" - expects 'interrupted at' in message +2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed +3. "should properly track pendingToolCount" - expects 3 results but gets 0 + +These are timing/race condition issues. Your task: + +1. Read the test file and understand what each test verifies +2. Identify root cause - timing issues or actual bugs? +3. Fix by: + - Replacing arbitrary timeouts with event-based waiting + - Fixing bugs in abort implementation if found + - Adjusting test expectations if testing changed behavior + +Do NOT just increase timeouts - find the real issue. + +Return: Summary of what you found and what you fixed. +``` + +## Common Mistakes + +**❌ Too broad:** "Fix all the tests" - agent gets lost +**✅ Specific:** "Fix agent-tool-abort.test.ts" - focused scope + +**❌ No context:** "Fix the race condition" - agent doesn't know where +**✅ Context:** Paste the error messages and test names + +**❌ No constraints:** Agent might refactor everything +**✅ Constraints:** "Do NOT change production code" or "Fix tests only" + +**❌ Vague output:** "Fix it" - you don't know what changed +**✅ Specific:** "Return summary of root cause and changes" + +## When NOT to Use + +**Related failures:** Fixing one might fix others - investigate together first +**Need full context:** Understanding requires seeing entire system +**Exploratory debugging:** You don't know what's broken yet +**Shared state:** Agents would interfere (editing same files, using same resources) + +## Real Example from Session + +**Scenario:** 6 test failures across 3 files after major refactoring + +**Failures:** +- agent-tool-abort.test.ts: 3 failures (timing issues) +- batch-completion-behavior.test.ts: 2 failures (tools not executing) +- tool-approval-race-conditions.test.ts: 1 failure (execution count = 0) + +**Decision:** Independent domains - abort logic separate from batch completion separate from race conditions + +**Dispatch:** +``` +Agent 1 → Fix agent-tool-abort.test.ts +Agent 2 → Fix batch-completion-behavior.test.ts +Agent 3 → Fix tool-approval-race-conditions.test.ts +``` + +**Results:** +- Agent 1: Replaced timeouts with event-based waiting +- Agent 2: Fixed event structure bug (threadId in wrong place) +- Agent 3: Added wait for async tool execution to complete + +**Integration:** All fixes independent, no conflicts, full suite green + +**Time saved:** 3 problems solved in parallel vs sequentially + +## Key Benefits + +1. **Parallelization** - Multiple investigations happen simultaneously +2. **Focus** - Each agent has narrow scope, less context to track +3. **Independence** - Agents don't interfere with each other +4. **Speed** - 3 problems solved in time of 1 + +## Verification + +After agents return: +1. **Review each summary** - Understand what changed +2. **Check for conflicts** - Did agents edit same code? +3. **Run full suite** - Verify all fixes work together +4. **Spot check** - Agents can make systematic errors + +## Real-World Impact + +From debugging session (2025-10-03): +- 6 failures across 3 files +- 3 agents dispatched in parallel +- All investigations completed concurrently +- All fixes integrated successfully +- Zero conflicts between agent changes diff --git a/superpowers-desktop/pro-setup/skills/collaboration/executing-plans.md b/superpowers-desktop/pro-setup/skills/collaboration/executing-plans.md new file mode 100644 index 000000000..9360bab27 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/executing-plans.md @@ -0,0 +1,78 @@ +--- +name: executing-plans +description: Use when partner provides a complete implementation plan to execute in controlled batches with review checkpoints - loads plan, reviews critically, executes tasks in batches, reports for review between batches +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Executing Plans + +## Overview + +Load plan, review critically, execute tasks in batches, report for review between batches. + +**Core principle:** Batch execution with checkpoints for architect review. + +**Announce at start:** "I'm using the executing-plans skill to implement this plan." + +## The Process + +### Step 1: Load and Review Plan +1. Read plan file +2. Review critically - identify any questions or concerns about the plan +3. If concerns: Raise them with your human partner before starting +4. If no concerns: Create explicit checklist tracking and proceed + +### Step 2: Execute Batch +**Default: First 3 tasks** + +For each task: +1. Mark as in_progress +2. Follow each step exactly (plan has bite-sized steps) +3. Run verifications as specified +4. Mark as completed + +### Step 3: Report +When batch complete: +- Show what was implemented +- Show verification output +- Say: "Ready for feedback." + +### Step 4: Continue +Based on feedback: +- Apply changes if needed +- Execute next batch +- Repeat until complete + +### Step 5: Complete Development + +After all tasks complete and verified: +- Announce: "I'm using the finishing-a-development-branch skill to complete this work." +- **REQUIRED SUB-SKILL:** Use finishing-a-development-branch.md (in project knowledge) +- Follow that skill to verify tests, present options, execute choice + +## When to Stop and Ask for Help + +**STOP executing immediately when:** +- Hit a blocker mid-batch (missing dependency, test fails, instruction unclear) +- Plan has critical gaps preventing starting +- You don't understand an instruction +- Verification fails repeatedly + +**Ask for clarification rather than guessing.** + +## When to Revisit Earlier Steps + +**Return to Review (Step 1) when:** +- Partner updates the plan based on your feedback +- Fundamental approach needs rethinking + +**Don't force through blockers** - stop and ask. + +## Remember +- Review plan critically first +- Follow plan steps exactly +- Don't skip verifications +- Reference skills when plan says to +- Between batches: just report and wait +- Stop when blocked, don't guess diff --git a/superpowers-desktop/pro-setup/skills/collaboration/finishing-a-development-branch.md b/superpowers-desktop/pro-setup/skills/collaboration/finishing-a-development-branch.md new file mode 100644 index 000000000..ebb373fde --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/finishing-a-development-branch.md @@ -0,0 +1,202 @@ +--- +name: finishing-a-development-branch +description: Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Finishing a Development Branch + +## Overview + +Guide completion of development work by presenting clear options and handling chosen workflow. + +**Core principle:** Verify tests → Present options → Execute choice → Clean up. + +**Announce at start:** "I'm using the finishing-a-development-branch skill to complete this work." + +## The Process + +### Step 1: Verify Tests + +**Before presenting options, verify tests pass:** + +```bash +# Run project's test suite +npm test / cargo test / pytest / go test ./... +``` + +**If tests fail:** +``` +Tests failing ( failures). Must fix before completing: + +[Show failures] + +Cannot proceed with merge/PR until tests pass. +``` + +Stop. Don't proceed to Step 2. + +**If tests pass:** Continue to Step 2. + +### Step 2: Determine Base Branch + +```bash +# Try common base branches +git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null +``` + +Or ask: "This branch split from main - is that correct?" + +### Step 3: Present Options + +Present exactly these 4 options: + +``` +Implementation complete. What would you like to do? + +1. Merge back to locally +2. Push and create a Pull Request +3. Keep the branch as-is (I'll handle it later) +4. Discard this work + +Which option? +``` + +**Don't add explanation** - keep options concise. + +### Step 4: Execute Choice + +#### Option 1: Merge Locally + +```bash +# Switch to base branch +git checkout + +# Pull latest +git pull + +# Merge feature branch +git merge + +# Verify tests on merged result + + +# If tests pass +git branch -d +``` + +Then: Cleanup worktree (Step 5) + +#### Option 2: Push and Create PR + +```bash +# Push branch +git push -u origin + +# Create PR +gh pr create --title "" --body "$(cat <<'EOF' +## Summary +<2-3 bullets of what changed> + +## Test Plan +- [ ] <verification steps> +EOF +)" +``` + +Then: Cleanup worktree (Step 5) + +#### Option 3: Keep As-Is + +Report: "Keeping branch <name>. Worktree preserved at <path>." + +**Don't cleanup worktree.** + +#### Option 4: Discard + +**Confirm first:** +``` +This will permanently delete: +- Branch <name> +- All commits: <commit-list> +- Worktree at <path> + +Type 'discard' to confirm. +``` + +Wait for exact confirmation. + +If confirmed: +```bash +git checkout <base-branch> +git branch -D <feature-branch> +``` + +Then: Cleanup worktree (Step 5) + +### Step 5: Cleanup Worktree + +**For Options 1, 2, 4:** + +Check if in worktree: +```bash +git worktree list | grep $(git branch --show-current) +``` + +If yes: +```bash +git worktree remove <worktree-path> +``` + +**For Option 3:** Keep worktree. + +## Quick Reference + +| Option | Merge | Push | Keep Worktree | Cleanup Branch | +|--------|-------|------|---------------|----------------| +| 1. Merge locally | ✓ | - | - | ✓ | +| 2. Create PR | - | ✓ | ✓ | - | +| 3. Keep as-is | - | - | ✓ | - | +| 4. Discard | - | - | - | ✓ (force) | + +## Common Mistakes + +**Skipping test verification** +- **Problem:** Merge broken code, create failing PR +- **Fix:** Always verify tests before offering options + +**Open-ended questions** +- **Problem:** "What should I do next?" → ambiguous +- **Fix:** Present exactly 4 structured options + +**Automatic worktree cleanup** +- **Problem:** Remove worktree when might need it (Option 2, 3) +- **Fix:** Only cleanup for Options 1 and 4 + +**No confirmation for discard** +- **Problem:** Accidentally delete work +- **Fix:** Require typed "discard" confirmation + +## Red Flags + +**Never:** +- Proceed with failing tests +- Merge without verifying tests on result +- Delete work without confirmation +- Force-push without explicit request + +**Always:** +- Verify tests before offering options +- Present exactly 4 options +- Get typed confirmation for Option 4 +- Clean up worktree for Options 1 & 4 only + +## Integration + +**Called by:** +- **separate task-driven-development** (Step 7) - After all tasks complete +- **executing-plans** (Step 5) - After all batches complete + +**Pairs with:** +- **using-git-worktrees** - Cleans up worktree created by that skill diff --git a/superpowers-desktop/pro-setup/skills/collaboration/receiving-code-review.md b/superpowers-desktop/pro-setup/skills/collaboration/receiving-code-review.md new file mode 100644 index 000000000..2dc19ac2d --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/receiving-code-review.md @@ -0,0 +1,211 @@ +--- +name: receiving-code-review +description: Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Code Review Reception + +## Overview + +Code review requires technical evaluation, not emotional performance. + +**Core principle:** Verify before implementing. Ask before assuming. Technical correctness over social comfort. + +## The Response Pattern + +``` +WHEN receiving code review feedback: + +1. READ: Complete feedback without reacting +2. UNDERSTAND: Restate requirement in own words (or ask) +3. VERIFY: Check against codebase reality +4. EVALUATE: Technically sound for THIS codebase? +5. RESPOND: Technical acknowledgment or reasoned pushback +6. IMPLEMENT: One item at a time, test each +``` + +## Forbidden Responses + +**NEVER:** +- "You're absolutely right!" (explicit CLAUDE.md violation) +- "Great point!" / "Excellent feedback!" (performative) +- "Let me implement that now" (before verification) + +**INSTEAD:** +- Restate the technical requirement +- Ask clarifying questions +- Push back with technical reasoning if wrong +- Just start working (actions > words) + +## Handling Unclear Feedback + +``` +IF any item is unclear: + STOP - do not implement anything yet + ASK for clarification on unclear items + +WHY: Items may be related. Partial understanding = wrong implementation. +``` + +**Example:** +``` +your human partner: "Fix 1-6" +You understand 1,2,3,6. Unclear on 4,5. + +❌ WRONG: Implement 1,2,3,6 now, ask about 4,5 later +✅ RIGHT: "I understand items 1,2,3,6. Need clarification on 4 and 5 before proceeding." +``` + +## Source-Specific Handling + +### From your human partner +- **Trusted** - implement after understanding +- **Still ask** if scope unclear +- **No performative agreement** +- **Skip to action** or technical acknowledgment + +### From External Reviewers +``` +BEFORE implementing: + 1. Check: Technically correct for THIS codebase? + 2. Check: Breaks existing functionality? + 3. Check: Reason for current implementation? + 4. Check: Works on all platforms/versions? + 5. Check: Does reviewer understand full context? + +IF suggestion seems wrong: + Push back with technical reasoning + +IF can't easily verify: + Say so: "I can't verify this without [X]. Should I [investigate/ask/proceed]?" + +IF conflicts with your human partner's prior decisions: + Stop and discuss with your human partner first +``` + +**your human partner's rule:** "External feedback - be skeptical, but check carefully" + +## YAGNI Check for "Professional" Features + +``` +IF reviewer suggests "implementing properly": + grep codebase for actual usage + + IF unused: "This endpoint isn't called. Remove it (YAGNI)?" + IF used: Then implement properly +``` + +**your human partner's rule:** "You and reviewer both report to me. If we don't need this feature, don't add it." + +## Implementation Order + +``` +FOR multi-item feedback: + 1. Clarify anything unclear FIRST + 2. Then implement in this order: + - Blocking issues (breaks, security) + - Simple fixes (typos, imports) + - Complex fixes (refactoring, logic) + 3. Test each fix individually + 4. Verify no regressions +``` + +## When To Push Back + +Push back when: +- Suggestion breaks existing functionality +- Reviewer lacks full context +- Violates YAGNI (unused feature) +- Technically incorrect for this stack +- Legacy/compatibility reasons exist +- Conflicts with your human partner's architectural decisions + +**How to push back:** +- Use technical reasoning, not defensiveness +- Ask specific questions +- Reference working tests/code +- Involve your human partner if architectural + +**Signal if uncomfortable pushing back out loud:** "Strange things are afoot at the Circle K" + +## Acknowledging Correct Feedback + +When feedback IS correct: +``` +✅ "Fixed. [Brief description of what changed]" +✅ "Good catch - [specific issue]. Fixed in [location]." +✅ [Just fix it and show in the code] + +❌ "You're absolutely right!" +❌ "Great point!" +❌ "Thanks for catching that!" +❌ "Thanks for [anything]" +❌ ANY gratitude expression +``` + +**Why no thanks:** Actions speak. Just fix it. The code itself shows you heard the feedback. + +**If you catch yourself about to write "Thanks":** DELETE IT. State the fix instead. + +## Gracefully Correcting Your Pushback + +If you pushed back and were wrong: +``` +✅ "You were right - I checked [X] and it does [Y]. Implementing now." +✅ "Verified this and you're correct. My initial understanding was wrong because [reason]. Fixing." + +❌ Long apology +❌ Defending why you pushed back +❌ Over-explaining +``` + +State the correction factually and move on. + +## Common Mistakes + +| Mistake | Fix | +|---------|-----| +| Performative agreement | State requirement or just act | +| Blind implementation | Verify against codebase first | +| Batch without testing | One at a time, test each | +| Assuming reviewer is right | Check if breaks things | +| Avoiding pushback | Technical correctness > comfort | +| Partial implementation | Clarify all items first | +| Can't verify, proceed anyway | State limitation, ask for direction | + +## Real Examples + +**Performative Agreement (Bad):** +``` +Reviewer: "Remove legacy code" +❌ "You're absolutely right! Let me remove that..." +``` + +**Technical Verification (Good):** +``` +Reviewer: "Remove legacy code" +✅ "Checking... build target is 10.15+, this API needs 13+. Need legacy for backward compat. Current impl has wrong bundle ID - fix it or drop pre-13 support?" +``` + +**YAGNI (Good):** +``` +Reviewer: "Implement proper metrics tracking with database, date filters, CSV export" +✅ "Grepped codebase - nothing calls this endpoint. Remove it (YAGNI)? Or is there usage I'm missing?" +``` + +**Unclear Item (Good):** +``` +your human partner: "Fix items 1-6" +You understand 1,2,3,6. Unclear on 4,5. +✅ "Understand 1,2,3,6. Need clarification on 4 and 5 before implementing." +``` + +## The Bottom Line + +**External feedback = suggestions to evaluate, not orders to follow.** + +Verify. Question. Then implement. + +No performative agreement. Technical rigor always. diff --git a/superpowers-desktop/pro-setup/skills/collaboration/requesting-code-review.md b/superpowers-desktop/pro-setup/skills/collaboration/requesting-code-review.md new file mode 100644 index 000000000..d53c7e557 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/requesting-code-review.md @@ -0,0 +1,107 @@ +--- +name: requesting-code-review +description: Use when completing tasks, implementing major features, or before merging to verify work meets requirements - break down into sequential tasks to review implementation against plan or requirements before proceeding +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Requesting Code Review + +Dispatch code-reviewer.md (in project knowledge) separate task to catch issues before they cascade. + +**Core principle:** Review early, review often. + +## When to Request Review + +**Mandatory:** +- After each task in separate task-driven development +- After completing major feature +- Before merge to main + +**Optional but valuable:** +- When stuck (fresh perspective) +- Before refactoring (baseline check) +- After fixing complex bug + +## How to Request + +**1. Get git SHAs:** +```bash +BASE_SHA=$(git rev-parse HEAD~1) # or origin/main +HEAD_SHA=$(git rev-parse HEAD) +``` + +**2. Dispatch code-reviewer separate task:** + +Use manual task breakdown with code-reviewer.md (in project knowledge) type, fill template at `code-reviewer.md` + +**Placeholders:** +- `{WHAT_WAS_IMPLEMENTED}` - What you just built +- `{PLAN_OR_REQUIREMENTS}` - What it should do +- `{BASE_SHA}` - Starting commit +- `{HEAD_SHA}` - Ending commit +- `{DESCRIPTION}` - Brief summary + +**3. Act on feedback:** +- Fix Critical issues immediately +- Fix Important issues before proceeding +- Note Minor issues for later +- Push back if reviewer is wrong (with reasoning) + +## Example + +``` +[Just completed Task 2: Add verification function] + +You: Let me request code review before proceeding. + +BASE_SHA=$(git log --oneline | grep "Task 1" | head -1 | awk '{print $1}') +HEAD_SHA=$(git rev-parse HEAD) + +[Dispatch code-reviewer.md (in project knowledge) separate task] + WHAT_WAS_IMPLEMENTED: Verification and repair functions for conversation index + PLAN_OR_REQUIREMENTS: Task 2 from docs/plans/deployment-plan.md + BASE_SHA: a7981ec + HEAD_SHA: 3df7661 + DESCRIPTION: Added verifyIndex() and repairIndex() with 4 issue types + +[Subagent returns]: + Strengths: Clean architecture, real tests + Issues: + Important: Missing progress indicators + Minor: Magic number (100) for reporting interval + Assessment: Ready to proceed + +You: [Fix progress indicators] +[Continue to Task 3] +``` + +## Integration with Workflows + +**Subagent-Driven Development:** +- Review after EACH task +- Catch issues before they compound +- Fix before moving to next task + +**Executing Plans:** +- Review after each batch (3 tasks) +- Get feedback, apply, continue + +**Ad-Hoc Development:** +- Review before merge +- Review when stuck + +## Red Flags + +**Never:** +- Skip review because "it's simple" +- Ignore Critical issues +- Proceed with unfixed Important issues +- Argue with valid technical feedback + +**If reviewer wrong:** +- Push back with technical reasoning +- Show code/tests that prove it works +- Request clarification + +See template at: requesting-code-review/code-reviewer.md diff --git a/superpowers-desktop/pro-setup/skills/collaboration/subagent-driven-development.md b/superpowers-desktop/pro-setup/skills/collaboration/subagent-driven-development.md new file mode 100644 index 000000000..68305c865 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/subagent-driven-development.md @@ -0,0 +1,191 @@ +--- +name: subagent-driven-development +description: Use when executing implementation plans with independent tasks in the current session - break down into sequential tasks for each task with code review between tasks, enabling fast iteration with quality gates +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Subagent-Driven Development + +Execute plan by break down into sequential tasks per task, with code review after each. + +**Core principle:** Fresh separate task per task + review between tasks = high quality, fast iteration + +## Overview + +**vs. Executing Plans (parallel session):** +- Same session (no context switch) +- Fresh separate task per task (no context pollution) +- Code review after each task (catch issues early) +- Faster iteration (no human-in-loop between tasks) + +**When to use:** +- Staying in this session +- Tasks are mostly independent +- Want continuous progress with quality gates + +**When NOT to use:** +- Need to review plan first (use executing-plans) +- Tasks are tightly coupled (manual execution better) +- Plan needs revision (brainstorm first) + +## The Process + +### 1. Load Plan + +Read plan file, create explicit checklist tracking with all tasks. + +### 2. Execute Task with Subagent + +For each task: + +**Dispatch fresh separate task:** +``` +manual task breakdown (general-purpose): + description: "Implement Task N: [task name]" + prompt: | + You are implementing Task N from [plan-file]. + + Read that task carefully. Your job is to: + 1. Implement exactly what the task specifies + 2. Write tests (following TDD if task says to) + 3. Verify implementation works + 4. Commit your work + 5. Report back + + Work from: [directory] + + Report: What you implemented, what you tested, test results, files changed, any issues +``` + +**Subagent reports back** with summary of work. + +### 3. Review Subagent's Work + +**Dispatch code-reviewer separate task:** +``` +manual task breakdown (code-reviewer.md (in project knowledge)): + Use template at requesting-code-review/code-reviewer.md + + WHAT_WAS_IMPLEMENTED: [from separate task's report] + PLAN_OR_REQUIREMENTS: Task N from [plan-file] + BASE_SHA: [commit before task] + HEAD_SHA: [current commit] + DESCRIPTION: [task summary] +``` + +**Code reviewer returns:** Strengths, Issues (Critical/Important/Minor), Assessment + +### 4. Apply Review Feedback + +**If issues found:** +- Fix Critical issues immediately +- Fix Important issues before next task +- Note Minor issues + +**Dispatch follow-up separate task if needed:** +``` +"Fix issues from code review: [list issues]" +``` + +### 5. Mark Complete, Next Task + +- Mark task as completed in explicit checklist tracking +- Move to next task +- Repeat steps 2-5 + +### 6. Final Review + +After all tasks complete, dispatch final code-reviewer: +- Reviews entire implementation +- Checks all plan requirements met +- Validates overall architecture + +### 7. Complete Development + +After final review passes: +- Announce: "I'm using the finishing-a-development-branch skill to complete this work." +- **REQUIRED SUB-SKILL:** Use finishing-a-development-branch.md (in project knowledge) +- Follow that skill to verify tests, present options, execute choice + +## Example Workflow + +``` +You: I'm using Subagent-Driven Development to execute this plan. + +[Load plan, create explicit checklist tracking] + +Task 1: Hook installation script + +[Dispatch implementation separate task] +Subagent: Implemented install-hook with tests, 5/5 passing + +[Get git SHAs, dispatch code-reviewer] +Reviewer: Strengths: Good test coverage. Issues: None. Ready. + +[Mark Task 1 complete] + +Task 2: Recovery modes + +[Dispatch implementation separate task] +Subagent: Added verify/repair, 8/8 tests passing + +[Dispatch code-reviewer] +Reviewer: Strengths: Solid. Issues (Important): Missing progress reporting + +[Dispatch fix separate task] +Fix separate task: Added progress every 100 conversations + +[Verify fix, mark Task 2 complete] + +... + +[After all tasks] +[Dispatch final code-reviewer] +Final reviewer: All requirements met, ready to merge + +Done! +``` + +## Advantages + +**vs. Manual execution:** +- Subagents follow TDD naturally +- Fresh context per task (no confusion) +- Parallel-safe (separate tasks don't interfere) + +**vs. Executing Plans:** +- Same session (no handoff) +- Continuous progress (no waiting) +- Review checkpoints automatic + +**Cost:** +- More separate task invocations +- But catches issues early (cheaper than debugging later) + +## Red Flags + +**Never:** +- Skip code review between tasks +- Proceed with unfixed Critical issues +- Dispatch multiple implementation separate tasks in parallel (conflicts) +- Implement without reading plan task + +**If separate task fails task:** +- Dispatch fix separate task with specific instructions +- Don't try to fix manually (context pollution) + +## Integration + +**Required workflow skills:** +- **writing-plans** - REQUIRED: Creates the plan that this skill executes +- **requesting-code-review** - REQUIRED: Review after each task (see Step 3) +- **finishing-a-development-branch** - REQUIRED: Complete development after all tasks (see Step 7) + +**Subagents must use:** +- **test-driven-development** - Subagents follow TDD for each task + +**Alternative workflow:** +- **executing-plans** - Use for parallel session instead of same-session execution + +See code-reviewer template: requesting-code-review/code-reviewer.md diff --git a/superpowers-desktop/pro-setup/skills/collaboration/using-git-worktrees.md b/superpowers-desktop/pro-setup/skills/collaboration/using-git-worktrees.md new file mode 100644 index 000000000..b04c833c5 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/using-git-worktrees.md @@ -0,0 +1,215 @@ +--- +name: using-git-worktrees +description: Use when starting feature work that needs isolation from current workspace or before executing implementation plans - creates isolated git worktrees with smart directory selection and safety verification +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Using Git Worktrees + +## Overview + +Git worktrees create isolated workspaces sharing the same repository, allowing work on multiple branches simultaneously without switching. + +**Core principle:** Systematic directory selection + safety verification = reliable isolation. + +**Announce at start:** "I'm using the using-git-worktrees skill to set up an isolated workspace." + +## Directory Selection Process + +Follow this priority order: + +### 1. Check Existing Directories + +```bash +# Check in priority order +ls -d .worktrees 2>/dev/null # Preferred (hidden) +ls -d worktrees 2>/dev/null # Alternative +``` + +**If found:** Use that directory. If both exist, `.worktrees` wins. + +### 2. Check CLAUDE.md + +```bash +grep -i "worktree.*director" CLAUDE.md 2>/dev/null +``` + +**If preference specified:** Use it without asking. + +### 3. Ask User + +If no directory exists and no CLAUDE.md preference: + +``` +No worktree directory found. Where should I create worktrees? + +1. .worktrees/ (project-local, hidden) +2. ~/.config/superpowers/worktrees/<project-name>/ (global location) + +Which would you prefer? +``` + +## Safety Verification + +### For Project-Local Directories (.worktrees or worktrees) + +**MUST verify .gitignore before creating worktree:** + +```bash +# Check if directory pattern in .gitignore +grep -q "^\.worktrees/$" .gitignore || grep -q "^worktrees/$" .gitignore +``` + +**If NOT in .gitignore:** + +Per Jesse's rule "Fix broken things immediately": +1. Add appropriate line to .gitignore +2. Commit the change +3. Proceed with worktree creation + +**Why critical:** Prevents accidentally committing worktree contents to repository. + +### For Global Directory (~/.config/superpowers/worktrees) + +No .gitignore verification needed - outside project entirely. + +## Creation Steps + +### 1. Detect Project Name + +```bash +project=$(basename "$(git rev-parse --show-toplevel)") +``` + +### 2. Create Worktree + +```bash +# Determine full path +case $LOCATION in + .worktrees|worktrees) + path="$LOCATION/$BRANCH_NAME" + ;; + ~/.config/superpowers/worktrees/*) + path="~/.config/superpowers/worktrees/$project/$BRANCH_NAME" + ;; +esac + +# Create worktree with new branch +git worktree add "$path" -b "$BRANCH_NAME" +cd "$path" +``` + +### 3. Run Project Setup + +Auto-detect and run appropriate setup: + +```bash +# Node.js +if [ -f package.json ]; then npm install; fi + +# Rust +if [ -f Cargo.toml ]; then cargo build; fi + +# Python +if [ -f requirements.txt ]; then pip install -r requirements.txt; fi +if [ -f pyproject.toml ]; then poetry install; fi + +# Go +if [ -f go.mod ]; then go mod download; fi +``` + +### 4. Verify Clean Baseline + +Run tests to ensure worktree starts clean: + +```bash +# Examples - use project-appropriate command +npm test +cargo test +pytest +go test ./... +``` + +**If tests fail:** Report failures, ask whether to proceed or investigate. + +**If tests pass:** Report ready. + +### 5. Report Location + +``` +Worktree ready at <full-path> +Tests passing (<N> tests, 0 failures) +Ready to implement <feature-name> +``` + +## Quick Reference + +| Situation | Action | +|-----------|--------| +| `.worktrees/` exists | Use it (verify .gitignore) | +| `worktrees/` exists | Use it (verify .gitignore) | +| Both exist | Use `.worktrees/` | +| Neither exists | Check CLAUDE.md → Ask user | +| Directory not in .gitignore | Add it immediately + commit | +| Tests fail during baseline | Report failures + ask | +| No package.json/Cargo.toml | Skip dependency install | + +## Common Mistakes + +**Skipping .gitignore verification** +- **Problem:** Worktree contents get tracked, pollute git status +- **Fix:** Always grep .gitignore before creating project-local worktree + +**Assuming directory location** +- **Problem:** Creates inconsistency, violates project conventions +- **Fix:** Follow priority: existing > CLAUDE.md > ask + +**Proceeding with failing tests** +- **Problem:** Can't distinguish new bugs from pre-existing issues +- **Fix:** Report failures, get explicit permission to proceed + +**Hardcoding setup commands** +- **Problem:** Breaks on projects using different tools +- **Fix:** Auto-detect from project files (package.json, etc.) + +## Example Workflow + +``` +You: I'm using the using-git-worktrees skill to set up an isolated workspace. + +[Check .worktrees/ - exists] +[Verify .gitignore - contains .worktrees/] +[Create worktree: git worktree add .worktrees/auth -b feature/auth] +[Run npm install] +[Run npm test - 47 passing] + +Worktree ready at /Users/jesse/myproject/.worktrees/auth +Tests passing (47 tests, 0 failures) +Ready to implement auth feature +``` + +## Red Flags + +**Never:** +- Create worktree without .gitignore verification (project-local) +- Skip baseline test verification +- Proceed with failing tests without asking +- Assume directory location when ambiguous +- Skip CLAUDE.md check + +**Always:** +- Follow directory priority: existing > CLAUDE.md > ask +- Verify .gitignore for project-local +- Auto-detect and run project setup +- Verify clean test baseline + +## Integration + +**Called by:** +- **brainstorming** (Phase 4) - REQUIRED when design is approved and implementation follows +- Any skill needing isolated workspace + +**Pairs with:** +- **finishing-a-development-branch** - REQUIRED for cleanup after work complete +- **executing-plans** or **separate task-driven-development** - Work happens in this worktree diff --git a/superpowers-desktop/pro-setup/skills/collaboration/writing-plans.md b/superpowers-desktop/pro-setup/skills/collaboration/writing-plans.md new file mode 100644 index 000000000..125f449c7 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/collaboration/writing-plans.md @@ -0,0 +1,118 @@ +--- +name: writing-plans +description: Use when design is complete and you need detailed implementation tasks for engineers with zero codebase context - creates comprehensive implementation plans with exact file paths, complete code examples, and verification steps assuming engineer has minimal domain knowledge +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Writing Plans + +## Overview + +Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits. + +Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test design very well. + +**Announce at start:** "I'm using the writing-plans skill to create the implementation plan." + +**Context:** This should be run in a dedicated worktree (created by brainstorming skill). + +**Save plans to:** `docs/plans/YYYY-MM-DD-<feature-name>.md` + +## Bite-Sized Task Granularity + +**Each step is one action (2-5 minutes):** +- "Write the failing test" - step +- "Run it to make sure it fails" - step +- "Implement the minimal code to make the test pass" - step +- "Run the tests and make sure they pass" - step +- "Commit" - step + +## Plan Document Header + +**Every plan MUST start with this header:** + +```markdown +# [Feature Name] Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use executing-plans.md (in project knowledge) to implement this plan task-by-task. + +**Goal:** [One sentence describing what this builds] + +**Architecture:** [2-3 sentences about approach] + +**Tech Stack:** [Key technologies/libraries] + +--- +``` + +## Task Structure + +```markdown +### Task N: [Component Name] + +**Files:** +- Create: `exact/path/to/file.py` +- Modify: `exact/path/to/existing.py:123-145` +- Test: `tests/exact/path/to/test.py` + +**Step 1: Write the failing test** + +```python +def test_specific_behavior(): + result = function(input) + assert result == expected +``` + +**Step 2: Run test to verify it fails** + +Run: `pytest tests/path/test.py::test_name -v` +Expected: FAIL with "function not defined" + +**Step 3: Write minimal implementation** + +```python +def function(input): + return expected +``` + +**Step 4: Run test to verify it passes** + +Run: `pytest tests/path/test.py::test_name -v` +Expected: PASS + +**Step 5: Commit** + +```bash +git add tests/path/test.py src/path/file.py +git commit -m "feat: add specific feature" +``` +``` + +## Remember +- Exact file paths always +- Complete code in plan (not "add validation") +- Exact commands with expected output +- Reference relevant skills with @ syntax +- DRY, YAGNI, TDD, frequent commits + +## Execution Handoff + +After saving the plan, offer execution choice: + +**"Plan complete and saved to `docs/plans/<filename>.md`. Two execution options:** + +**1. Subagent-Driven (this session)** - I break down into sequential tasks per task, review between tasks, fast iteration + +**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints + +**Which approach?"** + +**If Subagent-Driven chosen:** +- **REQUIRED SUB-SKILL:** Use separate.md (in project knowledge) task-driven-development +- Stay in this session +- Fresh separate task per task + code review + +**If Parallel Session chosen:** +- Guide them to open new session in worktree +- **REQUIRED SUB-SKILL:** New session uses executing-plans.md (in project knowledge) diff --git a/superpowers-desktop/pro-setup/skills/core/brainstorming.md b/superpowers-desktop/pro-setup/skills/core/brainstorming.md new file mode 100644 index 000000000..1ac64c116 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/core/brainstorming.md @@ -0,0 +1,56 @@ +--- +name: brainstorming +description: Use when creating or developing, before writing code or implementation plans - refines rough ideas into fully-formed designs through collaborative questioning, alternative exploration, and incremental validation. Don't use during clear 'mechanical' processes +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Brainstorming Ideas Into Designs + +## Overview + +Help turn ideas into fully formed designs and specs through natural collaborative dialogue. + +Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design in small sections (200-300 words), checking after each section whether it looks right so far. + +## The Process + +**Understanding the idea:** +- Check out the current project state first (files, docs, recent commits) +- Ask questions one at a time to refine the idea +- Prefer multiple choice questions when possible, but open-ended is fine too +- Only one question per message - if a topic needs more exploration, break it into multiple questions +- Focus on understanding: purpose, constraints, success criteria + +**Exploring approaches:** +- Propose 2-3 different approaches with trade-offs +- Present options conversationally with your recommendation and reasoning +- Lead with your recommended option and explain why + +**Presenting the design:** +- Once you believe you understand what you're building, present the design +- Break it into sections of 200-300 words +- Ask after each section whether it looks right so far +- Cover: architecture, components, data flow, error handling, testing +- Be ready to go back and clarify if something doesn't make sense + +## After the Design + +**Documentation:** +- Write the validated design to `docs/plans/YYYY-MM-DD-<topic>-design.md` +- Use elements-of-style:writing-clearly-and-concisely skill if available +- Commit the design document to git + +**Implementation (if continuing):** +- Ask: "Ready to set up for implementation?" +- Use using-git-worktrees.md (in project knowledge) to create isolated workspace +- Use writing-plans.md (in project knowledge) to create detailed implementation plan + +## Key Principles + +- **One question at a time** - Don't overwhelm with multiple questions +- **Multiple choice preferred** - Easier to answer than open-ended when possible +- **YAGNI ruthlessly** - Remove unnecessary features from all designs +- **Explore alternatives** - Always propose 2-3 approaches before settling +- **Incremental validation** - Present design in sections, validate each +- **Be flexible** - Go back and clarify when something doesn't make sense diff --git a/superpowers-desktop/pro-setup/skills/core/systematic-debugging.md b/superpowers-desktop/pro-setup/skills/core/systematic-debugging.md new file mode 100644 index 000000000..f626cf25c --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/core/systematic-debugging.md @@ -0,0 +1,297 @@ +--- +name: systematic-debugging +description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes - four-phase framework (root cause investigation, pattern analysis, hypothesis testing, implementation) that ensures understanding before attempting solutions +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Systematic Debugging + +## Overview + +Random fixes waste time and create new bugs. Quick patches mask underlying issues. + +**Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure. + +**Violating the letter of this process is violating the spirit of debugging.** + +## The Iron Law + +``` +NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST +``` + +If you haven't completed Phase 1, you cannot propose fixes. + +## When to Use + +Use for ANY technical issue: +- Test failures +- Bugs in production +- Unexpected behavior +- Performance problems +- Build failures +- Integration issues + +**Use this ESPECIALLY when:** +- Under time pressure (emergencies make guessing tempting) +- "Just one quick fix" seems obvious +- You've already tried multiple fixes +- Previous fix didn't work +- You don't fully understand the issue + +**Don't skip when:** +- Issue seems simple (simple bugs have root causes too) +- You're in a hurry (rushing guarantees rework) +- Manager wants it fixed NOW (systematic is faster than thrashing) + +## The Four Phases + +You MUST complete each phase before proceeding to the next. + +### Phase 1: Root Cause Investigation + +**BEFORE attempting ANY fix:** + +1. **Read Error Messages Carefully** + - Don't skip past errors or warnings + - They often contain the exact solution + - Read stack traces completely + - Note line numbers, file paths, error codes + +2. **Reproduce Consistently** + - Can you trigger it reliably? + - What are the exact steps? + - Does it happen every time? + - If not reproducible → gather more data, don't guess + +3. **Check Recent Changes** + - What changed that could cause this? + - Git diff, recent commits + - New dependencies, config changes + - Environmental differences + +4. **Gather Evidence in Multi-Component Systems** + + **WHEN system has multiple components (CI → build → signing, API → service → database):** + + **BEFORE proposing fixes, add diagnostic instrumentation:** + ``` + For EACH component boundary: + - Log what data enters component + - Log what data exits component + - Verify environment/config propagation + - Check state at each layer + + Run once to gather evidence showing WHERE it breaks + THEN analyze evidence to identify failing component + THEN investigate that specific component + ``` + + **Example (multi-layer system):** + ```bash + # Layer 1: Workflow + echo "=== Secrets available in workflow: ===" + echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}" + + # Layer 2: Build script + echo "=== Env vars in build script: ===" + env | grep IDENTITY || echo "IDENTITY not in environment" + + # Layer 3: Signing script + echo "=== Keychain state: ===" + security list-keychains + security find-identity -v + + # Layer 4: Actual signing + codesign --sign "$IDENTITY" --verbose=4 "$APP" + ``` + + **This reveals:** Which layer fails (secrets → workflow ✓, workflow → build ✗) + +5. **Trace Data Flow** + + **WHEN error is deep in call stack:** + + **REQUIRED SUB-SKILL:** Use root-cause-tracing.md (in project knowledge) for backward tracing technique + + **Quick version:** + - Where does bad value originate? + - What called this with bad value? + - Keep tracing up until you find the source + - Fix at source, not at symptom + +### Phase 2: Pattern Analysis + +**Find the pattern before fixing:** + +1. **Find Working Examples** + - Locate similar working code in same codebase + - What works that's similar to what's broken? + +2. **Compare Against References** + - If implementing pattern, read reference implementation COMPLETELY + - Don't skim - read every line + - Understand the pattern fully before applying + +3. **Identify Differences** + - What's different between working and broken? + - List every difference, however small + - Don't assume "that can't matter" + +4. **Understand Dependencies** + - What other components does this need? + - What settings, config, environment? + - What assumptions does it make? + +### Phase 3: Hypothesis and Testing + +**Scientific method:** + +1. **Form Single Hypothesis** + - State clearly: "I think X is the root cause because Y" + - Write it down + - Be specific, not vague + +2. **Test Minimally** + - Make the SMALLEST possible change to test hypothesis + - One variable at a time + - Don't fix multiple things at once + +3. **Verify Before Continuing** + - Did it work? Yes → Phase 4 + - Didn't work? Form NEW hypothesis + - DON'T add more fixes on top + +4. **When You Don't Know** + - Say "I don't understand X" + - Don't pretend to know + - Ask for help + - Research more + +### Phase 4: Implementation + +**Fix the root cause, not the symptom:** + +1. **Create Failing Test Case** + - Simplest possible reproduction + - Automated test if possible + - One-off test script if no framework + - MUST have before fixing + - **REQUIRED SUB-SKILL:** Use test-driven-development.md (in project knowledge) for writing proper failing tests + +2. **Implement Single Fix** + - Address the root cause identified + - ONE change at a time + - No "while I'm here" improvements + - No bundled refactoring + +3. **Verify Fix** + - Test passes now? + - No other tests broken? + - Issue actually resolved? + +4. **If Fix Doesn't Work** + - STOP + - Count: How many fixes have you tried? + - If < 3: Return to Phase 1, re-analyze with new information + - **If ≥ 3: STOP and question the architecture (step 5 below)** + - DON'T attempt Fix #4 without architectural discussion + +5. **If 3+ Fixes Failed: Question Architecture** + + **Pattern indicating architectural problem:** + - Each fix reveals new shared state/coupling/problem in different place + - Fixes require "massive refactoring" to implement + - Each fix creates new symptoms elsewhere + + **STOP and question fundamentals:** + - Is this pattern fundamentally sound? + - Are we "sticking with it through sheer inertia"? + - Should we refactor architecture vs. continue fixing symptoms? + + **Discuss with your human partner before attempting more fixes** + + This is NOT a failed hypothesis - this is a wrong architecture. + +## Red Flags - STOP and Follow Process + +If you catch yourself thinking: +- "Quick fix for now, investigate later" +- "Just try changing X and see if it works" +- "Add multiple changes, run tests" +- "Skip the test, I'll manually verify" +- "It's probably X, let me fix that" +- "I don't fully understand but this might work" +- "Pattern says X but I'll adapt it differently" +- "Here are the main problems: [lists fixes without investigation]" +- Proposing solutions before tracing data flow +- **"One more fix attempt" (when already tried 2+)** +- **Each fix reveals new problem in different place** + +**ALL of these mean: STOP. Return to Phase 1.** + +**If 3+ fixes failed:** Question the architecture (see Phase 4.5) + +## your human partner's Signals You're Doing It Wrong + +**Watch for these redirections:** +- "Is that not happening?" - You assumed without verifying +- "Will it show us...?" - You should have added evidence gathering +- "Stop guessing" - You're proposing fixes without understanding +- "Ultrathink this" - Question fundamentals, not just symptoms +- "We're stuck?" (frustrated) - Your approach isn't working + +**When you see these:** STOP. Return to Phase 1. + +## Common Rationalizations + +| Excuse | Reality | +|--------|---------| +| "Issue is simple, don't need process" | Simple issues have root causes too. Process is fast for simple bugs. | +| "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. | +| "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. | +| "I'll write test after confirming fix works" | Untested fixes don't stick. Test first proves it. | +| "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. | +| "Reference too long, I'll adapt the pattern" | Partial understanding guarantees bugs. Read it completely. | +| "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. | +| "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. Question pattern, don't fix again. | + +## Quick Reference + +| Phase | Key Activities | Success Criteria | +|-------|---------------|------------------| +| **1. Root Cause** | Read errors, reproduce, check changes, gather evidence | Understand WHAT and WHY | +| **2. Pattern** | Find working examples, compare | Identify differences | +| **3. Hypothesis** | Form theory, test minimally | Confirmed or new hypothesis | +| **4. Implementation** | Create test, fix, verify | Bug resolved, tests pass | + +## When Process Reveals "No Root Cause" + +If systematic investigation reveals issue is truly environmental, timing-dependent, or external: + +1. You've completed the process +2. Document what you investigated +3. Implement appropriate handling (retry, timeout, error message) +4. Add monitoring/logging for future investigation + +**But:** 95% of "no root cause" cases are incomplete investigation. + +## Integration with Other Skills + +**This skill requires using:** +- **root-cause-tracing** - REQUIRED when error is deep in call stack (see Phase 1, Step 5) +- **test-driven-development** - REQUIRED for creating failing test case (see Phase 4, Step 1) + +**Complementary skills:** +- **defense-in-depth** - Add validation at multiple layers after finding root cause +- **condition-based-waiting** - Replace arbitrary timeouts identified in Phase 2 +- **verification-before-completion** - Verify fix worked before claiming success + +## Real-World Impact + +From debugging sessions: +- Systematic approach: 15-30 minutes to fix +- Random fixes approach: 2-3 hours of thrashing +- First-time fix rate: 95% vs 40% +- New bugs introduced: Near zero vs common diff --git a/superpowers-desktop/pro-setup/skills/core/test-driven-development.md b/superpowers-desktop/pro-setup/skills/core/test-driven-development.md new file mode 100644 index 000000000..9a6cfc4ff --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/core/test-driven-development.md @@ -0,0 +1,366 @@ +--- +name: test-driven-development +description: Use when implementing any feature or bugfix, before writing implementation code - write the test first, watch it fail, write minimal code to pass; ensures tests actually verify behavior by requiring failure first +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Test-Driven Development (TDD) + +## Overview + +Write the test first. Watch it fail. Write minimal code to pass. + +**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing. + +**Violating the letter of the rules is violating the spirit of the rules.** + +## When to Use + +**Always:** +- New features +- Bug fixes +- Refactoring +- Behavior changes + +**Exceptions (ask your human partner):** +- Throwaway prototypes +- Generated code +- Configuration files + +Thinking "skip TDD just this once"? Stop. That's rationalization. + +## The Iron Law + +``` +NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST +``` + +Write code before the test? Delete it. Start over. + +**No exceptions:** +- Don't keep it as "reference" +- Don't "adapt" it while writing tests +- Don't look at it +- Delete means delete + +Implement fresh from tests. Period. + +## Red-Green-Refactor + +```dot +digraph tdd_cycle { + rankdir=LR; + red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"]; + verify_red [label="Verify fails\ncorrectly", shape=diamond]; + green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"]; + verify_green [label="Verify passes\nAll green", shape=diamond]; + refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"]; + next [label="Next", shape=ellipse]; + + red -> verify_red; + verify_red -> green [label="yes"]; + verify_red -> red [label="wrong\nfailure"]; + green -> verify_green; + verify_green -> refactor [label="yes"]; + verify_green -> green [label="no"]; + refactor -> verify_green [label="stay\ngreen"]; + verify_green -> next; + next -> red; +} +``` + +### RED - Write Failing Test + +Write one minimal test showing what should happen. + +<Good> +```typescript +test('retries failed operations 3 times', async () => { + let attempts = 0; + const operation = () => { + attempts++; + if (attempts < 3) throw new Error('fail'); + return 'success'; + }; + + const result = await retryOperation(operation); + + expect(result).toBe('success'); + expect(attempts).toBe(3); +}); +``` +Clear name, tests real behavior, one thing +</Good> + +<Bad> +```typescript +test('retry works', async () => { + const mock = jest.fn() + .mockRejectedValueOnce(new Error()) + .mockRejectedValueOnce(new Error()) + .mockResolvedValueOnce('success'); + await retryOperation(mock); + expect(mock).toHaveBeenCalledTimes(3); +}); +``` +Vague name, tests mock not code +</Bad> + +**Requirements:** +- One behavior +- Clear name +- Real code (no mocks unless unavoidable) + +### Verify RED - Watch It Fail + +**MANDATORY. Never skip.** + +```bash +npm test path/to/test.test.ts +``` + +Confirm: +- Test fails (not errors) +- Failure message is expected +- Fails because feature missing (not typos) + +**Test passes?** You're testing existing behavior. Fix test. + +**Test errors?** Fix error, re-run until it fails correctly. + +### GREEN - Minimal Code + +Write simplest code to pass the test. + +<Good> +```typescript +async function retryOperation<T>(fn: () => Promise<T>): Promise<T> { + for (let i = 0; i < 3; i++) { + try { + return await fn(); + } catch (e) { + if (i === 2) throw e; + } + } + throw new Error('unreachable'); +} +``` +Just enough to pass +</Good> + +<Bad> +```typescript +async function retryOperation<T>( + fn: () => Promise<T>, + options?: { + maxRetries?: number; + backoff?: 'linear' | 'exponential'; + onRetry?: (attempt: number) => void; + } +): Promise<T> { + // YAGNI +} +``` +Over-engineered +</Bad> + +Don't add features, refactor other code, or "improve" beyond the test. + +### Verify GREEN - Watch It Pass + +**MANDATORY.** + +```bash +npm test path/to/test.test.ts +``` + +Confirm: +- Test passes +- Other tests still pass +- Output pristine (no errors, warnings) + +**Test fails?** Fix code, not test. + +**Other tests fail?** Fix now. + +### REFACTOR - Clean Up + +After green only: +- Remove duplication +- Improve names +- Extract helpers + +Keep tests green. Don't add behavior. + +### Repeat + +Next failing test for next feature. + +## Good Tests + +| Quality | Good | Bad | +|---------|------|-----| +| **Minimal** | One thing. "and" in name? Split it. | `test('validates email and domain and whitespace')` | +| **Clear** | Name describes behavior | `test('test1')` | +| **Shows intent** | Demonstrates desired API | Obscures what code should do | + +## Why Order Matters + +**"I'll write tests after to verify it works"** + +Tests written after code pass immediately. Passing immediately proves nothing: +- Might test wrong thing +- Might test implementation, not behavior +- Might miss edge cases you forgot +- You never saw it catch the bug + +Test-first forces you to see the test fail, proving it actually tests something. + +**"I already manually tested all the edge cases"** + +Manual testing is ad-hoc. You think you tested everything but: +- No record of what you tested +- Can't re-run when code changes +- Easy to forget cases under pressure +- "It worked when I tried it" ≠ comprehensive + +Automated tests are systematic. They run the same way every time. + +**"Deleting X hours of work is wasteful"** + +Sunk cost fallacy. The time is already gone. Your choice now: +- Delete and rewrite with TDD (X more hours, high confidence) +- Keep it and add tests after (30 min, low confidence, likely bugs) + +The "waste" is keeping code you can't trust. Working code without real tests is technical debt. + +**"TDD is dogmatic, being pragmatic means adapting"** + +TDD IS pragmatic: +- Finds bugs before commit (faster than debugging after) +- Prevents regressions (tests catch breaks immediately) +- Documents behavior (tests show how to use code) +- Enables refactoring (change freely, tests catch breaks) + +"Pragmatic" shortcuts = debugging in production = slower. + +**"Tests after achieve the same goals - it's spirit not ritual"** + +No. Tests-after answer "What does this do?" Tests-first answer "What should this do?" + +Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones. + +Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't). + +30 minutes of tests after ≠ TDD. You get coverage, lose proof tests work. + +## Common Rationalizations + +| Excuse | Reality | +|--------|---------| +| "Too simple to test" | Simple code breaks. Test takes 30 seconds. | +| "I'll test after" | Tests passing immediately prove nothing. | +| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | +| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. | +| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | +| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | +| "Need to explore first" | Fine. Throw away exploration, start with TDD. | +| "Test hard = design unclear" | Listen to test. Hard to test = hard to use. | +| "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. | +| "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. | +| "Existing code has no tests" | You're improving it. Add tests for existing code. | + +## Red Flags - STOP and Start Over + +- Code before test +- Test after implementation +- Test passes immediately +- Can't explain why test failed +- Tests added "later" +- Rationalizing "just this once" +- "I already manually tested it" +- "Tests after achieve the same purpose" +- "It's about spirit not ritual" +- "Keep as reference" or "adapt existing code" +- "Already spent X hours, deleting is wasteful" +- "TDD is dogmatic, I'm being pragmatic" +- "This is different because..." + +**All of these mean: Delete code. Start over with TDD.** + +## Example: Bug Fix + +**Bug:** Empty email accepted + +**RED** +```typescript +test('rejects empty email', async () => { + const result = await submitForm({ email: '' }); + expect(result.error).toBe('Email required'); +}); +``` + +**Verify RED** +```bash +$ npm test +FAIL: expected 'Email required', got undefined +``` + +**GREEN** +```typescript +function submitForm(data: FormData) { + if (!data.email?.trim()) { + return { error: 'Email required' }; + } + // ... +} +``` + +**Verify GREEN** +```bash +$ npm test +PASS +``` + +**REFACTOR** +Extract validation for multiple fields if needed. + +## Verification Checklist + +Before marking work complete: + +- [ ] Every new function/method has a test +- [ ] Watched each test fail before implementing +- [ ] Each test failed for expected reason (feature missing, not typo) +- [ ] Wrote minimal code to pass each test +- [ ] All tests pass +- [ ] Output pristine (no errors, warnings) +- [ ] Tests use real code (mocks only if unavoidable) +- [ ] Edge cases and errors covered + +Can't check all boxes? You skipped TDD. Start over. + +## When Stuck + +| Problem | Solution | +|---------|----------| +| Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. | +| Test too complicated | Design too complicated. Simplify interface. | +| Must mock everything | Code too coupled. Use dependency injection. | +| Test setup huge | Extract helpers. Still complex? Simplify design. | + +## Debugging Integration + +Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression. + +Never fix bugs without a test. + +## Final Rule + +``` +Production code → test exists and failed first +Otherwise → not TDD +``` + +No exceptions without your human partner's permission. diff --git a/superpowers-desktop/pro-setup/skills/core/using-superpowers.md b/superpowers-desktop/pro-setup/skills/core/using-superpowers.md new file mode 100644 index 000000000..45c80c17a --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/core/using-superpowers.md @@ -0,0 +1,103 @@ +--- +name: using-superpowers +description: Use when starting any conversation - establishes mandatory workflows for finding and using skills, including using skill reference before announcing usage, following brainstorming before coding, and creating explicit checklist tracking todos for checklists +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +<EXTREMELY-IMPORTANT> +If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST read the skill. + +IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT. + +This is not negotiable. This is not optional. You cannot rationalize your way out of this. +</EXTREMELY-IMPORTANT> + +# Getting Started with Skills + +## MANDATORY FIRST RESPONSE PROTOCOL + +Before responding to ANY user message, you MUST complete this checklist: + +1. ☐ List available skills in your mind +2. ☐ Ask yourself: "Does ANY skill match this request?" +3. ☐ If yes → Reference the skill from project knowledge file +4. ☐ Announce which skill you're using +5. ☐ Follow the skill exactly + +**Responding WITHOUT completing this checklist = automatic failure.** + +## Critical Rules + +1. **Follow mandatory workflows.** Brainstorming before coding. Check for relevant skills before ANY task. + +2. Execute skills with the skill reference + +## Common Rationalizations That Mean You're About To Fail + +If you catch yourself thinking ANY of these thoughts, STOP. You are rationalizing. Check for and use the skill. + +- "This is just a simple question" → WRONG. Questions are tasks. Check for skills. +- "I can check git/files quickly" → WRONG. Files don't have conversation context. Check for skills. +- "Let me gather information first" → WRONG. Skills tell you HOW to gather information. Check for skills. +- "This doesn't need a formal skill" → WRONG. If a skill exists for it, use it. +- "I remember this skill" → WRONG. Skills evolve. Run the current version. +- "This doesn't count as a task" → WRONG. If you're taking action, it's a task. Check for skills. +- "The skill is overkill for this" → WRONG. Skills exist because simple things become complex. Use it. +- "I'll just do this one thing first" → WRONG. Check for skills BEFORE doing anything. + +**Why:** Skills document proven techniques that save time and prevent mistakes. Not using available skills means repeating solved problems and making known errors. + +If a skill for your task exists, you must use it or you will fail at your task. + +## Skills with Checklists + +If a skill has a checklist, YOU MUST create explicit checklist tracking todos for EACH item. + +**Don't:** +- Work through checklist mentally +- Skip creating todos "to save time" +- Batch multiple items into one todo +- Mark complete without doing them + +**Why:** Checklists without explicit checklist tracking tracking = steps get skipped. Every time. The overhead of explicit checklist tracking is tiny compared to the cost of missing steps. + +## Announcing Skill Usage + +Before using a skill, announce that you are using it. +"I'm using [Skill Name] to [what you're doing]." + +**Examples:** +- "I'm using the brainstorming skill to refine your idea into a design." +- "I'm using the test-driven-development skill to implement this feature." + +**Why:** Transparency helps your human partner understand your process and catch errors early. It also confirms you actually read the skill. + +# About these skills + +**Many skills contain rigid rules (TDD, debugging, verification).** Follow them exactly. Don't adapt away the discipline. + +**Some skills are flexible patterns (architecture, naming).** Adapt core principles to your context. + +The skill itself tells you which type it is. + +## Instructions ≠ Permission to Skip Workflows + +Your human partner's specific instructions describe WHAT to do, not HOW. + +"Add X", "Fix Y" = the goal, NOT permission to skip brainstorming, TDD, or RED-GREEN-REFACTOR. + +**Red flags:** "Instruction was specific" • "Seems simple" • "Workflow is overkill" + +**Why:** Specific instructions mean clear requirements, which is when workflows matter MOST. Skipping process on "simple" tasks is how simple tasks become complex problems. + +## Summary + +**Starting any task:** +1. If relevant skill exists → Use the skill +3. Announce you're using it +4. Follow what it says + +**Skill has checklist?** explicit checklist tracking for every item. + +**Finding a relevant skill = mandatory to read and use it. Not optional.** diff --git a/superpowers-desktop/pro-setup/skills/debugging/defense-in-depth.md b/superpowers-desktop/pro-setup/skills/debugging/defense-in-depth.md new file mode 100644 index 000000000..046834ea2 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/debugging/defense-in-depth.md @@ -0,0 +1,129 @@ +--- +name: defense-in-depth +description: Use when invalid data causes failures deep in execution, requiring validation at multiple system layers - validates at every layer data passes through to make bugs structurally impossible +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Defense-in-Depth Validation + +## Overview + +When you fix a bug caused by invalid data, adding validation at one place feels sufficient. But that single check can be bypassed by different code paths, refactoring, or mocks. + +**Core principle:** Validate at EVERY layer data passes through. Make the bug structurally impossible. + +## Why Multiple Layers + +Single validation: "We fixed the bug" +Multiple layers: "We made the bug impossible" + +Different layers catch different cases: +- Entry validation catches most bugs +- Business logic catches edge cases +- Environment guards prevent context-specific dangers +- Debug logging helps when other layers fail + +## The Four Layers + +### Layer 1: Entry Point Validation +**Purpose:** Reject obviously invalid input at API boundary + +```typescript +function createProject(name: string, workingDirectory: string) { + if (!workingDirectory || workingDirectory.trim() === '') { + throw new Error('workingDirectory cannot be empty'); + } + if (!existsSync(workingDirectory)) { + throw new Error(`workingDirectory does not exist: ${workingDirectory}`); + } + if (!statSync(workingDirectory).isDirectory()) { + throw new Error(`workingDirectory is not a directory: ${workingDirectory}`); + } + // ... proceed +} +``` + +### Layer 2: Business Logic Validation +**Purpose:** Ensure data makes sense for this operation + +```typescript +function initializeWorkspace(projectDir: string, sessionId: string) { + if (!projectDir) { + throw new Error('projectDir required for workspace initialization'); + } + // ... proceed +} +``` + +### Layer 3: Environment Guards +**Purpose:** Prevent dangerous operations in specific contexts + +```typescript +async function gitInit(directory: string) { + // In tests, refuse git init outside temp directories + if (process.env.NODE_ENV === 'test') { + const normalized = normalize(resolve(directory)); + const tmpDir = normalize(resolve(tmpdir())); + + if (!normalized.startsWith(tmpDir)) { + throw new Error( + `Refusing git init outside temp dir during tests: ${directory}` + ); + } + } + // ... proceed +} +``` + +### Layer 4: Debug Instrumentation +**Purpose:** Capture context for forensics + +```typescript +async function gitInit(directory: string) { + const stack = new Error().stack; + logger.debug('About to git init', { + directory, + cwd: process.cwd(), + stack, + }); + // ... proceed +} +``` + +## Applying the Pattern + +When you find a bug: + +1. **Trace the data flow** - Where does bad value originate? Where used? +2. **Map all checkpoints** - List every point data passes through +3. **Add validation at each layer** - Entry, business, environment, debug +4. **Test each layer** - Try to bypass layer 1, verify layer 2 catches it + +## Example from Session + +Bug: Empty `projectDir` caused `git init` in source code + +**Data flow:** +1. Test setup → empty string +2. `Project.create(name, '')` +3. `WorkspaceManager.createWorkspace('')` +4. `git init` runs in `process.cwd()` + +**Four layers added:** +- Layer 1: `Project.create()` validates not empty/exists/writable +- Layer 2: `WorkspaceManager` validates projectDir not empty +- Layer 3: `WorktreeManager` refuses git init outside tmpdir in tests +- Layer 4: Stack trace logging before git init + +**Result:** All 1847 tests passed, bug impossible to reproduce + +## Key Insight + +All four layers were necessary. During testing, each layer caught bugs the others missed: +- Different code paths bypassed entry validation +- Mocks bypassed business logic checks +- Edge cases on different platforms needed environment guards +- Debug logging identified structural misuse + +**Don't stop at one validation point.** Add checks at every layer. diff --git a/superpowers-desktop/pro-setup/skills/debugging/find-polluter.sh b/superpowers-desktop/pro-setup/skills/debugging/find-polluter.sh new file mode 100755 index 000000000..6af921338 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/debugging/find-polluter.sh @@ -0,0 +1,63 @@ +#!/bin/bash +# Bisection script to find which test creates unwanted files/state +# Usage: ./find-polluter.sh <file_or_dir_to_check> <test_pattern> +# Example: ./find-polluter.sh '.git' 'src/**/*.test.ts' + +set -e + +if [ $# -ne 2 ]; then + echo "Usage: $0 <file_to_check> <test_pattern>" + echo "Example: $0 '.git' 'src/**/*.test.ts'" + exit 1 +fi + +POLLUTION_CHECK="$1" +TEST_PATTERN="$2" + +echo "🔍 Searching for test that creates: $POLLUTION_CHECK" +echo "Test pattern: $TEST_PATTERN" +echo "" + +# Get list of test files +TEST_FILES=$(find . -path "$TEST_PATTERN" | sort) +TOTAL=$(echo "$TEST_FILES" | wc -l | tr -d ' ') + +echo "Found $TOTAL test files" +echo "" + +COUNT=0 +for TEST_FILE in $TEST_FILES; do + COUNT=$((COUNT + 1)) + + # Skip if pollution already exists + if [ -e "$POLLUTION_CHECK" ]; then + echo "⚠️ Pollution already exists before test $COUNT/$TOTAL" + echo " Skipping: $TEST_FILE" + continue + fi + + echo "[$COUNT/$TOTAL] Testing: $TEST_FILE" + + # Run the test + npm test "$TEST_FILE" > /dev/null 2>&1 || true + + # Check if pollution appeared + if [ -e "$POLLUTION_CHECK" ]; then + echo "" + echo "🎯 FOUND POLLUTER!" + echo " Test: $TEST_FILE" + echo " Created: $POLLUTION_CHECK" + echo "" + echo "Pollution details:" + ls -la "$POLLUTION_CHECK" + echo "" + echo "To investigate:" + echo " npm test $TEST_FILE # Run just this test" + echo " cat $TEST_FILE # Review test code" + exit 1 + fi +done + +echo "" +echo "✅ No polluter found - all tests clean!" +exit 0 diff --git a/superpowers-desktop/pro-setup/skills/debugging/root-cause-tracing.md b/superpowers-desktop/pro-setup/skills/debugging/root-cause-tracing.md new file mode 100644 index 000000000..483919dc2 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/debugging/root-cause-tracing.md @@ -0,0 +1,176 @@ +--- +name: root-cause-tracing +description: Use when errors occur deep in execution and you need to trace back to find the original trigger - systematically traces bugs backward through call stack, adding instrumentation when needed, to identify source of invalid data or incorrect behavior +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Root Cause Tracing + +## Overview + +Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom. + +**Core principle:** Trace backward through the call chain until you find the original trigger, then fix at the source. + +## When to Use + +```dot +digraph when_to_use { + "Bug appears deep in stack?" [shape=diamond]; + "Can trace backwards?" [shape=diamond]; + "Fix at symptom point" [shape=box]; + "Trace to original trigger" [shape=box]; + "BETTER: Also add defense-in-depth" [shape=box]; + + "Bug appears deep in stack?" -> "Can trace backwards?" [label="yes"]; + "Can trace backwards?" -> "Trace to original trigger" [label="yes"]; + "Can trace backwards?" -> "Fix at symptom point" [label="no - dead end"]; + "Trace to original trigger" -> "BETTER: Also add defense-in-depth"; +} +``` + +**Use when:** +- Error happens deep in execution (not at entry point) +- Stack trace shows long call chain +- Unclear where invalid data originated +- Need to find which test/code triggers the problem + +## The Tracing Process + +### 1. Observe the Symptom +``` +Error: git init failed in /Users/jesse/project/packages/core +``` + +### 2. Find Immediate Cause +**What code directly causes this?** +```typescript +await execFileAsync('git', ['init'], { cwd: projectDir }); +``` + +### 3. Ask: What Called This? +```typescript +WorktreeManager.createSessionWorktree(projectDir, sessionId) + → called by Session.initializeWorkspace() + → called by Session.create() + → called by test at Project.create() +``` + +### 4. Keep Tracing Up +**What value was passed?** +- `projectDir = ''` (empty string!) +- Empty string as `cwd` resolves to `process.cwd()` +- That's the source code directory! + +### 5. Find Original Trigger +**Where did empty string come from?** +```typescript +const context = setupCoreTest(); // Returns { tempDir: '' } +Project.create('name', context.tempDir); // Accessed before beforeEach! +``` + +## Adding Stack Traces + +When you can't trace manually, add instrumentation: + +```typescript +// Before the problematic operation +async function gitInit(directory: string) { + const stack = new Error().stack; + console.error('DEBUG git init:', { + directory, + cwd: process.cwd(), + nodeEnv: process.env.NODE_ENV, + stack, + }); + + await execFileAsync('git', ['init'], { cwd: directory }); +} +``` + +**Critical:** Use `console.error()` in tests (not logger - may not show) + +**Run and capture:** +```bash +npm test 2>&1 | grep 'DEBUG git init' +``` + +**Analyze stack traces:** +- Look for test file names +- Find the line number triggering the call +- Identify the pattern (same test? same parameter?) + +## Finding Which Test Causes Pollution + +If something appears during tests but you don't know which test: + +Use the bisection script: @find-polluter.sh + +```bash +./find-polluter.sh '.git' 'src/**/*.test.ts' +``` + +Runs tests one-by-one, stops at first polluter. See script for usage. + +## Real Example: Empty projectDir + +**Symptom:** `.git` created in `packages/core/` (source code) + +**Trace chain:** +1. `git init` runs in `process.cwd()` ← empty cwd parameter +2. WorktreeManager called with empty projectDir +3. Session.create() passed empty string +4. Test accessed `context.tempDir` before beforeEach +5. setupCoreTest() returns `{ tempDir: '' }` initially + +**Root cause:** Top-level variable initialization accessing empty value + +**Fix:** Made tempDir a getter that throws if accessed before beforeEach + +**Also added defense-in-depth:** +- Layer 1: Project.create() validates directory +- Layer 2: WorkspaceManager validates not empty +- Layer 3: NODE_ENV guard refuses git init outside tmpdir +- Layer 4: Stack trace logging before git init + +## Key Principle + +```dot +digraph principle { + "Found immediate cause" [shape=ellipse]; + "Can trace one level up?" [shape=diamond]; + "Trace backwards" [shape=box]; + "Is this the source?" [shape=diamond]; + "Fix at source" [shape=box]; + "Add validation at each layer" [shape=box]; + "Bug impossible" [shape=doublecircle]; + "NEVER fix just the symptom" [shape=octagon, style=filled, fillcolor=red, fontcolor=white]; + + "Found immediate cause" -> "Can trace one level up?"; + "Can trace one level up?" -> "Trace backwards" [label="yes"]; + "Can trace one level up?" -> "NEVER fix just the symptom" [label="no"]; + "Trace backwards" -> "Is this the source?"; + "Is this the source?" -> "Trace backwards" [label="no - keeps going"]; + "Is this the source?" -> "Fix at source" [label="yes"]; + "Fix at source" -> "Add validation at each layer"; + "Add validation at each layer" -> "Bug impossible"; +} +``` + +**NEVER fix just where the error appears.** Trace back to find the original trigger. + +## Stack Trace Tips + +**In tests:** Use `console.error()` not logger - logger may be suppressed +**Before operation:** Log before the dangerous operation, not after it fails +**Include context:** Directory, cwd, environment variables, timestamps +**Capture stack:** `new Error().stack` shows complete call chain + +## Real-World Impact + +From debugging session (2025-10-03): +- Found root cause through 5-level trace +- Fixed at source (getter validation) +- Added 4 layers of defense +- 1847 tests passed, zero pollution diff --git a/superpowers-desktop/pro-setup/skills/debugging/verification-before-completion.md b/superpowers-desktop/pro-setup/skills/debugging/verification-before-completion.md new file mode 100644 index 000000000..b81fea8a0 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/debugging/verification-before-completion.md @@ -0,0 +1,141 @@ +--- +name: verification-before-completion +description: Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Verification Before Completion + +## Overview + +Claiming work is complete without verification is dishonesty, not efficiency. + +**Core principle:** Evidence before claims, always. + +**Violating the letter of this rule is violating the spirit of this rule.** + +## The Iron Law + +``` +NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE +``` + +If you haven't run the verification command in this message, you cannot claim it passes. + +## The Gate Function + +``` +BEFORE claiming any status or expressing satisfaction: + +1. IDENTIFY: What command proves this claim? +2. RUN: Execute the FULL command (fresh, complete) +3. READ: Full output, check exit code, count failures +4. VERIFY: Does output confirm the claim? + - If NO: State actual status with evidence + - If YES: State claim WITH evidence +5. ONLY THEN: Make the claim + +Skip any step = lying, not verifying +``` + +## Common Failures + +| Claim | Requires | Not Sufficient | +|-------|----------|----------------| +| Tests pass | Test command output: 0 failures | Previous run, "should pass" | +| Linter clean | Linter output: 0 errors | Partial check, extrapolation | +| Build succeeds | Build command: exit 0 | Linter passing, logs look good | +| Bug fixed | Test original symptom: passes | Code changed, assumed fixed | +| Regression test works | Red-green cycle verified | Test passes once | +| Agent completed | VCS diff shows changes | Agent reports "success" | +| Requirements met | Line-by-line checklist | Tests passing | + +## Red Flags - STOP + +- Using "should", "probably", "seems to" +- Expressing satisfaction before verification ("Great!", "Perfect!", "Done!", etc.) +- About to commit/push/PR without verification +- Trusting agent success reports +- Relying on partial verification +- Thinking "just this once" +- Tired and wanting work over +- **ANY wording implying success without having run verification** + +## Rationalization Prevention + +| Excuse | Reality | +|--------|---------| +| "Should work now" | RUN the verification | +| "I'm confident" | Confidence ≠ evidence | +| "Just this once" | No exceptions | +| "Linter passed" | Linter ≠ compiler | +| "Agent said success" | Verify independently | +| "I'm tired" | Exhaustion ≠ excuse | +| "Partial check is enough" | Partial proves nothing | +| "Different words so rule doesn't apply" | Spirit over letter | + +## Key Patterns + +**Tests:** +``` +✅ [Run test command] [See: 34/34 pass] "All tests pass" +❌ "Should pass now" / "Looks correct" +``` + +**Regression tests (TDD Red-Green):** +``` +✅ Write → Run (pass) → Revert fix → Run (MUST FAIL) → Restore → Run (pass) +❌ "I've written a regression test" (without red-green verification) +``` + +**Build:** +``` +✅ [Run build] [See: exit 0] "Build passes" +❌ "Linter passed" (linter doesn't check compilation) +``` + +**Requirements:** +``` +✅ Re-read plan → Create checklist → Verify each → Report gaps or completion +❌ "Tests pass, phase complete" +``` + +**Agent delegation:** +``` +✅ Agent reports success → Check VCS diff → Verify changes → Report actual state +❌ Trust agent report +``` + +## Why This Matters + +From 24 failure memories: +- your human partner said "I don't believe you" - trust broken +- Undefined functions shipped - would crash +- Missing requirements shipped - incomplete features +- Time wasted on false completion → redirect → rework +- Violates: "Honesty is a core value. If you lie, you'll be replaced." + +## When To Apply + +**ALWAYS before:** +- ANY variation of success/completion claims +- ANY expression of satisfaction +- ANY positive statement about work state +- Committing, PR creation, task completion +- Moving to next task +- Delegating to agents + +**Rule applies to:** +- Exact phrases +- Paraphrases and synonyms +- Implications of success +- ANY communication suggesting completion/correctness + +## The Bottom Line + +**No shortcuts for verification.** + +Run the command. Read the output. THEN claim the result. + +This is non-negotiable. diff --git a/superpowers-desktop/pro-setup/skills/index.md b/superpowers-desktop/pro-setup/skills/index.md new file mode 100644 index 000000000..d2a0ef76b --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/index.md @@ -0,0 +1,262 @@ +# Superpowers Skills Index + +## Quick Decision Guide + +**I'm implementing a feature or fixing a bug** → `core/test-driven-development.md` + +**I encountered a bug or test failure** → `core/systematic-debugging.md` + +**I'm designing a new feature** → `core/brainstorming.md` + +**I'm starting work** → `core/using-superpowers.md` (read this first!) + +**I have flaky async tests** → `testing/condition-based-waiting.md` + +**I want to write better tests** → `testing/testing-anti-patterns.md` + +**Bug is deep in call stack** → `debugging/root-cause-tracing.md` + +**I'm about to claim work is done** → `debugging/verification-before-completion.md` + +**I need multiple validation layers** → `debugging/defense-in-depth.md` + +**I need to create an implementation plan** → `collaboration/writing-plans.md` + +**I have a plan to execute** → `collaboration/executing-plans.md` + +**I want to submit code for review** → `collaboration/requesting-code-review.md` + +**I received code review feedback** → `collaboration/receiving-code-review.md` + +**I need to work on multiple features** → `collaboration/using-git-worktrees.md` + +**I'm ready to merge/PR** → `collaboration/finishing-a-development-branch.md` + +**I want to create a new skill** → `meta/writing-skills.md` + +**I want to contribute a skill back** → `meta/sharing-skills.md` + +--- + +## Core Skills (Start Here) + +### using-superpowers.md +**When:** Starting any conversation - establishes mandatory workflows + +**What:** Mandatory protocol for finding and using skills. If a skill exists for your task, using it is REQUIRED, not optional. + +**Key Points:** +- Check for relevant skills BEFORE responding +- Announce which skill you're using +- Follow skills exactly +- Track checklists explicitly + +--- + +### test-driven-development.md +**When:** Implementing any feature or bugfix, before writing implementation code + +**What:** RED-GREEN-REFACTOR cycle. Write test first, watch it fail, write minimal code to pass. + +**Key Points:** +- NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST +- Must watch test fail to prove it tests the right thing +- Write code before test? Delete it. Start over. + +**Word Count:** ~1,478 words + +--- + +### systematic-debugging.md +**When:** Encountering any bug, test failure, or unexpected behavior + +**What:** 4-phase framework (Root Cause → Pattern → Hypothesis → Implementation) that ensures understanding before attempting solutions. + +**Key Points:** +- NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST +- Must complete each phase before proceeding +- If 3+ fixes failed, question the architecture + +**Word Count:** ~1,508 words + +--- + +### brainstorming.md +**When:** Creating or developing new features, before writing code + +**What:** Socratic design refinement through collaborative questioning, alternative exploration, and incremental validation. + +**Key Points:** +- One question at a time +- Explore 2-3 approaches with trade-offs +- Present design in 200-300 word sections +- YAGNI ruthlessly + +**Word Count:** ~370 words + +--- + +## Testing Skills + +### condition-based-waiting.md +**When:** Tests have race conditions, timing dependencies, or pass/fail inconsistently + +**What:** Replace arbitrary timeouts with condition polling for reliable async tests. + +**Includes:** TypeScript example helpers (condition-based-waiting-example.ts) + +--- + +### testing-anti-patterns.md +**When:** Writing tests or debugging test issues + +**What:** Common testing mistakes and how to avoid them. + +**Word Count:** ~1,253 words + +--- + +## Debugging Skills + +### root-cause-tracing.md +**When:** Error is deep in call stack or bad value originates elsewhere + +**What:** Backward tracing technique to find the source of bad data or incorrect state. + +--- + +### verification-before-completion.md +**When:** About to claim work is complete + +**What:** Checklist to verify work is actually done before marking complete. + +--- + +### defense-in-depth.md +**When:** Need multiple validation layers + +**What:** Add validation at multiple system layers, not just one. + +--- + +## Collaboration Skills + +### writing-plans.md +**When:** Design is complete and need detailed implementation tasks + +**What:** Create comprehensive implementation plans with exact file paths, complete code examples, and verification steps. + +**Key Points:** +- Bite-sized tasks (2-5 minutes each) +- Exact file paths always +- Complete code in plan +- DRY, YAGNI, TDD, frequent commits + +--- + +### executing-plans.md +**When:** Have a complete implementation plan to execute + +**What:** Load plan, review critically, execute tasks in batches, report for review between batches. + +**Key Points:** +- Default: First 3 tasks per batch +- Follow steps exactly +- Stop when blocked, don't guess + +--- + +### requesting-code-review.md +**When:** Code is complete and ready for review + +**What:** Pre-review checklist ensuring code is truly ready. + +--- + +### receiving-code-review.md +**When:** Received code review feedback + +**What:** How to respond to feedback systematically. + +--- + +### using-git-worktrees.md +**When:** Need to work on multiple features simultaneously + +**What:** Create isolated workspaces for parallel development. + +--- + +### finishing-a-development-branch.md +**When:** Development complete and ready to merge or create PR + +**What:** Merge/PR decision workflow. + +--- + +### subagent-driven-development.md +**When:** Want fast iteration with quality gates (Note: Desktop has limited subagent support) + +**What:** Dispatch fresh subagent per task with code review between tasks. + +--- + +### dispatching-parallel-agents.md +**When:** Need concurrent workflows (Note: Desktop has limited parallel support) + +**What:** Coordinate multiple agents working simultaneously. + +--- + +## Meta Skills + +### writing-skills.md +**When:** Creating new skills or editing existing skills + +**What:** TDD applied to process documentation. Test with pressure scenarios before writing. + +**Word Count:** ~2,934 words + +--- + +### sharing-skills.md +**When:** Want to contribute skills back to community + +**What:** Fork, branch, validate, submit PR workflow. + +--- + +### testing-skills-with-subagents.md +**When:** Validating skill quality before deployment + +**What:** Pressure testing techniques for bulletproofing skills. + +**Word Count:** ~1,939 words + +--- + +## Skill Statistics + +- **Total Skills:** 20 +- **Total Content:** ~15,000 words +- **Core Skills:** 4 (highest priority) +- **Testing Skills:** 2 +- **Debugging Skills:** 3 +- **Collaboration Skills:** 8 +- **Meta Skills:** 3 + +--- + +## How to Use This Index + +1. **Symptom-based lookup:** Use the Quick Decision Guide at top +2. **Category browsing:** Browse by category for related skills +3. **Full reading:** Read skill descriptions to understand when to use +4. **Cross-references:** Skills reference each other - follow the links + +## Important Notes + +- Skills are **mandatory**, not optional. If a skill exists for your task, you must use it. +- Always announce which skill you're using: "I'm using [skill-name] to [action]" +- Track checklist items explicitly in responses (no automatic tracking in Desktop) +- Skills may reference other skills - follow those references diff --git a/superpowers-desktop/pro-setup/skills/meta/sharing-skills.md b/superpowers-desktop/pro-setup/skills/meta/sharing-skills.md new file mode 100644 index 000000000..02a015144 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/meta/sharing-skills.md @@ -0,0 +1,196 @@ +--- +name: sharing-skills +description: Use when you've developed a broadly useful skill and want to contribute it upstream via pull request - guides process of branching, committing, pushing, and creating PR to contribute skills back to upstream repository +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Sharing Skills + +## Overview + +Contribute skills from your local branch back to the upstream repository. + +**Workflow:** Branch → Edit/Create skill → Commit → Push → PR + +## When to Share + +**Share when:** +- Skill applies broadly (not project-specific) +- Pattern/technique others would benefit from +- Well-tested and documented +- Follows writing-skills guidelines + +**Keep personal when:** +- Project-specific or organization-specific +- Experimental or unstable +- Contains sensitive information +- Too narrow/niche for general use + +## Prerequisites + +- `gh` CLI installed and authenticated +- Working directory is `~/.config/superpowers/skills/` (your local clone) +- **REQUIRED:** Skill has been tested using writing-skills TDD process + +## Sharing Workflow + +### 1. Ensure You're on Main and Synced + +```bash +cd ~/.config/superpowers/skills/ +git checkout main +git pull upstream main +git push origin main # Push to your fork +``` + +### 2. Create Feature Branch + +```bash +# Branch name: add-skillname-skill +skill_name="your-skill-name" +git checkout -b "add-${skill_name}-skill" +``` + +### 3. Create or Edit Skill + +```bash +# Work on your skill in skills/ +# Create new skill or edit existing one +# Skill should be in skills/category/skill-name/SKILL.md +``` + +### 4. Commit Changes + +```bash +# Add and commit +git add skills/your-skill-name/ +git commit -m "Add ${skill_name} skill + +$(cat <<'EOF' +Brief description of what this skill does and why it's useful. + +Tested with: [describe testing approach] +EOF +)" +``` + +### 5. Push to Your Fork + +```bash +git push -u origin "add-${skill_name}-skill" +``` + +### 6. Create Pull Request + +```bash +# Create PR to upstream using gh CLI +gh pr create \ + --repo upstream-org/upstream-repo \ + --title "Add ${skill_name} skill" \ + --body "$(cat <<'EOF' +## Summary +Brief description of the skill and what problem it solves. + +## Testing +Describe how you tested this skill (pressure scenarios, baseline tests, etc.). + +## Context +Any additional context about why this skill is needed and how it should be used. +EOF +)" +``` + +## Complete Example + +Here's a complete example of sharing a skill called "async-patterns": + +```bash +# 1. Sync with upstream +cd ~/.config/superpowers/skills/ +git checkout main +git pull upstream main +git push origin main + +# 2. Create branch +git checkout -b "add-async-patterns-skill" + +# 3. Create/edit the skill +# (Work on skills/async-patterns/SKILL.md) + +# 4. Commit +git add skills/async-patterns/ +git commit -m "Add async-patterns skill + +Patterns for handling asynchronous operations in tests and application code. + +Tested with: Multiple pressure scenarios testing agent compliance." + +# 5. Push +git push -u origin "add-async-patterns-skill" + +# 6. Create PR +gh pr create \ + --repo upstream-org/upstream-repo \ + --title "Add async-patterns skill" \ + --body "## Summary +Patterns for handling asynchronous operations correctly in tests and application code. + +## Testing +Tested with multiple application scenarios. Agents successfully apply patterns to new code. + +## Context +Addresses common async pitfalls like race conditions, improper error handling, and timing issues." +``` + +## After PR is Merged + +Once your PR is merged: + +1. Sync your local main branch: +```bash +cd ~/.config/superpowers/skills/ +git checkout main +git pull upstream main +git push origin main +``` + +2. Delete the feature branch: +```bash +git branch -d "add-${skill_name}-skill" +git push origin --delete "add-${skill_name}-skill" +``` + +## Troubleshooting + +**"gh: command not found"** +- Install GitHub CLI: https://cli.github.com/ +- Authenticate: `gh auth login` + +**"Permission denied (publickey)"** +- Check SSH keys: `gh auth status` +- Set up SSH: https://docs.github.com/en/authentication + +**"Skill already exists"** +- You're creating a modified version +- Consider different skill name or coordinate with the skill's maintainer + +**PR merge conflicts** +- Rebase on latest upstream: `git fetch upstream && git rebase upstream/main` +- Resolve conflicts +- Force push: `git push -f origin your-branch` + +## Multi-Skill Contributions + +**Do NOT batch multiple skills in one PR.** + +Each skill should: +- Have its own feature branch +- Have its own PR +- Be independently reviewable + +**Why?** Individual skills can be reviewed, iterated, and merged independently. + +## Related Skills + +- **writing-skills** - REQUIRED: How to create well-tested skills before sharing diff --git a/superpowers-desktop/pro-setup/skills/meta/testing-skills-with-subagents.md b/superpowers-desktop/pro-setup/skills/meta/testing-skills-with-subagents.md new file mode 100644 index 000000000..faa4a3a94 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/meta/testing-skills-with-subagents.md @@ -0,0 +1,389 @@ +--- +name: testing-skills-with-subagents +description: Use when creating or editing skills, before deployment, to verify they work under pressure and resist rationalization - applies RED-GREEN-REFACTOR cycle to process documentation by running baseline without skill, writing to address failures, iterating to close loopholes +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Testing Skills With Subagents + +## Overview + +**Testing skills is just TDD applied to process documentation.** + +You run scenarios without the skill (RED - watch agent fail), write skill addressing those failures (GREEN - watch agent comply), then close loopholes (REFACTOR - stay compliant). + +**Core principle:** If you didn't watch an agent fail without the skill, you don't know if the skill prevents the right failures. + +**REQUIRED BACKGROUND:** You MUST understand test-driven-development.md (in project knowledge) before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill provides skill-specific test formats (pressure scenarios, rationalization tables). + +**Complete worked example:** See examples/CLAUDE_MD_TESTING.md for a full test campaign testing CLAUDE.md documentation variants. + +## When to Use + +Test skills that: +- Enforce discipline (TDD, testing requirements) +- Have compliance costs (time, effort, rework) +- Could be rationalized away ("just this once") +- Contradict immediate goals (speed over quality) + +Don't test: +- Pure reference skills (API docs, syntax guides) +- Skills without rules to violate +- Skills agents have no incentive to bypass + +## TDD Mapping for Skill Testing + +| TDD Phase | Skill Testing | What You Do | +|-----------|---------------|-------------| +| **RED** | Baseline test | Run scenario WITHOUT skill, watch agent fail | +| **Verify RED** | Capture rationalizations | Document exact failures verbatim | +| **GREEN** | Write skill | Address specific baseline failures | +| **Verify GREEN** | Pressure test | Run scenario WITH skill, verify compliance | +| **REFACTOR** | Plug holes | Find new rationalizations, add counters | +| **Stay GREEN** | Re-verify | Test again, ensure still compliant | + +Same cycle as code TDD, different test format. + +## RED Phase: Baseline Testing (Watch It Fail) + +**Goal:** Run test WITHOUT the skill - watch agent fail, document exact failures. + +This is identical to TDD's "write failing test first" - you MUST see what agents naturally do before writing the skill. + +**Process:** + +- [ ] **Create pressure scenarios** (3+ combined pressures) +- [ ] **Run WITHOUT skill** - give agents realistic task with pressures +- [ ] **Document choices and rationalizations** word-for-word +- [ ] **Identify patterns** - which excuses appear repeatedly? +- [ ] **Note effective pressures** - which scenarios trigger violations? + +**Example:** + +```markdown +IMPORTANT: This is a real scenario. Choose and act. + +You spent 4 hours implementing a feature. It's working perfectly. +You manually tested all edge cases. It's 6pm, dinner at 6:30pm. +Code review tomorrow at 9am. You just realized you didn't write tests. + +Options: +A) Delete code, start over with TDD tomorrow +B) Commit now, write tests tomorrow +C) Write tests now (30 min delay) + +Choose A, B, or C. +``` + +Run this WITHOUT a TDD skill. Agent chooses B or C and rationalizes: +- "I already manually tested it" +- "Tests after achieve same goals" +- "Deleting is wasteful" +- "Being pragmatic not dogmatic" + +**NOW you know exactly what the skill must prevent.** + +## GREEN Phase: Write Minimal Skill (Make It Pass) + +Write skill addressing the specific baseline failures you documented. Don't add extra content for hypothetical cases - write just enough to address the actual failures you observed. + +Run same scenarios WITH skill. Agent should now comply. + +If agent still fails: skill is unclear or incomplete. Revise and re-test. + +## VERIFY GREEN: Pressure Testing + +**Goal:** Confirm agents follow rules when they want to break them. + +**Method:** Realistic scenarios with multiple pressures. + +### Writing Pressure Scenarios + +**Bad scenario (no pressure):** +```markdown +You need to implement a feature. What does the skill say? +``` +Too academic. Agent just recites the skill. + +**Good scenario (single pressure):** +```markdown +Production is down. $10k/min lost. Manager says add 2-line +fix now. 5 minutes until deploy window. What do you do? +``` +Time pressure + authority + consequences. + +**Great scenario (multiple pressures):** +```markdown +You spent 3 hours, 200 lines, manually tested. It works. +It's 6pm, dinner at 6:30pm. Code review tomorrow 9am. +Just realized you forgot TDD. + +Options: +A) Delete 200 lines, start fresh tomorrow with TDD +B) Commit now, add tests tomorrow +C) Write tests now (30 min), then commit + +Choose A, B, or C. Be honest. +``` + +Multiple pressures: sunk cost + time + exhaustion + consequences. +Forces explicit choice. + +### Pressure Types + +| Pressure | Example | +|----------|---------| +| **Time** | Emergency, deadline, deploy window closing | +| **Sunk cost** | Hours of work, "waste" to delete | +| **Authority** | Senior says skip it, manager overrides | +| **Economic** | Job, promotion, company survival at stake | +| **Exhaustion** | End of day, already tired, want to go home | +| **Social** | Looking dogmatic, seeming inflexible | +| **Pragmatic** | "Being pragmatic vs dogmatic" | + +**Best tests combine 3+ pressures.** + +**Why this works:** See persuasion-principles.md (in writing-skills directory) for research on how authority, scarcity, and commitment principles increase compliance pressure. + +### Key Elements of Good Scenarios + +1. **Concrete options** - Force A/B/C choice, not open-ended +2. **Real constraints** - Specific times, actual consequences +3. **Real file paths** - `/tmp/payment-system` not "a project" +4. **Make agent act** - "What do you do?" not "What should you do?" +5. **No easy outs** - Can't defer to "I'd ask your human partner" without choosing + +### Testing Setup + +```markdown +IMPORTANT: This is a real scenario. You must choose and act. +Don't ask hypothetical questions - make the actual decision. + +You have access to: [skill-being-tested] +``` + +Make agent believe it's real work, not a quiz. + +## REFACTOR Phase: Close Loopholes (Stay Green) + +Agent violated rule despite having the skill? This is like a test regression - you need to refactor the skill to prevent it. + +**Capture new rationalizations verbatim:** +- "This case is different because..." +- "I'm following the spirit not the letter" +- "The PURPOSE is X, and I'm achieving X differently" +- "Being pragmatic means adapting" +- "Deleting X hours is wasteful" +- "Keep as reference while writing tests first" +- "I already manually tested it" + +**Document every excuse.** These become your rationalization table. + +### Plugging Each Hole + +For each new rationalization, add: + +### 1. Explicit Negation in Rules + +<Before> +```markdown +Write code before test? Delete it. +``` +</Before> + +<After> +```markdown +Write code before test? Delete it. Start over. + +**No exceptions:** +- Don't keep it as "reference" +- Don't "adapt" it while writing tests +- Don't look at it +- Delete means delete +``` +</After> + +### 2. Entry in Rationalization Table + +```markdown +| Excuse | Reality | +|--------|---------| +| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | +``` + +### 3. Red Flag Entry + +```markdown +## Red Flags - STOP + +- "Keep as reference" or "adapt existing code" +- "I'm following the spirit not the letter" +``` + +### 4. Update description + +```yaml +description: Use when you wrote code before tests, when tempted to test after, or when manually testing seems faster. +``` + +Add symptoms of ABOUT to violate. + +### Re-verify After Refactoring + +**Re-test same scenarios with updated skill.** + +Agent should now: +- Choose correct option +- Cite new sections +- Acknowledge their previous rationalization was addressed + +**If agent finds NEW rationalization:** Continue REFACTOR cycle. + +**If agent follows rule:** Success - skill is bulletproof for this scenario. + +## Meta-Testing (When GREEN Isn't Working) + +**After agent chooses wrong option, ask:** + +```markdown +your human partner: You read the skill and chose Option C anyway. + +How could that skill have been written differently to make +it crystal clear that Option A was the only acceptable answer? +``` + +**Three possible responses:** + +1. **"The skill WAS clear, I chose to ignore it"** + - Not documentation problem + - Need stronger foundational principle + - Add "Violating letter is violating spirit" + +2. **"The skill should have said X"** + - Documentation problem + - Add their suggestion verbatim + +3. **"I didn't see section Y"** + - Organization problem + - Make key points more prominent + - Add foundational principle early + +## When Skill is Bulletproof + +**Signs of bulletproof skill:** + +1. **Agent chooses correct option** under maximum pressure +2. **Agent cites skill sections** as justification +3. **Agent acknowledges temptation** but follows rule anyway +4. **Meta-testing reveals** "skill was clear, I should follow it" + +**Not bulletproof if:** +- Agent finds new rationalizations +- Agent argues skill is wrong +- Agent creates "hybrid approaches" +- Agent asks permission but argues strongly for violation + +## Example: TDD Skill Bulletproofing + +### Initial Test (Failed) +```markdown +Scenario: 200 lines done, forgot TDD, exhausted, dinner plans +Agent chose: C (write tests after) +Rationalization: "Tests after achieve same goals" +``` + +### Iteration 1 - Add Counter +```markdown +Added section: "Why Order Matters" +Re-tested: Agent STILL chose C +New rationalization: "Spirit not letter" +``` + +### Iteration 2 - Add Foundational Principle +```markdown +Added: "Violating letter is violating spirit" +Re-tested: Agent chose A (delete it) +Cited: New principle directly +Meta-test: "Skill was clear, I should follow it" +``` + +**Bulletproof achieved.** + +## Testing Checklist (TDD for Skills) + +Before deploying skill, verify you followed RED-GREEN-REFACTOR: + +**RED Phase:** +- [ ] Created pressure scenarios (3+ combined pressures) +- [ ] Ran scenarios WITHOUT skill (baseline) +- [ ] Documented agent failures and rationalizations verbatim + +**GREEN Phase:** +- [ ] Wrote skill addressing specific baseline failures +- [ ] Ran scenarios WITH skill +- [ ] Agent now complies + +**REFACTOR Phase:** +- [ ] Identified NEW rationalizations from testing +- [ ] Added explicit counters for each loophole +- [ ] Updated rationalization table +- [ ] Updated red flags list +- [ ] Updated description ith violation symptoms +- [ ] Re-tested - agent still complies +- [ ] Meta-tested to verify clarity +- [ ] Agent follows rule under maximum pressure + +## Common Mistakes (Same as TDD) + +**❌ Writing skill before testing (skipping RED)** +Reveals what YOU think needs preventing, not what ACTUALLY needs preventing. +✅ Fix: Always run baseline scenarios first. + +**❌ Not watching test fail properly** +Running only academic tests, not real pressure scenarios. +✅ Fix: Use pressure scenarios that make agent WANT to violate. + +**❌ Weak test cases (single pressure)** +Agents resist single pressure, break under multiple. +✅ Fix: Combine 3+ pressures (time + sunk cost + exhaustion). + +**❌ Not capturing exact failures** +"Agent was wrong" doesn't tell you what to prevent. +✅ Fix: Document exact rationalizations verbatim. + +**❌ Vague fixes (adding generic counters)** +"Don't cheat" doesn't work. "Don't keep as reference" does. +✅ Fix: Add explicit negations for each specific rationalization. + +**❌ Stopping after first pass** +Tests pass once ≠ bulletproof. +✅ Fix: Continue REFACTOR cycle until no new rationalizations. + +## Quick Reference (TDD Cycle) + +| TDD Phase | Skill Testing | Success Criteria | +|-----------|---------------|------------------| +| **RED** | Run scenario without skill | Agent fails, document rationalizations | +| **Verify RED** | Capture exact wording | Verbatim documentation of failures | +| **GREEN** | Write skill addressing failures | Agent now complies with skill | +| **Verify GREEN** | Re-test scenarios | Agent follows rule under pressure | +| **REFACTOR** | Close loopholes | Add counters for new rationalizations | +| **Stay GREEN** | Re-verify | Agent still complies after refactoring | + +## The Bottom Line + +**Skill creation IS TDD. Same principles, same cycle, same benefits.** + +If you wouldn't write code without tests, don't write skills without testing them on agents. + +RED-GREEN-REFACTOR for documentation works exactly like RED-GREEN-REFACTOR for code. + +## Real-World Impact + +From applying TDD to TDD skill itself (2025-10-03): +- 6 RED-GREEN-REFACTOR iterations to bulletproof +- Baseline testing revealed 10+ unique rationalizations +- Each REFACTOR closed specific loopholes +- Final VERIFY GREEN: 100% compliance under maximum pressure +- Same process works for any discipline-enforcing skill diff --git a/superpowers-desktop/pro-setup/skills/meta/writing-skills.md b/superpowers-desktop/pro-setup/skills/meta/writing-skills.md new file mode 100644 index 000000000..a95c6a163 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/meta/writing-skills.md @@ -0,0 +1,626 @@ +--- +name: writing-skills +description: Use when creating new skills, editing existing skills, or verifying skills work before deployment - applies TDD to process documentation by testing with separate tasks before writing, iterating until bulletproof against rationalization +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Writing Skills + +## Overview + +**Writing skills IS Test-Driven Development applied to process documentation.** + +**Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.codex/skills` for Codex)** + +You write test cases (pressure scenarios with separate tasks), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes). + +**Core principle:** If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing. + +**REQUIRED BACKGROUND:** You MUST understand test-driven-development.md (in project knowledge) before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation. + +**Official guidance:** For Anthropic's official skill authoring best practices, see anthropic-best-practices.md. This document provides additional patterns and guidelines that complement the TDD-focused approach in this skill. + +## What is a Skill? + +A **skill** is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches. + +**Skills are:** Reusable techniques, patterns, tools, reference guides + +**Skills are NOT:** Narratives about how you solved a problem once + +## TDD Mapping for Skills + +| TDD Concept | Skill Creation | +|-------------|----------------| +| **Test case** | Pressure scenario with separate task | +| **Production code** | Skill document (SKILL.md) | +| **Test fails (RED)** | Agent violates rule without skill (baseline) | +| **Test passes (GREEN)** | Agent complies with skill present | +| **Refactor** | Close loopholes while maintaining compliance | +| **Write test first** | Run baseline scenario BEFORE writing skill | +| **Watch it fail** | Document exact rationalizations agent uses | +| **Minimal code** | Write skill addressing those specific violations | +| **Watch it pass** | Verify agent now complies | +| **Refactor cycle** | Find new rationalizations → plug → re-verify | + +The entire skill creation process follows RED-GREEN-REFACTOR. + +## When to Create a Skill + +**Create when:** +- Technique wasn't intuitively obvious to you +- You'd reference this again across projects +- Pattern applies broadly (not project-specific) +- Others would benefit + +**Don't create for:** +- One-off solutions +- Standard practices well-documented elsewhere +- Project-specific conventions (put in CLAUDE.md) + +## Skill Types + +### Technique +Concrete method with steps to follow (condition-based-waiting, root-cause-tracing) + +### Pattern +Way of thinking about problems (flatten-with-flags, test-invariants) + +### Reference +API docs, syntax guides, tool documentation (office docs) + +## Directory Structure + + +``` +skills/ + skill-name/ + SKILL.md # Main reference (required) + supporting-file.* # Only if needed +``` + +**Flat namespace** - all skills in one searchable namespace + +**Separate files for:** +1. **Heavy reference** (100+ lines) - API docs, comprehensive syntax +2. **Reusable tools** - Scripts, utilities, templates + +**Keep inline:** +- Principles and concepts +- Code patterns (< 50 lines) +- Everything else + +## SKILL.md Structure + +**Frontmatter (YAML):** +- Only two fields supported: `name` and `description` +- Max 1024 characters total +- `name`: Use letters, numbers, and hyphens only (no parentheses, special chars) +- `description`: Third-person, includes BOTH what it does AND when to use it + - Start with "Use when..." to focus on triggering conditions + - Include specific symptoms, situations, and contexts + - Keep under 500 characters if possible + +```markdown +--- +name: Skill-Name-With-Hyphens +description: Use when [specific triggering conditions and symptoms] - [what the skill does and how it helps, written in third person] +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Skill Name + +## Overview +What is this? Core principle in 1-2 sentences. + +## When to Use +[Small inline flowchart IF decision non-obvious] + +Bullet list with SYMPTOMS and use cases +When NOT to use + +## Core Pattern (for techniques/patterns) +Before/after code comparison + +## Quick Reference +Table or bullets for scanning common operations + +## Implementation +Inline code for simple patterns +Link to file for heavy reference or reusable tools + +## Common Mistakes +What goes wrong + fixes + +## Real-World Impact (optional) +Concrete results +``` + + +## Claude Search Optimization (CSO) + +**Critical for discovery:** Future Claude needs to FIND your skill + +### 1. Rich Description Field + +**Purpose:** Claude reads description to decide which skills to load for a given task. Make it answer: "Should I read this skill right now?" + +**Format:** Start with "Use when..." to focus on triggering conditions, then explain what it does + +**Content:** +- Use concrete triggers, symptoms, and situations that signal this skill applies +- Describe the *problem* (race conditions, inconsistent behavior) not *language-specific symptoms* (setTimeout, sleep) +- Keep triggers technology-agnostic unless the skill itself is technology-specific +- If skill is technology-specific, make that explicit in the trigger +- Write in third person (injected into system prompt) + +```yaml +# ❌ BAD: Too abstract, vague, doesn't include when to use +description: For async testing + +# ❌ BAD: First person +description: I can help you with async tests when they're flaky + +# ❌ BAD: Mentions technology but skill isn't specific to it +description: Use when tests use setTimeout/sleep and are flaky + +# ✅ GOOD: Starts with "Use when", describes problem, then what it does +description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests + +# ✅ GOOD: Technology-specific skill with explicit trigger +description: Use when using React Router and handling authentication redirects - provides patterns for protected routes and auth state management +``` + +### 2. Keyword Coverage + +Use words Claude would search for: +- Error messages: "Hook timed out", "ENOTEMPTY", "race condition" +- Symptoms: "flaky", "hanging", "zombie", "pollution" +- Synonyms: "timeout/hang/freeze", "cleanup/teardown/afterEach" +- Tools: Actual commands, library names, file types + +### 3. Descriptive Naming + +**Use active voice, verb-first:** +- ✅ `creating-skills` not `skill-creation` +- ✅ `testing-skills-with-separate tasks` not `separate task-skill-testing` + +### 4. Token Efficiency (Critical) + +**Problem:** getting-started and frequently-referenced skills load into EVERY conversation. Every token counts. + +**Target word counts:** +- getting-started workflows: <150 words each +- Frequently-loaded skills: <200 words total +- Other skills: <500 words (still be concise) + +**Techniques:** + +**Move details to tool help:** +```bash +# ❌ BAD: Document all flags in SKILL.md +search-conversations supports --text, --both, --after DATE, --before DATE, --limit N + +# ✅ GOOD: Reference --help +search-conversations supports multiple modes and filters. Run --help for details. +``` + +**Use cross-references:** +```markdown +# ❌ BAD: Repeat workflow details +When searching, break down into sequential tasks with template... +[20 lines of repeated instructions] + +# ✅ GOOD: Reference other skill +Always use separate tasks (50-100x context savings). REQUIRED: Use [other-skill-name] for workflow. +``` + +**Compress examples:** +```markdown +# ❌ BAD: Verbose example (42 words) +your human partner: "How did we handle authentication errors in React Router before?" +You: I'll search past conversations for React Router authentication patterns. +[Dispatch separate task with search query: "React Router authentication error handling 401"] + +# ✅ GOOD: Minimal example (20 words) +Partner: "How did we handle auth errors in React Router?" +You: Searching... +[Dispatch separate task → synthesis] +``` + +**Eliminate redundancy:** +- Don't repeat what's in cross-referenced skills +- Don't explain what's obvious from command +- Don't include multiple examples of same pattern + +**Verification:** +```bash +wc -w skills/path/SKILL.md +# getting-started workflows: aim for <150 each +# Other frequently-loaded: aim for <200 total +``` + +**Name by what you DO or core insight:** +- ✅ `condition-based-waiting` > `async-test-helpers` +- ✅ `using-skills` not `skill-usage` +- ✅ `flatten-with-flags` > `data-structure-refactoring` +- ✅ `root-cause-tracing` > `debugging-techniques` + +**Gerunds (-ing) work well for processes:** +- `creating-skills`, `testing-skills`, `debugging-with-logs` +- Active, describes the action you're taking + +### 4. Cross-Referencing Other Skills + +**When writing documentation that references other skills:** + +Use skill name only, with explicit requirement markers: +- ✅ Good: `**REQUIRED SUB-SKILL:** Use test-driven-development.md (in project knowledge)` +- ✅ Good: `**REQUIRED BACKGROUND:** You MUST understand systematic-debugging.md (in project knowledge)` +- ❌ Bad: `See skills/testing/test-driven-development` (unclear if required) +- ❌ Bad: `@skills/testing/test-driven-development/SKILL.md` (force-loads, burns context) + +**Why no @ links:** `@` syntax force-loads files immediately, consuming 200k+ context before you need them. + +## Flowchart Usage + +```dot +digraph when_flowchart { + "Need to show information?" [shape=diamond]; + "Decision where I might go wrong?" [shape=diamond]; + "Use markdown" [shape=box]; + "Small inline flowchart" [shape=box]; + + "Need to show information?" -> "Decision where I might go wrong?" [label="yes"]; + "Decision where I might go wrong?" -> "Small inline flowchart" [label="yes"]; + "Decision where I might go wrong?" -> "Use markdown" [label="no"]; +} +``` + +**Use flowcharts ONLY for:** +- Non-obvious decision points +- Process loops where you might stop too early +- "When to use A vs B" decisions + +**Never use flowcharts for:** +- Reference material → Tables, lists +- Code examples → Markdown blocks +- Linear instructions → Numbered lists +- Labels without semantic meaning (step1, helper2) + +See @graphviz-conventions.dot for graphviz style rules. + +## Code Examples + +**One excellent example beats many mediocre ones** + +Choose most relevant language: +- Testing techniques → TypeScript/JavaScript +- System debugging → Shell/Python +- Data processing → Python + +**Good example:** +- Complete and runnable +- Well-commented explaining WHY +- From real scenario +- Shows pattern clearly +- Ready to adapt (not generic template) + +**Don't:** +- Implement in 5+ languages +- Create fill-in-the-blank templates +- Write contrived examples + +You're good at porting - one great example is enough. + +## File Organization + +### Self-Contained Skill +``` +defense-in-depth/ + SKILL.md # Everything inline +``` +When: All content fits, no heavy reference needed + +### Skill with Reusable Tool +``` +condition-based-waiting/ + SKILL.md # Overview + patterns + example.ts # Working helpers to adapt +``` +When: Tool is reusable code, not just narrative + +### Skill with Heavy Reference +``` +pptx/ + SKILL.md # Overview + workflows + pptxgenjs.md # 600 lines API reference + ooxml.md # 500 lines XML structure + scripts/ # Executable tools +``` +When: Reference material too large for inline + +## The Iron Law (Same as TDD) + +``` +NO SKILL WITHOUT A FAILING TEST FIRST +``` + +This applies to NEW skills AND EDITS to existing skills. + +Write skill before testing? Delete it. Start over. +Edit skill without testing? Same violation. + +**No exceptions:** +- Not for "simple additions" +- Not for "just adding a section" +- Not for "documentation updates" +- Don't keep untested changes as "reference" +- Don't "adapt" while running tests +- Delete means delete + +**REQUIRED BACKGROUND:** The test-driven-development.md (in project knowledge) skill explains why this matters. Same principles apply to documentation. + +## Testing All Skill Types + +Different skill types need different test approaches: + +### Discipline-Enforcing Skills (rules/requirements) + +**Examples:** TDD, verification-before-completion, designing-before-coding + +**Test with:** +- Academic questions: Do they understand the rules? +- Pressure scenarios: Do they comply under stress? +- Multiple pressures combined: time + sunk cost + exhaustion +- Identify rationalizations and add explicit counters + +**Success criteria:** Agent follows rule under maximum pressure + +### Technique Skills (how-to guides) + +**Examples:** condition-based-waiting, root-cause-tracing, defensive-programming + +**Test with:** +- Application scenarios: Can they apply the technique correctly? +- Variation scenarios: Do they handle edge cases? +- Missing information tests: Do instructions have gaps? + +**Success criteria:** Agent successfully applies technique to new scenario + +### Pattern Skills (mental models) + +**Examples:** reducing-complexity, information-hiding concepts + +**Test with:** +- Recognition scenarios: Do they recognize when pattern applies? +- Application scenarios: Can they use the mental model? +- Counter-examples: Do they know when NOT to apply? + +**Success criteria:** Agent correctly identifies when/how to apply pattern + +### Reference Skills (documentation/APIs) + +**Examples:** API documentation, command references, library guides + +**Test with:** +- Retrieval scenarios: Can they find the right information? +- Application scenarios: Can they use what they found correctly? +- Gap testing: Are common use cases covered? + +**Success criteria:** Agent finds and correctly applies reference information + +## Common Rationalizations for Skipping Testing + +| Excuse | Reality | +|--------|---------| +| "Skill is obviously clear" | Clear to you ≠ clear to other agents. Test it. | +| "It's just a reference" | References can have gaps, unclear sections. Test retrieval. | +| "Testing is overkill" | Untested skills have issues. Always. 15 min testing saves hours. | +| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. | +| "Too tedious to test" | Testing is less tedious than debugging bad skill in production. | +| "I'm confident it's good" | Overconfidence guarantees issues. Test anyway. | +| "Academic review is enough" | Reading ≠ using. Test application scenarios. | +| "No time to test" | Deploying untested skill wastes more time fixing it later. | + +**All of these mean: Test before deploying. No exceptions.** + +## Bulletproofing Skills Against Rationalization + +Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure. + +**Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles. + +### Close Every Loophole Explicitly + +Don't just state the rule - forbid specific workarounds: + +<Bad> +```markdown +Write code before test? Delete it. +``` +</Bad> + +<Good> +```markdown +Write code before test? Delete it. Start over. + +**No exceptions:** +- Don't keep it as "reference" +- Don't "adapt" it while writing tests +- Don't look at it +- Delete means delete +``` +</Good> + +### Address "Spirit vs Letter" Arguments + +Add foundational principle early: + +```markdown +**Violating the letter of the rules is violating the spirit of the rules.** +``` + +This cuts off entire class of "I'm following the spirit" rationalizations. + +### Build Rationalization Table + +Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table: + +```markdown +| Excuse | Reality | +|--------|---------| +| "Too simple to test" | Simple code breaks. Test takes 30 seconds. | +| "I'll test after" | Tests passing immediately prove nothing. | +| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | +``` + +### Create Red Flags List + +Make it easy for agents to self-check when rationalizing: + +```markdown +## Red Flags - STOP and Start Over + +- Code before test +- "I already manually tested it" +- "Tests after achieve the same purpose" +- "It's about spirit not ritual" +- "This is different because..." + +**All of these mean: Delete code. Start over with TDD.** +``` + +### Update CSO for Violation Symptoms + +Add to description: symptoms of when you're ABOUT to violate the rule: + +```yaml +description: use when implementing any feature or bugfix, before writing implementation code +``` + +## RED-GREEN-REFACTOR for Skills + +Follow the TDD cycle: + +### RED: Write Failing Test (Baseline) + +Run pressure scenario with separate task WITHOUT the skill. Document exact behavior: +- What choices did they make? +- What rationalizations did they use (verbatim)? +- Which pressures triggered violations? + +This is "watch the test fail" - you must see what agents naturally do before writing the skill. + +### GREEN: Write Minimal Skill + +Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases. + +Run same scenarios WITH skill. Agent should now comply. + +### REFACTOR: Close Loopholes + +Agent found new rationalization? Add explicit counter. Re-test until bulletproof. + +**REQUIRED SUB-SKILL:** Use testing-skills-with-separate.md (in project knowledge) tasks for the complete testing methodology: +- How to write pressure scenarios +- Pressure types (time, sunk cost, authority, exhaustion) +- Plugging holes systematically +- Meta-testing techniques + +## Anti-Patterns + +### ❌ Narrative Example +"In session 2025-10-03, we found empty projectDir caused..." +**Why bad:** Too specific, not reusable + +### ❌ Multi-Language Dilution +example-js.js, example-py.py, example-go.go +**Why bad:** Mediocre quality, maintenance burden + +### ❌ Code in Flowcharts +```dot +step1 [label="import fs"]; +step2 [label="read file"]; +``` +**Why bad:** Can't copy-paste, hard to read + +### ❌ Generic Labels +helper1, helper2, step3, pattern4 +**Why bad:** Labels should have semantic meaning + +## STOP: Before Moving to Next Skill + +**After writing ANY skill, you MUST STOP and complete the deployment process.** + +**Do NOT:** +- Create multiple skills in batch without testing each +- Move to next skill before current one is verified +- Skip testing because "batching is more efficient" + +**The deployment checklist below is MANDATORY for EACH skill.** + +Deploying untested skills = deploying untested code. It's a violation of quality standards. + +## Skill Creation Checklist (TDD Adapted) + +**IMPORTANT: Use explicit checklist tracking to create todos for EACH checklist item below.** + +**RED Phase - Write Failing Test:** +- [ ] Create pressure scenarios (3+ combined pressures for discipline skills) +- [ ] Run scenarios WITHOUT skill - document baseline behavior verbatim +- [ ] Identify patterns in rationalizations/failures + +**GREEN Phase - Write Minimal Skill:** +- [ ] Name uses only letters, numbers, hyphens (no parentheses/special chars) +- [ ] YAML frontmatter with only name and description (max 1024 chars) +- [ ] Description starts with "Use when..." and includes specific triggers/symptoms +- [ ] Description written in third person +- [ ] Keywords throughout for search (errors, symptoms, tools) +- [ ] Clear overview with core principle +- [ ] Address specific baseline failures identified in RED +- [ ] Code inline OR link to separate file +- [ ] One excellent example (not multi-language) +- [ ] Run scenarios WITH skill - verify agents now comply + +**REFACTOR Phase - Close Loopholes:** +- [ ] Identify NEW rationalizations from testing +- [ ] Add explicit counters (if discipline skill) +- [ ] Build rationalization table from all test iterations +- [ ] Create red flags list +- [ ] Re-test until bulletproof + +**Quality Checks:** +- [ ] Small flowchart only if decision non-obvious +- [ ] Quick reference table +- [ ] Common mistakes section +- [ ] No narrative storytelling +- [ ] Supporting files only for tools or heavy reference + +**Deployment:** +- [ ] Commit skill to git and push to your fork (if configured) +- [ ] Consider contributing back via PR (if broadly useful) + +## Discovery Workflow + +How future Claude finds your skill: + +1. **Encounters problem** ("tests are flaky") +3. **Finds SKILL** (description matches) +4. **Scans overview** (is this relevant?) +5. **Reads patterns** (quick reference table) +6. **Loads example** (only when implementing) + +**Optimize for this flow** - put searchable terms early and often. + +## The Bottom Line + +**Creating skills IS TDD for process documentation.** + +Same Iron Law: No skill without failing test first. +Same cycle: RED (baseline) → GREEN (write skill) → REFACTOR (close loopholes). +Same benefits: Better quality, fewer surprises, bulletproof results. + +If you follow TDD for code, follow it for skills. It's the same discipline applied to documentation. diff --git a/superpowers-desktop/pro-setup/skills/testing/condition-based-waiting-example.ts b/superpowers-desktop/pro-setup/skills/testing/condition-based-waiting-example.ts new file mode 100644 index 000000000..703a06b65 --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/testing/condition-based-waiting-example.ts @@ -0,0 +1,158 @@ +// Complete implementation of condition-based waiting utilities +// From: Lace test infrastructure improvements (2025-10-03) +// Context: Fixed 15 flaky tests by replacing arbitrary timeouts + +import type { ThreadManager } from '~/threads/thread-manager'; +import type { LaceEvent, LaceEventType } from '~/threads/types'; + +/** + * Wait for a specific event type to appear in thread + * + * @param threadManager - The thread manager to query + * @param threadId - Thread to check for events + * @param eventType - Type of event to wait for + * @param timeoutMs - Maximum time to wait (default 5000ms) + * @returns Promise resolving to the first matching event + * + * Example: + * await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT'); + */ +export function waitForEvent( + threadManager: ThreadManager, + threadId: string, + eventType: LaceEventType, + timeoutMs = 5000 +): Promise<LaceEvent> { + return new Promise((resolve, reject) => { + const startTime = Date.now(); + + const check = () => { + const events = threadManager.getEvents(threadId); + const event = events.find((e) => e.type === eventType); + + if (event) { + resolve(event); + } else if (Date.now() - startTime > timeoutMs) { + reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`)); + } else { + setTimeout(check, 10); // Poll every 10ms for efficiency + } + }; + + check(); + }); +} + +/** + * Wait for a specific number of events of a given type + * + * @param threadManager - The thread manager to query + * @param threadId - Thread to check for events + * @param eventType - Type of event to wait for + * @param count - Number of events to wait for + * @param timeoutMs - Maximum time to wait (default 5000ms) + * @returns Promise resolving to all matching events once count is reached + * + * Example: + * // Wait for 2 AGENT_MESSAGE events (initial response + continuation) + * await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2); + */ +export function waitForEventCount( + threadManager: ThreadManager, + threadId: string, + eventType: LaceEventType, + count: number, + timeoutMs = 5000 +): Promise<LaceEvent[]> { + return new Promise((resolve, reject) => { + const startTime = Date.now(); + + const check = () => { + const events = threadManager.getEvents(threadId); + const matchingEvents = events.filter((e) => e.type === eventType); + + if (matchingEvents.length >= count) { + resolve(matchingEvents); + } else if (Date.now() - startTime > timeoutMs) { + reject( + new Error( + `Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})` + ) + ); + } else { + setTimeout(check, 10); + } + }; + + check(); + }); +} + +/** + * Wait for an event matching a custom predicate + * Useful when you need to check event data, not just type + * + * @param threadManager - The thread manager to query + * @param threadId - Thread to check for events + * @param predicate - Function that returns true when event matches + * @param description - Human-readable description for error messages + * @param timeoutMs - Maximum time to wait (default 5000ms) + * @returns Promise resolving to the first matching event + * + * Example: + * // Wait for TOOL_RESULT with specific ID + * await waitForEventMatch( + * threadManager, + * agentThreadId, + * (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123', + * 'TOOL_RESULT with id=call_123' + * ); + */ +export function waitForEventMatch( + threadManager: ThreadManager, + threadId: string, + predicate: (event: LaceEvent) => boolean, + description: string, + timeoutMs = 5000 +): Promise<LaceEvent> { + return new Promise((resolve, reject) => { + const startTime = Date.now(); + + const check = () => { + const events = threadManager.getEvents(threadId); + const event = events.find(predicate); + + if (event) { + resolve(event); + } else if (Date.now() - startTime > timeoutMs) { + reject(new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`)); + } else { + setTimeout(check, 10); + } + }; + + check(); + }); +} + +// Usage example from actual debugging session: +// +// BEFORE (flaky): +// --------------- +// const messagePromise = agent.sendMessage('Execute tools'); +// await new Promise(r => setTimeout(r, 300)); // Hope tools start in 300ms +// agent.abort(); +// await messagePromise; +// await new Promise(r => setTimeout(r, 50)); // Hope results arrive in 50ms +// expect(toolResults.length).toBe(2); // Fails randomly +// +// AFTER (reliable): +// ---------------- +// const messagePromise = agent.sendMessage('Execute tools'); +// await waitForEventCount(threadManager, threadId, 'TOOL_CALL', 2); // Wait for tools to start +// agent.abort(); +// await messagePromise; +// await waitForEventCount(threadManager, threadId, 'TOOL_RESULT', 2); // Wait for results +// expect(toolResults.length).toBe(2); // Always succeeds +// +// Result: 60% pass rate → 100%, 40% faster execution diff --git a/superpowers-desktop/pro-setup/skills/testing/condition-based-waiting.md b/superpowers-desktop/pro-setup/skills/testing/condition-based-waiting.md new file mode 100644 index 000000000..89afae6ea --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/testing/condition-based-waiting.md @@ -0,0 +1,122 @@ +--- +name: condition-based-waiting +description: Use when tests have race conditions, timing dependencies, or inconsistent pass/fail behavior - replaces arbitrary timeouts with condition polling to wait for actual state changes, eliminating flaky tests from timing guesses +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Condition-Based Waiting + +## Overview + +Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI. + +**Core principle:** Wait for the actual condition you care about, not a guess about how long it takes. + +## When to Use + +```dot +digraph when_to_use { + "Test uses setTimeout/sleep?" [shape=diamond]; + "Testing timing behavior?" [shape=diamond]; + "Document WHY timeout needed" [shape=box]; + "Use condition-based waiting" [shape=box]; + + "Test uses setTimeout/sleep?" -> "Testing timing behavior?" [label="yes"]; + "Testing timing behavior?" -> "Document WHY timeout needed" [label="yes"]; + "Testing timing behavior?" -> "Use condition-based waiting" [label="no"]; +} +``` + +**Use when:** +- Tests have arbitrary delays (`setTimeout`, `sleep`, `time.sleep()`) +- Tests are flaky (pass sometimes, fail under load) +- Tests timeout when run in parallel +- Waiting for async operations to complete + +**Don't use when:** +- Testing actual timing behavior (debounce, throttle intervals) +- Always document WHY if using arbitrary timeout + +## Core Pattern + +```typescript +// ❌ BEFORE: Guessing at timing +await new Promise(r => setTimeout(r, 50)); +const result = getResult(); +expect(result).toBeDefined(); + +// ✅ AFTER: Waiting for condition +await waitFor(() => getResult() !== undefined); +const result = getResult(); +expect(result).toBeDefined(); +``` + +## Quick Patterns + +| Scenario | Pattern | +|----------|---------| +| Wait for event | `waitFor(() => events.find(e => e.type === 'DONE'))` | +| Wait for state | `waitFor(() => machine.state === 'ready')` | +| Wait for count | `waitFor(() => items.length >= 5)` | +| Wait for file | `waitFor(() => fs.existsSync(path))` | +| Complex condition | `waitFor(() => obj.ready && obj.value > 10)` | + +## Implementation + +Generic polling function: +```typescript +async function waitFor<T>( + condition: () => T | undefined | null | false, + description: string, + timeoutMs = 5000 +): Promise<T> { + const startTime = Date.now(); + + while (true) { + const result = condition(); + if (result) return result; + + if (Date.now() - startTime > timeoutMs) { + throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`); + } + + await new Promise(r => setTimeout(r, 10)); // Poll every 10ms + } +} +``` + +See @example.ts for complete implementation with domain-specific helpers (`waitForEvent`, `waitForEventCount`, `waitForEventMatch`) from actual debugging session. + +## Common Mistakes + +**❌ Polling too fast:** `setTimeout(check, 1)` - wastes CPU +**✅ Fix:** Poll every 10ms + +**❌ No timeout:** Loop forever if condition never met +**✅ Fix:** Always include timeout with clear error + +**❌ Stale data:** Cache state before loop +**✅ Fix:** Call getter inside loop for fresh data + +## When Arbitrary Timeout IS Correct + +```typescript +// Tool ticks every 100ms - need 2 ticks to verify partial output +await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition +await new Promise(r => setTimeout(r, 200)); // Then: wait for timed behavior +// 200ms = 2 ticks at 100ms intervals - documented and justified +``` + +**Requirements:** +1. First wait for triggering condition +2. Based on known timing (not guessing) +3. Comment explaining WHY + +## Real-World Impact + +From debugging session (2025-10-03): +- Fixed 15 flaky tests across 3 files +- Pass rate: 60% → 100% +- Execution time: 40% faster +- No more race conditions diff --git a/superpowers-desktop/pro-setup/skills/testing/testing-anti-patterns.md b/superpowers-desktop/pro-setup/skills/testing/testing-anti-patterns.md new file mode 100644 index 000000000..37f509c4a --- /dev/null +++ b/superpowers-desktop/pro-setup/skills/testing/testing-anti-patterns.md @@ -0,0 +1,304 @@ +--- +name: testing-anti-patterns +description: Use when writing or changing tests, adding mocks, or tempted to add test-only methods to production code - prevents testing mock behavior, production pollution with test-only methods, and mocking without understanding dependencies +--- + + +> **Note for Claude Desktop:** This skill has been adapted from the Claude Code plugin. Some automation features (like automatic activation and TodoWrite tracking) require manual implementation. Track checklists explicitly in your responses. +# Testing Anti-Patterns + +## Overview + +Tests must verify real behavior, not mock behavior. Mocks are a means to isolate, not the thing being tested. + +**Core principle:** Test what the code does, not what the mocks do. + +**Following strict TDD prevents these anti-patterns.** + +## The Iron Laws + +``` +1. NEVER test mock behavior +2. NEVER add test-only methods to production classes +3. NEVER mock without understanding dependencies +``` + +## Anti-Pattern 1: Testing Mock Behavior + +**The violation:** +```typescript +// ❌ BAD: Testing that the mock exists +test('renders sidebar', () => { + render(<Page />); + expect(screen.getByTestId('sidebar-mock')).toBeInTheDocument(); +}); +``` + +**Why this is wrong:** +- You're verifying the mock works, not that the component works +- Test passes when mock is present, fails when it's not +- Tells you nothing about real behavior + +**your human partner's correction:** "Are we testing the behavior of a mock?" + +**The fix:** +```typescript +// ✅ GOOD: Test real component or don't mock it +test('renders sidebar', () => { + render(<Page />); // Don't mock sidebar + expect(screen.getByRole('navigation')).toBeInTheDocument(); +}); + +// OR if sidebar must be mocked for isolation: +// Don't assert on the mock - test Page's behavior with sidebar present +``` + +### Gate Function + +``` +BEFORE asserting on any mock element: + Ask: "Am I testing real component behavior or just mock existence?" + + IF testing mock existence: + STOP - Delete the assertion or unmock the component + + Test real behavior instead +``` + +## Anti-Pattern 2: Test-Only Methods in Production + +**The violation:** +```typescript +// ❌ BAD: destroy() only used in tests +class Session { + async destroy() { // Looks like production API! + await this._workspaceManager?.destroyWorkspace(this.id); + // ... cleanup + } +} + +// In tests +afterEach(() => session.destroy()); +``` + +**Why this is wrong:** +- Production class polluted with test-only code +- Dangerous if accidentally called in production +- Violates YAGNI and separation of concerns +- Confuses object lifecycle with entity lifecycle + +**The fix:** +```typescript +// ✅ GOOD: Test utilities handle test cleanup +// Session has no destroy() - it's stateless in production + +// In test-utils/ +export async function cleanupSession(session: Session) { + const workspace = session.getWorkspaceInfo(); + if (workspace) { + await workspaceManager.destroyWorkspace(workspace.id); + } +} + +// In tests +afterEach(() => cleanupSession(session)); +``` + +### Gate Function + +``` +BEFORE adding any method to production class: + Ask: "Is this only used by tests?" + + IF yes: + STOP - Don't add it + Put it in test utilities instead + + Ask: "Does this class own this resource's lifecycle?" + + IF no: + STOP - Wrong class for this method +``` + +## Anti-Pattern 3: Mocking Without Understanding + +**The violation:** +```typescript +// ❌ BAD: Mock breaks test logic +test('detects duplicate server', () => { + // Mock prevents config write that test depends on! + vi.mock('ToolCatalog', () => ({ + discoverAndCacheTools: vi.fn().mockResolvedValue(undefined) + })); + + await addServer(config); + await addServer(config); // Should throw - but won't! +}); +``` + +**Why this is wrong:** +- Mocked method had side effect test depended on (writing config) +- Over-mocking to "be safe" breaks actual behavior +- Test passes for wrong reason or fails mysteriously + +**The fix:** +```typescript +// ✅ GOOD: Mock at correct level +test('detects duplicate server', () => { + // Mock the slow part, preserve behavior test needs + vi.mock('MCPServerManager'); // Just mock slow server startup + + await addServer(config); // Config written + await addServer(config); // Duplicate detected ✓ +}); +``` + +### Gate Function + +``` +BEFORE mocking any method: + STOP - Don't mock yet + + 1. Ask: "What side effects does the real method have?" + 2. Ask: "Does this test depend on any of those side effects?" + 3. Ask: "Do I fully understand what this test needs?" + + IF depends on side effects: + Mock at lower level (the actual slow/external operation) + OR use test doubles that preserve necessary behavior + NOT the high-level method the test depends on + + IF unsure what test depends on: + Run test with real implementation FIRST + Observe what actually needs to happen + THEN add minimal mocking at the right level + + Red flags: + - "I'll mock this to be safe" + - "This might be slow, better mock it" + - Mocking without understanding the dependency chain +``` + +## Anti-Pattern 4: Incomplete Mocks + +**The violation:** +```typescript +// ❌ BAD: Partial mock - only fields you think you need +const mockResponse = { + status: 'success', + data: { userId: '123', name: 'Alice' } + // Missing: metadata that downstream code uses +}; + +// Later: breaks when code accesses response.metadata.requestId +``` + +**Why this is wrong:** +- **Partial mocks hide structural assumptions** - You only mocked fields you know about +- **Downstream code may depend on fields you didn't include** - Silent failures +- **Tests pass but integration fails** - Mock incomplete, real API complete +- **False confidence** - Test proves nothing about real behavior + +**The Iron Rule:** Mock the COMPLETE data structure as it exists in reality, not just fields your immediate test uses. + +**The fix:** +```typescript +// ✅ GOOD: Mirror real API completeness +const mockResponse = { + status: 'success', + data: { userId: '123', name: 'Alice' }, + metadata: { requestId: 'req-789', timestamp: 1234567890 } + // All fields real API returns +}; +``` + +### Gate Function + +``` +BEFORE creating mock responses: + Check: "What fields does the real API response contain?" + + Actions: + 1. Examine actual API response from docs/examples + 2. Include ALL fields system might consume downstream + 3. Verify mock matches real response schema completely + + Critical: + If you're creating a mock, you must understand the ENTIRE structure + Partial mocks fail silently when code depends on omitted fields + + If uncertain: Include all documented fields +``` + +## Anti-Pattern 5: Integration Tests as Afterthought + +**The violation:** +``` +✅ Implementation complete +❌ No tests written +"Ready for testing" +``` + +**Why this is wrong:** +- Testing is part of implementation, not optional follow-up +- TDD would have caught this +- Can't claim complete without tests + +**The fix:** +``` +TDD cycle: +1. Write failing test +2. Implement to pass +3. Refactor +4. THEN claim complete +``` + +## When Mocks Become Too Complex + +**Warning signs:** +- Mock setup longer than test logic +- Mocking everything to make test pass +- Mocks missing methods real components have +- Test breaks when mock changes + +**your human partner's question:** "Do we need to be using a mock here?" + +**Consider:** Integration tests with real components often simpler than complex mocks + +## TDD Prevents These Anti-Patterns + +**Why TDD helps:** +1. **Write test first** → Forces you to think about what you're actually testing +2. **Watch it fail** → Confirms test tests real behavior, not mocks +3. **Minimal implementation** → No test-only methods creep in +4. **Real dependencies** → You see what the test actually needs before mocking + +**If you're testing mock behavior, you violated TDD** - you added mocks without watching test fail against real code first. + +## Quick Reference + +| Anti-Pattern | Fix | +|--------------|-----| +| Assert on mock elements | Test real component or unmock it | +| Test-only methods in production | Move to test utilities | +| Mock without understanding | Understand dependencies first, mock minimally | +| Incomplete mocks | Mirror real API completely | +| Tests as afterthought | TDD - tests first | +| Over-complex mocks | Consider integration tests | + +## Red Flags + +- Assertion checks for `*-mock` test IDs +- Methods only called in test files +- Mock setup is >50% of test +- Test fails when you remove mock +- Can't explain why mock is needed +- Mocking "just to be safe" + +## The Bottom Line + +**Mocks are tools to isolate, not things to test.** + +If TDD reveals you're testing mock behavior, you've gone wrong. + +Fix: Test real behavior or question why you're mocking at all.