-
-
Notifications
You must be signed in to change notification settings - Fork 751
docs(writing-skills): add automation-over-documentation lesson #146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs(writing-skills): add automation-over-documentation lesson #146
Conversation
- Tested with RED-GREEN-REFACTOR using 4 pressure scenarios - Scenarios validate mechanical vs judgment decision-making - All tests pass: agents correctly identify automation opportunities - Word count optimized to 424 words (target: ≤500) - Integration points added to SKILL.md line 500 Test results: 4/4 scenarios pass (GREEN phase: first try) Test methodology: Baseline without doc showed documentation failures, tested version correctly guided agents to mechanical enforcement
WalkthroughAdded explanatory content to the writing-skills module: a deep-dive case study was inserted into Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
skills/writing-skills/SKILL.md (1)
500-501: Clarify cross-reference format for discoverability.Reference formatting aligns with CSO guidelines (no @ force-load), but consider whether this warrants a "RECOMMENDED" marker (similar to REQUIRED patterns in lines 256-257) to signal importance, or if optional "see also" is intentional.
skills/writing-skills/automation-over-documentation.md (1)
63-63: Expand Defense-in-Depth concept for completeness.Line 63 briefly mentions layering "mechanical enforcement (code) + judgment guidance (skills) + outcome validation (tests)". This is a powerful concept that deserves a dedicated section with examples. Consider adding:
- How each layer protects against failures of the others
- When to apply all three vs. subset
- Examples beyond emoji (e.g., line-length automation + readability skill + test validation)
This would strengthen the framework's applicability to future constraint problems.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
skills/writing-skills/SKILL.md(1 hunks)skills/writing-skills/automation-over-documentation.md(1 hunks)
🔇 Additional comments (3)
skills/writing-skills/SKILL.md (1)
500-501: Inconsistency between summary and provided code.The AI summary claims the Deep dive note is inserted twice (after violation-symptoms and after the STOP section), but the provided code shows only one instance at line 500. Please confirm whether the reference should appear in multiple locations per the PR design.
skills/writing-skills/automation-over-documentation.md (2)
1-79: Framework effectively addresses PR problem and aligns with TDD discipline.The document establishes a clear, actionable decision framework grounded in TDD signal recognition ("REFACTOR not stabilizing = wrong tool"). The emoji case study concretely demonstrates the problem, and the psychological rationale (lines 50-52) explains why documentation fails for mechanical constraints. This directly addresses the PR objective of reducing wasted TDD cycles on documentation-only solutions for mechanical issues.
52-52: Citation "Cialdini 2021" is accurate.Robert B. Cialdini published "Influence, New and Expanded: The Psychology of Persuasion" in May 2021 (Harper Business), which updates the classic work with new research and adds a seventh principle. The citation year is correct.
Likely an incorrect or invalid review comment.
| ## Cost-Benefit | ||
|
|
||
| | Approach | Cost | Benefit | Maintenance | | ||
| |----------|------|---------|-------------| | ||
| | Documentation | Multiple TDD cycles, bloat | None (0/5 tests) | High (whack-a-mole) | | ||
| | Automation | 15 lines code | 100% (5/5 tests) | Near-zero | | ||
|
|
||
| **ROI:** Automation paid for itself first test run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temper cost-benefit claims; frame as emoji-specific case study.
The cost-benefit table presents very strong claims ("None (0/5 tests)" vs. "100% (5/5 tests)") based on the emoji-stripping case. This may not generalize to all mechanical constraints (e.g., line-length limits, JSON schema validation, or whitespace normalization may have different failure modes or false-positive rates).
Consider reframing as: "In the emoji-stripping case study, automation succeeded immediately" rather than implying universal success across all mechanical constraints.
🤖 Prompt for AI Agents
In skills/writing-skills/automation-over-documentation.md around lines 65 to 72,
the cost-benefit table overgeneralizes results from the emoji-stripping example
by claiming universal failure for documentation and universal success for
automation; update the phrasing and table to frame these as results from the
specific emoji-stripping case study only (e.g., change "None (0/5 tests)" and
"100% (5/5 tests)" to "None in this case (0/5 tests)" and "100% in this case
(5/5 tests)" or add a footnote/row header stating these numbers apply to the
emoji-stripping case), and add a short sentence after the table noting that
outcomes may vary for other mechanical constraints like line-length, JSON
schema, or whitespace normalization.
|
Thanks @coderabbitai for the review! Addressing your feedback: Nitpick 1: RECOMMENDED marker for cross-reference (SKILL.md:500)Decision: Won't implement The current "Deep dive:" format is intentional. It signals optional reference material for readers who want deeper understanding. Adding "RECOMMENDED" would over-emphasize it and break the existing pattern where only workflow-critical items (like line 525's "REQUIRED SUB-SKILL") receive markers. The reference is already discoverable in its current location within the testing methodology section. Nitpick 2: Expand Defense-in-Depth concept (automation-over-documentation.md:63)Decision: Won't implement While this would be valuable content, it conflicts with testing methodology requirements:
The existing example is clear and actionable: "Strip emojis (Python) → guide severity labeling (skill) → validate actionability (test)" Future iteration could explore this in a separate reference doc if demand emerges. Additional Comment: Inconsistency about double insertionClarification: False positive The reference appears only once (line 500 of SKILL.md). There is no double insertion. The AI summary's claim that it appears "after violation-symptoms and after the STOP section" is incorrect. Verification: $ grep -n "automation-over-documentation" skills/writing-skills/SKILL.md
500:**Deep dive:** See automation-over-documentation.md... |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1339 @EthanJStark Thank you for the detailed explanations and for catching that error in the AI summary! Verification Confirmed ✓You're absolutely correct—the reference appears only once at line 500. The AI summary's claim about double insertion was inaccurate. I apologize for that confusion. Your DecisionsI appreciate your thoughtful reasoning on both nitpicks: Nitpick 1 (RECOMMENDED marker): Your point about maintaining the existing pattern makes sense. The "Deep dive:" format clearly signals optional reference material, and overusing markers would dilute their meaning. Nitpick 2 (Defense-in-Depth expansion): The constraints you've outlined are compelling—especially the word count budget (424/500) and the testing requirements per the ws skill Iron Law. Since the current content passed all 4 pressure scenarios, expansion isn't justified at this time. Your approach of deferring potential enhancements to future iterations if demand emerges is pragmatic. Thanks for taking the time to provide such thorough responses! The PR looks good to go. |
- Use emoji library for production-ready approach - Include commented regex alternative for zero dependencies - Addresses CodeRabbit feedback: code examples must be runnable - Word count: 442 words (still under 500 target)
|
@coderabbitai Good catch on the placeholder code! Updated with working implementation (commit 1133e33): Production approach: Uses import emoji
def strip_emojis(text):
return emoji.replace_emoji(text, replace='')Zero-dependency alternative: Included as commented regex for environments where adding dependencies isn't preferred. Why emoji library over full regex:
Word count: 443 words (up from 424, still well under 500 target) Thanks for ensuring the code examples meet the "complete and runnable" standard! |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3361 @EthanJStark Excellent work! 🎉 I've verified the implementation in commit 1133e33, and it looks great: Code Quality ✓Primary implementation:
Zero-dependency fallback:
Design Decision ✓Your rationale for choosing the
The inclusion of both approaches serves readers well—they get a production-quality example and a self-contained alternative. Document Quality ✓
The PR is in great shape. Thanks for the thorough implementation! If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
Problem
Skills that enforce mechanical constraints (emoji prohibition, line length, format validation) fail to achieve compliance despite multiple TDD cycles. Documentation-based approaches lead to endless rationalization whack-a-mole where each REFACTOR closes one loophole only to reveal another.
Root Cause
Fighting LLM training with documentation is ineffective for mechanical constraints. Models are trained on data that uses emojis for status indicators, varies line lengths naturally, and prioritizes readability over strict formatting. Asking agents to override this training through documentation creates cognitive dissonance.
Solution
Add decision framework distinguishing mechanical enforcement from judgment guidance:
Framework aligns with Anthropic's official guidance on matching specificity to task fragility.
Why This Approach
Evidence-based validation: Tested with RED-GREEN-REFACTOR using 4 pressure scenarios. Baseline showed agents consistently suggesting documentation improvements for mechanical constraints. With framework, agents immediately identified automation opportunities while correctly preserving documentation for judgment calls.
Operationalizes Anthropic best practices: Their best practices for skills docs recommend automation for "fragile, error-prone operations requiring exact sequences" and documentation for "decisions depend on context." Our framework provides concrete decision criteria and red flags.
Prevents wasted TDD cycles: Recognizing when REFACTOR phase won't stabilize (mechanical constraint, wrong tool) saves weeks of iteration. Reference doc includes cost-benefit analysis showing 15 lines of automation code beats endless documentation refinement.
Integration
Added to writing-skills as reference doc (424 words) with integration point in SKILL.md decision framework section. Complements existing TDD methodology by helping identify when automation should replace documentation entirely.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.