feat: add critical thinking directive and Socratic pre-flight questions#927
Conversation
…ns to subjective agents Add a critical thinking section to build.txt that applies to all non-trivial output: Is this a good idea? Compared to what? At what cost? Based on what evidence? Add domain-specific pre-flight questions to 10 Tier 1 agents that produce subjective, advisory, or creative output: - legal: cite law, jurisdiction conflicts, consequences, opposing arguments - health: peer-reviewed evidence, mechanisms, bias types, controlled experiments - accounts: tax inspector perspective, substance vs form, audit trail - marketing: offer value, unique solution, benefits before features, provable claims - sales: hook, need, desire, price positioning, ability to pay, close, consolidation - content/research: first principles, root cause, bias, evidence reliability - content/story: theme, takeaway, story tension, protagonist choice - content/writing: reader action, front-loaded value, section necessity, tone - youtube/script-writer: single takeaway, 30-second retention, viewer knowledge, differentiation - seo/eeat-score: citations, backlinks, NAP consistency, entity density, first-hand experience Tested via response-scoring framework: Socratic pre-flight improved output quality by +0.73 avg across Sonnet/Opus/Gemini, with legal seeing the largest gain (+1.13).
WalkthroughAdds "Pre-flight Questions" guidance sections across multiple agent documentation files to establish preparatory checklists before workflow execution. Additionally expands the E-E-A-T scoring framework with detailed criteria and enhances marketing documentation with FluentCRM setup guidance. Changes
Estimated Code Review Effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Tip We've launched Issue Planner and it is currently in beta. Please try it out and share your feedback on Discord! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the quality and reliability of AI agent outputs by integrating a foundational critical thinking directive and tailored pre-flight questions into specific agent workflows. These additions compel agents to consider deeper implications, evidence, and potential biases before generating responses, leading to more robust, contextually appropriate, and objectively sound results, as validated by improved performance metrics across diverse task types. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 10 14:23:29 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable enhancement by adding critical thinking directives and domain-specific 'pre-flight questions' to several AI agents. The added prompts are well-crafted and should significantly improve the quality and thoughtfulness of the generated output, as demonstrated by the test results. My review includes a couple of minor suggestions to further improve the clarity and readability of the new prompts in marketing.md and build.txt.
|
|
||
| 1. Is the offer valuable? What specific problem does it solve, and is that problem real and painful? | ||
| 2. What is unique about our solution — what do we offer that alternatives don't? | ||
| 3. What are the benefits (outcomes the buyer gets) before the features (how it works)? |
There was a problem hiding this comment.
This question's phrasing is a bit ambiguous. 'Benefits... before the features' could be clearer. To ensure the AI correctly prioritizes benefits in its output, consider rephrasing this to be more direct.
| 3. What are the benefits (outcomes the buyer gets) before the features (how it works)? | |
| 3. Are we leading with benefits (the outcomes for the buyer) over features (how it works)? |
| For all non-trivial output: Is this a good idea? Compared to what? At what cost? Based on | ||
| what evidence? Evaluate whether action is necessary — doing nothing is a valid option. | ||
| Ensure objective understanding, distinguish nuance, and consider unintended consequences | ||
| or third-order effects. Weigh value against cost and effort before proceeding. |
There was a problem hiding this comment.
The current paragraph formatting, with its awkward line breaks, makes the core critical thinking questions less prominent and harder to read. To improve clarity and ensure the AI gives these questions proper weight, consider pulling them out into a bulleted list.
For all non-trivial output, ask:
- Is this a good idea?
- Compared to what?
- At what cost?
- Based on what evidence?
Evaluate whether action is necessary — doing nothing is a valid option. Ensure objective understanding, distinguish nuance, and consider unintended consequences or third-order effects. Weigh value against cost and effort before proceeding.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.agents/seo/eeat-score.md (1)
68-670:⚠️ Potential issue | 🟡 MinorCommit message understates scope—comprehensive E-E-A-T documentation delivered beyond pre-flight questions.
The git history confirms the scope concern: commit
08a7f58claims to add only "Socratic pre-flight questions" (citations, backlinks, NAP consistency, entity density, first-hand experience) but actually introduces the entire 670-line eeat-score.md file containing full Scoring Criteria, Analysis Prompts, Usage, Configuration, and Interpretation sections.This is a significant undisclosed scope expansion. The pre-flight questions themselves (5 items listed in the commit message) represent perhaps 20-30 lines; the remaining 640+ lines of comprehensive E-E-A-T methodology documentation are not mentioned in the PR description at all.
Update the commit message to accurately reflect what was delivered:
feat: add critical thinking directive and E-E-A-T scoring agent with pre-flight questions Add critical thinking section to build.txt and domain-specific pre-flight questions to 10 Tier 1 agents. Introduce comprehensive E-E-A-T Score agent (.agents/seo/eeat-score.md) with detailed scoring criteria, LLM analysis prompts, usage patterns, and interpretation guidance. - seo/eeat-score: 7 scoring criteria (authorship, citations, effort, originality, intent, subjective quality, writing) with 1-10 scales, pre-flight checks (citations, backlinks, NAP consistency, entity density, first-hand experience)This clarifies that the PR intentionally expands scope to deliver a complete E-E-A-T scoring framework, not just pre-flight questions.
🤖 Fix all issues with AI agents
In @.agents/content/production/writing.md:
- Around line 29-37: The new "Pre-flight Questions" section duplicates an
existing cross-agent pattern; run a repo search for the "Pre-flight Questions"
header and the specific questions (or search for the files listed:
.agents/content/research.md, .agents/content/story.md,
.agents/youtube/script-writer.md, .agents/seo/eeat-score.md, .agents/sales.md,
.agents/legal.md, .agents/accounts.md, .agents/marketing.md, .agents/health.md)
and either remove the duplicate from .agents/content/production/writing.md or
refactor it to reference the canonical shared guidance (keep the four questions
but replace with a link or note pointing to the established central file),
ensuring the unique header "Pre-flight Questions" is not redefined in multiple
agent docs.
In @.agents/content/story.md:
- Around line 29-37: The "single takeaway" question in .agents/content/story.md
duplicates guidance in .agents/youtube/script-writer.md (Story.md Question 2 vs
YouTube script-writer.md Question 1); run the repository-wide search (rg) to
locate the existing wording, then either consolidate the duplicate into a single
shared guideline or explicitly differentiate the two (e.g., "story-level single
takeaway" vs "script-section alignment") so they don't conflict; update
.agents/content/story.md (the pre-flight Questions block) to reference the
consolidated/shared guideline or to use distinct scope language, and adjust any
overlapping "tension/transformation/resolution" phrasing similarly to avoid
semantic overlap with the YouTube agent checks.
In @.agents/health.md:
- Around line 34-43: The "Pre-flight Questions" block in the health agent is
duplicated across domains and must be centralized: remove the duplicated
"Pre-flight Questions" section from ".agents/health.md" and add a single
canonical copy into root AGENTS.md under a new "Pre-flight Questions" pattern;
update domain agent docs (legal, story, production, research, youtube, seo,
health, marketing, accounts, sales) to reference that canonical header and
implement only domain-specific tailoring locally (e.g., a short "Domain notes"
subsection), and record an `rg` search in the commit message or PR description
showing where duplicates were found; ensure the header text remains identical in
AGENTS.md and that all removed copies are replaced with a one-line reference
pointing to the canonical pattern.
🧹 Nitpick comments (4)
.agents/sales.md (1)
84-91: Optional: vary sentence openings for readability.Several list items start with “What…”, which can feel repetitive. Consider rephrasing 1–2 items for flow.
.agents/content/research.md (1)
35-40: Optional: vary sentence openings for readability.Multiple items begin with “What…”. Consider mixing in alternative phrasing to keep the checklist crisp.
.agents/health.md (1)
38-42: Optional: vary sentence openings for readability.Several items start with “What…”. Consider rephrasing one or two questions to improve cadence.
.agents/legal.md (1)
41-45: Optional: vary sentence openings for readability.Several items start with “What…”. Consider rephrasing one item to reduce repetition.
| ## Pre-flight Questions | ||
|
|
||
| Before generating copy, scripts, or long-form content, work through: | ||
|
|
||
| 1. What is the one thing the reader should do after reading this? | ||
| 2. Is the value front-loaded — would someone get something useful from the first paragraph alone? | ||
| 3. Is every section earning its place, or am I padding to fill a word count? | ||
| 4. Does the tone match the context — and am I matching it deliberately or defaulting? | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "Pre-flight Questions|front-loaded|tone match|padding" .agents/Repository: marcusquinn/aidevops
Length of output: 5028
🏁 Script executed:
# Also search more broadly for similar instruction patterns in writing/content agents
rg -n "Questions|reader should|value|word count" .agents/ | head -30Repository: marcusquinn/aidevops
Length of output: 3096
🏁 Script executed:
# List the structure of .agents/ to understand the scope
find .agents/ -name "*.md" -type f | head -20Repository: marcusquinn/aidevops
Length of output: 702
Search before adding instructions was skipped; "Pre-flight Questions" already exists across 9 agent files.
The guideline requiring rg search before adding instructions was not followed. This pattern already exists in .agents/content/research.md, .agents/content/story.md, .agents/youtube/script-writer.md, .agents/seo/eeat-score.md, .agents/sales.md, .agents/legal.md, .agents/accounts.md, .agents/marketing.md, and .agents/health.md. While the specific questions added here are contextually appropriate for the writing agent (reader action, front-loading value, avoiding padding), the developer should have confirmed awareness of this established pattern before adding. Future additions should follow the search-first guideline.
🤖 Prompt for AI Agents
In @.agents/content/production/writing.md around lines 29 - 37, The new
"Pre-flight Questions" section duplicates an existing cross-agent pattern; run a
repo search for the "Pre-flight Questions" header and the specific questions (or
search for the files listed: .agents/content/research.md,
.agents/content/story.md, .agents/youtube/script-writer.md,
.agents/seo/eeat-score.md, .agents/sales.md, .agents/legal.md,
.agents/accounts.md, .agents/marketing.md, .agents/health.md) and either remove
the duplicate from .agents/content/production/writing.md or refactor it to
reference the canonical shared guidance (keep the four questions but replace
with a link or note pointing to the established central file), ensuring the
unique header "Pre-flight Questions" is not redefined in multiple agent docs.
| ## Pre-flight Questions | ||
|
|
||
| Before committing to a narrative angle or hook, work through: | ||
|
|
||
| 1. What is the theme — the universal truth this content explores? | ||
| 2. What is the single takeaway — what should the audience think, feel, or do differently? | ||
| 3. Does this tell a story — is there tension, transformation, and resolution? | ||
| 4. Who is the protagonist — the audience, a character, or the brand — and is that the right choice? | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for existing narrative/story guidance in agents that might duplicate the new pre-flight questions
echo "=== Checking for theme/takeaway/story structure guidance ==="
rg -i "theme|takeaway|transformation|protagonist" .agents/ -C 2 --type md
echo -e "\n=== Checking for pre-flight or preparatory narrative questions ==="
rg -i "before (committing|generating|creating)|pre-flight|preliminary" .agents/ -C 2 --type md
echo -e "\n=== Checking for narrative arc/story structure guidance ==="
rg -i "tension.*transformation|narrative.*arc|story.*structure" .agents/ -C 2 --type md
echo -e "\n=== Listing all .md files in .agents/ ==="
find .agents/ -name "*.md" -type fRepository: marcusquinn/aidevops
Length of output: 50379
🏁 Script executed:
#!/bin/bash
# Focused verification on the specific pre-flight questions
echo "=== Story.md new pre-flight questions ==="
sed -n '29,37p' .agents/content/story.md
echo -e "\n=== YouTube script-writer pre-flight section ==="
grep -A 5 "Before writing a video script" .agents/youtube/script-writer.md
echo -e "\n=== Compare: 'single takeaway' across agents ==="
rg "single takeaway" .agents/ -C 1 --type mdRepository: marcusquinn/aidevops
Length of output: 1588
Coding guideline violation: "Single takeaway" question duplicates existing guidance in .agents/youtube/script-writer.md.
Before adding instructions to .agents/**/*.md files, the repository's coding guideline requires searching for existing similar guidance using rg. This search was not performed.
Duplication found:
- Story.md Question 2: "What is the single takeaway — what should the audience think, feel, or do differently?"
- YouTube script-writer.md Question 1: "What is the single takeaway — and does every section of the script serve it?"
Both questions address the same concept (identifying the core message) and should either be consolidated or explicitly differentiated by scope (story-level takeaway vs. script-section alignment). Additionally, the story.md "tension/transformation/resolution" question semantically overlaps with the YouTube agent's pre-flight checks.
Differentiate these questions to avoid conflicting narrative guidance across agents, or consolidate the shared framework into a shared reference document if the intent is identical.
🤖 Prompt for AI Agents
In @.agents/content/story.md around lines 29 - 37, The "single takeaway"
question in .agents/content/story.md duplicates guidance in
.agents/youtube/script-writer.md (Story.md Question 2 vs YouTube
script-writer.md Question 1); run the repository-wide search (rg) to locate the
existing wording, then either consolidate the duplicate into a single shared
guideline or explicitly differentiate the two (e.g., "story-level single
takeaway" vs "script-section alignment") so they don't conflict; update
.agents/content/story.md (the pre-flight Questions block) to reference the
consolidated/shared guideline or to use distinct scope language, and adjust any
overlapping "tension/transformation/resolution" phrasing similarly to avoid
semantic overlap with the YouTube agent checks.
| ## Pre-flight Questions | ||
|
|
||
| Before generating health-related output, work through: | ||
|
|
||
| 1. What does the peer-reviewed evidence say? Cite studies, not opinions. | ||
| 2. What is the mechanism of action — can it be explained physiologically? | ||
| 3. What biases could be influencing this conclusion — confirmation, survivorship, selection, funding? | ||
| 4. What would a controlled experiment look like to test this claim? | ||
| 5. What are the risks of acting on this vs doing nothing — and for whom? | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "Pre-flight Questions|peer-reviewed|mechanism of action|controlled experiment|risks" .agents/Repository: marcusquinn/aidevops
Length of output: 1722
🏁 Script executed:
# Also check the root AGENTS.md for similar guidance patterns
cat .agents/AGENTS.md | head -100Repository: marcusquinn/aidevops
Length of output: 5281
🏁 Script executed:
# Check if there are other agent files with similar critical thinking patterns
fd -t f "\.md$" .agents/ | head -20Repository: marcusquinn/aidevops
Length of output: 712
🏁 Script executed:
grep -n "Pre-flight" .agents/AGENTS.mdRepository: marcusquinn/aidevops
Length of output: 46
🏁 Script executed:
# Check the size of root AGENTS.md to see if it's close to the ~50-100 instruction budget
wc -l .agents/AGENTS.mdRepository: marcusquinn/aidevops
Length of output: 85
Consolidate "Pre-flight Questions" pattern to root AGENTS.md—do not replicate across domain agents.
Guideline violation confirmed: No prior rg search documented before adding this instruction. Additionally, "Pre-flight Questions" is already duplicated across 9 domain agents (legal, story, production/writing, research, youtube, seo, health, marketing, accounts, sales), violating the single-source-of-truth principle. This universal pattern should be centralized in root AGENTS.md with domain-specific tailoring applied locally, not duplicated verbatim across domains. Root AGENTS.md is currently 396 lines (4× the recommended ~50-100 instruction budget); consolidate this pattern and apply similar deduplication across other repeated instruction blocks to restore adherence to the architectural guidelines.
🧰 Tools
🪛 LanguageTool
[style] ~41-~41: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...n, survivorship, selection, funding? 4. What would a controlled experiment look like...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~42-~42: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...riment look like to test this claim? 5. What are the risks of acting on this vs doin...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🤖 Prompt for AI Agents
In @.agents/health.md around lines 34 - 43, The "Pre-flight Questions" block in
the health agent is duplicated across domains and must be centralized: remove
the duplicated "Pre-flight Questions" section from ".agents/health.md" and add a
single canonical copy into root AGENTS.md under a new "Pre-flight Questions"
pattern; update domain agent docs (legal, story, production, research, youtube,
seo, health, marketing, accounts, sales) to reference that canonical header and
implement only domain-specific tailoring locally (e.g., a short "Domain notes"
subsection), and record an `rg` search in the commit message or PR description
showing where duplicates were found; ensure the header text remains identical in
AGENTS.md and that all removed copies are replaced with a one-line reference
pointing to the canonical pattern.
marcusquinn
left a comment
There was a problem hiding this comment.
Review Response
CodeRabbit feedback — all dismissed with evidence:
1. "eeat-score.md scope expansion" — Incorrect. The file already existed (670 lines). This PR added only 11 lines (the pre-flight section). Confirmed by git diff --stat: .agents/seo/eeat-score.md | 11 +++++++++++. CodeRabbit is seeing the full file context, not the actual diff.
2. "Centralize pre-flight questions into AGENTS.md" — Deliberately rejected. We tested this design decision in the conversation: putting a pointer in AGENTS.md is redundant when the subagent already carries its own questions. This follows the existing progressive disclosure pattern (AGENTS.md = pointers, subagents = detail).
3. "Duplicate single takeaway between story.md and script-writer.md" — Intentional. Story (narrative design) and script-writer (video production) are different stages of the pipeline. Both independently need to verify the single takeaway. A story can have a clear takeaway that gets lost in script adaptation — the check at both stages catches this.
4. Nitpicks about "What..." repetition — These are pre-flight checklists, not prose. Consistent question structure aids scanning. No change needed.
Gemini feedback — acknowledged, no changes needed:
Minor suggestions only, no blocking issues identified.
All CI checks passing. No substantive changes required.
All suggestions are either factually incorrect (scope claim based on full file context, not actual diff) or contradict deliberate design decisions (progressive disclosure pattern). See review comment for detailed evidence.



Summary
build.txtthat applies to all non-trivial output: Is this a good idea? Compared to what? At what cost? Based on what evidence?Evidence
Tested via the response-scoring framework with 18 responses across 3 models (Sonnet, Opus, Gemini) and 3 task types (research, story, legal):
Average improvement: +0.73 points on a 5-point scale. Legal saw the largest gain because all models missed jurisdiction awareness and disclaimers without the pre-flight prompt.
Files Changed (11)
build.txt— Critical thinking directive (4 lines)legal.md— Law citation, jurisdiction, consequences, opposing argumentshealth.md— Peer-reviewed evidence, mechanisms, bias types, experimentsaccounts.md— Tax inspector perspective, substance vs form, audit trailmarketing.md— Offer value, unique solution, benefits > features, provable claimssales.md— Hook, need, desire, price, ability to pay, close, consolidationcontent/research.md— First principles, root cause, bias, evidence reliabilitycontent/story.md— Theme, takeaway, story tension, protagonist choicecontent/production/writing.md— Reader action, front-loaded value, toneyoutube/script-writer.md— Single takeaway, 30s retention, differentiationseo/eeat-score.md— Citations, backlinks, NAP consistency, entity densityDesign Decisions
Summary by CodeRabbit