Skip to content

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 10, 2026

Summary

  • Adds proof_logs SQLite table with structured audit trail for every supervisor decision point
  • Instruments 10+ decision points across the pulse lifecycle (evaluate, complete, retry, blocked, verify, deploy, quality gate, deliverable verification, lifecycle transitions)
  • Provides proof-log CLI command with table/JSON/timeline output and aggregate statistics

What it does

The existing state_log table records state transitions but not why decisions were made. The new proof_logs table captures:

Field Purpose
event What happened (evaluate, complete, retry, blocked, verify_pass, etc.)
decision The decision made (e.g., complete:PR_URL, retry:rate_limited)
evidence Evidence used (e.g., exit_code=0, signal=FULL_LOOP_COMPLETE)
decision_maker Who/what decided (e.g., evaluate_worker, quality_gate, pulse:phase1)
duration_secs Stage duration for pipeline latency analysis (t219 prep)

Query interface

supervisor-helper.sh proof-log t001              # Table view
supervisor-helper.sh proof-log t001 --json       # JSON export
supervisor-helper.sh proof-log t001 --timeline   # Stage timing timeline
supervisor-helper.sh proof-log --recent 20       # Recent entries
supervisor-helper.sh proof-log --stats           # Aggregate statistics

Design decisions

  • Best-effort writes: proof-log failures never block the pipeline
  • Immutable append-only: no UPDATE/DELETE on proof_logs (audit integrity)
  • Auto-migration: existing databases get the table via ensure_db()
  • Stage timing: _proof_log_stage_duration() calculates elapsed time between entries (t219 prep)

Testing

  • bash -n syntax check: PASS
  • ShellCheck on new functions: PASS (zero violations)
  • SQLite schema + query verification: PASS
  • DB migration tested with INSERT/SELECT/aggregate queries

Closes

Closes #913 (t218)

Summary by CodeRabbit

  • New Features
    • Added a proof-log feature to track and audit task transitions, decisions, and verification outcomes throughout the system.
    • New proof-log command provides multiple modes to query, export to JSON, view timelines, check recent entries, and review statistics for enhanced traceability and audit verification.

… (t218)

Adds a proof_logs SQLite table and write_proof_log() function that records
immutable evidence at each supervisor decision point. This creates a
structured audit trail answering 'why was this task marked complete?'

Schema: proof_logs(task_id, event, stage, decision, evidence,
decision_maker, pr_url, duration_secs, metadata, timestamp)

Instrumented decision points:
- evaluate_worker outcome (with evaluation tier used)
- complete/retry/blocked/failed decisions in pulse Phase 1
- quality gate escalation
- deliverable verification (PR merged + substantive files)
- post-merge verify pass/fail (VERIFY.md checks)
- lifecycle stage transitions (dispatch, pr_review, merge, deploy, verify)

Query interface: supervisor-helper.sh proof-log <task_id> [--json|--timeline]
  --recent [N]  Show recent entries across all tasks
  --stats       Aggregate statistics (event counts, avg stage durations)
  --timeline    Stage timing timeline with durations (t219 prep)

Stage timing: _proof_log_stage_duration() calculates elapsed time between
proof-log entries, enabling pipeline latency analysis for t219.

Design decisions:
- Best-effort writes (never block pipeline on proof-log failure)
- Immutable append-only (no UPDATE/DELETE on proof_logs)
- Auto-migration for existing databases via ensure_db()
- Follows existing patterns: local var=$1, explicit returns, sql_escape()
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Walkthrough

This PR implements a structured proof-log audit trail system (t218) for task verification. It introduces a SQLite proof_logs table, a write_proof_log() function to record entries at decision points throughout the task lifecycle, helper functions for duration tracking, and a new proof-log CLI command with modes to query, export, and analyze logs for audit verification and traceability.

Changes

Cohort / File(s) Summary
Proof-log Storage & Core Functions
.agents/scripts/supervisor-helper.sh
Adds proof_logs SQLite table creation with columns (id, task_id, event, stage, decision, evidence, decision_maker, pr_url, duration_secs, metadata, timestamp), indices, and DB migration initialization; includes write_proof_log() function for ingesting structured entries with field escaping and _proof_log_stage_duration() helper for computing durations since last relevant entry. Non-blocking error handling preserves pipeline semantics.
Proof-log Injection Points
.agents/scripts/supervisor-helper.sh
Injects proof-log entries across multiple decision points: cmd_transition() for task lifecycle stages (dispatched, complete, retrying, blocked, failed), cmd_pulse()/evaluate_worker() for evaluation outcomes (evaluate, complete, retry, blocked, failed), run_verify_checks() for verification results (verify_pass, verify_fail), and deliverable verification outcomes. Includes duration tracking where applicable.
CLI Command & Integration
.agents/scripts/supervisor-helper.sh
Adds cmd_proof_log() handler supporting multiple modes: display task proof-logs, export to JSON, timeline visualization, recent entries, and statistics. Integrates into main command dispatcher as proof-log entry point.
Documentation & Migration
.agents/scripts/supervisor-helper.sh
Updates help/README text to announce proof-log functionality and t218 migration note; adds migration blocks to ensure proof_logs table exists during upgrades; includes sample log echoes for validation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

📜 Tasks now tell their structured tale,
Each decision, logged without fail,
From dispatch to the final check,
Proof-logs shine, no audit wreck! ✨
Trust flows deep through SQLite's keep.

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Linked Issues check ❓ Inconclusive The PR implements core requirements from issue #913: structured proof_logs table with required fields, instrumentation at decision points, and CLI commands for querying proof-logs. However, the audit command and proof-log enforcement in update_todo_on_complete() are not fully addressed. Verify that update_todo_on_complete() enforces proof-log existence before marking tasks complete, and confirm audit CLI functionality is included or planned for a follow-up.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding structured proof-logs for task completion trust with reference to the linked issue (t218).
Out of Scope Changes check ✅ Passed All changes are focused on proof-log infrastructure: new SQLite table, write functions, instrumentation at decision points, and CLI commands. No unrelated refactoring or scope creep detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t218

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 33 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 05:42:11 UTC 2026: Code review monitoring started
Tue Feb 10 05:42:11 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 33

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 33
  • VULNERABILITIES: 0

Generated on: Tue Feb 10 05:42:14 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@marcusquinn marcusquinn merged commit 1b6da26 into main Feb 10, 2026
9 of 10 checks passed
marcusquinn added a commit that referenced this pull request Feb 10, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @.agents/scripts/supervisor-helper.sh:
- Around line 1808-1819: The stage duration is being calculated by
_proof_log_stage_duration using "$current_state" but then written to
write_proof_log under stage "$new_state", shifting timing one stage forward; to
fix, ensure the duration is attributed to the same stage you calculate — either
call _proof_log_stage_duration with "$new_state" if you intend to measure the
upcoming stage, or (preferably) keep the current call with "$current_state" and
change the write_proof_log --stage argument to "$current_state" so the --stage
value matches the computed _stage_duration; update the invocation around
_proof_log_stage_duration, write_proof_log, and the variables
task_id/current_state/new_state accordingly.
- Around line 10783-10789: The --limit handling currently assigns limit_n from
the next arg without validation; add a numeric check using the same regex
/^[0-9]+$/ that the positional argument uses and reject or default if it fails.
Specifically, after the argument-parsing loop that sets limit_n (the --limit
branch in the case block) and before the main case "$mode" block, validate
limit_n with the regex and if it does not match set limit_n to a safe default
(e.g., 20) or exit with an error, ensuring LIMIT $limit_n in SQL only ever
receives numeric input.
- Around line 10860-10874: The code builds JSON using manual escaping of
evidence/meta via _esc_evidence and _esc_meta which only replaces quotes and
leaves newlines, backslashes, tabs, etc., unescaped; replace that logic by
passing the raw pevidence and pmeta through the existing JSON escaper `jq -Rs .`
(e.g., set _esc_evidence=$(printf '%s' "$pevidence" | jq -Rs .) and similarly
for _esc_meta) and then use those values in the printf that constructs the
object (update the printf placeholders to accept the jq-escaped strings as-is);
apply the same change to the other block around lines 10949-10963 that uses
_esc_evidence/_esc_meta.
- Around line 323-330: The proof log insertion currently interpolates $duration
directly into the SQL; validate that the value from _proof_log_stage_duration()
(variable duration) is a numeric (integer or float) before using it unquoted,
and if it fails validation treat it as NULL or skip adding duration_secs. Update
the block that builds cols and vals so it first checks duration with a numeric
regex (e.g. /^[0-9]+(\.[0-9]+)?$/) or an arithmetic test, only appends
"duration_secs" and the numeric value to cols/vals when valid, and otherwise
either omit the column or insert NULL (ensuring the SQL value is quoted/escaped
consistently). Also keep the existing metadata handling and the db "INSERT INTO
proof_logs (...) VALUES (...)" invocation unchanged except for using the
validated/quoted duration variable.

Comment on lines +323 to +330
if [[ -n "$duration" ]]; then
cols="$cols, duration_secs"; vals="$vals, $duration"
fi
if [[ -n "$metadata" ]]; then
cols="$cols, metadata"; vals="$vals, '$e_meta'"
fi

db "$SUPERVISOR_DB" "INSERT INTO proof_logs ($cols) VALUES ($vals);" 2>/dev/null || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .agents/scripts/supervisor-helper.sh | sed -n '310,340p'

Repository: marcusquinn/aidevops

Length of output: 1375


🏁 Script executed:

# Get full context of the write_proof_log function
rg -n "write_proof_log" .agents/scripts/supervisor-helper.sh -A 80 | head -100

Repository: marcusquinn/aidevops

Length of output: 3937


🏁 Script executed:

# Check for other numeric field handling patterns in the file
rg -n "duration|limit" .agents/scripts/supervisor-helper.sh | grep -E "(vals|cols|INSERT)" | head -20

Repository: marcusquinn/aidevops

Length of output: 128


🏁 Script executed:

# Check where _stage_duration comes from and how it's computed
rg -n "_stage_duration" .agents/scripts/supervisor-helper.sh -B 5 -A 2 | head -60

Repository: marcusquinn/aidevops

Length of output: 2799


🏁 Script executed:

# Look for the sql_escape function to understand how text fields are protected
rg -n "sql_escape" .agents/scripts/supervisor-helper.sh -A 8 | head -30

Repository: marcusquinn/aidevops

Length of output: 1095


🏁 Script executed:

# Check if there are any tests or validation patterns for numeric fields elsewhere
rg -n "^[0-9]|=~ ^" .agents/scripts/supervisor-helper.sh | grep -E "(regex|validate|numeric)" | head -10

Repository: marcusquinn/aidevops

Length of output: 46


🏁 Script executed:

# Get the full _proof_log_stage_duration function to see what it returns
rg -n "_proof_log_stage_duration" .agents/scripts/supervisor-helper.sh -A 30 | head -45

Repository: marcusquinn/aidevops

Length of output: 1794


🏁 Script executed:

# Check the db function to understand SQL execution
rg -n "^db()" .agents/scripts/supervisor-helper.sh -A 15

Repository: marcusquinn/aidevops

Length of output: 571


🏁 Script executed:

# Check what happens when _proof_log_stage_duration returns empty or invalid
rg -n "local _.*_duration" .agents/scripts/supervisor-helper.sh -B 2 -A 5 | head -40

Repository: marcusquinn/aidevops

Length of output: 2174


🏁 Script executed:

# Get the rest of _proof_log_stage_duration to see what it returns
sed -n '341,390p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1397


🏁 Script executed:

# Check if write_proof_log is called from anywhere else with --duration
rg -n "write_proof_log.*--duration" .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 46


🏁 Script executed:

# Verify if there's any validation on duration parameter anywhere in the file
rg -n "duration.*=~|duration.*[0-9]|validate.*duration" .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 634


🏁 Script executed:

# Check for "limit" field mentioned in original scratchpad comments
rg -n "limit.*cols|cols.*limit|vals.*limit" .agents/scripts/supervisor-helper.sh | head -20

Repository: marcusquinn/aidevops

Length of output: 46


🏁 Script executed:

# Search for any other unquoted numeric insertions in SQL queries
rg -n "INSERT INTO.*VALUES" .agents/scripts/supervisor-helper.sh -A 2 -B 2 | grep -E "(\$[a-z_]+[^']|, [a-z_]+ [^'])" | head -20

Repository: marcusquinn/aidevops

Length of output: 46


🏁 Script executed:

# Verify what ShellCheck would say about unquoted variables in SQL context
cat > /tmp/test_duration.sh << 'EOF'
# Test pattern from supervisor-helper.sh
local duration="$1"
local vals="'value', $duration"
db "INSERT INTO test (col1, col2) VALUES ($vals);"
EOF

# Run shellcheck if available (to understand the pattern)
shellcheck /tmp/test_duration.sh 2>&1 || echo "ShellCheck not available, but checking pattern manually"

Repository: marcusquinn/aidevops

Length of output: 724


🏁 Script executed:

# Verify the complete proof_logs schema to understand column types
rg -n "CREATE TABLE proof_logs|duration_secs" .agents/scripts/supervisor-helper.sh -B 3 -A 15 | head -50

Repository: marcusquinn/aidevops

Length of output: 2153


🏁 Script executed:

# Final check: see if there are any explicit input validation patterns used elsewhere in the file for numeric values
rg -n "=~ \^[0-9]|numeric|integer" .agents/scripts/supervisor-helper.sh | head -15

Repository: marcusquinn/aidevops

Length of output: 199


Validate duration as numeric before SQL interpolation to maintain robust automation standards.

Although _proof_log_stage_duration() currently returns only arithmetic results (guaranteed numeric or empty), inserting duration_secs unquoted into SQL violates defensive coding practices. The guidelines require automation scripts to prioritize robustness, and unquoted dynamic values in SQL are a security anti-pattern. Add validation to future-proof against code changes and maintain A-grade standards.

Suggested fix
-    if [[ -n "$duration" ]]; then
-        cols="$cols, duration_secs"; vals="$vals, $duration"
-    fi
+    if [[ -n "$duration" ]]; then
+        if [[ "$duration" =~ ^[0-9]+$ ]]; then
+            cols="$cols, duration_secs"; vals="$vals, $duration"
+        else
+            log_verbose "proof-log: ignoring non-numeric duration '$duration'"
+        fi
+    fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [[ -n "$duration" ]]; then
cols="$cols, duration_secs"; vals="$vals, $duration"
fi
if [[ -n "$metadata" ]]; then
cols="$cols, metadata"; vals="$vals, '$e_meta'"
fi
db "$SUPERVISOR_DB" "INSERT INTO proof_logs ($cols) VALUES ($vals);" 2>/dev/null || true
if [[ -n "$duration" ]]; then
if [[ "$duration" =~ ^[0-9]+$ ]]; then
cols="$cols, duration_secs"; vals="$vals, $duration"
else
log_verbose "proof-log: ignoring non-numeric duration '$duration'"
fi
fi
if [[ -n "$metadata" ]]; then
cols="$cols, metadata"; vals="$vals, '$e_meta'"
fi
db "$SUPERVISOR_DB" "INSERT INTO proof_logs ($cols) VALUES ($vals);" 2>/dev/null || true
🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 323 - 330, The proof log
insertion currently interpolates $duration directly into the SQL; validate that
the value from _proof_log_stage_duration() (variable duration) is a numeric
(integer or float) before using it unquoted, and if it fails validation treat it
as NULL or skip adding duration_secs. Update the block that builds cols and vals
so it first checks duration with a numeric regex (e.g. /^[0-9]+(\.[0-9]+)?$/) or
an arithmetic test, only appends "duration_secs" and the numeric value to
cols/vals when valid, and otherwise either omit the column or insert NULL
(ensuring the SQL value is quoted/escaped consistently). Also keep the existing
metadata handling and the db "INSERT INTO proof_logs (...) VALUES (...)"
invocation unchanged except for using the validated/quoted duration variable.

Comment on lines +1808 to +1819
# Proof-log: record lifecycle stage transitions (t218)
# Only log transitions that represent significant pipeline stages
# (not every micro-transition, to keep proof-logs focused)
case "$new_state" in
dispatched|pr_review|review_triage|merging|merged|deploying|deployed|verifying|verified|verify_failed)
local _stage_duration
_stage_duration=$(_proof_log_stage_duration "$task_id" "$current_state")
write_proof_log --task "$task_id" --event "transition" --stage "$new_state" \
--decision "$current_state->$new_state" \
--evidence "${error_msg:+error=$error_msg}" \
--maker "cmd_transition" \
${pr_url:+--pr-url "$pr_url"} \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stage durations are attributed to the wrong stage.

_proof_log_stage_duration is computed for the current stage, but logged under stage="$new_state", which shifts timings one stage forward in timelines/stats. This undermines the latency analysis objective (t219).

Suggested fix
-            write_proof_log --task "$task_id" --event "transition" --stage "$new_state" \
+            write_proof_log --task "$task_id" --event "transition" --stage "$current_state" \
                 --decision "$current_state->$new_state" \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Proof-log: record lifecycle stage transitions (t218)
# Only log transitions that represent significant pipeline stages
# (not every micro-transition, to keep proof-logs focused)
case "$new_state" in
dispatched|pr_review|review_triage|merging|merged|deploying|deployed|verifying|verified|verify_failed)
local _stage_duration
_stage_duration=$(_proof_log_stage_duration "$task_id" "$current_state")
write_proof_log --task "$task_id" --event "transition" --stage "$new_state" \
--decision "$current_state->$new_state" \
--evidence "${error_msg:+error=$error_msg}" \
--maker "cmd_transition" \
${pr_url:+--pr-url "$pr_url"} \
# Proof-log: record lifecycle stage transitions (t218)
# Only log transitions that represent significant pipeline stages
# (not every micro-transition, to keep proof-logs focused)
case "$new_state" in
dispatched|pr_review|review_triage|merging|merged|deploying|deployed|verifying|verified|verify_failed)
local _stage_duration
_stage_duration=$(_proof_log_stage_duration "$task_id" "$current_state")
write_proof_log --task "$task_id" --event "transition" --stage "$current_state" \
--decision "$current_state->$new_state" \
--evidence "${error_msg:+error=$error_msg}" \
--maker "cmd_transition" \
${pr_url:+--pr-url "$pr_url"} \
🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 1808 - 1819, The stage
duration is being calculated by _proof_log_stage_duration using "$current_state"
but then written to write_proof_log under stage "$new_state", shifting timing
one stage forward; to fix, ensure the duration is attributed to the same stage
you calculate — either call _proof_log_stage_duration with "$new_state" if you
intend to measure the upcoming stage, or (preferably) keep the current call with
"$current_state" and change the write_proof_log --stage argument to
"$current_state" so the --stage value matches the computed _stage_duration;
update the invocation around _proof_log_stage_duration, write_proof_log, and the
variables task_id/current_state/new_state accordingly.

Comment on lines +10783 to +10789
while [[ $# -gt 0 ]]; do
case "$1" in
--json) format="json"; shift ;;
--timeline) mode="timeline"; shift ;;
--recent) mode="recent"; shift ;;
--stats) mode="stats"; shift ;;
--limit) limit_n="${2:-20}"; shift 2 ;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's check the file size and then examine the specific lines mentioned
wc -l .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 106


🏁 Script executed:

# Get context around the first location (10783-10789)
sed -n '10775,10800p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1210


🏁 Script executed:

# Search for where limit_n is actually used in SQL queries
rg "limit_n" .agents/scripts/supervisor-helper.sh -B 2 -A 2

Repository: marcusquinn/aidevops

Length of output: 989


🏁 Script executed:

# Get the secondary location (10876-10890)
sed -n '10870,10900p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1511


🏁 Script executed:

# Search for any validation logic that might occur after the argument parsing
sed -n '10800,10880p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 3540


🏁 Script executed:

# Check if there's any validation on limit_n before it's used in SQL
rg "limit_n" .agents/scripts/supervisor-helper.sh -B 5 -A 5 | head -100

Repository: marcusquinn/aidevops

Length of output: 1987


🏁 Script executed:

# Check if ShellCheck is available and test the script
which shellcheck && shellcheck .agents/scripts/supervisor-helper.sh 2>&1 | head -50

Repository: marcusquinn/aidevops

Length of output: 109


Validate --limit parameter before use in SQL queries.

The --limit option assigns limit_n without validation, unlike the positional argument which validates with ^[0-9]+$. Non-numeric values will cause SQL syntax errors when injected into LIMIT $limit_n;. Add validation to match the existing regex pattern used for positional arguments and ensure query reliability.

Suggested fix
+    # Validate limit_n is numeric before SQL execution
+    if [[ ! "$limit_n" =~ ^[0-9]+$ ]]; then
+        log_error "--limit must be a positive integer"
+        return 1
+    fi

Insert after argument parsing (before the case "$mode" block at line ~10815).

Also applies to: 10876-10890

🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 10783 - 10789, The --limit
handling currently assigns limit_n from the next arg without validation; add a
numeric check using the same regex /^[0-9]+$/ that the positional argument uses
and reject or default if it fails. Specifically, after the argument-parsing loop
that sets limit_n (the --limit branch in the case block) and before the main
case "$mode" block, validate limit_n with the regex and if it does not match set
limit_n to a safe default (e.g., 20) or exit with an error, ensuring LIMIT
$limit_n in SQL only ever receives numeric input.

Comment on lines +10860 to +10874
echo "["
local first=true
while IFS='|' read -r pid ptask pevent pstage pdecision pevidence pmaker ppr pdur pmeta pts; do
[[ -z "$pid" ]] && continue
if [[ "$first" != "true" ]]; then echo ","; fi
first=false
local _esc_evidence="${pevidence:-}"
_esc_evidence="${_esc_evidence//\"/\\\"}"
local _esc_meta="${pmeta:-}"
_esc_meta="${_esc_meta//\"/\\\"}"
printf ' {"id":%s,"task_id":"%s","event":"%s","stage":"%s","decision":"%s","evidence":"%s","decision_maker":"%s","pr_url":"%s","duration_secs":%s,"metadata":"%s","timestamp":"%s"}' \
"$pid" "$ptask" "$pevent" "${pstage:-}" "${pdecision:-}" \
"$_esc_evidence" \
"${pmaker:-}" "${ppr:-}" "${pdur:-null}" \
"$_esc_meta" "$pts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and get its size
wc -l .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 106


🏁 Script executed:

# Read the first code block mentioned (lines 10860-10874)
sed -n '10860,10874p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1043


🏁 Script executed:

# Read the second code block mentioned (lines 10949-10963)
sed -n '10949,10963p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1013


🏁 Script executed:

# Search for jq usage in the script to verify if it's already in use
rg 'jq' .agents/scripts/supervisor-helper.sh -n

Repository: marcusquinn/aidevops

Length of output: 5288


🏁 Script executed:

# Check if there are other JSON escaping patterns in the file
rg 'printf.*%s.*\{' .agents/scripts/supervisor-helper.sh -A 2 -B 2 | head -50

Repository: marcusquinn/aidevops

Length of output: 2123


🏁 Script executed:

# Get more context around the first location (10850-10880)
sed -n '10850,10880p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1668


🏁 Script executed:

# Get more context around the second location (10940-10970)
sed -n '10940,10970p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 1731


🏁 Script executed:

# Test the jq -Rs escaping approach with sample problematic data
cat <<'EOF' > /tmp/test_json_escape.sh
#!/bin/bash

# Test the proposed _json_escape function
_json_escape() { printf '%s' "${1:-}" | jq -Rs .; }

# Test cases with problematic characters
test_data=(
  'simple'
  'with"quotes'
  'with\backslash'
  'with
newline'
  'with	tab'
  'with"both\and
newline'
)

echo "Testing _json_escape function:"
for data in "${test_data[@]}"; do
  escaped=$(_json_escape "$data")
  echo "Input: $data"
  echo "Output: $escaped"
  echo "---"
done
EOF
bash /tmp/test_json_escape.sh

Repository: marcusquinn/aidevops

Length of output: 383


🏁 Script executed:

# Verify the current escaping is insufficient
cat <<'EOF' > /tmp/test_current_escape.sh
#!/bin/bash

# Current escaping approach
current_escape() {
  local var="${1:-}"
  var="${var//\"/\\\"}"
  echo "$var"
}

# Test with problematic data
test_with_newline='foo"bar
baz'
test_with_backslash='foo"bar\baz'

echo "Current escaping results:"
echo "With newline: $(current_escape "$test_with_newline")"
echo "With backslash: $(current_escape "$test_with_backslash")"

# Show how this breaks JSON
echo ""
echo "JSON output with current escaping:"
printf '{"data":"%s"}\n' "$(current_escape "$test_with_newline")"
echo "^ Notice: JSON is broken (newline not escaped)"
EOF
bash /tmp/test_current_escape.sh

Repository: marcusquinn/aidevops

Length of output: 252


JSON export emits invalid JSON for evidence and metadata fields.

The current escaping only removes double quotes via parameter expansion (${var//\"/\\\"}), but leaves newlines, backslashes, tabs, and other control characters unescaped. This breaks JSON structure downstream. Use the proper jq -Rs . JSON escaper (already used throughout the script) for all string values.

Suggested fix
+    _json_escape() { printf '%s' "${1:-}" | jq -Rs .; }
...
-                    local _esc_evidence="${pevidence:-}"
-                    _esc_evidence="${_esc_evidence//\"/\\\"}"
-                    local _esc_meta="${pmeta:-}"
-                    _esc_meta="${_esc_meta//\"/\\\"}"
-                    printf '  {"id":%s,"task_id":"%s","event":"%s","stage":"%s","decision":"%s","evidence":"%s","decision_maker":"%s","pr_url":"%s","duration_secs":%s,"metadata":"%s","timestamp":"%s"}' \
-                        "$pid" "$ptask" "$pevent" "${pstage:-}" "${pdecision:-}" \
-                        "$_esc_evidence" \
-                        "${pmaker:-}" "${ppr:-}" "${pdur:-null}" \
-                        "$_esc_meta" "$pts"
+                    printf '  {"id":%s,"task_id":%s,"event":%s,"stage":%s,"decision":%s,"evidence":%s,"decision_maker":%s,"pr_url":%s,"duration_secs":%s,"metadata":%s,"timestamp":%s}' \
+                        "$pid" \
+                        "$(_json_escape "$ptask")" "$(_json_escape "$pevent")" "$(_json_escape "${pstage:-}")" \
+                        "$(_json_escape "${pdecision:-}")" "$(_json_escape "${pevidence:-}")" \
+                        "$(_json_escape "${pmaker:-}")" "$(_json_escape "${ppr:-}")" \
+                        "${pdur:-null}" "$(_json_escape "${pmeta:-}")" "$(_json_escape "$pts")"

Also applies to: 10949-10963

🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 10860 - 10874, The code
builds JSON using manual escaping of evidence/meta via _esc_evidence and
_esc_meta which only replaces quotes and leaves newlines, backslashes, tabs,
etc., unescaped; replace that logic by passing the raw pevidence and pmeta
through the existing JSON escaper `jq -Rs .` (e.g., set _esc_evidence=$(printf
'%s' "$pevidence" | jq -Rs .) and similarly for _esc_meta) and then use those
values in the printf that constructs the object (update the printf placeholders
to accept the jq-escaped strings as-is); apply the same change to the other
block around lines 10949-10963 that uses _esc_evidence/_esc_meta.

marcusquinn added a commit that referenced this pull request Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

t218: Verification proof-logs — structured audit trail for task completion trust

1 participant