GH#3916: Recover high-value hardening into mergeable replacement PR#4126
GH#3916: Recover high-value hardening into mergeable replacement PR#4126marcusquinn merged 3 commits intomainfrom
Conversation
Salvage high-value security fixes without reverting newer framework behavior by restoring only the stable script hardening changes for content scanning, deploy safety, and settings updates.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Missing issue link. This PR references issue #3916, but the PR body doesn't contain a closing keyword. Add |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThree shell scripts updated: content-scanner now normalizes input before pre-filtering, adds regex keyword support, sentinel handling, and stronger entropy; deploy-agents tightens revision validation, surfaces errors, and uses safer argument handling; settings-helper validates keys and switches to jq path-based setpath/getpath updates. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Wed Mar 11 03:52:45 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/content-scanner-helper.sh:
- Around line 232-245: Entry-point scripts capture payloads with command
substitution which strips trailing newlines before _cs_normalize_nfkc() runs;
update scan-file, scan-stdin, annotate-file, and annotate-stdin to stop using
$() / $(<file) and instead feed the raw bytes into _cs_normalize_nfkc without
intermediate substitution (e.g., pipe the content into the function or redirect
the file into it) so the sentinel logic inside _cs_normalize_nfkc() can preserve
trailing newlines; find calls that look like content=$(...) or content=$(<file)
and replace them with direct redirection or piping into _cs_normalize_nfkc().
In @.agents/scripts/deploy-agents-on-merge.sh:
- Around line 157-163: The git diff invocation that assigns changed_files (git
-C "$REPO_DIR" diff --name-only "$since_commit" HEAD -- '.agents/' || echo "")
is swallowing real git errors by falling back to an empty string; change this so
failures are detected and propagated: run the git diff without the "|| echo
\"\"" fallback, capture its exit status, and on non-zero exit log the git error
(use log_error) and return a failing exit code (non-2) so the deploy script
fails fast; keep the successful output assigned to changed_files when git
returns 0 and only treat an empty changed_files as "no agent changes." Ensure
you reference the changed_files assignment and the git diff command when making
this change.
In @.agents/scripts/settings-helper.sh:
- Around line 264-272: The array branch currently feeds raw JSON to jq
(--argjson v "$value") without validating it, which can cause jq to exit with
code 2 under set -euo pipefail; before calling jq in the branch where value
starts with "[" validate the JSON (e.g., run jq -e . on the raw "$value") and if
validation fails call print_error with a clear message and exit/return non-zero;
keep the existing behavior for the comma-separated path that produces
json_array, but also ensure the jq call that writes to "$tmp_file" uses the same
setpath invocation and variables (key, value, json_array, SETTINGS_FILE,
tmp_file) so the error path is handled consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a684d32c-6555-493b-8a59-1ee747c5177f
📒 Files selected for processing (3)
.agents/scripts/content-scanner-helper.sh.agents/scripts/deploy-agents-on-merge.sh.agents/scripts/settings-helper.sh
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Wed Mar 11 06:21:40 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Addressed the CHANGES_REQUESTED feedback while keeping scope on the recovery-hardening intent:\n\n- Removed newline-stripping input capture in /// by feeding file/stdin directly into \n- Made fail fast on errors instead of swallowing failures\n- Added explicit JSON validation for array !=0 USAGE: COMMANDS: SETTINGS KEYS (dot-notation): PRECEDENCE: EXAMPLES: FILE: DOCS: In .agents/scripts/settings-helper.sh line 31: For more information: Testing keyword pre-filter: Testing NFKC normalization: Testing boundary annotation: Testing full scan pipeline (integration with prompt-guard): ════════════════════════════════════════════════════════════ |
|
Correction: previous comment had shell-formatting artifacts.\n\nResolved CHANGES_REQUESTED items while keeping scope to recovery hardening:\n- Removed newline-stripping input capture in scan-file, scan-stdin, annotate-file, and annotate-stdin by feeding file/stdin directly into the normalizer\n- Updated detect_changes to fail fast when git diff fails (no silent fallback)\n- Added array JSON validation before jq argjson in settings-helper set\n\nVerification rerun:\n- shellcheck .agents/scripts/content-scanner-helper.sh .agents/scripts/deploy-agents-on-merge.sh .agents/scripts/settings-helper.sh\n- bash .agents/scripts/content-scanner-helper.sh test\n- bash .agents/scripts/content-scanner-helper.sh help >/dev/null\n- bash .agents/scripts/deploy-agents-on-merge.sh --help >/dev/null\n- bash .agents/scripts/settings-helper.sh --help >/dev/null |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
.agents/scripts/content-scanner-helper.sh (1)
223-305:⚠️ Potential issue | 🟠 MajorTrailing newlines are still lost at the
_cs_normalize_nfkcAPI boundary.The inner Python/Perl sentinel only protects
_cs_normalize_nfkc()'s own command substitutions. The function strips that sentinel before returning, and its callers still capture the function with$(), so final trailing newlines are dropped again inscan_content,scan-file,scan-stdin,annotate-file, andannotate-stdin.Please keep the sentinel alive until after the outer capture strips it, or switch these flows to a temp-file/process-substitution pattern instead of returning normalized content through
$().Based on learnings, trailing newlines are only preserved when the sentinel survives the command-substitution boundary and is stripped afterward by the caller.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/content-scanner-helper.sh around lines 223 - 305, _cs_normalize_nfkc currently removes its internal sentinel before returning, so callers that capture its output with $() still lose trailing newlines; change the API so the sentinel is left in the output and let callers (e.g. scan_content, scan-file, scan-stdin, annotate-file, annotate-stdin) strip the sentinel after their command-substitution, or refactor callers to avoid $() by using a temp file or process substitution to read normalized content; update _cs_normalize_nfkc and each caller (names above) to agree on preserving-and-stripping the sentinel (or to use the temp-file/process-substitution pattern) so trailing newlines survive the outer command-substitution boundary..agents/scripts/settings-helper.sh (1)
264-271:⚠️ Potential issue | 🟠 MajorThe JSON-array branch still accepts non-array values.
jq -e .only proves that$valueis valid JSON.settings-helper.sh set onboarding.familiarity '{"x":1}'will now succeed and store an object in a field whose default type isarray.Suggested fix
- if [[ "$value" == "["* ]]; then - if ! jq -e . <<<"$value" >/dev/null 2>&1; then - print_error "Invalid array JSON value: $value" + if [[ "$value" == "["* ]]; then + if ! jq -e 'if type == "array" then . else error("not array") end' <<<"$value" >/dev/null 2>&1; then + print_error "Invalid array value: $value" return 1 fi jq --arg k "$key" --argjson v "$value" 'setpath($k | split("."); $v)' "$SETTINGS_FILE" >"$tmp_file"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/settings-helper.sh around lines 264 - 271, In the array) case validate that $value is a JSON array (not just any JSON) before writing: replace the current jq -e . check with a type check such as jq -e 'if type=="array" then . else error("not an array") end' <<<"$value" and keep using jq --arg k "$key" --argjson v "$value" 'setpath($k | split("."); $v)' "$SETTINGS_FILE" >"$tmp_file"; this ensures the array branch (variables: key, value, SETTINGS_FILE, tmp_file) only accepts true JSON arrays.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/deploy-agents-on-merge.sh:
- Around line 214-215: The cleanup steps that run find "$target_scripts_dir"
-mindepth 1 -delete || true and the corresponding fallback at lines ~272-279
must not swallow failures; remove the "|| true" and ensure the script exits on
failure (either rely on existing set -e or explicitly check the command's exit
status and call exit 1 with a clear error message). Update both occurrences so
that if the deletion fails the deploy halts rather than continuing with stale
files, and include a short log (e.g., echo or printf) identifying the failing
command before exiting to aid debugging.
---
Duplicate comments:
In @.agents/scripts/content-scanner-helper.sh:
- Around line 223-305: _cs_normalize_nfkc currently removes its internal
sentinel before returning, so callers that capture its output with $() still
lose trailing newlines; change the API so the sentinel is left in the output and
let callers (e.g. scan_content, scan-file, scan-stdin, annotate-file,
annotate-stdin) strip the sentinel after their command-substitution, or refactor
callers to avoid $() by using a temp file or process substitution to read
normalized content; update _cs_normalize_nfkc and each caller (names above) to
agree on preserving-and-stripping the sentinel (or to use the
temp-file/process-substitution pattern) so trailing newlines survive the outer
command-substitution boundary.
In @.agents/scripts/settings-helper.sh:
- Around line 264-271: In the array) case validate that $value is a JSON array
(not just any JSON) before writing: replace the current jq -e . check with a
type check such as jq -e 'if type=="array" then . else error("not an array")
end' <<<"$value" and keep using jq --arg k "$key" --argjson v "$value"
'setpath($k | split("."); $v)' "$SETTINGS_FILE" >"$tmp_file"; this ensures the
array branch (variables: key, value, SETTINGS_FILE, tmp_file) only accepts true
JSON arrays.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ffa0f8c0-5077-4116-8bdf-e9474438eca8
📒 Files selected for processing (3)
.agents/scripts/content-scanner-helper.sh.agents/scripts/deploy-agents-on-merge.sh.agents/scripts/settings-helper.sh
|
Dispatching worker to address requested review changes.\n- Model: default headless rotation (via headless-runtime-helper.sh)\n- Branch: bugfix/recover-pr-3916-salvage\n- Scope: resolve CHANGES_REQUESTED feedback on PR #4126 and push minimal fixes to the same branch\n- Attempt: 1 of 1\n- Direction: prioritize reviewer-requested corrections over broad refactors, keep PR linked to GH#3916 context |
|
Supervisor pulse intervention: previous worker was stopped after unproductive thrashing.\n\n- Model: sonnet (anthropic/claude-sonnet-4-6)\n- Branch: bugfix/recover-pr-3916-salvage\n- Reason: struggle_ratio 47790 with 0 commits after 34m\n- Diagnosis: worker appears stuck in analysis/retry loop without producing branch updates\n- Next action: dispatched a fresh focused worker on the same branch to address CHANGES_REQUESTED with minimal diff |
|
Dispatching worker to address requested review changes and unblock this PR. Linked issue: #3916. |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Wed Mar 11 08:24:25 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Addressed remaining CHANGES_REQUESTED items and pushed commit 9045111 to this branch. Changes made:
Verification rerun:
|
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.agents/scripts/settings-helper.sh (1)
221-239:⚠️ Potential issue | 🟠 MajorFinish migrating the read path away from jq source strings.
cmd_set()now correctly treats keys as data and preserves empty values, butcmd_get(),cmd_list(), andcmd_export_settings()still do not. These functions use// emptyplus[[ -n ]]or hardcoded filter strings, so documented empty-string settings likeonboarding.work_typeare reported as unknown, and settings cannot be reliably read. Migrate all read paths to--arg+getpath()to match the write path.Affected locations:
cmd_get()lines 183, 192cmd_list()lines 327–328cmd_export_settings()line 429🛠️ Suggested direction
- value=$(jq -r "$jq_path // empty" "$SETTINGS_FILE" 2>/dev/null || echo "") - if [[ -n "$value" ]]; then + value=$(jq -r --arg k "$key" 'getpath($k | split(".")) // "__MISSING__"' "$SETTINGS_FILE" 2>/dev/null || echo "__MISSING__") + if [[ "$value" != "__MISSING__" ]]; then echo "$value" return 0 fi- keys=$(jq -r ".$section | keys[]" "$SETTINGS_FILE" 2>/dev/null) + keys=$(jq -r --arg s "$section" 'getpath([$s]) | keys[]' "$SETTINGS_FILE" 2>/dev/null) ... - value=$(jq -r ".$section.$key" "$SETTINGS_FILE" 2>/dev/null) + value=$(jq -r --arg s "$section" --arg k "$key" 'getpath([$s, $k])' "$SETTINGS_FILE" 2>/dev/null)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/settings-helper.sh around lines 221 - 239, cmd_get(), cmd_list(), and cmd_export_settings() still read settings by embedding the key into jq source strings and using fallbacks like // empty or [[ -n ]], which treats legitimate empty-string values as missing; update their read paths to mirror cmd_set() by passing keys as data via --arg (or --argjson as needed) and using getpath($k|split(".")) to fetch values, detecting "__MISSING__" when absent and preserving empty strings and types. Concretely, replace any jq expressions that interpolate keys or use hardcoded filter strings with patterns like jq -r --arg k "$key" 'getpath($k|split(".")) // "__MISSING__"' (and analogous list/export queries that map keys to getpath calls), remove the // empty and [[ -n ]] presence checks, and ensure the subsequent logic treats "__MISSING__" as absent while accepting "" as a valid value; use _generate_defaults and the same getpath-based type extraction (e.g., | type) to determine value types consistently with cmd_set().
♻️ Duplicate comments (1)
.agents/scripts/content-scanner-helper.sh (1)
305-313:⚠️ Potential issue | 🟠 MajorThe sentinel strip step still drops trailing newlines.
_cs_normalize_nfkc()preserves trailing\nby appending the sentinel before the first capture, but every caller then removes it with$(_cs_strip_normalize_sentinel ...). That second command substitution strips the trailing newlines again, soscan-file,scan-stdin,annotate-file,annotate-stdin, and the in-memory scan path still mutate payloads ending in newlines. Strip the sentinel from the already-captured variable instead of printing it back through$().🛠️ Suggested direction
-_cs_strip_normalize_sentinel() { +_cs_require_normalize_sentinel() { local content="$1" if [[ "$content" != *"$_CS_NORMALIZE_SENTINEL" ]]; then _cs_log_error "Normalization output missing sentinel" return 1 fi - printf '%s' "${content%"$_CS_NORMALIZE_SENTINEL"}" return 0 }normalized=$(_cs_normalize_nfkc "$content") -if ! normalized=$(_cs_strip_normalize_sentinel "$normalized"); then +if ! _cs_require_normalize_sentinel "$normalized"; then return 1 fi +normalized="${normalized%"$_CS_NORMALIZE_SENTINEL"}"Apply the same in-place strip pattern at each file/stdin call site.
This demonstrates the shell behavior directly. Expected result:
broken-bytes=5andfixed-bytes=7, because the second$()drops the two trailing newlines.#!/bin/bash set -euo pipefail sent="__CS_SENTINEL__" normalize() { printf 'line1\n\n%s' "$sent" } strip_via_subshell() { local content="$1" printf '%s' "${content%"$sent"}" } captured=$(normalize) broken=$(strip_via_subshell "$captured") fixed="$captured" fixed="${fixed%"$sent"}" printf 'broken-bytes=%s\n' "$(printf '%s' "$broken" | wc -c | tr -d ' ')" printf 'fixed-bytes=%s\n' "$(printf '%s' "$fixed" | wc -c | tr -d ' ')"Based on learnings, trailing newlines in
.agents/scripts/content-scanner-helper.sh_cs_normalize_nfkc()must be stripped with${var%x}after the in-process sentinel, because a separate command substitution drops them again.Also applies to: 375-376, 461-468, 480-485, 524-529, 543-548
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/content-scanner-helper.sh around lines 305 - 313, The sentinel-strip approach currently uses a subshell printf via _cs_strip_normalize_sentinel which causes trailing newlines to be lost; update callers of _cs_normalize_nfkc() (e.g., scan-file, scan-stdin, annotate-file, annotate-stdin and the in-memory scan path) to strip the sentinel in-place on the captured variable instead of reprinting it: after capturing the normalized output do captured_var="${captured_var%$_CS_NORMALIZE_SENTINEL}" (or use the exact local var names used at each call site) rather than feeding the value through $(_cs_strip_normalize_sentinel ...); you can keep _cs_strip_normalize_sentinel for validation but change call sites to use the shell parameter expansion pattern to preserve trailing newlines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In @.agents/scripts/settings-helper.sh:
- Around line 221-239: cmd_get(), cmd_list(), and cmd_export_settings() still
read settings by embedding the key into jq source strings and using fallbacks
like // empty or [[ -n ]], which treats legitimate empty-string values as
missing; update their read paths to mirror cmd_set() by passing keys as data via
--arg (or --argjson as needed) and using getpath($k|split(".")) to fetch values,
detecting "__MISSING__" when absent and preserving empty strings and types.
Concretely, replace any jq expressions that interpolate keys or use hardcoded
filter strings with patterns like jq -r --arg k "$key" 'getpath($k|split("."))
// "__MISSING__"' (and analogous list/export queries that map keys to getpath
calls), remove the // empty and [[ -n ]] presence checks, and ensure the
subsequent logic treats "__MISSING__" as absent while accepting "" as a valid
value; use _generate_defaults and the same getpath-based type extraction (e.g.,
| type) to determine value types consistently with cmd_set().
---
Duplicate comments:
In @.agents/scripts/content-scanner-helper.sh:
- Around line 305-313: The sentinel-strip approach currently uses a subshell
printf via _cs_strip_normalize_sentinel which causes trailing newlines to be
lost; update callers of _cs_normalize_nfkc() (e.g., scan-file, scan-stdin,
annotate-file, annotate-stdin and the in-memory scan path) to strip the sentinel
in-place on the captured variable instead of reprinting it: after capturing the
normalized output do captured_var="${captured_var%$_CS_NORMALIZE_SENTINEL}" (or
use the exact local var names used at each call site) rather than feeding the
value through $(_cs_strip_normalize_sentinel ...); you can keep
_cs_strip_normalize_sentinel for validation but change call sites to use the
shell parameter expansion pattern to preserve trailing newlines.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: caefdbd5-82f1-4e6c-9fed-f7a10c3f53f4
📒 Files selected for processing (3)
.agents/scripts/content-scanner-helper.sh.agents/scripts/deploy-agents-on-merge.sh.agents/scripts/settings-helper.sh
|
Dispatching worker to push PR #4126 to merge-ready state.\n- Model: alternating default via headless runtime helper\n- Branch: bugfix/recover-pr-3916-salvage\n- Scope: resolve actionable blockers and stabilize merge readiness\n- Attempt: 1 of 1\n- Direction: focus only on blockers preventing merge |



Summary
content-scanner-helper.sh(normalization order, regex-safe prefilter handling, stronger boundary-id fallback, newline-safe normalization)deploy-agents-on-merge.sh,settings-helper.sh) for commit validation, safer file ops, and jq/setpath key handlingVerification
shellcheck .agents/scripts/content-scanner-helper.sh .agents/scripts/deploy-agents-on-merge.sh .agents/scripts/settings-helper.sh(SC1091 info-only)bash .agents/scripts/content-scanner-helper.sh testbash .agents/scripts/content-scanner-helper.sh help >/dev/nullbash .agents/scripts/deploy-agents-on-merge.sh --help >/dev/nullbash .agents/scripts/settings-helper.sh --help >/dev/nullContext
Closes #3916
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests