Skip to content

fix: security hardening — path traversal, injection, jq filter bug (batch 3)#3877

Merged
marcusquinn merged 6 commits intomarcusquinn:mainfrom
johnwaldo:bugfix/critical-batch3
Mar 9, 2026
Merged

fix: security hardening — path traversal, injection, jq filter bug (batch 3)#3877
marcusquinn merged 6 commits intomarcusquinn:mainfrom
johnwaldo:bugfix/critical-batch3

Conversation

@johnwaldo
Copy link
Contributor

@johnwaldo johnwaldo commented Mar 8, 2026

Summary

Security hardening fixes for 3 scripts addressing path traversal, command injection, and data corruption vulnerabilities.

Issues Fixed

Issue File Fix
#3152 worker-token-helper.sh Validate --token-file path is within TOKEN_DIR (prevents path traversal + arbitrary file deletion); validate --ttl is numeric (prevents bash arithmetic injection); remove 2>/dev/null from strategy fallback calls
#3179 ai-judgment-helper.sh Use awk -v for safe variable passing (prevents command injection via threshold); use jq for JSON construction (prevents JSON injection via LLM output)
#3240 stuck-detection-helper.sh Fix jq filter to match on both issue number AND repo slug (prevents cross-repo data corruption); use awk -v for safe variable passing; remove 2>/dev/null from gh issue edit calls

Vulnerability Details

Closes #3152, closes #3240

Partially addresses #3179 (critical and high findings fixed; medium findings remain)

Summary by CodeRabbit

  • Bug Fixes
    • Surface errors instead of suppressing them so failures propagate to callers.
    • Produce consistent, structured JSON for evaluator results using a robust builder.
    • Enforce path confinement and add validation for token operations, including numeric TTL and filename checks.
    • Improve state and label management in detection workflows: deduplicate and match by repository+issue, add numeric input validation, and ensure state progression when confidence is low.

…njection, JSON injection, jq filter bug

Security fixes:
- marcusquinn#3152: worker-token-helper.sh — validate --token-file path within TOKEN_DIR
  to prevent path traversal; validate --ttl is numeric to prevent bash
  arithmetic injection; remove 2>/dev/null from strategy fallback calls
- marcusquinn#3179: ai-judgment-helper.sh — use awk -v for safe variable passing
  (prevents command injection via threshold); use jq for JSON construction
  (prevents JSON injection via LLM output)
- marcusquinn#3240: stuck-detection-helper.sh — fix jq filter to match on both issue
  number AND repo slug (prevents cross-repo data corruption); use awk -v
  for safe variable passing; remove 2>/dev/null from gh issue edit calls
@github-actions github-actions bot added the bug Auto-created from TODO.md tag label Mar 8, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 18 minutes and 27 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4d5d4b8e-322d-493d-b7de-c98100187b31

📥 Commits

Reviewing files that changed from the base of the PR and between 0a39ee6 and 9a61e30.

📒 Files selected for processing (2)
  • .agents/scripts/stuck-detection-helper.sh
  • .agents/scripts/worker-token-helper.sh

Walkthrough

Refactors in three shell helper scripts: JSON construction moved to jq with typed fields, silent error redirections removed to allow error propagation, and worker-token inputs and file paths validated to prevent injection and path traversal.

Changes

Cohort / File(s) Summary
AI evaluation / JSON output
.agents/scripts/ai-judgment-helper.sh
Replace inline awk float compare with parameterized awk; build evaluator result and fallback JSON via jq -cn to preserve numeric/boolean typing.
Stuck-detection / state & GH ops
.agents/scripts/stuck-detection-helper.sh
Remove 2>/dev/null suppressions so write and gh errors surface; validate numeric inputs and parameterize awk threshold checks; update jq state filters to deduplicate/match by both issue and repo (use two-field unique keys).
Worker token / validation & safety
.agents/scripts/worker-token-helper.sh
Add TTL numeric validation; remove error-suppressing redirections so token creation errors propagate; enforce realpath confinement to TOKEN_DIR and require .token extension for revocation; perform validations earlier in control flow.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

needs-review

Poem

🔧 Three scripts tightened, errors left to speak,
jq sculpts the output, types no longer leak,
validation gates the tokens' hidden door,
state now checks repos — duplicates to explore,
a safer shell dance, tidy and sleek.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title precisely describes the main changes: security hardening across three helper scripts addressing path traversal, injection vulnerabilities, and jq filter bugs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security posture of several critical agent scripts by addressing multiple vulnerabilities. It introduces robust input validation and safer data handling practices to prevent common attack vectors such as path traversal, command injection, and JSON injection, while also rectifying a logic error that could lead to cross-repository data corruption. The changes aim to make these scripts more resilient and secure against malicious inputs and incorrect state management.

Highlights

  • Path Traversal & Arithmetic Injection Prevention: Implemented robust validation for --token-file paths and --ttl values in worker-token-helper.sh to prevent path traversal and arithmetic injection vulnerabilities, ensuring file operations are confined and numerical inputs are safe.
  • Command & JSON Injection Mitigation: Refactored ai-judgment-helper.sh to use awk -v for safe variable passing and jq for secure JSON construction, effectively eliminating command and JSON injection risks from external inputs.
  • Cross-Repo Data Corruption Fix: Updated stuck-detection-helper.sh to correctly identify issues by both issue number and repository slug when managing labeled issues, preventing unintended data corruption across different repositories.
  • Improved Error Visibility: Removed 2>/dev/null from several command calls across worker-token-helper.sh and stuck-detection-helper.sh to ensure that potential error messages are no longer suppressed and are visible for debugging and monitoring.
Changelog
  • .agents/scripts/ai-judgment-helper.sh
    • Updated awk command to use -v for safe variable passing of score and threshold, preventing command injection.
    • Refactored JSON construction for result_json and fallback_json to use jq -n --arg, preventing JSON injection.
  • .agents/scripts/stuck-detection-helper.sh
    • Removed 2>/dev/null from printf command when writing temporary state files.
    • Updated awk command to use -v for safe variable passing of confidence and STUCK_CONFIDENCE_THRESHOLD.
    • Removed 2>/dev/null from gh issue edit calls for adding and removing labels.
    • Removed 2>/dev/null from gh issue comment calls.
    • Added a redundant --arg repo to a jq command when updating labeled_issues state.
    • Modified jq filter to include repo in unique_by for labeled_issues to prevent cross-repo data corruption.
    • Modified jq filter to include repo in select for labeled_issues removal to prevent cross-repo data corruption.
  • .agents/scripts/worker-token-helper.sh
    • Added duplicate validation for the --ttl argument to ensure it is a positive integer, preventing arithmetic injection.
    • Removed 2>/dev/null from create_app_token and create_delegated_token function calls.
    • Implemented path validation using realpath to ensure --token-file is within TOKEN_DIR in cmd_validate and cmd_revoke functions, preventing path traversal.
Activity
  • No specific activity (comments, reviews, progress) was provided in the context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several important security hardening fixes across three scripts, addressing vulnerabilities like path traversal, command injection, and data corruption. The changes are well-implemented, using safer alternatives like awk -v and jq --arg for passing variables and jq for robust JSON construction. I've identified a few minor issues related to duplicated code and incorrect indentation, which are detailed in the comments. Overall, this is a great set of security improvements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
.agents/scripts/worker-token-helper.sh (1)

370-379: Drop the duplicated TTL guard.

Line 370 through Line 379 repeat the same regex check and error path. Keeping a single validation block avoids drift the next time this rule changes.

Proposed cleanup
 		--ttl | -t)
 			ttl="$2"
 			# Validate TTL is numeric to prevent arithmetic injection
 			if ! [[ "$ttl" =~ ^[0-9]+$ ]]; then
 				log_token "ERROR" "TTL must be a positive integer: ${ttl}"
 				return 1
 			fi
-			# Validate TTL is numeric to prevent arithmetic injection
-			if ! [[ "$ttl" =~ ^[0-9]+$ ]]; then
-				log_token "ERROR" "TTL must be a positive integer: ${ttl}"
-				return 1
-			fi
 			if ((ttl > MAX_TTL)); then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/worker-token-helper.sh around lines 370 - 379, The script
contains a duplicated TTL numeric-guard block using the variable ttl and calling
log_token on failure; remove the redundant second validation so only one check
remains: keep the single if [[ "$ttl" =~ ^[0-9]+$ ]] { ... } block that logs via
log_token "ERROR" "TTL must be a positive integer: ${ttl}" and returns 1, and
delete the repeated identical lines to avoid duplication and future drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/ai-judgment-helper.sh:
- Around line 908-911: The JSON output is being pretty-printed by jq -n which
breaks the JSONL one-record-per-line contract; update the jq invocations that
build result JSON (e.g., where result_json is assigned and the other jq -n usage
that emits evaluator records used by eval_dataset) to use compact mode by
changing jq -n to jq -cn so each record is emitted as a single-line JSON object,
preserving the existing use of --arg/--argjson for fields like eval_type, score,
passed and details.

In @.agents/scripts/stuck-detection-helper.sh:
- Around line 344-349: The jq deduplication is wrong and a duplicate --arg is
present: in the pipeline that builds new_state from state (using variables
issue_number, repo_slug, now) change unique_by(.issue + .repo) to
unique_by([.issue, .repo]) so uniqueness is based on the tuple rather than
string concatenation, and remove the duplicated --arg repo "$repo_slug" argument
so --arg repo is only passed once.

In @.agents/scripts/worker-token-helper.sh:
- Around line 529-538: The current realpath call on the token file can fail if
the file is already deleted; change the check to canonicalize the parent
directory instead: compute parent_dir=$(dirname "$token_file") and realpath that
(realpath "$parent_dir") and verify it starts with ${TOKEN_DIR}, and separately
validate the filename via basename to ensure it contains no directory
components; keep using log_token for errors and preserve the existing -f guards
in the cleanup logic so deletion remains safe and idempotent.

---

Nitpick comments:
In @.agents/scripts/worker-token-helper.sh:
- Around line 370-379: The script contains a duplicated TTL numeric-guard block
using the variable ttl and calling log_token on failure; remove the redundant
second validation so only one check remains: keep the single if [[ "$ttl" =~
^[0-9]+$ ]] { ... } block that logs via log_token "ERROR" "TTL must be a
positive integer: ${ttl}" and returns 1, and delete the repeated identical lines
to avoid duplication and future drift.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2311bd43-8d71-4b3b-b53e-d499c026a7a3

📥 Commits

Reviewing files that changed from the base of the PR and between 2b20a17 and 5e4113c.

📒 Files selected for processing (3)
  • .agents/scripts/ai-judgment-helper.sh
  • .agents/scripts/stuck-detection-helper.sh
  • .agents/scripts/worker-token-helper.sh

@marcusquinn
Copy link
Owner

This PR is from an external contributor (@johnwaldo). Auto-merge is disabled for external PRs — a maintainer must review and merge manually.

…idempotent revoke

- Remove duplicated TTL validation block in worker-token-helper.sh
- Use jq -cn (compact) instead of jq -n to maintain JSONL contract
- Fix indentation on fallback_json block in ai-judgment-helper.sh
- Remove duplicate --arg repo in stuck-detection-helper.sh
- Fix unique_by(.issue + .repo) → unique_by([.issue, .repo]) to prevent
  string concatenation collisions (e.g. issue 12 + repo 3/x vs issue 1 + repo 23/x)
- Fix cmd_revoke realpath to resolve parent dir instead of token file,
  so cleanup works when only .meta remains (token already deleted)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.agents/scripts/stuck-detection-helper.sh (1)

269-277: ⚠️ Potential issue | 🟠 Major

Validate confidence as numeric before threshold comparison to prevent string comparison semantics.

Line 271 uses awk to compare confidence against threshold, but awk treats non-numeric strings as strings. When confidence is a non-numeric value (e.g., "high", "foo"), the comparison becomes a lexicographic comparison, causing strings that appear "greater than" the threshold to spuriously return 1 and trigger label/comment operations.

The repro confirms the issue:

0.8   -> 1  ✓ correct
0.6   -> 0  ✓ correct
high  -> 1  ✗ WRONG (string "high" >= "0.7")
foo   -> 1  ✗ WRONG (string "foo" >= "0.7")

Add numeric validation before the awk comparison:

Suggested fix
 	# Check confidence threshold
 	local above_threshold
+	if ! [[ "$confidence" =~ ^([0-9]+([.][0-9]+)?|[.][0-9]+)$ ]] || \
+		! [[ "$STUCK_CONFIDENCE_THRESHOLD" =~ ^([0-9]+([.][0-9]+)?|[.][0-9]+)$ ]]; then
+		_sd_log_error "confidence values must be numeric (got confidence=${confidence}, threshold=${STUCK_CONFIDENCE_THRESHOLD})"
+		return 1
+	fi
 	above_threshold=$(awk -v c="$confidence" -v t="$STUCK_CONFIDENCE_THRESHOLD" 'BEGIN { print (c >= t) ? 1 : 0 }') || above_threshold="0"

This aligns with the coding guideline requirement for automation scripts: Reliability and robustness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/stuck-detection-helper.sh around lines 269 - 277, The
confidence variable must be validated as a numeric value before running the awk
threshold comparison to avoid lexicographic comparisons; modify the block around
above_threshold (and the STUCK_CONFIDENCE_THRESHOLD check) to first test
confidence with a numeric regex (e.g., allow integers or decimals) and if it
fails, set a safe numeric default (0) or treat it as below threshold and log a
warning via _sd_log_info; then run the existing awk comparison to compute
above_threshold and proceed to call _sd_record_milestone/return as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/worker-token-helper.sh:
- Around line 445-454: The token validation in cmd_validate resolves token_file
with realpath but compares it to the raw TOKEN_DIR string, which can fail when
TOKEN_DIR contains symlinks; update cmd_validate to canonicalize TOKEN_DIR
(e.g., compute a TOKEN_DIR_REAL via realpath or similar) and compare real_path
of token_file against that canonical TOKEN_DIR path (same approach used in
cmd_revoke), ensuring the check uses the canonicalized directory before
returning error.

---

Outside diff comments:
In @.agents/scripts/stuck-detection-helper.sh:
- Around line 269-277: The confidence variable must be validated as a numeric
value before running the awk threshold comparison to avoid lexicographic
comparisons; modify the block around above_threshold (and the
STUCK_CONFIDENCE_THRESHOLD check) to first test confidence with a numeric regex
(e.g., allow integers or decimals) and if it fails, set a safe numeric default
(0) or treat it as below threshold and log a warning via _sd_log_info; then run
the existing awk comparison to compute above_threshold and proceed to call
_sd_record_milestone/return as before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: edb72120-8811-4e49-8b86-c8002f879460

📥 Commits

Reviewing files that changed from the base of the PR and between 5e4113c and 2cd10c7.

📒 Files selected for processing (3)
  • .agents/scripts/ai-judgment-helper.sh
  • .agents/scripts/stuck-detection-helper.sh
  • .agents/scripts/worker-token-helper.sh

…ze TOKEN_DIR in cmd_validate

- stuck-detection-helper.sh: Add numeric regex validation for confidence and
  threshold before awk comparison to prevent string comparison semantics
  (e.g., "high" >= "0.7" is true lexicographically in awk)
- worker-token-helper.sh: Canonicalize TOKEN_DIR via realpath in cmd_validate
  to match the pattern already used in cmd_revoke, preventing false rejections
  when TOKEN_DIR contains symlinks

Addresses remaining CodeRabbit review feedback on PR marcusquinn#3877.
@marcusquinn
Copy link
Owner

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.agents/scripts/stuck-detection-helper.sh (1)

264-283: ⚠️ Potential issue | 🟠 Major

Validate all parameters before GitHub operations and state mutations.

The guard at line 264 checks issue_number, milestone_min, and confidence, but omits elapsed_min—yet elapsed_min is interpolated into the comment body at line 313. More critically, milestone_min is passed to _sd_record_milestone (line 276) which pipes it to jq --argjson ms "$milestone" at line 242; non-numeric values like "30m" cause jq to exit with code 2, leaving GitHub labels applied but state unrecorded. Similarly, confidence enters an awk comparison at line 270 without validation, risking lexicographic semantics if the variable contains non-numeric text.

Add numeric validation before any side effects:

	if [[ -z "$issue_number" || -z "$milestone_min" || -z "$elapsed_min" || -z "$confidence" ]]; then
		_sd_log_error "usage: label-stuck <issue_number> <milestone_min> <elapsed_min> <confidence> <reasoning> <suggested_actions> [--repo <slug>]"
		return 1
	fi

+	if ! [[ "$milestone_min" =~ ^[0-9]+$ ]] || ! [[ "$elapsed_min" =~ ^[0-9]+$ ]]; then
+		_sd_log_error "milestone_min and elapsed_min must be positive integers (got milestone=${milestone_min}, elapsed=${elapsed_min})"
+		return 1
+	fi
+
+	if ! [[ "$confidence" =~ ^([0-9]+([.][0-9]+)?|[.][0-9]+)$ ]]; then
+		_sd_log_error "confidence must be numeric (got confidence=${confidence})"
+		return 1
+	fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/stuck-detection-helper.sh around lines 264 - 283, The guard
currently omits validating elapsed_min and does not ensure milestone_min and
confidence are numeric before performing GitHub ops or calling
_sd_record_milestone; update the parameter validation at the start of
label-stuck to: check that elapsed_min is provided, validate milestone_min and
confidence (and STUCK_CONFIDENCE_THRESHOLD) match a numeric regex before any
side effects, and return non-zero with an _sd_log_error if validation fails;
ensure the check happens before any calls that mutate state (e.g., before
invoking _sd_record_milestone and before any labeling/GitHub API calls) so
functions like _sd_record_milestone and the awk comparison receive only numeric
values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/stuck-detection-helper.sh:
- Around line 423-428: The jq expression that assigns new_state can fail when
.labeled_issues is missing; update the filter used in the new_state assignment
(the jq call using --arg key "$issue_key" --arg issue "$issue_number" --arg repo
"$repo_slug") to coalesce .labeled_issues to an empty array before iterating it
(for example using (.labeled_issues // []) or (.labeled_issues // [] ) ), e.g.
keep del(.milestones_checked[$key]) and set .labeled_issues = [(.labeled_issues
// [])[] | select(not (.issue == $issue and .repo == $repo))] so the jq call no
longer errors and the || true masking is unnecessary for this case.

---

Outside diff comments:
In @.agents/scripts/stuck-detection-helper.sh:
- Around line 264-283: The guard currently omits validating elapsed_min and does
not ensure milestone_min and confidence are numeric before performing GitHub ops
or calling _sd_record_milestone; update the parameter validation at the start of
label-stuck to: check that elapsed_min is provided, validate milestone_min and
confidence (and STUCK_CONFIDENCE_THRESHOLD) match a numeric regex before any
side effects, and return non-zero with an _sd_log_error if validation fails;
ensure the check happens before any calls that mutate state (e.g., before
invoking _sd_record_milestone and before any labeling/GitHub API calls) so
functions like _sd_record_milestone and the awk comparison receive only numeric
values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b59465a8-23b7-4464-bc8b-0fc5a609c26f

📥 Commits

Reviewing files that changed from the base of the PR and between 2cd10c7 and 0a39ee6.

📒 Files selected for processing (2)
  • .agents/scripts/stuck-detection-helper.sh
  • .agents/scripts/worker-token-helper.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • .agents/scripts/worker-token-helper.sh

@marcusquinn
Copy link
Owner

Dispatching fix worker to address CodeRabbit CHANGES_REQUESTED review and failing CI (Monitor & Auto-Fix Code Quality, SonarCloud — systemic issues #3848/#3837 cover those; Label PR from Conventional Commit is per-PR).

marcusquinn and others added 2 commits March 8, 2026 15:54
…gainst missing state

Address remaining CodeRabbit CHANGES_REQUESTED review feedback on PR marcusquinn#3877:

- Add elapsed_min to required parameter check in cmd_label_stuck
- Validate milestone_min and elapsed_min as positive integers before
  any GitHub operations or state mutations (prevents jq --argjson
  failures and comment interpolation of non-numeric values)
- Guard cmd_label_clear jq filter against missing .labeled_issues key
  using null coalescing (// []) to prevent 'Cannot iterate over null'
  errors that were silently masked by || true
- ai-judgment-helper: use jq -cn for compact one-record-per-line output
- stuck-detection-helper: validate elapsed_min/milestone_min/confidence as numeric
- stuck-detection-helper: fix unique_by string concat collision, use tuple key
- stuck-detection-helper: handle missing .labeled_issues key in clear filter
- worker-token-helper: resolve TOKEN_DIR with realpath for symlink consistency
- worker-token-helper: handle missing token file gracefully in cmd_revoke
@johnwaldo
Copy link
Contributor Author

@coderabbitai review

All review feedback addressed:

  • ai-judgment-helper: added -cn flag for compact JSON output
  • stuck-detection-helper: added numeric validation for milestone_min, elapsed_min, confidence
  • stuck-detection-helper: fixed unique_by to use tuple key instead of string concatenation
  • stuck-detection-helper: fixed labeled_issues filter to handle missing key with // []
  • worker-token-helper: resolved TOKEN_DIR with realpath for symlink consistency in both validate and revoke

Please re-review.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

@johnwaldo Thanks for the thorough follow-up and for addressing the review feedback! I'll kick off a fresh incremental review now to validate all the changes across the three scripts:

  • ai-judgment-helper.sh-cn flag for compact JSON output
  • stuck-detection-helper.sh — numeric validation, tuple unique_by key, // [] fallback
  • worker-token-helper.shrealpath canonicalization for both validate and revoke

🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 0
File: :0-0
Timestamp: 2026-03-03T16:33:18.313Z
Learning: For the marcusquinn/aidevops repository: The daily quality sweep system (pulse-wrapper.sh) should only mention coderabbitai for full codebase review when there are significant changes (Quality Gate failure, +10 issues in one day, etc.), not on every automated sweep with stable metrics.

Learnt from: CR
Repo: marcusquinn/aidevops PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-08T12:35:20.071Z
Learning: Consult tools/build-agent/agent-review.md for guidance on reviewing and improving agents
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.agents/scripts/stuck-detection-helper.sh (1)

149-158: ⚠️ Potential issue | 🟠 Major

Use a symlink-safe temp file for state writes.

"${state_file}.tmp.$$" is predictable, so a pre-created symlink in SUPERVISOR_DIR can redirect the write into another file. Also, mv ... 2>/dev/null still hides the real failure reason in the atomic-write path.

Suggested fix
-	local tmp_file="${state_file}.tmp.$$"
-	if ! printf '%s\n' "$state_json" >"$tmp_file"; then
+	local tmp_file
+	tmp_file=$(mktemp "$(dirname "$state_file")/$(basename "$state_file").tmp.XXXXXX") || {
+		_sd_log_warn "failed to allocate temp state file"
+		return 1
+	}
+	if ! printf '%s\n' "$state_json" >"$tmp_file"; then
 		_sd_log_warn "failed to write temp state file: $tmp_file"
 		rm -f "$tmp_file" 2>/dev/null || true
 		return 1
 	fi
-	if ! mv -f "$tmp_file" "$state_file" 2>/dev/null; then
+	if ! mv -f "$tmp_file" "$state_file"; then
 		_sd_log_warn "failed to move temp state to: $state_file"
 		rm -f "$tmp_file" 2>/dev/null || true
 		return 1
 	fi

As per coding guidelines, ".agents/scripts/*.sh: Automation scripts - focus on: Reliability and robustness" and "Clear logging and feedback".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/stuck-detection-helper.sh around lines 149 - 158, Replace
the predictable temp filename logic that sets tmp_file="${state_file}.tmp.$$"
with a symlink-safe mktemp call that creates the temp file in the same directory
as state_file (e.g., use mktemp --tmpdir="$(dirname "$state_file")" or mktemp
"$(dirname "$state_file")/tmp.XXXXXX"), and ensure the created tmp_file is
cleaned up on failures; also stop silencing mv errors—capture mv's stderr when
moving tmp_file to state_file and pass that error text to _sd_log_warn (include
the state_file and captured error), then remove tmp_file and return 1.
Reference: tmp_file, state_file, and _sd_log_warn.
🧹 Nitpick comments (2)
.agents/scripts/worker-token-helper.sh (1)

370-378: Solid arithmetic injection guard.

The regex validation ensures TTL contains only digits before any arithmetic evaluation, effectively blocking injection payloads like 1; rm -rf /.

One edge-case nitpick: bash arithmetic interprets leading zeros as octal (e.g., 010 → 8 decimal). If you want explicit decimal interpretation regardless of input format, you could use 10#$ttl in the comparison. This is quite minor since users rarely pass octal-formatted TTLs.

,

Optional: force decimal interpretation
-		if ((ttl > MAX_TTL)); then
+		if ((10#$ttl > MAX_TTL)); then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/worker-token-helper.sh around lines 370 - 378, The
comparison against MAX_TTL can misinterpret inputs with leading zeros as octal;
update the arithmetic check that uses ttl (and subsequent assignment) to force
decimal interpretation by applying the bash base prefix (e.g., use 10#$ttl) when
evaluating ttl in the (( ttl > MAX_TTL )) test and any arithmetic context
involving ttl (refer to the ttl variable and the MAX_TTL comparison in the
worker-token-helper.sh snippet and keep the existing regex validation and
log_token calls intact).
.agents/scripts/stuck-detection-helper.sh (1)

291-295: Clarify exit code semantics: "skipped" paths should use distinct status.

The label-stuck command returns 0 in three distinct scenarios: confidence below threshold (line 295), GitHub skipped via SD_SKIP_GITHUB=true (line 317), and successful labeling (line 372). This ambiguity means future callers cannot distinguish between "checked and skipped" versus "checked and labeled"—currently no active callers exist, but integrating this helper without a separate exit code for skipped scenarios will create the misreporting risk described. Consider using exit code 2 (or similar) for the "checked but skipped" paths to keep the contract unambiguous.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/stuck-detection-helper.sh around lines 291 - 295, The
"label-stuck" script currently returns 0 for both "checked but skipped" and
"checked and labeled" outcomes; change the skipped-return semantics so callers
can distinguish them by returning a non-zero but distinct code (e.g., 2) for
skipped paths. Specifically, update the branch that logs "confidence ... below
threshold" (the block using _sd_log_info, _sd_record_milestone and currently
`return 0`) to `return 2`, and likewise update the SD_SKIP_GITHUB path (the
branch checking SD_SKIP_GITHUB=true) to return the same distinct code; keep the
successful labeling/normal completion return as 0 so success remains unchanged.
Ensure the referenced helpers (_sd_log_info, _sd_record_milestone) remain called
as before and only the return values change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/stuck-detection-helper.sh:
- Around line 351-354: The gh issue comment failure handler currently calls
_sd_log_warn which exits 0, so failures are swallowed; update the handler after
the gh issue comment invocation to propagate a non-zero status (e.g., change the
failure branch to call _sd_log_warn and then exit 1 or return 1 depending on
whether this code runs in the main script or inside a function) so that the gh
issue comment failure is visible to the caller; target the gh issue comment
invocation and its failure block and ensure it returns/ exits non-zero instead
of relying on _sd_log_warn's current behavior.

---

Outside diff comments:
In @.agents/scripts/stuck-detection-helper.sh:
- Around line 149-158: Replace the predictable temp filename logic that sets
tmp_file="${state_file}.tmp.$$" with a symlink-safe mktemp call that creates the
temp file in the same directory as state_file (e.g., use mktemp
--tmpdir="$(dirname "$state_file")" or mktemp "$(dirname
"$state_file")/tmp.XXXXXX"), and ensure the created tmp_file is cleaned up on
failures; also stop silencing mv errors—capture mv's stderr when moving tmp_file
to state_file and pass that error text to _sd_log_warn (include the state_file
and captured error), then remove tmp_file and return 1. Reference: tmp_file,
state_file, and _sd_log_warn.

---

Nitpick comments:
In @.agents/scripts/stuck-detection-helper.sh:
- Around line 291-295: The "label-stuck" script currently returns 0 for both
"checked but skipped" and "checked and labeled" outcomes; change the
skipped-return semantics so callers can distinguish them by returning a non-zero
but distinct code (e.g., 2) for skipped paths. Specifically, update the branch
that logs "confidence ... below threshold" (the block using _sd_log_info,
_sd_record_milestone and currently `return 0`) to `return 2`, and likewise
update the SD_SKIP_GITHUB path (the branch checking SD_SKIP_GITHUB=true) to
return the same distinct code; keep the successful labeling/normal completion
return as 0 so success remains unchanged. Ensure the referenced helpers
(_sd_log_info, _sd_record_milestone) remain called as before and only the return
values change.

In @.agents/scripts/worker-token-helper.sh:
- Around line 370-378: The comparison against MAX_TTL can misinterpret inputs
with leading zeros as octal; update the arithmetic check that uses ttl (and
subsequent assignment) to force decimal interpretation by applying the bash base
prefix (e.g., use 10#$ttl) when evaluating ttl in the (( ttl > MAX_TTL )) test
and any arithmetic context involving ttl (refer to the ttl variable and the
MAX_TTL comparison in the worker-token-helper.sh snippet and keep the existing
regex validation and log_token calls intact).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a36c1c74-c0c7-44a6-997a-b1be80ed69b8

📥 Commits

Reviewing files that changed from the base of the PR and between 0a39ee6 and e03d196.

📒 Files selected for processing (2)
  • .agents/scripts/stuck-detection-helper.sh
  • .agents/scripts/worker-token-helper.sh

Capture gh issue comment failure status and return it after state
recording completes. State is always recorded regardless of comment
success (since the label was already applied), preventing re-labeling
on the next check cycle.
@johnwaldo
Copy link
Contributor Author

@coderabbitai review

Addressed the remaining review comment:

  • stuck-detection-helper.sh: Advisory-comment failure now propagated via comment_failed flag. State recording still happens regardless (label was already applied), preventing re-labeling on next check. Returns non-zero only if the comment failed.

All other review comments (duplicate TTL validation, indentation, duplicate --arg, jq -cn, unique_by tuple, realpath canonicalization, labeled_issues null guard) were already addressed in prior commits.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

@johnwaldo Thanks for the update! The comment_failed flag approach for stuck-detection-helper.sh is a clean solution — propagating the advisory-comment failure while preserving state ensures the label-was-applied record is maintained and avoids re-labeling on the next cycle. Kicking off a fresh incremental review now to validate the final fix across all three scripts.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 0
File: :0-0
Timestamp: 2026-03-03T16:33:18.313Z
Learning: For the marcusquinn/aidevops repository: The daily quality sweep system (pulse-wrapper.sh) should only mention coderabbitai for full codebase review when there are significant changes (Quality Gate failure, +10 issues in one day, etc.), not on every automated sweep with stable metrics.

Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 3893
File: .agents/scripts/pulse-wrapper.sh:452-455
Timestamp: 2026-03-08T16:59:32.817Z
Learning: In marcusquinn/aidevops, the `supervisor` label has been in production on all managed repos for months as a role label for health dashboard issues. The `contributor` label was added in PR `#3893` following the same convention for non-maintainer health dashboards. Renaming these labels would break existing health issues and require migration logic — label namespacing is a known deferred follow-up item, not an immediate fix requirement. Both labels are only created on repos with `pulse: true` in repos.json, which are fully framework-managed repos.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcusquinn
Copy link
Owner

Dispatching worker to address review feedback.

@johnwaldo
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcusquinn
Copy link
Owner

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcusquinn
Copy link
Owner

Worker killed after 3h34m with 2 commits (struggle_ratio: 19996).

  • Model: sonnet (anthropic/claude-sonnet-4-6)
  • Branch: bugfix/critical-batch3
  • Reason: Extreme thrashing — 39992 messages, only 2 commits. Context exhaustion loop.
  • Diagnosis: 7 unresolved bot suggestions detected. Worker likely looped on review feedback without resolving suggestions.
  • Next action: Re-dispatch to address 7 unresolved bot suggestions, then merge (PR is APPROVED + CI PASS).

@marcusquinn
Copy link
Owner

Dispatching fix worker to address 7 unresolved bot review suggestions.

  • Model: sonnet (anthropic/claude-sonnet-4-6)
  • Branch: bugfix/critical-batch3
  • Scope: Address unresolved inline suggestions from coderabbit/gemini on this PR
  • Attempt: 1 of 3

@marcusquinn
Copy link
Owner

Review Bot Suggestions — Resolution Summary

All CodeRabbit and Gemini Code Assist findings have been addressed. Here's the status of each:

Already Fixed (prior commits)

Finding File Status
Duplicate TTL validation block worker-token-helper.sh Fixed — single block remains
Indentation inconsistency ai-judgment-helper.sh Fixed — consistent single-tab
Duplicate --arg repo stuck-detection-helper.sh Fixed — single arg
unique_by(.issue + .repo) string concat stuck-detection-helper.sh Fixed — uses tuple [.issue, .repo]
jq -njq -cn for JSONL ai-judgment-helper.sh Fixed — compact output on both calls
realpath on deleted token file worker-token-helper.sh:cmd_revoke Fixed — resolves parent dir + validates basename
TOKEN_DIR symlink canonicalization worker-token-helper.sh:cmd_validate Fixed — realpath "$TOKEN_DIR" before comparison
Numeric validation for confidence/milestone/elapsed stuck-detection-helper.sh Fixed — regex guards before any side effects
Predictable temp filename (symlink attack) stuck-detection-helper.sh Fixed — uses mktemp with XXXXXX suffix
.labeled_issues null coalesce in label-clear stuck-detection-helper.sh Fixed — (.labeled_issues // [])
Octal interpretation of leading-zero TTL worker-token-helper.sh Fixed — 10#$ttl forces decimal
Exit code 2 for "checked but skipped" paths stuck-detection-helper.sh Fixed — return 2 for below-threshold and skip-github

Intentionally Not Applied

Finding Reason
return 1 in gh issue comment failure handler Current design defers return via comment_failed flag (line 375) to ensure state recording completes. Immediate return 1 would skip _sd_record_milestone and state write, causing re-labeling on next check.

All inline comments have been replied to individually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

already-fixed Already fixed by another change bug Auto-created from TODO.md tag external-contributor PR from external contributor, requires maintainer review needs-review-fixes

Projects

None yet

2 participants