fix: prevent ShellCheck memory explosion with RSS watchdog, rate limiting, and PATH fix#3146
Conversation
…ting, and PATH fix Three defenses against the Mar 7 crash (18.5 GB shellcheck, 31 GB total): shellcheck-wrapper.sh: - Replace broken ulimit -v (no-op on macOS ARM — EINVAL) with background RSS watchdog that polls every 2s and kills at 1 GB RSS - Add respawn rate limiter with exponential backoff (5s, 10s, 20s... 300s) to prevent kill-respawn-grow cycles from bash-language-server - Add 120s hard timeout as additional safety net memory-pressure-monitor.sh (v2.1.0): - Lower thresholds: warn at 1 GB, kill at 2 GB (was 2/4 GB) - Lower shellcheck runtime: 5 min (was 10 min) - Fix zsh false positives: match MONITORED_PATTERNS against command basename only, not full command line with arguments - Adaptive polling: 10s when shellcheck detected, 30s normal (was 60s) - Launchd plist interval: 30s (was 60s) setup-modules/shell-env.sh: - Fix PATH ordering: strip existing shim_dir from PATH before prepending to guarantee ~/.aidevops/bin is FIRST, not appended at end - Document ulimit -v macOS ARM limitation
… fix PATH security Address CHANGES_REQUESTED review from CodeRabbit and Gemini Code Assist: shellcheck-wrapper.sh: - Add _validate_int() to sanitize RSS_LIMIT_MB, WATCHDOG_INTERVAL, HARD_TIMEOUT before use in arithmetic/sleep (prevents tight loops from typos like 'abc') - Fix race condition in _record_kill() with mkdir-based atomic lock - Remove blanket 2>/dev/null from mkdir -p (surface permission errors) - Add explicit return 0 to _record_kill() setup-modules/shell-env.sh: - Fix PATH trailing colon security issue in launchctl setenv (empty clean_path produced 'shim_dir:' which resolves '.' — PATH injection vector) - Replace presence-only case guard with sanitize-and-prepend logic that strips existing shim_dir from PATH before prepending (fixes upgrade path where shim was appended at end, bypassing wrapper in new shells) - Clean up stale old-form case-guard entries from .zshenv and rc files - Update fish shell PATH line to use same sanitize-and-prepend pattern CI failures (SonarCloud, Label PR, Monitor) are all fork permission issues (403 'Resource not accessible by integration') — expected for external PRs.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Mar 7 18:08:50 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
|
@coderabbitai review Review bots were rate-limited when this PR was created (affected: coderabbitai gemini-code-assist). Requesting a review retry. |
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
|
This PR appears orphaned — no active worker process found and no activity for 25+ hours. Flagging for re-dispatch. If work is still in progress, remove the |
|
Merging via pulse supervisor.
|



Summary
Rebased version of #3105 — all CodeRabbit and Gemini Code Assist review feedback addressed. Rebased onto current main to eliminate stale version/planning file diffs.
Fixes the recurring ShellCheck memory explosion that caused a system crash on Mar 7 (20 GB -> 48 GB+ in minutes, 18.5 GB single shellcheck process).
Root cause analysis:
ulimit -vis a complete no-op on macOS ARM (setrlimit failed: invalid argument) — the wrapper's memory cap was doing nothingThree-layer fix:
1. shellcheck-wrapper.sh — RSS watchdog + rate limiter
ulimit -v: polls child process RSS every 2s, kills at 1 GB_validate_int) prevents tight loops from invalid env vars_record_killprevents race conditions from concurrent kills.realfast-path from main for binary discovery2. memory-pressure-monitor.sh v2.1.0 — faster detection
3. setup-modules/shell-env.sh — PATH ordering fix
~/.aidevops/binfrom PATH before prepending, guaranteeing first position.zshenv, all rc files, and fish configcaseguard entries cleaned up on upgradeReview feedback addressed (from #3105)
_validate_int()with min bounds (128/1/10)_record_kill—mkdir-based atomic lock2>/dev/nullfrommkdir -p— surface permission errorsTesting
Closes #2915