Skip to content

t4231: harden critical stats-wrapper dedup age validation#4232

Merged
alex-solovyev merged 3 commits intomainfrom
bugfix/t4231-stats-wrapper-critical-quality-debt
Mar 12, 2026
Merged

t4231: harden critical stats-wrapper dedup age validation#4232
alex-solovyev merged 3 commits intomainfrom
bugfix/t4231-stats-wrapper-critical-quality-debt

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Mar 12, 2026

Summary

  • validate pidfile epoch before elapsed-time math to prevent false stale-process kills when epoch data is missing/corrupt
  • fall back to portable ps elapsed-time lookup (etimes first, then parsed etime) so dedup checks remain reliable on macOS and Linux
  • preserve existing timeout kill/remove behavior while adding focused runtime validation for invalid and stale epoch paths

Validation

  • shellcheck .agents/scripts/stats-wrapper.sh
  • bash -lc 'source ".agents/scripts/stats-wrapper.sh"; tmpdir=$(mktemp -d); STATS_PIDFILE="$tmpdir/stats.pid"; STATS_LOGFILE="$tmpdir/stats.log"; STATS_TIMEOUT=600; sleep 30 & pid=$!; printf "%s 0\n" "$pid" >"$STATS_PIDFILE"; if check_stats_dedup; then rc=$?; else rc=$?; fi; alive=0; if kill -0 "$pid" 2>/dev/null; then alive=1; fi; kill "$pid" 2>/dev/null || true; wait "$pid" 2>/dev/null || true; printf "invalid_epoch rc=%s alive=%s\n" "$rc" "$alive"'
  • bash -lc 'source ".agents/scripts/stats-wrapper.sh"; tmpdir=$(mktemp -d); STATS_PIDFILE="$tmpdir/stats.pid"; STATS_LOGFILE="$tmpdir/stats.log"; STATS_TIMEOUT=2; sleep 30 & pid=$!; old_epoch=$(( $(date +%s) - 600 )); printf "%s %s\n" "$pid" "$old_epoch" >"$STATS_PIDFILE"; if check_stats_dedup; then rc=$?; else rc=$?; fi; alive=0; if kill -0 "$pid" 2>/dev/null; then alive=1; fi; kill "$pid" 2>/dev/null || true; wait "$pid" 2>/dev/null || true; printf "stale_epoch rc=%s alive=%s\n" "$rc" "$alive"'

Closes #4231

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced reliability of internal process monitoring through improved elapsed time calculation validation.
    • Added safeguards to prevent edge-case issues from corrupt or invalid elapsed time data.
    • Implemented fallback validation mechanisms for more robust process status verification.
    • Maintains existing process timeout and logging functionality.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the stats-wrapper script's process deduplication mechanism. It addresses potential issues where corrupted or missing epoch data in PID files could lead to the incorrect termination of active processes. By introducing rigorous validation for epoch timestamps and a portable method for determining process elapsed time, the changes ensure that the script reliably identifies and manages stale processes across different operating systems, improving overall system stability.

Highlights

  • Epoch Validation: Validated the pidfile epoch before performing elapsed-time calculations to prevent false stale-process kills when epoch data is missing or corrupt.
  • Portable Elapsed-Time Lookup: Implemented a fallback to portable ps elapsed-time lookup (etimes first, then parsed etime) to ensure dedup checks remain reliable on both macOS and Linux.
  • Enhanced Runtime Validation: Preserved existing timeout kill/remove behavior while adding focused runtime validation for invalid and stale epoch paths.
Changelog
  • .agents/scripts/stats-wrapper.sh
    • Introduced a new private function _stats_process_elapsed_seconds to provide a cross-platform way of getting a process's elapsed time in seconds using ps and awk.
    • Modified the check_stats_dedup function to first validate the old_epoch read from the pidfile. If the epoch is invalid or corrupt, it now falls back to using the _stats_process_elapsed_seconds function to determine the process's age, preventing erroneous process kills.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Warning

Rate limit exceeded

@alex-solovyev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 23 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e893406c-44e7-4527-b56f-bb726a0ad64b

📥 Commits

Reviewing files that changed from the base of the PR and between 8d613d6 and 268ad84.

📒 Files selected for processing (1)
  • .agents/scripts/stats-wrapper.sh

Walkthrough

Refactored process age calculation in .agents/scripts/stats-wrapper.sh to eliminate GNU-specific date -d command, replacing it with a portable helper function that uses ps and awk to compute elapsed seconds, ensuring cross-platform compatibility on both Linux and macOS/BSD systems.

Changes

Cohort / File(s) Summary
Process Age Calculation
.agents/scripts/stats-wrapper.sh
Added portable helper _stats_process_elapsed_seconds(pid) to compute process elapsed time using ps -o etimes= (Linux) with fallback to parsing ps -o etime= via awk (macOS). Updated check_stats_dedup() to validate old_epoch before use and employ the new helper as fallback when epoch is unavailable or invalid. Added error handling to remove stale PID files on process lookup failures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

bug

Poem

🐧 A shell script that worked just fine,
Failed on macOS's BSD date design,
With ps and awk, we craft a way,
Cross-platform harmony saves the day! 🔧✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: hardening the epoch validation logic in stats-wrapper.sh to handle malformed data robustly.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from #4231: replaces GNU date -d with portable ps-based elapsed-time lookup, handles missing/corrupt epoch data, and maintains existing timeout/kill behavior.
Out of Scope Changes check ✅ Passed All changes are directly scoped to hardening process-age validation in stats-wrapper.sh with no unrelated modifications outside the stated objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/t4231-stats-wrapper-critical-quality-debt
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 402 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Thu Mar 12 18:59:11 UTC 2026: Code review monitoring started
Thu Mar 12 18:59:11 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 402

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 402
  • VULNERABILITIES: 0

Generated on: Thu Mar 12 18:59:14 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly hardens the process deduplication logic in stats-wrapper.sh. The added validation for the process start epoch from the pidfile is robust and correctly prevents the erroneous termination of healthy processes due to corrupt data. The introduction of a portable fallback mechanism using ps to calculate the process's elapsed time is an excellent improvement for cross-platform reliability. The new _stats_process_elapsed_seconds function is well-implemented with a sensible fallback from etimes to parsing etime. I have one minor suggestion to make the awk script within this new function more concise and idiomatically robust.

Comment on lines +55 to +74
elapsed=$(awk -v value="$etime" '
BEGIN {
n = split(value, parts, /[-:]/)
if (index(value, "-") > 0) {
if (n != 4) {
print ""
exit
}
total = (parts[1] * 86400) + (parts[2] * 3600) + (parts[3] * 60) + parts[4]
} else if (n == 3) {
total = (parts[1] * 3600) + (parts[2] * 60) + parts[3]
} else if (n == 2) {
total = (parts[1] * 60) + parts[2]
} else {
print ""
exit
}
print total
}
')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and ensure the awk command is independently robust, using exit 1 on invalid input is a cleaner way to signal an error than print ""; exit. This aligns with the guideline to make awk commands robust and handle potential errors, even when outer logic provides safeguards. This also allows for removing duplicated error-handling blocks, improving overall maintainability.

Suggested change
elapsed=$(awk -v value="$etime" '
BEGIN {
n = split(value, parts, /[-:]/)
if (index(value, "-") > 0) {
if (n != 4) {
print ""
exit
}
total = (parts[1] * 86400) + (parts[2] * 3600) + (parts[3] * 60) + parts[4]
} else if (n == 3) {
total = (parts[1] * 3600) + (parts[2] * 60) + parts[3]
} else if (n == 2) {
total = (parts[1] * 60) + parts[2]
} else {
print ""
exit
}
print total
}
')
elapsed=$(awk -v value="$etime" '
BEGIN {
n = split(value, parts, /[-:]/)
if (index(value, "-") > 0) {
if (n != 4) { exit 1 }
total = (parts[1] * 86400) + (parts[2] * 3600) + (parts[3] * 60) + parts[4]
} else if (n == 3) {
total = (parts[1] * 3600) + (parts[2] * 60) + parts[3]
} else if (n == 2) {
total = (parts[1] * 60) + parts[2]
} else {
exit 1
}
print total
}
')
References
  1. To improve maintainability, ensure that commands like awk are independently robust and handle potential errors such as division-by-zero, even when outer logic already provides a safeguard.

coderabbitai[bot]
coderabbitai bot previously requested changes Mar 12, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/stats-wrapper.sh:
- Around line 116-118: The current error path in the stats rotation logic
deletes STATS_PIDFILE whenever _stats_process_elapsed_seconds "$old_pid" fails,
which can falsely remove the pidfile for live processes when the elapsed-time
lookup fails; change the failure handling so you do NOT rm -f "$STATS_PIDFILE"
immediately — instead, on _stats_process_elapsed_seconds failure check process
liveness (e.g. test with kill -0 "$old_pid" or a portable ps check) and only
remove STATS_PIDFILE if the process is confirmed not running; otherwise preserve
the pidfile and return a non-destructive error/exit so the existing stats worker
isn’t restarted unnecessarily.
- Around line 43-52: The two ps probes that assign to elapsed and etime can fail
under set -euo pipefail and must be guarded so the script can continue to the
fallback; update the assignments to use a safe form like elapsed=$(ps -p "$pid"
-o etimes= 2>/dev/null || true) and etime=$(ps -p "$pid" -o etime= 2>/dev/null
|| true) (or wrap each ps call in an if/command substitution check) so a
non-zero exit from ps does not abort the function and the subsequent etime
fallback and PID-file cleanup still run.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fc3203d9-2338-4f2a-8140-3f4cda81d7d5

📥 Commits

Reviewing files that changed from the base of the PR and between cf44b1d and 8d613d6.

📒 Files selected for processing (1)
  • .agents/scripts/stats-wrapper.sh

@marcusquinn
Copy link
Owner Author

Supervisor relaunch completed after launch-validation mismatch on the prior worker session.\n\n- Model: gpt-5.3-codex (openai/gpt-5.3-codex)\n- Action: replaced stale worker process with a fresh PR-advancement worker on the same branch\n- Reason: previous run showed no checkpoint progress and failed launch validation heuristics\n- Next: worker will address requested changes and post a fix summary in this PR

@github-actions
Copy link
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 402 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Thu Mar 12 20:20:40 UTC 2026: Code review monitoring started
Thu Mar 12 20:20:40 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 402

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 402
  • VULNERABILITIES: 0

Generated on: Thu Mar 12 20:20:42 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@marcusquinn
Copy link
Owner Author

Addressed review feedback in 80ee81f.

  • Guarded both ps probes in _stats_process_elapsed_seconds so set -euo pipefail does not abort before fallback logic runs
  • Hardened awk parse failure handling to use explicit non-zero exits for invalid etime formats
  • Updated dedup fallback path to preserve stats.pid when PID is still live but elapsed-time lookup fails, and only remove pidfile when process is confirmed gone

Local verification run:

  • bash -n .agents/scripts/stats-wrapper.sh
  • aidevops_quality_check on .agents/scripts/stats-wrapper.sh (pass)

CI status after push: all previously pending checks are now passing.

@alex-solovyev
Copy link
Collaborator

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@alex-solovyev
Copy link
Collaborator

@coderabbitai review

Both CHANGES_REQUESTED issues have been addressed in commit 80ee81f:

  1. Lines 43-52 (ps probe guards): Both ps probes now use || true to prevent set -euo pipefail from aborting the function when ps -o etimes= is unavailable (e.g. macOS). The etime fallback and pidfile cleanup can now always run.

  2. Lines 116-118 (ambiguous elapsed-time failure): The error path now checks process liveness with kill -0 before removing the pidfile. If the process is still alive, the pidfile is preserved and the function returns 1 (non-destructive skip). Only if the process is confirmed dead is the pidfile removed.

Please re-review.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the current state of the stats-wrapper.sh file to verify the fixes
cat .agents/scripts/stats-wrapper.sh

Length of output: 5305


@alex-solovyev Both fixes have been verified against the current file contents — everything looks correct:

Fix 1 — _stats_process_elapsed_seconds ps probe guards (lines 43, 50):
Both command substitutions now carry || true, so neither ps -o etimes= nor ps -o etime= can abort the function under set -euo pipefail when the flag is unsupported on macOS. The etime fallback path and the return 1 sentinel on empty output are both reachable on all platforms. ✅

Fix 2 — non-destructive skip on live-but-unmeasurable PIDs (lines 116–122):
The || { error block now gates on kill -0 "$old_pid" before touching the pidfile:

  • Process alive → logs a warning, preserves stats.pid, and returns 1 (safe no-op). ✅
  • Process confirmed gone → removes the stale pidfile and returns 0 (correct cleanup). ✅

Additional correctness notes observed:

  • The awk block's exit 1 on malformed etime strings produces an empty $elapsed, which correctly fails the ^[0-9]+$ guard and propagates return 1 to the caller — no arithmetic on garbage data.
  • The epoch pre-validation (^[0-9]+$ + > 0 + <= now) tightly gates the ((now - old_epoch)) path against negative, zero, or future-dated epochs — the original PR #4231 root cause is fully addressed.
  • _stats_process_elapsed_seconds is only ever invoked inside || {, so set -euo pipefail is suppressed within it for that call, making the awk exit 1 safe without needing an extra || true on the awk line.

All previously requested changes are addressed. The implementation is portable, shellcheck-clean, and correct.

[review]


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 0
File: :0-0
Timestamp: 2026-03-03T16:33:18.313Z
Learning: For the marcusquinn/aidevops repository: The daily quality sweep system (pulse-wrapper.sh) should only mention coderabbitai for full codebase review when there are significant changes (Quality Gate failure, +10 issues in one day, etc.), not on every automated sweep with stable metrics.

@alex-solovyev
Copy link
Collaborator

@coderabbitai review

Both findings from your CHANGES_REQUESTED review are addressed in commit 80ee81f (already on this branch):

  1. ps probes guarded with || true (lines 43, 50): Both ps -o etimes= and ps -o etime= command substitutions now have || true appended, preventing set -euo pipefail from aborting the function before the etime fallback can execute.

  2. pidfile deletion guarded with kill -0 (lines 112-118): The _stats_process_elapsed_seconds failure path now checks process liveness via kill -0 "$old_pid" before deciding whether to remove the pidfile. Live processes preserve the pidfile and skip; dead processes clean up.

Please re-review to clear the CHANGES_REQUESTED state.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

@alex-solovyev Both findings remain fully addressed as verified in my previous review of the current file state:

  1. || true on both ps probesetimes and etime command substitutions are both guarded, keeping set -euo pipefail from aborting before the fallback path runs. ✅
  2. kill -0 gate before pidfile removal — live processes are preserved with a return 1 (non-destructive skip); only confirmed-dead processes trigger pidfile cleanup. ✅

Triggering a fresh review pass to update the PR state.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 0
File: :0-0
Timestamp: 2026-03-03T16:33:18.313Z
Learning: For the marcusquinn/aidevops repository: The daily quality sweep system (pulse-wrapper.sh) should only mention coderabbitai for full codebase review when there are significant changes (Quality Gate failure, +10 issues in one day, etc.), not on every automated sweep with stable metrics.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- (1) awk now uses exit 1 for invalid etime input (already present); add
  || true guard on the awk command substitution so set -euo pipefail cannot
  abort the function on parse failure before the ^[0-9]+$ guard runs
- (2)+(3) ps fallback already distinguishes process-not-found from
  lookup-failure via kill -0 gate (commit 80ee81f); pidfile only deleted
  when process is confirmed gone, preserved with return 1 when live
@github-actions
Copy link
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 404 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Thu Mar 12 22:18:28 UTC 2026: Code review monitoring started
Thu Mar 12 22:18:29 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 404

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 404
  • VULNERABILITIES: 0

Generated on: Thu Mar 12 22:18:32 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@alex-solovyev
Copy link
Collaborator

CodeRabbit findings addressed: (1) awk now uses exit 1 for invalid input (already present in 80ee81f); added || true guard on the awk command substitution so set -euo pipefail cannot abort the function on parse failure before the ^[0-9]+$ guard runs — this is the defensive fix for finding (1); (2)+(3) ps fallback now distinguishes process-not-found from lookup-failure — pidfile only deleted when process is confirmed gone (kill -0 check), preserved with return 1 when live (already in 80ee81f, confirmed by CodeRabbit at 21:14 UTC).

@alex-solovyev
Copy link
Collaborator

@coderabbitai review

All three CodeRabbit CHANGES_REQUESTED findings are now addressed in commit 268ad84:

  1. awk exit signaling (finding 1): awk already uses exit 1 on invalid etime input (not print ""; exit). Added || true guard on the awk command substitution at line 70 so set -euo pipefail cannot abort the function on parse failure before the ^[0-9]+$ validity check runs — making the error path explicit and defensive.

  2. ps fallback pidfile drop on ambiguous failure (findings 2+3): The _stats_process_elapsed_seconds failure path (lines 112-119) gates on kill -0 "$old_pid" before touching the pidfile. Live processes preserve the pidfile and return 1 (non-destructive skip). Only confirmed-dead processes trigger pidfile cleanup. This was already in 80ee81f and confirmed correct by CodeRabbit at 21:14 UTC.

Please re-review to clear the CHANGES_REQUESTED state.

@alex-solovyev alex-solovyev dismissed coderabbitai[bot]’s stale review March 12, 2026 22:21

All requested changes addressed in commits 80ee81f and 268ad84. CodeRabbit confirmed fixes at 21:14 UTC. Dismissing to clear CHANGES_REQUESTED state.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
cat .agents/scripts/stats-wrapper.sh

Length of output: 5313


@alex-solovyev All three findings have been verified against the current file state — everything checks out:

Finding 1 — awk || true guard (line ~70):
The awk command substitution now carries || true, so a parse failure (exit 1 on malformed etime) produces an empty $elapsed rather than aborting the function under set -euo pipefail. The subsequent ^[0-9]+$ guard then correctly propagates return 1 to the caller — no arithmetic on garbage data. ✅

Findings 2 + 3 — kill -0 gate before pidfile removal (lines ~113–120):
The _stats_process_elapsed_seconds failure block gates on kill -0 "$old_pid" before any pidfile mutation:

  • Process alive → logs a warning, preserves stats.pid, and returns 1 (non-destructive skip). ✅
  • Process confirmed gone → removes the stale pidfile and returns 0 (correct cleanup). ✅

Both ps probe guards remain in place (lines ~43, 49):
Both etimes= and etime= command substitutions carry || true, keeping set -euo pipefail from aborting before the fallback path runs. ✅

The implementation is portable across macOS/Linux, shellcheck-clean, and fully addresses the original PR #4231 root cause. Triggering a review pass to clear the CHANGES_REQUESTED state.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 0
File: :0-0
Timestamp: 2026-03-03T16:33:18.313Z
Learning: For the marcusquinn/aidevops repository: The daily quality sweep system (pulse-wrapper.sh) should only mention coderabbitai for full codebase review when there are significant changes (Quality Gate failure, +10 issues in one day, etc.), not on every automated sweep with stable metrics.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@alex-solovyev alex-solovyev merged commit a95d44d into main Mar 12, 2026
12 checks passed
@alex-solovyev alex-solovyev deleted the bugfix/t4231-stats-wrapper-critical-quality-debt branch March 12, 2026 22:26
alex-solovyev added a commit that referenced this pull request Mar 12, 2026
…per.sh

All 3 PR #4232 review findings are confirmed addressed:
- CRITICAL (coderabbit): ps probes guarded with || true so set -euo pipefail
  cannot abort before etime fallback runs (commit 80ee81f)
- HIGH (coderabbit): pidfile preserved for live PIDs via kill -0 check;
  only removed when process is confirmed gone (commit 80ee81f)
- MEDIUM (gemini): awk uses exit 1 for invalid input; || true on command
  substitution is intentional to prevent set -e abort on parse failure
  (commit 268ad84)

Adds inline comment explaining the || true design rationale so future
reviewers understand it is not a suppression but a deliberate guard.
alex-solovyev added a commit that referenced this pull request Mar 13, 2026
…er.sh (#4288)

* fix: document GH#4271 quality-debt findings as verified in stats-wrapper.sh

All 3 PR #4232 review findings are confirmed addressed:
- CRITICAL (coderabbit): ps probes guarded with || true so set -euo pipefail
  cannot abort before etime fallback runs (commit 80ee81f)
- HIGH (coderabbit): pidfile preserved for live PIDs via kill -0 check;
  only removed when process is confirmed gone (commit 80ee81f)
- MEDIUM (gemini): awk uses exit 1 for invalid input; || true on command
  substitution is intentional to prevent set -e abort on parse failure
  (commit 268ad84)

Adds inline comment explaining the || true design rationale so future
reviewers understand it is not a suppression but a deliberate guard.

* refactor: replace history-referencing comments with logic-focused comments in stats-wrapper.sh

Replace GH issue/PR references in robustness notes with evergreen comments
that explain the technical why (set -euo pipefail guard behaviour, awk exit
semantics, empty-output handling) rather than the development history.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

quality-debt: .agents/scripts/stats-wrapper.sh — PR #4094 review feedback (critical)

2 participants