t1425, t1426, t1427: Fix pulse-wrapper zombie accumulation and blocking#4020
t1425, t1426, t1427: Fix pulse-wrapper zombie accumulation and blocking#4020alex-solovyev merged 3 commits intomainfrom
Conversation
…g (t1425, t1426, t1427) - Write PID file immediately after check_dedup() to close race window where cron spawns parallel instances during setup phase (t1425) - Skip person-stats calls when GitHub Search API rate limit < 10, avoiding 56s+ blocking waits that stall the pulse cycle (t1426) - Guard empty sqlite3 -json output before piping to Python, preventing JSONDecodeError tracebacks in pulse logs (t1427) Closes #4015, closes #4016, closes #4009
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the robustness and efficiency of the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds a sqlite3 -json output guard in contributor-activity-helper.sh to coerce empty/non-array results to Changes
Sequence Diagram(s)sequenceDiagram
participant Pulse as Pulse Wrapper
participant CacheFS as Cache (filesystem)
participant Helper as Contributor Activity Helper
participant GH as GitHub Search API
rect rgba(200,220,255,0.5)
Pulse->>CacheFS: check PERSON_STATS_LAST_RUN
alt cache stale
Pulse->>Helper: _refresh_person_stats_cache()
Helper->>GH: fetch per-person stats (search API)
GH-->>Helper: return person stats
Helper->>CacheFS: write person-stats cache files
else cache fresh
Pulse->>CacheFS: read person-stats cache files
end
Pulse->>Helper: run per-repo health flow with cached person-stats
Helper-->>Pulse: per-repo & cross-repo markdown
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Mar 9 21:44:32 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/pulse-wrapper.sh:
- Around line 1957-1964: The current gate uses a fixed ">=10" check which
doesn't account for the dynamic cost of person_stats() (it consumes ~4 Search
API calls per login) so the blocking helper can still sleep on repos with 3+
contributors; update the logic to either (A) compute an estimated required
budget before invoking the helper by querying the expected number of logins and
comparing search_remaining to (estimated_logins * 4) or (B) move the skip
decision into the helper functions person_stats() / cross_repo_person_stats() so
they can count actual logins and decide to skip or sleep themselves; modify the
call site that sets person_stats_md and the activity_helper invocation
accordingly so the skip decision uses the estimated_cost or the helper-internal
check rather than the hardcoded ">=10" threshold.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7b23eb7c-0ec1-44be-b595-7ced11cc3869
📒 Files selected for processing (2)
.agents/scripts/contributor-activity-helper.sh.agents/scripts/pulse-wrapper.sh
There was a problem hiding this comment.
Code Review
This pull request introduces several fixes to improve the robustness of the pulse-wrapper and contributor-activity-helper scripts. The changes address a race condition by writing a PID file earlier, prevent blocking waits by checking the GitHub API rate limit before expensive operations, and handle empty JSON output from sqlite3 to avoid errors.
My review focuses on improving code maintainability and aligning with established repository rules. I've suggested simplifying a conditional check in contributor-activity-helper.sh to ensure valid JSON output and refactoring duplicated code in pulse-wrapper.sh into a helper function to improve readability and reduce redundancy. Overall, the changes are solid and address important stability issues.
Note: Security Review did not run due to the size of the PR.
| if [[ -z "$query_result" || "${query_result:0:1}" != "[" ]]; then | ||
| query_result="[]" | ||
| fi |
There was a problem hiding this comment.
The condition to check if query_result is a valid JSON array can be simplified. The current check [[ -z "$query_result" || "${query_result:0:1}" != "[" ]] is a bit verbose. A simpler and more robust way to check if the string starts with [ is to use a glob pattern match [[ "$query_result" != "["* ]]. This single check correctly handles both an empty string and a string that doesn't start with [, making the code more concise. This also aligns with the practice that shell functions expected to return JSON should output a valid empty JSON array ([]) on no results, ensuring consumers like jq do not break.
| if [[ -z "$query_result" || "${query_result:0:1}" != "[" ]]; then | |
| query_result="[]" | |
| fi | |
| if [[ "$query_result" != "["* ]]; then | |
| query_result="[]" | |
| fi |
References
- When a shell function is expected to return JSON, it should output a valid empty JSON array (
[]) or object ({}) on no results, rather than returning a non-zero exit code or empty string, to ensure consumers of the output (like pipes tojq) do not break.
There was a problem hiding this comment.
Applied in follow-up PR. Simplified to [[ "$query_result" != "["* ]] as suggested -- handles both empty string and non-array in a single glob match.
.agents/scripts/pulse-wrapper.sh
Outdated
| local search_remaining | ||
| search_remaining=$(gh api rate_limit --jq '.resources.search.remaining' 2>/dev/null) || search_remaining=0 | ||
| if [[ "$search_remaining" -ge 10 ]]; then | ||
| person_stats_md=$(bash "$activity_helper" person-stats "$repo_path" --period month --format markdown || echo "_Person stats unavailable._") | ||
| else | ||
| person_stats_md="_Person stats skipped (GitHub Search API rate limit: ${search_remaining} remaining)._" | ||
| fi |
There was a problem hiding this comment.
This logic for checking the GitHub search API rate limit is duplicated in the update_health_issues function (lines 2190-2197). To avoid code duplication and centralize the API call, consider creating a helper function. This aligns with the practice of extracting repeated logic into an internal helper function to improve maintainability.
For example, you could define a function like this:
get_search_rate_limit_remaining() {
local remaining
remaining=$(gh api rate_limit --jq '.resources.search.remaining' 2>/dev/null) || remaining=0
echo "$remaining"
}
Then, you can use it in both places like this:
local search_remaining
search_remaining=$(get_search_rate_limit_remaining)
if [[ "$search_remaining" -ge 10 ]]; then
# ...
else
# ...
fi
This would make the code more maintainable by adhering to the DRY (Don't Repeat Yourself) principle.
References
- In shell scripts, extract repeated logic into an internal helper function to improve maintainability. This applies even for standalone scripts where external
sourcedependencies are avoided.
There was a problem hiding this comment.
Not applying. The code was refactored in c044f23 -- the original duplicate at the call site no longer exists (replaced with cache reads at line 1964-1970). The remaining rate limit checks in _refresh_person_stats_cache() are in different contexts (pre-flight, per-repo bail-out, cross-repo gate) with different comparison values. Extracting a helper for a one-liner gh api rate_limit --jq ... adds indirection without meaningful DRY benefit.
Add _refresh_person_stats_cache() with 1-hour timestamp guard. Per-repo and cross-repo person-stats written to cache files, read by health issue updates each pulse. Replaces rate limit guards — 44 Search API calls/hr instead of ~1320/hr.
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Mar 9 22:13:42 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Dispatching worker to address coderabbit CHANGES_REQUESTED.
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.agents/scripts/pulse-wrapper.sh (1)
1963-1971: Minor:slug_safeis already defined earlier in this function.The variable
slug_safeis computed identically at line 1663 for the cache file path. This redefinition is harmless but redundant.♻️ Remove redundant variable declaration
fi # t1426: person-stats from hourly cache (see _refresh_person_stats_cache) - local slug_safe="${repo_slug//\//-}" local ps_cache="${PERSON_STATS_CACHE_DIR}/person-stats-cache-${slug_safe}.md" if [[ -f "$ps_cache" ]]; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/pulse-wrapper.sh around lines 1963 - 1971, The slug_safe variable is redundantly redefined before computing ps_cache; remove the repeated declaration and reuse the existing slug_safe already set earlier in the function so you only build ps_cache via PERSON_STATS_CACHE_DIR and slug_safe once; update the block that computes ps_cache and person_stats_md (referencing ps_cache, PERSON_STATS_CACHE_DIR, and person_stats_md) to rely on the previously defined slug_safe and eliminate the duplicate assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.agents/scripts/pulse-wrapper.sh:
- Around line 1963-1971: The slug_safe variable is redundantly redefined before
computing ps_cache; remove the repeated declaration and reuse the existing
slug_safe already set earlier in the function so you only build ps_cache via
PERSON_STATS_CACHE_DIR and slug_safe once; update the block that computes
ps_cache and person_stats_md (referencing ps_cache, PERSON_STATS_CACHE_DIR, and
person_stats_md) to rely on the previously defined slug_safe and eliminate the
duplicate assignment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 26008f55-fe9e-4658-8a87-2fa22aaefdb4
📒 Files selected for processing (1)
.agents/scripts/pulse-wrapper.sh
The fixed >=10 threshold didn't account for the dynamic cost of person_stats() (~4 Search API calls per contributor per repo). With 3+ contributors, the helper's internal sleep branch could still stall the pulse. Replace the hardcoded threshold with cost-aware gating: - Estimate minimum budget (repos × 4 queries) before starting - Re-check remaining budget before each repo, bail early if exhausted - Gate cross-repo stats on remaining budget too Addresses CodeRabbit review suggestion on PR #4020.
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Mar 9 22:18:37 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
Dispatching worker to address 1 unresolved CodeRabbit suggestion.
|
|
Follow-up PR addressing remaining review suggestions: Applied:
Dismissed:
|
- Simplify JSON array check to single glob match (Gemini suggestion) - Remove redundant slug_safe redefinition (CodeRabbit nitpick)
- Simplify JSON array check to single glob match (Gemini suggestion) - Remove redundant slug_safe redefinition (CodeRabbit nitpick)



Summary
check_dedup()to close race window where cron spawns 18+ parallel instances during setup phaseperson-stats/cross-repo-person-statswhen GitHub Search API rate limit < 10, avoiding 56s+ blocking waits that stall the pulse cyclesqlite3 -jsonoutput (returns""not"[]"on no rows) before piping to Python, preventingJSONDecodeErrortracebacksCloses #4015, closes #4016, closes #4009
Summary by CodeRabbit
New Features
Bug Fixes
Improvements