GH#4142: fix auto-update recovery for dirty tree, diverged branch, and API rate limits#4144
Conversation
…te limits Port recovery logic from aidevops.sh cmd_update() into auto-update-helper.sh cmd_check() to prevent indefinite update failures: 1. Ensure on main branch before pull (handles detached HEAD) 2. Clean dirty working tree before pull (handles setup.sh leftovers) 3. Fall back to git reset --hard origin/main when ff-only fails 4. Clean working tree after setup.sh to prevent next-cycle failures 5. Use authenticated gh api for version checks (5000/hr vs 60/hr) Fixes 130 pull failures and 106 remote=unknown events observed in logs. Closes marcusquinn#4142
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and reliability of the auto-update mechanism by integrating comprehensive recovery logic previously found only in manual update commands. The changes address critical failure modes such as API rate limiting, dirty working trees, and diverged Git branches, ensuring a more resilient and self-healing update process for the system. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAuto-update helper now prefers authenticated GitHub API, falls back to unauthenticated API and raw CDN, adds pre-pull housekeeping (ensure Changes
Sequence Diagram(s)sequenceDiagram
participant AutoUpdater as "Auto-update script\n(.agents/.../auto-update-helper.sh)"
participant GH_CLI as "gh (GitHub CLI)"
participant GitHubAPI as "GitHub API\n(unauthenticated)"
participant RawCDN as "raw.githubusercontent.com"
participant Git as "git / origin"
participant Freshness as "run_freshness_checks"
AutoUpdater->>GH_CLI: request remote version (authenticated)
alt gh returns version
GH_CLI-->>AutoUpdater: base64 content -> decoded version
else gh missing/unauth
AutoUpdater->>GitHubAPI: request remote version (unauthenticated)
alt API returns version
GitHubAPI-->>AutoUpdater: version
else
AutoUpdater->>RawCDN: fetch raw version file
RawCDN-->>AutoUpdater: raw version
end
end
AutoUpdater->>Git: ensure on main branch
AutoUpdater->>Git: clean working tree (reset & checkout)
AutoUpdater->>Git: fetch && pull --ff-only
alt pull succeeds
Git-->>AutoUpdater: updated
AutoUpdater->>Freshness: run_freshness_checks (post-update)
else pull fails
Git-->>AutoUpdater: pull failed
AutoUpdater->>Git: reset --hard origin/main
alt reset succeeds
Git-->>AutoUpdater: reset complete
AutoUpdater->>Freshness: run_freshness_checks (post-recovery)
else
AutoUpdater->>Freshness: run_freshness_checks (log failure)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan for PR comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.agents/scripts/auto-update-helper.sh (1)
1088-1094: Consider logging if branch checkout fails.The silent
|| trueat the end of the checkout chain means both attempts could fail without any record. While the subsequent pull failure recovery handles this functionally, a warning log would aid debugging when investigating unusual update failures.♻️ Optional: Add warning on checkout failure
local current_branch current_branch=$(git -C "$INSTALL_DIR" branch --show-current 2>/dev/null || echo "") if [[ "$current_branch" != "main" ]]; then log_info "Not on main branch ($current_branch), switching..." - git -C "$INSTALL_DIR" checkout main --quiet 2>/dev/null || - git -C "$INSTALL_DIR" checkout -b main origin/main --quiet 2>/dev/null || true + if ! git -C "$INSTALL_DIR" checkout main --quiet 2>/dev/null && + ! git -C "$INSTALL_DIR" checkout -b main origin/main --quiet 2>/dev/null; then + log_warn "Could not switch to main branch — pull may fail" + fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/auto-update-helper.sh around lines 1088 - 1094, The checkout chain that attempts to switch to main (using current_branch and git -C "$INSTALL_DIR" checkout ...) ends with a silent "|| true" so failures are hidden; update the logic to detect if both checkout attempts failed and call the existing logging helper (e.g., log_warn or log_error) with a message including current_branch and $INSTALL_DIR to record the failure. Concretely, after trying git -C "$INSTALL_DIR" checkout main --quiet and the fallback git -C "$INSTALL_DIR" checkout -b main origin/main --quiet, check the exit status and if non-zero call log_warn "Failed to switch to main branch from '$current_branch' in $INSTALL_DIR" (or log_error) so the failure is surfaced instead of swallowed by || true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.agents/scripts/auto-update-helper.sh:
- Around line 1088-1094: The checkout chain that attempts to switch to main
(using current_branch and git -C "$INSTALL_DIR" checkout ...) ends with a silent
"|| true" so failures are hidden; update the logic to detect if both checkout
attempts failed and call the existing logging helper (e.g., log_warn or
log_error) with a message including current_branch and $INSTALL_DIR to record
the failure. Concretely, after trying git -C "$INSTALL_DIR" checkout main
--quiet and the fallback git -C "$INSTALL_DIR" checkout -b main origin/main
--quiet, check the exit status and if non-zero call log_warn "Failed to switch
to main branch from '$current_branch' in $INSTALL_DIR" (or log_error) so the
failure is surfaced instead of swallowed by || true.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d5e12110-fa5e-4dfe-9ae1-0eb2ffab3dda
📒 Files selected for processing (1)
.agents/scripts/auto-update-helper.sh
There was a problem hiding this comment.
Code Review
This pull request significantly improves the robustness of the auto-update mechanism by incorporating recovery logic for dirty Git trees, diverged branches, and GitHub API rate limits. The use of authenticated gh api calls is a good optimization to prevent rate limiting issues. The added git recovery steps, including ensuring the correct branch and cleaning the working tree, are crucial for a managed installation directory. Overall, these changes directly address the identified problems and enhance the reliability of the framework's self-update capabilities.
Note: Security Review is unavailable for this PR.
| if command -v gh &>/dev/null && gh auth status &>/dev/null; then | ||
| version=$(gh api repos/marcusquinn/aidevops/contents/VERSION \ | ||
| --jq '.content' 2>/dev/null | | ||
| base64 -d 2>/dev/null | | ||
| tr -d '\n') | ||
| if [[ -n "$version" ]]; then | ||
| echo "$version" | ||
| return 0 | ||
| fi |
There was a problem hiding this comment.
The gh api command uses --jq '.content', which relies on jq being available. To align with the repository rule of preferring built-in tools like grep over external dependencies like jq for simple tasks, consider parsing the JSON output directly using grep or sed instead of gh's --jq flag. This avoids the jq dependency entirely and improves portability.
| if command -v gh &>/dev/null && gh auth status &>/dev/null; then | |
| version=$(gh api repos/marcusquinn/aidevops/contents/VERSION \ | |
| --jq '.content' 2>/dev/null | | |
| base64 -d 2>/dev/null | | |
| tr -d '\n') | |
| if [[ -n "$version" ]]; then | |
| echo "$version" | |
| return 0 | |
| fi | |
| if command -v gh >/dev/null && gh auth status >/dev/null; then | |
| version=$(gh api repos/marcusquinn/aidevops/contents/VERSION 2>/dev/null | \ | |
| grep -oP '"content":\s*"\K[^"]+' | \ | |
| base64 -d 2>/dev/null | \ | |
| tr -d '\n') | |
| if [[ -n "$version" ]]; then | |
| echo "$version" | |
| return 0 | |
| fi | |
| fi |
References
- For shell scripts that need to be portable, prefer built-in tools like
grepover external dependencies likejqfor simple tasks, to avoid installation requirements on diverse systems.
| git -C "$INSTALL_DIR" checkout main --quiet 2>/dev/null || | ||
| git -C "$INSTALL_DIR" checkout -b main origin/main --quiet 2>/dev/null || true |
There was a problem hiding this comment.
The || true at the end of this compound command can silently mask failures if both git checkout main and git checkout -b main origin/main fail. It's important to know if the branch switch was unsuccessful, especially after logging an log_info message about switching. Consider removing || true and adding an log_error if the operation fails. Additionally, for git commands, prefer relying on the --quiet flag to suppress stdout and avoid 2>/dev/null to ensure error messages on stderr remain visible for debugging.
| git -C "$INSTALL_DIR" checkout main --quiet 2>/dev/null || | |
| git -C "$INSTALL_DIR" checkout -b main origin/main --quiet 2>/dev/null || true | |
| git -C "$INSTALL_DIR" checkout main --quiet || \ | |
| git -C "$INSTALL_DIR" checkout -b main origin/main --quiet || \ | |
| log_error "Failed to switch to or create main branch" |
References
- When using git commands (like 'init', 'remote') in shell scripts, use the '-q' flag to suppress standard output instead of '2>/dev/null', ensuring that error messages on stderr remain visible for debugging.
| git -C "$INSTALL_DIR" reset HEAD -- . 2>/dev/null || true | ||
| git -C "$INSTALL_DIR" checkout -- . 2>/dev/null || true |
There was a problem hiding this comment.
Similar to the previous comment, the || true here can silently hide failures during the working tree cleanup. While the intent is to ensure the pull is not blocked, if git reset or git checkout fail, it could indicate a more serious underlying issue that should be logged as an error. Additionally, avoid 2>/dev/null for git commands to ensure error messages on stderr remain visible for debugging, even if a --quiet flag is not applicable for these specific forms.
| git -C "$INSTALL_DIR" reset HEAD -- . 2>/dev/null || true | |
| git -C "$INSTALL_DIR" checkout -- . 2>/dev/null || true | |
| git -C "$INSTALL_DIR" reset HEAD -- . || log_error "Failed to reset HEAD" | |
| git -C "$INSTALL_DIR" checkout -- . || log_error "Failed to checkout working tree" |
References
- When using git commands (like 'init', 'remote') in shell scripts, use the '-q' flag to suppress standard output instead of '2>/dev/null', ensuring that error messages on stderr remain visible for debugging.
| log_error "git reset --hard origin/main also failed" | ||
| update_state "update" "$remote" "pull_failed" | ||
| check_skill_freshness | ||
| check_openclaw_freshness | ||
| check_tool_freshness | ||
| check_upstream_watch | ||
| return 1 |
There was a problem hiding this comment.
The sequence of update_state, check_skill_freshness, check_openclaw_freshness, check_tool_freshness, and check_upstream_watch is duplicated here. This block of code is executed if git pull --ff-only fails and then git reset --hard origin/main also fails. This exact same block is present in the original if condition (lines 1080-1084 in the LEFT side). Consider refactoring to avoid this duplication, perhaps by encapsulating these calls in a function or moving them to a single failure point.
References
- In shell scripts, extract repeated logic into an internal helper function to improve maintainability. This applies even for standalone scripts where external
sourcedependencies are avoided.
| # (e.g., chmod on tracked scripts, scan results written to repo) | ||
| # Prevents dirty tree from blocking the next update cycle. | ||
| # See: https://github.com/marcusquinn/aidevops/issues/2286 | ||
| git -C "$INSTALL_DIR" checkout -- . 2>/dev/null || true |
There was a problem hiding this comment.
The || true at the end of this line can mask potential failures during the post-setup working tree cleanup. If git checkout -- . fails, it might indicate a problem that should be logged, even if it's a cleanup step. Additionally, avoid 2>/dev/null for git commands to ensure error messages on stderr remain visible for debugging, even if a --quiet flag is not applicable for this specific form.
| git -C "$INSTALL_DIR" checkout -- . 2>/dev/null || true | |
| git -C "$INSTALL_DIR" checkout -- . || log_error "Post-setup cleanup failed" |
References
- When using git commands (like 'init', 'remote') in shell scripts, use the '-q' flag to suppress standard output instead of '2>/dev/null', ensuring that error messages on stderr remain visible for debugging.
… error logging Address review feedback from Gemini Code Assist on PR marcusquinn#4144: - Extract duplicated 4-call freshness-check block into run_freshness_checks() helper, reducing 7 call sites to 1 definition + 7 single-line calls - Replace || true on branch checkout with proper error logging and early return on failure (silent failures here block updates indefinitely) - Replace || true on working tree cleanup (reset/checkout) with log_warn so cleanup failures are visible in the update log - Replace || true on post-setup cleanup with log_warn for dirty-state visibility on next update cycle - Redirect git stderr to LOG_FILE instead of /dev/null for debuggability Dismisses Gemini comment on line 319 (gh --jq uses gh's built-in jq engine, not the external jq binary — no portability concern).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/auto-update-helper.sh:
- Around line 1088-1102: The branch-switch logic can fail before cleanup runs,
so move the working-tree cleanup commands (git reset HEAD -- . and git checkout
-- .) to run immediately after detecting a non-main branch (i.e., inside the if
[[ "$current_branch" != "main" ]] block, before attempting git -C "$INSTALL_DIR"
checkout main ...), ensuring cleanup always executes prior to the git checkout;
keep the existing error handling (log_info, log_error, update_state "update"
"$remote" "branch_switch_failed", and run_freshness_checks) unchanged and only
attempt the checkout after the cleanup steps complete.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 098ee155-88cd-4352-82a2-a9e849d42ae1
📒 Files selected for processing (1)
.agents/scripts/auto-update-helper.sh
|
Addressed the CHANGES_REQUESTED feedback in 1df47ea by moving dirty-tree cleanup ahead of branch switching in cmd_check(), so checkout failures caused by local changes are handled fail-safe before branch recovery. All PR checks are now passing on this head commit. |
|
Addressed CHANGES_REQUESTED feedback and pushed b33af98 with additional branch-switch diagnostics. @coderabbitai review\n/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly improves the robustness of the auto-update mechanism by adding recovery logic for common failure scenarios like a dirty git tree, diverged branches, and API rate limits. The use of authenticated gh api calls is a great addition to prevent rate-limiting issues. The code is also refactored for better maintainability. I've found one potential logic issue regarding the order of git operations that could cause update failures in some cases, for which I've left a specific comment.
| # Ensure we're on the main branch (detached HEAD or stale branch blocks pull) | ||
| # Mirrors recovery logic from aidevops.sh cmd_update() | ||
| # See: https://github.com/marcusquinn/aidevops/issues/4142 | ||
| local current_branch | ||
| current_branch=$(git -C "$INSTALL_DIR" branch --show-current 2>/dev/null || echo "") | ||
| if [[ "$current_branch" != "main" ]]; then | ||
| log_info "Not on main branch ($current_branch), switching..." | ||
| if ! git -C "$INSTALL_DIR" checkout main --quiet 2>>"$LOG_FILE" && | ||
| ! git -C "$INSTALL_DIR" checkout -b main origin/main --quiet 2>>"$LOG_FILE"; then | ||
| log_error "Failed to switch to main branch from '$current_branch' in $INSTALL_DIR" | ||
| update_state "update" "$remote" "branch_switch_failed" | ||
| run_freshness_checks | ||
| return 1 | ||
| fi | ||
| fi | ||
|
|
||
| # Pull latest changes | ||
| if ! git -C "$INSTALL_DIR" fetch origin main --quiet 2>>"$LOG_FILE"; then | ||
| log_error "git fetch failed" | ||
| update_state "update" "$remote" "fetch_failed" | ||
| check_skill_freshness | ||
| check_openclaw_freshness | ||
| check_tool_freshness | ||
| check_upstream_watch | ||
| run_freshness_checks | ||
| return 1 | ||
| fi |
There was a problem hiding this comment.
There's a logic error in the order of operations here. The git fetch command on line 1119 needs to run before the branch switching logic (lines 1102-1116). The git checkout -b main origin/main command on line 1110 depends on origin/main being fetched and up-to-date. Without fetching first, this command may fail if origin/main is not available locally or is stale, preventing the script from recovering from a detached HEAD or incorrect branch.
While this mirrors the logic in aidevops.sh, that implementation appears to have the same issue. Fixing it here would be a great improvement.
# Pull latest changes
if ! git -C "$INSTALL_DIR" fetch origin main --quiet 2>>"$LOG_FILE"; then
log_error "git fetch failed"
update_state "update" "$remote" "fetch_failed"
run_freshness_checks
return 1
fi
# Ensure we're on the main branch (detached HEAD or stale branch blocks pull)
# Mirrors recovery logic from aidevops.sh cmd_update()
# See: https://github.com/marcusquinn/aidevops/issues/4142
local current_branch
current_branch=$(git -C "$INSTALL_DIR" branch --show-current 2>/dev/null || echo "")
if [[ "$current_branch" != "main" ]]; then
log_info "Not on main branch ($current_branch), switching..."
if ! git -C "$INSTALL_DIR" checkout main --quiet 2>>"$LOG_FILE" &&
! git -C "$INSTALL_DIR" checkout -b main origin/main --quiet 2>>"$LOG_FILE"; then
log_error "Failed to switch to main branch from '$current_branch' in $INSTALL_DIR"
update_state "update" "$remote" "branch_switch_failed"
run_freshness_checks
return 1
fi
fi|
This PR is from an external contributor (@johnwaldo). Auto-merge is disabled for external PRs — a maintainer must review and approve manually. To approve or decline, comment on this PR:
|
Summary
aidevops.sh cmd_update()intoauto-update-helper.sh cmd_check()to prevent indefinite update failuresgh apifor version checks (5000/hr vs 60/hr unauthenticated)Problem
auto-update-helper.sh checkusesgit pull --ff-onlybut has no recovery when it fails. Three failure modes observed in production logs:remote=unknown): 106 failures over ~10 hours overnight when unauthenticated GitHub API quota (60/hr) was exhaustedThe manual
aidevops updatecommand already handles all three cases correctly (lines 656-718 ofaidevops.sh). The auto-update helper was missing this recovery logic entirely.Changes
get_remote_version()— authenticated API firstgh apiwith auth token (5000 req/hr) before falling back to unauthenticated curl (60 req/hr)remote=unknownfailurescmd_check()— pre-pull recoverymainbranch before pull (handles detached HEAD)git reset HEAD -- . && git checkout -- .)cmd_check()— pull failure fallbackgit pull --ff-onlyfails, fall back togit reset --hard origin/main~/Git/aidevopsrepo is a managed install directory — it should always trackorigin/mainexactlycmd_check()— post-setup cleanupgit checkout -- .after setup.sh to prevent dirty tree from blocking the next cycleTesting
cmd_update()implementation that has been working correctlyCloses #4142
Summary by CodeRabbit
Bug Fixes
Refactor