Skip to content

fix: tool-version-check.sh hangs — add timeout and hung-process detection#2932

Merged
marcusquinn merged 1 commit intomainfrom
bugfix/version-check-timeout
Mar 5, 2026
Merged

fix: tool-version-check.sh hangs — add timeout and hung-process detection#2932
marcusquinn merged 1 commit intomainfrom
bugfix/version-check-timeout

Conversation

@marcusquinn
Copy link
Owner

Summary

  • Replace fragile perl alarm timeout fallback with background+kill approach that escalates SIGTERM → SIGKILL — fixes the root cause where macos-automator-mcp --version hung indefinitely
  • Capture timeout_sec exit code (124) to distinguish "command timed out" from "command failed" — returns "timeout" instead of "unknown"
  • Add VERSION_TIMEOUT=10 constant for local --version calls (separate from PKG_QUERY_TIMEOUT=30 for remote queries)
  • Add timeout status display with distinct icon () and red color in both console and JSON output
  • Show timed-out tools in --quiet mode since they indicate broken tools

Closes #2920

Testing

Verified on macOS (no timeout/gtimeout available — exercises the background+kill fallback):

  • gh --version2.87.3 (normal path works)
  • nonexistent tool → not installed (unchanged)
  • sleep 999 (simulates hung process) → timeout after ~10s (new behaviour)
  • ShellCheck clean (only pre-existing SC1091 info)

Closes #2920

get_installed_version() already had timeout_sec wrapping the --version
call, but three issues remained:

1. timeout_sec used perl alarm as macOS fallback — SIGALRM doesn't kill
   child processes that trap or ignore signals (e.g., Node MCP servers
   like macos-automator-mcp). Replaced with background+kill approach
   that escalates SIGTERM -> SIGKILL.

2. The exit code from timeout_sec was swallowed by '|| true', making it
   impossible to distinguish 'command timed out' from 'command failed'.
   Now captures the exit code and returns 'timeout' (not 'unknown') when
   exit code is 124.

3. The hardcoded 5s timeout was too short for slow interpreters. Added
   VERSION_TIMEOUT=10 constant (matching the issue's 5-10s recommendation)
   alongside the existing PKG_QUERY_TIMEOUT=30 for remote queries.

Also adds:
- 'timeout' status in check_tool() with distinct icon and color
- timeout tools shown in --quiet mode (they indicate broken tools)
- Consistent exit code 124 across all timeout_sec code paths
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the reliability and user experience of the tool-version-check.sh script by addressing issues where tool version checks could hang indefinitely. The changes introduce a more robust timeout mechanism, particularly for macOS environments, and provide clearer feedback to the user when a tool's version check times out. This prevents script stagnation and improves diagnostic capabilities by distinguishing between command failures and timeouts.

Highlights

  • Robust Timeout Mechanism: Replaced the fragile perl alarm fallback with a more reliable background process and SIGTERM then SIGKILL escalation for macOS, ensuring tool-version-check.sh no longer hangs indefinitely.
  • Improved Timeout Detection: The script now differentiates between a command timing out (exit code 124) and a command failing, returning a distinct "timeout" status instead of "unknown".
  • Configurable Timeout: Introduced a VERSION_TIMEOUT constant set to 10 seconds specifically for local --version calls, providing a dedicated timeout separate from remote query timeouts.
  • Enhanced User Feedback: Added a distinct "timeout" status with a specific icon (⏱) and red color in both console and JSON output, and ensured that timed-out tools are displayed even when running in --quiet mode.
Changelog
  • .agents/scripts/tool-version-check.sh
    • Defined VERSION_TIMEOUT constant for local version checks.
    • Updated get_installed_version to utilize VERSION_TIMEOUT and correctly interpret the timeout_sec exit code 124 as a "timeout" status.
    • Reworked the timeout_sec function to implement a robust background process with SIGTERM and SIGKILL escalation for macOS when timeout or gtimeout are unavailable, ensuring it returns 124 on timeout.
    • Added a new conditional branch in check_tool to handle the "timeout" status, assigning a distinct icon and red color.
    • Modified the check_tool's quiet mode logic to ensure tools with a "timeout" status are still displayed.
    • Added a new case for "timeout" status in check_tool's console output, providing a descriptive message indicating the tool hung and was killed.
Activity
  • The pull request was opened by marcusquinn with a clear description of the problem and proposed solutions.
  • Testing was performed on macOS, verifying the new background+kill fallback for sleep 999 simulation and confirming normal operation for gh --version and nonexistent tools.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 21 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3e73fd94-b7c0-4932-aa69-dd0fd01ff89d

📥 Commits

Reviewing files that changed from the base of the PR and between 6cb02b0 and e50e3ef.

📒 Files selected for processing (1)
  • .agents/scripts/tool-version-check.sh
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/version-check-timeout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of hanging processes during version checks by implementing a more robust timeout mechanism. The new timeout_sec function with its background process and kill fallback is a significant improvement over the previous perl alarm approach. The changes to handle and display the timeout status are also clear and user-friendly. I have one suggestion regarding error stream redirection to improve debuggability, in line with the repository's general guidelines.

local _ver_rc=0
# shellcheck disable=SC2086
timeout_sec 5 "$cmd" $ver_flag >"$_ver_log" 2>/dev/null || true
timeout_sec "$VERSION_TIMEOUT" "$cmd" $ver_flag >"$_ver_log" 2>/dev/null || _ver_rc=$?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While this works, redirecting stderr to /dev/null can hide important error messages if a command fails for reasons other than hanging (e.g., permission issues, missing dependencies). This makes debugging difficult. The project's general rules advise against blanket error suppression.

To improve debuggability, consider removing 2>/dev/null and letting any error messages from the version command be printed.

Suggested change
timeout_sec "$VERSION_TIMEOUT" "$cmd" $ver_flag >"$_ver_log" 2>/dev/null || _ver_rc=$?
timeout_sec "$VERSION_TIMEOUT" "$cmd" $ver_flag >"$_ver_log" || _ver_rc=$?
References
  1. In shell scripts, avoid blanket suppression of errors with '2>/dev/null' to ensure that authentication, syntax, or system issues remain visible for debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: tool-version-check.sh hangs — get_installed_version has no timeout on local --version calls

1 participant