Skip to content

feat: add file discovery performance guidance and fd/ripgrep installers#115

Merged
marcusquinn merged 1 commit intomainfrom
chore/file-discovery-performance
Jan 17, 2026
Merged

feat: add file discovery performance guidance and fd/ripgrep installers#115
marcusquinn merged 1 commit intomainfrom
chore/file-discovery-performance

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Jan 17, 2026

Summary

  • Add file discovery guidance to AGENTS.md with preference order for AI agents
  • Add setup_file_discovery_tools() to setup.sh for automatic fd/ripgrep installation
  • Update README.md with file discovery tools documentation

Problem

The mcp_glob tool (Claude Code's built-in glob) can be CPU-intensive on large codebases because it:

  1. Traverses the entire directory tree
  2. Doesn't always respect .gitignore efficiently
  3. Runs in the main Node.js thread

Solution

Guide AI agents toward more efficient alternatives:

Tool Speed Use Case
git ls-files Instant Git-tracked files only
fd ~10x faster Untracked files, system-wide searches
rg --files ~10x faster When already using ripgrep
mcp_glob Baseline Fallback when bash unavailable

Changes

AGENTS.md

  • Added "File Discovery" section to Critical Rules with preference order
  • Explains when to use each tool

setup.sh

  • New setup_file_discovery_tools() function
  • Checks for fd and ripgrep installation
  • Offers to install via detected package manager
  • Handles Debian/Ubuntu fd-find package naming
  • Provides manual installation instructions for all platforms

README.md

  • Added fd and ripgrep to Requirements install commands
  • New "File Discovery Tools" section with comparison table
  • Documents preference order for AI agents

Testing

  • Local linting passed (ShellCheck, secretlint)
  • No new SonarCloud violations introduced
  • setup.sh syntax validated

Summary by CodeRabbit

  • Documentation

    • Introduced file discovery tools documentation with cross-platform usage guidance and performance considerations
    • Expanded localhost standards guidance to include SSL requirements and automatic port detection
  • Chores

    • Updated dependency installation instructions for macOS and Linux with optional file discovery utilities
    • Setup automation now includes optional installation of file discovery tools with package manager support

✏️ Tip: You can customize this high-level summary in your review settings.

- Add file discovery guidance to AGENTS.md with preference order:
  1. git ls-files (instant, tracked files)
  2. fd (fast, respects .gitignore)
  3. rg --files (fast, respects .gitignore)
  4. mcp_glob (fallback when bash unavailable)

- Add setup_file_discovery_tools() to setup.sh:
  - Checks for fd and ripgrep installation
  - Offers to install via detected package manager
  - Handles Debian/Ubuntu fd-find package naming
  - Provides manual installation instructions

- Update README.md Requirements section:
  - Add fd and ripgrep to install commands
  - Add File Discovery Tools section with comparison table
  - Document preference order for AI agents

This addresses CPU overhead from mcp_glob on large codebases by
guiding AI agents toward more efficient alternatives.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 17, 2026

Walkthrough

Documentation and shell scripting updates introduce file discovery tools (fd and ripgrep) with setup automation. Changes include documentation in AGENTS.md and README.md, and a new setup function in setup.sh to detect and cross-platform-install these utilities. A duplicate function declaration in setup.sh requires attention.

Changes

Cohort / File(s) Summary
Documentation - File Discovery Guidance
.agent/AGENTS.md, README.md
Added "File Discovery" subsection documenting four discovery methods (git ls-files, fd, rg, mcp_glob) with usage notes and CPU considerations. Expanded "Localhost Standards" with SSL/Traefik guidance. README now includes "File Discovery Tools" section with fd and ripgrep descriptions, preferred discovery sequence, and setup script automation notes.
Setup Script - File Discovery Tool Installation
setup.sh
Introduces setup_file_discovery_tools() function with version detection, user prompts, and cross-package-manager handling (apt/brew naming abstraction). Integrates into main setup flow post-Git CLI setup. Note: Duplicate function declaration present in file—requires verification and removal.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes


Poem

🔍 From fd's swift stride to ripgrep's gleam,
Discovery tools fulfill the automation dream,
Cross-platform harmony, set with care,
Zero tech debt floating through the air! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding file discovery guidance and fd/ripgrep installers across documentation and setup script.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the efficiency of file discovery operations for AI agents by integrating and promoting the use of high-performance tools like fd and ripgrep. By guiding agents towards these faster alternatives and providing automated installation, the changes aim to reduce CPU load, enhance codebase navigation, and ensure a smoother development experience, especially in large repositories.

Highlights

  • Enhanced File Discovery Guidance: Introduced clear guidance for AI agents on efficient file discovery, prioritizing tools like git ls-files, fd, and ripgrep over the CPU-intensive mcp_glob.
  • Automated Tool Installation: Added a new function to setup.sh that automatically detects and offers to install fd and ripgrep, streamlining the setup process for optimal performance.
  • Comprehensive Documentation Updates: Updated AGENTS.md with critical rules for file discovery and README.md with installation requirements and a detailed comparison of file discovery tools.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@sonarqubecloud
Copy link

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 272 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Jan 17 05:07:38 UTC 2026: Code review monitoring started
Sat Jan 17 05:07:39 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 272
Sat Jan 17 05:07:39 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Sat Jan 17 05:07:41 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 272
  • VULNERABILITIES: 0

Generated on: Sat Jan 17 05:08:20 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@augmentcode
Copy link

augmentcode bot commented Jan 17, 2026

🤖 Augment PR Summary

Summary: Improves AI-agent guidance and automation around fast file discovery in large repos.

  • Documents preferred file listing/search order in .agent/AGENTS.md and README.md (git → fd → rg → glob).
  • Adds setup_file_discovery_tools() to setup.sh to detect and optionally install fd/ripgrep via the detected package manager.
  • Hooks the new step into the interactive setup.sh flow and calls out Debian/Ubuntu’s fd-find/fdfind naming.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

local missing_names=()

# Check for fd (fd-find)
if ! command -v fd >/dev/null 2>&1; then
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Debian/Ubuntu, fd-find commonly provides the fdfind binary (without fd), but this check only looks for fd and will still report it missing/prompt even when fdfind is already installed.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

# Handle package name differences across package managers
local actual_packages=()
for pkg in "${missing_packages[@]}"; do
case "$pkg_manager" in
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The install mapping only rewrites fdfd-find for apt, but your manual instructions say Fedora uses fd-find; with dnf (and potentially yum) this will try to install fd and may fail.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@setup.sh`:
- Around line 574-582: The check currently looks only for the fd executable
(command -v fd) so on Debian/Ubuntu systems where the package installs fdfind it
will still be reported missing; update the logic around the fd check to also
probe for fdfind, set fd_version by calling whichever binary exists (fd or
fdfind) and call print_success accordingly, and if only fdfind is found ensure
missing_packages/missing_names reflect the apt package name "fd-find" or map
fdfind to the canonical "fd" name in missing_tools; reference the existing
identifiers fd_version, missing_tools, missing_packages, missing_names, and
print_success when making the change.

Comment on lines +574 to +582
if ! command -v fd >/dev/null 2>&1; then
missing_tools+=("fd")
missing_packages+=("fd")
missing_names+=("fd (fast file finder)")
else
local fd_version
fd_version=$(fd --version 2>/dev/null | head -1 || echo "unknown")
print_success "fd found: $fd_version"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Verify fd command availability after fd-find installation on Debian/Ubuntu.

On Debian/Ubuntu, fd-find installs the binary as fdfind, not fd. The check at line 574 (command -v fd) will still fail after installation since the binary name differs.

Consider checking for fdfind as well on apt-based systems, or updating the success message to be more accurate.

🔧 Suggested improvement
     # Check for fd (fd-find)
-    if ! command -v fd >/dev/null 2>&1; then
+    if ! command -v fd >/dev/null 2>&1 && ! command -v fdfind >/dev/null 2>&1; then
         missing_tools+=("fd")
         missing_packages+=("fd")
         missing_names+=("fd (fast file finder)")
     else
         local fd_version
-        fd_version=$(fd --version 2>/dev/null | head -1 || echo "unknown")
+        fd_version=$(fd --version 2>/dev/null || fdfind --version 2>/dev/null | head -1 || echo "unknown")
         print_success "fd found: $fd_version"
     fi
🤖 Prompt for AI Agents
In `@setup.sh` around lines 574 - 582, The check currently looks only for the fd
executable (command -v fd) so on Debian/Ubuntu systems where the package
installs fdfind it will still be reported missing; update the logic around the
fd check to also probe for fdfind, set fd_version by calling whichever binary
exists (fd or fdfind) and call print_success accordingly, and if only fdfind is
found ensure missing_packages/missing_names reflect the apt package name
"fd-find" or map fdfind to the canonical "fd" name in missing_tools; reference
the existing identifiers fd_version, missing_tools, missing_packages,
missing_names, and print_success when making the change.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces performance guidance for file discovery and adds installers for fd and ripgrep. The documentation updates in AGENTS.md and README.md are clear and helpful. The new setup_file_discovery_tools function in setup.sh correctly detects and offers to install these tools. I've provided a couple of suggestions to improve the maintainability and scalability of the new shell script logic by reducing code duplication. Overall, this is a valuable addition to improve the performance of AI agents on large codebases.

Comment on lines +573 to +593
# Check for fd (fd-find)
if ! command -v fd >/dev/null 2>&1; then
missing_tools+=("fd")
missing_packages+=("fd")
missing_names+=("fd (fast file finder)")
else
local fd_version
fd_version=$(fd --version 2>/dev/null | head -1 || echo "unknown")
print_success "fd found: $fd_version"
fi

# Check for ripgrep
if ! command -v rg >/dev/null 2>&1; then
missing_tools+=("rg")
missing_packages+=("ripgrep")
missing_names+=("ripgrep (fast content search)")
else
local rg_version
rg_version=$(rg --version 2>/dev/null | head -1 || echo "unknown")
print_success "ripgrep found: $rg_version"
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation checks for fd and ripgrep in separate blocks of code. This can be refactored into a data-driven loop to improve maintainability and make it easier to add more tools in the future. By defining the tools in an array, you can iterate through them, reducing code duplication and making the logic more scalable.

Suggested change
# Check for fd (fd-find)
if ! command -v fd >/dev/null 2>&1; then
missing_tools+=("fd")
missing_packages+=("fd")
missing_names+=("fd (fast file finder)")
else
local fd_version
fd_version=$(fd --version 2>/dev/null | head -1 || echo "unknown")
print_success "fd found: $fd_version"
fi
# Check for ripgrep
if ! command -v rg >/dev/null 2>&1; then
missing_tools+=("rg")
missing_packages+=("ripgrep")
missing_names+=("ripgrep (fast content search)")
else
local rg_version
rg_version=$(rg --version 2>/dev/null | head -1 || echo "unknown")
print_success "ripgrep found: $rg_version"
fi
local tools_to_check=(
"fd:fd:fd (fast file finder)"
"rg:ripgrep:ripgrep (fast content search)"
)
for tool_info in "${tools_to_check[@]}"; do
IFS=':' read -r cmd pkg_name friendly_name <<< "$tool_info"
if ! command -v "$cmd" >/dev/null 2>&1; then
missing_tools+=("$cmd")
missing_packages+=("$pkg_name")
missing_names+=("$friendly_name")
else
local version
version=$($cmd --version 2>/dev/null | head -1 || echo "unknown")
print_success "$friendly_name found: $version"
fi
done

Comment on lines +609 to +660
if [[ "$pkg_manager" != "unknown" ]]; then
read -r -p "Install file discovery tools (${missing_packages[*]}) using $pkg_manager? (y/n): " install_fd_tools

if [[ "$install_fd_tools" == "y" ]]; then
print_info "Installing ${missing_packages[*]}..."

# Handle package name differences across package managers
local actual_packages=()
for pkg in "${missing_packages[@]}"; do
case "$pkg_manager" in
apt)
# Debian/Ubuntu uses fd-find instead of fd
if [[ "$pkg" == "fd" ]]; then
actual_packages+=("fd-find")
else
actual_packages+=("$pkg")
fi
;;
*)
actual_packages+=("$pkg")
;;
esac
done

if install_packages "$pkg_manager" "${actual_packages[@]}"; then
print_success "File discovery tools installed"

# On Debian/Ubuntu, fd is installed as fdfind - create alias
if [[ "$pkg_manager" == "apt" ]] && command -v fdfind >/dev/null 2>&1 && ! command -v fd >/dev/null 2>&1; then
print_info "Note: On Debian/Ubuntu, fd is installed as 'fdfind'"
echo " Consider adding to your shell config: alias fd=fdfind"
fi
else
print_warning "Failed to install some file discovery tools (non-critical)"
fi
else
print_info "Skipped file discovery tools installation"
echo ""
echo " Manual installation:"
echo " macOS: brew install fd ripgrep"
echo " Ubuntu/Debian: sudo apt install fd-find ripgrep"
echo " Fedora: sudo dnf install fd-find ripgrep"
echo " Arch: sudo pacman -S fd ripgrep"
fi
else
echo ""
echo " Manual installation:"
echo " macOS: brew install fd ripgrep"
echo " Ubuntu/Debian: sudo apt install fd-find ripgrep"
echo " Fedora: sudo dnf install fd-find ripgrep"
echo " Arch: sudo pacman -S fd ripgrep"
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code block for printing manual installation instructions is duplicated. This can be refactored to avoid repetition, which improves maintainability. By restructuring the logic, you can have a single block that handles the case where automatic installation is not possible or was skipped by the user.

        local attempt_install=false
        if [[ "$pkg_manager" != "unknown" ]]; then
            read -r -p "Install file discovery tools (${missing_packages[*]}) using $pkg_manager? (y/n): " install_fd_tools
            if [[ "$install_fd_tools" == "y" ]]; then
                attempt_install=true
            fi
        fi

        if [[ "$attempt_install" == "true" ]]; then
            print_info "Installing ${missing_packages[*]}..."
            
            # Handle package name differences across package managers
            local actual_packages=()
            for pkg in "${missing_packages[@]}"; do
                case "$pkg_manager" in
                    apt)
                        # Debian/Ubuntu uses fd-find instead of fd
                        if [[ "$pkg" == "fd" ]]; then
                            actual_packages+=("fd-find")
                        else
                            actual_packages+=("$pkg")
                        fi
                        ;;
                    *)
                        actual_packages+=("$pkg")
                        ;;
                esac
            done
            
            if install_packages "$pkg_manager" "${actual_packages[@]}"; then
                print_success "File discovery tools installed"
                
                # On Debian/Ubuntu, fd is installed as fdfind - create alias
                if [[ "$pkg_manager" == "apt" ]] && command -v fdfind >/dev/null 2>&1 && ! command -v fd >/dev/null 2>&1; then
                    print_info "Note: On Debian/Ubuntu, fd is installed as 'fdfind'"
                    echo "  Consider adding to your shell config: alias fd=fdfind"
                fi
            else
                print_warning "Failed to install some file discovery tools (non-critical)"
            fi
        else
            if [[ "$pkg_manager" != "unknown" ]]; then
                print_info "Skipped file discovery tools installation"
            fi
            echo ""
            echo "  Manual installation:"
            echo "    macOS:        brew install fd ripgrep"
            echo "    Ubuntu/Debian: sudo apt install fd-find ripgrep"
            echo "    Fedora:       sudo dnf install fd-find ripgrep"
            echo "    Arch:         sudo pacman -S fd ripgrep"
        fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant