Skip to content

Conversation

@daniel-lxs
Copy link
Member

@daniel-lxs daniel-lxs commented Sep 5, 2025

Description

This PR fixes issue #7588 where codebase indexing fails with "Maximum call stack size exceeded" error when scanning large projects with 200k+ files.

Problem

When indexing large codebases, the recursive directory scanning in listFilteredDirectories function would cause stack overflow due to unbounded recursion depth.

Solution

  • Added a limit parameter to listFilteredDirectories to control the maximum number of directories scanned
  • Implemented early termination when the limit is reached during recursive scanning
  • The limit is calculated as the remaining capacity after accounting for files found
  • Added comprehensive tests to verify the fix handles large projects without stack overflow

Changes

  • Modified src/services/glob/list-files.ts:
    • Added limit parameter to listFilteredDirectories function
    • Added dirCount tracking and early termination logic
    • Updated listFiles to pass remaining limit to directory scanning
  • Added src/services/glob/__tests__/list-files-limit.spec.ts:
    • Tests for stack overflow prevention
    • Tests for early termination when limit is reached
    • Tests for proper limit distribution between files and directories

Testing

All tests pass ✅

cd src && npx vitest run services/glob/__tests__/list-files-limit.spec.ts

Fixes #7588


Important

Fixes stack overflow in listFilteredDirectories by adding a limit parameter and tests for large project indexing.

  • Behavior:
    • Fixes stack overflow in listFilteredDirectories by adding a limit parameter to control directory scanning depth.
    • Implements early termination when the limit is reached.
    • Adjusts listFiles to pass remaining limit to directory scanning.
  • Testing:
    • Adds list-files-limit.spec.ts with tests for stack overflow prevention, early termination, and limit distribution.
  • Misc:
    • Updates list-files.ts to track dirCount and handle limits in listFilteredDirectories.

This description was created by Ellipsis for 8b8ef00. You can customize this summary. It will automatically update as commits are pushed.

- Add limit parameter to listFilteredDirectories to control directory scanning depth
- Implement early termination when limit is reached during recursive scanning
- Pass remaining limit from listFiles to ensure total items don't exceed MAX_LIST_FILES_LIMIT_CODE_INDEX
- Add comprehensive tests to verify the fix handles projects with 200k+ items

Fixes #7588
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Sep 5, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 5, 2025
Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and the implementation correctly addresses the stack overflow issue. The solution elegantly prevents the problem by adding early termination when limits are reached.

Positive observations:

  • The fix properly prevents stack overflow by adding limit checks and early termination
  • Comprehensive test coverage for the main scenarios
  • The limit distribution between files and directories is well thought out
  • Maintains backward compatibility with existing code

Suggestions for improvement:

  1. Potential race condition in limit checking: The limit is checked before processing each directory entry, but dirCount is incremented after adding to results. While not currently an issue since this runs synchronously, incrementing dirCount atomically with the limit check could prevent any potential race conditions in future concurrent scenarios.

  2. Missing edge case handling: While zero limit is handled, negative limits could cause unexpected behavior. Consider adding validation to ensure limit is non-negative.

  3. Documentation improvement: The JSDoc for listFilteredDirectories doesn't mention that the limit parameter represents the maximum number of directories to return.

  4. Performance consideration: Using Number.MAX_SAFE_INTEGER instead of Infinity might be more explicit about the intent.

  5. Test coverage: Consider adding a test case where the file count exactly equals the limit to ensure directories are completely excluded in that edge case.

… limit

Using MAX_SAFE_INTEGER is more explicit and safer than Infinity for numeric operations
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 5, 2025
@mrubens mrubens merged commit 0510c03 into main Sep 5, 2025
9 checks passed
@mrubens mrubens deleted the fix/issue-7588-codebase-indexing-limit branch September 5, 2025 18:51
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 5, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Codebase indexing large project errors

4 participants