Skip to content

Codex/add ignorecontent configuration and cli support br7p2k#822

Closed
Ruddy35 wants to merge 3 commits intoyamadashy:mainfrom
Ruddy35:codex/add-ignorecontent-configuration-and-cli-support-br7p2k
Closed

Codex/add ignorecontent configuration and cli support br7p2k#822
Ruddy35 wants to merge 3 commits intoyamadashy:mainfrom
Ruddy35:codex/add-ignorecontent-configuration-and-cli-support-br7p2k

Conversation

@Ruddy35
Copy link

@Ruddy35 Ruddy35 commented Sep 7, 2025

Checklist

  • Run npm run test
  • Run npm run lint

@Ruddy35 Ruddy35 requested a review from yamadashy as a code owner September 7, 2025 11:26
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 7, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a new ignore-content mechanism across CLI, config, core file collection/processing, and docs. Introduces --ignore-content and ignoreContent config. File collection tags tasks with skipContent to omit reading contents; downstream processing and security paths filter out files without content. Tests and schemas updated; package.json adds prepare script.

Changes

Cohort / File(s) Summary
CLI option wiring
src/cli/cliRun.ts, src/cli/actions/defaultAction.ts, src/cli/types.ts
Adds --ignore-content <patterns> flag; parses to CliOptions.ignoreContent (string), splits into array in default action, forwards to config merge.
Config schema and loading
src/config/configSchema.ts, src/config/configLoad.ts
Adds ignoreContent to base/default schemas and merged config; defaults to []; merges values from base/file/CLI configs.
File collection and worker
src/core/file/fileCollect.ts, src/core/file/workers/fileCollectWorker.ts, src/core/file/fileTypes.ts
Introduces glob-based decision to skip content; sets skipContent per task; worker respects skipContent to avoid reading; RawFile.content becomes optional; exports SkippedFileInfo.
File processing pipeline
src/core/file/fileProcess.ts, src/core/file/fileProcessContent.ts, src/core/packager.ts
Filters out files without content before processing; processContent now errors if content is undefined; packager passes only files with content to processing.
Security pipeline
src/core/security/securityCheck.ts, src/core/security/validateFileSafety.ts
Tightens runSecurityCheck to require content on inputs; validateFileSafety filters raw files to those with content before calling security checks.
Documentation updates
README.md, website/client/src/*/guide/command-line-options.md, website/client/src/*/guide/configuration.md
Documents new --ignore-content and ignoreContent across README and localized docs; adds examples and defaults.
Tests
tests/cli/actions/defaultAction.test.ts, tests/config/configSchema.test.ts, tests/core/file/fileCollect.test.ts, tests/core/security/securityCheck.test.ts, tests/testing/testUtils.ts
Adds/updates tests for parsing, schema default, collection behavior (patterns, negation, dotfiles), security type expectations, and test utils to include ignoreContent.
Build script
package.json
Adds prepare script: npm run build.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as CLI
  participant Config as ConfigLoad/Merge
  participant Collect as FileCollector
  participant Worker as FileCollectWorker
  participant Pack as Packager
  participant Proc as FileProcessor
  participant Sec as SecurityCheck

  User->>CLI: repomix --ignore-content "<patterns>"
  CLI->>Config: buildCliConfig(ignoreContent)
  Config-->>CLI: merged config (include, ignore, ignoreContent)

  CLI->>Collect: collectFiles(config)
  Collect->>Collect: shouldSkipContent(filePath, ignoreContent)
  alt skipContent = true
    Collect->>Worker: { path, skipContent: true }
    Worker-->>Collect: RawFile { path } (no content)
  else skipContent = false
    Collect->>Worker: { path, skipContent: false }
    Worker-->>Collect: RawFile { path, content }
  end
  Collect-->>Pack: rawFiles (some with undefined content)

  Pack->>Pack: filter files with content
  Pack->>Proc: process(files with content)
  Proc-->>Pack: processed files

  Pack->>Sec: runSecurityCheck(files with content)
  Sec-->>Pack: suspicious results
  Pack-->>User: packaged output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • yamadashy

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17f3c73 and a4b0eef.

📒 Files selected for processing (44)
  • README.md (4 hunks)
  • package.json (1 hunks)
  • src/cli/actions/defaultAction.ts (1 hunks)
  • src/cli/cliRun.ts (1 hunks)
  • src/cli/types.ts (1 hunks)
  • src/config/configLoad.ts (1 hunks)
  • src/config/configSchema.ts (2 hunks)
  • src/core/file/fileCollect.ts (2 hunks)
  • src/core/file/fileProcess.ts (2 hunks)
  • src/core/file/fileProcessContent.ts (1 hunks)
  • src/core/file/fileTypes.ts (1 hunks)
  • src/core/file/workers/fileCollectWorker.ts (2 hunks)
  • src/core/packager.ts (1 hunks)
  • src/core/security/securityCheck.ts (1 hunks)
  • src/core/security/validateFileSafety.ts (1 hunks)
  • tests/cli/actions/defaultAction.test.ts (1 hunks)
  • tests/config/configSchema.test.ts (2 hunks)
  • tests/core/file/fileCollect.test.ts (1 hunks)
  • tests/core/security/securityCheck.test.ts (1 hunks)
  • tests/testing/testUtils.ts (1 hunks)
  • website/client/src/de/guide/command-line-options.md (2 hunks)
  • website/client/src/de/guide/configuration.md (2 hunks)
  • website/client/src/en/guide/command-line-options.md (2 hunks)
  • website/client/src/en/guide/configuration.md (2 hunks)
  • website/client/src/es/guide/command-line-options.md (2 hunks)
  • website/client/src/es/guide/configuration.md (2 hunks)
  • website/client/src/fr/guide/command-line-options.md (2 hunks)
  • website/client/src/fr/guide/configuration.md (2 hunks)
  • website/client/src/hi/guide/command-line-options.md (2 hunks)
  • website/client/src/hi/guide/configuration.md (2 hunks)
  • website/client/src/id/guide/command-line-options.md (2 hunks)
  • website/client/src/id/guide/configuration.md (2 hunks)
  • website/client/src/ja/guide/command-line-options.md (2 hunks)
  • website/client/src/ja/guide/configuration.md (2 hunks)
  • website/client/src/ko/guide/command-line-options.md (2 hunks)
  • website/client/src/ko/guide/configuration.md (2 hunks)
  • website/client/src/pt-br/guide/command-line-options.md (2 hunks)
  • website/client/src/pt-br/guide/configuration.md (2 hunks)
  • website/client/src/vi/guide/command-line-options.md (2 hunks)
  • website/client/src/vi/guide/configuration.md (2 hunks)
  • website/client/src/zh-cn/guide/command-line-options.md (2 hunks)
  • website/client/src/zh-cn/guide/configuration.md (2 hunks)
  • website/client/src/zh-tw/guide/command-line-options.md (2 hunks)
  • website/client/src/zh-tw/guide/configuration.md (2 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Ruddy35, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the file processing capabilities by introducing a mechanism to selectively ignore the content of files based on glob patterns. This allows for more precise control over what data is processed, which can be crucial for managing sensitive information or optimizing performance by skipping large, irrelevant file contents. The feature is integrated into both the command-line interface and the configuration system, complete with robust testing and updated documentation.

Highlights

  • New ignoreContent Feature: Introduced a new configuration option and CLI flag (--ignore-content) to allow users to specify patterns for files whose content should be ignored during processing, while still including their paths.
  • Flexible Pattern Matching: The ignoreContent patterns support glob syntax, including negation (!) to selectively keep content for specific paths within a broader ignored set.
  • Refined File Processing Flow: The internal file collection and processing logic has been updated to handle files with optional content, ensuring that content-intensive operations like security checks only run on files that actually have their content loaded.
  • Comprehensive Documentation: The new feature is fully documented in the README.md and across all localized guide pages for both command-line options and configuration files.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable --ignore-content feature, allowing files to be included in the project structure without their content. The implementation is thorough, with updates to the CLI, configuration, core logic, tests, and documentation across multiple languages. My review focuses on improving the robustness of the pattern matching logic, removing a small piece of redundant code, and ensuring documentation consistency. I've also noted that the new feature's documentation needs to be translated in the non-English guides.

Comment on lines +34 to +47
const shouldSkipContent = (filePath: string, patterns: string[]): boolean => {
let skip = false;
for (const pattern of patterns) {
const normalizedPattern = normalizeGlobPattern(pattern.startsWith('!') ? pattern.slice(1) : pattern);
if (pattern.startsWith('!')) {
if (minimatch(filePath, normalizedPattern, { dot: true })) {
skip = false;
}
} else if (minimatch(filePath, normalizedPattern, { dot: true })) {
skip = true;
}
}
return skip;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of shouldSkipContent is dependent on the order of patterns in the configuration. For example, with ['!foo', 'foo'], a file named foo will be skipped, but with ['foo', '!foo'], it will not. This can be confusing and lead to unexpected behavior. A more robust approach is to give negated patterns (!) precedence regardless of their order.

  const shouldSkipContent = (filePath: string, patterns: string[]): boolean => {
    // Negated patterns ("!...") should always take precedence to ensure content is included.
    const isExplicitlyIncluded = patterns
      .filter((pattern) => pattern.startsWith('!'))
      .some((pattern) => minimatch(filePath, normalizeGlobPattern(pattern.slice(1)), { dot: true }));

    if (isExplicitlyIncluded) {
      return false; // Do not skip content.
    }

    // If not explicitly included, check if it matches any ignore pattern.
    const isIgnored = patterns
      .filter((pattern) => !pattern.startsWith('!'))
      .some((pattern) => minimatch(filePath, normalizeGlobPattern(pattern), { dot: true }));

    return isIgnored;
  };

| `output.git.includeLogs` | Apakah akan menyertakan log git dalam output. Menampilkan riwayat commit dengan tanggal, pesan, dan jalur file | `false` |
| `output.git.includeLogsCount` | Jumlah commit log git yang akan disertakan dalam output | `50` |
| `include` | Pola file untuk disertakan menggunakan [pola glob](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax) | `[]` |
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for ignoreContent is in English. It should be translated into Indonesian for consistency.

Suggested change
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
| `ignoreContent` | Pola file yang kontennya harus diabaikan (menggunakan [pola glob](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); awali dengan `!` untuk menyimpan path tertentu | `[]` |

## Options de sélection de fichiers
- `--include <patterns>`: Liste des motifs d'inclusion (séparés par des virgules)
- `-i, --ignore <patterns>`: Motifs d'ignorance supplémentaires (séparés par des virgules)
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for --ignore-content is in English. It should be translated into French for consistency.

Suggested change
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
- `--ignore-content <patterns>`: Motifs pour ignorer le contenu des fichiers correspondants (séparés par des virgules ; préfixer avec `!` pour conserver des chemins spécifiques)

## Dateiauswahloptionen
- `--include <patterns>`: Liste der Einschlussmuster (kommagetrennt)
- `-i, --ignore <patterns>`: Zusätzliche Ignoriermuster (kommagetrennt)
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for --ignore-content is in English. It should be translated into German for consistency with the rest of the document.

Suggested change
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
- `--ignore-content <patterns>`: Muster zum Überspringen von Dateiinhalten für übereinstimmende Muster (kommagetrennt; mit `!` voranstellen, um bestimmte Pfade beizubehalten)

| `output.git.includeLogs` | Ob Git-Logs in der Ausgabe enthalten sein sollen. Zeigt Commit-Historie mit Daten, Nachrichten und Dateipfaden an | `false` |
| `output.git.includeLogsCount` | Anzahl der Git-Log-Commits, die in die Ausgabe einbezogen werden sollen | `50` |
| `include` | Zu einschließende Dateimuster (verwendet [glob-Muster](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)) | `[]` |
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for ignoreContent is in English. It should be translated into German for consistency.

Suggested change
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
| `ignoreContent` | Muster von Dateien, deren Inhalt ignoriert werden soll (verwendet [glob-Muster](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); mit `!` voranstellen, um bestimmte Pfade beizubehalten | `[]` |

| `output.git.includeLogs` | Có nên bao gồm nhật ký git trong đầu ra hay không. Hiển thị lịch sử commit với ngày tháng, thông điệp và đường dẫn tệp | `false` |
| `output.git.includeLogsCount` | Số lượng commit git logs để bao gồm trong đầu ra | `50` |
| `include` | Các mẫu file để bao gồm sử dụng [mẫu glob](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax) | `[]` |
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for ignoreContent is in English. It should be translated into Vietnamese for consistency.

Suggested change
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
| `ignoreContent` | Các mẫu tệp có nội dung cần bỏ qua (sử dụng [mẫu glob](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); tiền tố `!` để giữ lại các đường dẫn cụ thể | `[]` |

## 文件选择选项
- `--include <patterns>`: 包含模式列表(逗号分隔)
- `-i, --ignore <patterns>`: 附加忽略模式(逗号分隔)
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for --ignore-content is in English. It should be translated into Simplified Chinese for consistency.

Suggested change
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
- `--ignore-content <patterns>`: 跳过匹配模式的文件内容(逗号分隔;以`!`为前缀保留特定路径)

| `output.git.includeLogs` | 是否在输出中包含Git日志。显示提交历史的日期、消息和文件路径 | `false` |
| `output.git.includeLogsCount` | 要包含的Git日志提交数量。限制历史深度以了解开发模式 | `50` |
| `include` | 要包含的文件模式(使用[glob模式](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)) | `[]` |
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for ignoreContent is in English. It should be translated into Simplified Chinese for consistency.

Suggested change
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
| `ignoreContent` | 其内容应被忽略的文件的模式(使用[glob模式](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax));以`!`为前缀保留特定路径 | `[]` |

## 檔案選擇選項
- `--include <patterns>`: 包含模式清單(逗號分隔)
- `-i, --ignore <patterns>`: 附加忽略模式(逗號分隔)
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for --ignore-content is in English. It should be translated into Traditional Chinese for consistency.

Suggested change
- `--ignore-content <patterns>`: Skip file content for matched patterns (comma-separated; prefix with `!` to keep specific paths)
- `--ignore-content <patterns>`: 跳過符合模式的檔案內容(逗號分隔;以`!`為前綴以保留特定路徑)

| `output.git.includeLogs` | 是否在輸出中包含Git記錄。顯示提交歷史包括日期、訊息和檔案路徑 | `false` |
| `output.git.includeLogsCount` | 在輸出中包含的git記錄提交數量 | `50` |
| `include` | 要包含的檔案模式(使用[glob模式](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)) | `[]` |
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for ignoreContent is in English. It should be translated into Traditional Chinese for consistency.

Suggested change
| `ignoreContent` | Patterns of files whose content should be ignored (using [glob patterns](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax)); prefix with `!` to keep specific paths | `[]` |
| `ignoreContent` | 其內容應被忽略的檔案的模式(使用[glob模式](https://github.com/mrmlnc/fast-glob?tab=readme-ov-file#pattern-syntax));以`!`為前綴以保留特定路徑 | `[]` |

@Ruddy35 Ruddy35 closed this Sep 7, 2025
@Ruddy35 Ruddy35 deleted the codex/add-ignorecontent-configuration-and-cli-support-br7p2k branch September 7, 2025 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant