-
-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code compression #336
Code compression #336
Conversation
📝 WalkthroughWalkthroughThe changes introduce several new features and modifications across the project. Two new dependencies, Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant Config as buildCliConfig
participant Worker as fileProcessWorker
participant Parser as parseFile
CLI->>Config: Run command with --compress option
Config-->>CLI: Return configuration with compress:true
CLI->>Worker: Execute file processing with config
alt compress is true
Worker->>Parser: Call parseFile(content, filePath, config)
Parser-->>Worker: Return parsed/compressed content
else compress is false
Worker->>Worker: Process content with default formatting
end
sequenceDiagram
participant App as Caller
participant LP as LanguageParser
participant LL as loadLanguage
App->>LP: init()
LP->>LP: Execute Parser.init()
App->>LP: getParserForLang(language)
alt Parser not cached
LP->>LL: prepareLang(language)
LL-->>LP: Return Parser.Language instance
else Parser cached
LP-->>App: Return cached parser instance
end
App->>LP: getQueryForLang(language)
LP-->>App: Return corresponding query instance
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (30)
src/core/tree-sitter/LanguageParser.ts (2)
24-30
: ValidateloadLanguage
error handling.
IfloadLanguage(name)
fails or throws, consider wrapping this logic in a try/catch to handle the error more gracefully, rather than allowing the exception to bubble up.
56-60
: Prevent re-initialization race conditions.
Calls togetReady()
in parallel might cause potential race conditions if invoked from different parts of the code. Even thoughParser.init()
is guarded byisParserInitialized
, consider using an internal Promise or a synchronization mechanism to ensure graceful handling if multiple calls take place concurrently.src/core/tree-sitter/queries/python.ts (1)
5-11
: Query enhancements for nested classes or decorators.
While this query captures class and function definitions, consider whether nested classes or decorated functions also need explicit handling. Tree-sitter queries can be extended to match additional constructs if needed.src/core/tree-sitter/queries/php.ts (1)
6-15
: Distinguish methods from functions.
Currently, both function definitions and method declarations are tagged as@definition.function
. If you need to differentiate methods, consider using a distinct label for the method declaration. For example:- name: (name) @name.definition.function) @definition.function + name: (name) @name.definition.method) @definition.methodsrc/core/tree-sitter/ext2lang.ts (1)
1-20
: Consider adding support for additional file extensions.The mapping is comprehensive but could be enhanced by adding support for more file extensions:
- JavaScript:
.mjs
(ES modules)- TypeScript:
.cts
,.mts
(modules)- C++:
.cc
,.hxx
,.cxx
- Objective-C/C++:
.m
,.mm
- Other languages:
.scala
,.kt
,.ex
,.exs
,.dart
,.lua
,.r
,.pl
,.pm
,.sh
,.bash
src/core/tree-sitter/queries/java.ts (1)
1-15
: Consider adding queries for additional Java constructs.The queries cover basic Java constructs but could be enhanced by adding support for:
- Enum declarations
- Record declarations (Java 16+)
- Annotation declarations
- Constructor declarations
- Field declarations
Example queries:
(enum_declaration name: (identifier) @name.definition.enum) @definition.enum (record_declaration name: (identifier) @name.definition.record) @definition.record (annotation_declaration name: (identifier) @name.definition.annotation) @definition.annotation (constructor_declaration name: (identifier) @name.definition.constructor) @definition.constructor (field_declaration declarator: (variable_declarator name: (identifier) @name.definition.field)) @definition.field
src/core/tree-sitter/queries/rust.ts (1)
1-16
: Consider adding queries for additional Rust constructs.The queries cover basic Rust constructs but could be enhanced by adding support for:
- Enum declarations
- Trait declarations
- Impl blocks
- Type aliases
- Const declarations
- Static declarations
Example queries:
(enum_item name: (type_identifier) @name.definition.enum) @definition.enum (trait_item name: (type_identifier) @name.definition.trait) @definition.trait (impl_item) @definition.impl (type_item name: (type_identifier) @name.definition.type) @definition.type (const_item name: (identifier) @name.definition.const) @definition.const (static_item name: (identifier) @name.definition.static) @definition.static
src/core/tree-sitter/loadLanguage.ts (1)
10-12
: Consider making the WASM path configurable.The path to WASM modules is hard-coded and lacks validation.
Consider:
- Making the path configurable through environment variables or configuration files.
- Adding validation to ensure the WASM file exists before attempting to load it.
Example implementation:
import fs from 'node:fs/promises'; const WASM_PATH = process.env.TREE_SITTER_WASM_PATH || path.join('node_modules', 'tree-sitter-wasms', 'out'); async function getWasmPath(langName: string): Promise<string> { const wasmPath = path.join(WASM_PATH, `tree-sitter-${langName}.wasm`); try { await fs.access(wasmPath); return wasmPath; } catch { throw new Error(`WASM file not found for language ${langName}`); } }src/core/tree-sitter/queries/c.ts (1)
1-6
: Enhance documentation with query syntax examples.The documentation lists the types of declarations but could be more helpful with examples of the C code patterns that each query matches.
Add examples like:
/* - struct declarations + Example: struct Point { int x, y; } - union declarations + Example: union Data { int i; float f; } - function declarations + Example: void foo(int x); - typedef declarations + Example: typedef int Integer; */src/core/tree-sitter/queries/c_sharp.ts (2)
1-6
: Enhance documentation with query syntax examples.The documentation lists the types of declarations but could be more helpful with examples of the C# code patterns that each query matches.
Add examples like:
/* - class declarations + Example: public class MyClass { } - interface declarations + Example: public interface IMyInterface { } - method declarations + Example: public void MyMethod() { } - namespace declarations + Example: namespace MyNamespace { } */
8-22
: Add support for additional C# declarations.The current implementation covers basic declarations but misses some important C# language constructs.
Consider adding patterns for:
- Property declarations (e.g.,
public int MyProperty { get; set; }
)- Event declarations (e.g.,
public event EventHandler MyEvent;
)- Delegate declarations (e.g.,
public delegate void MyDelegate();
)- Enum declarations (e.g.,
public enum MyEnum { }
)export default ` (class_declaration name: (identifier) @name.definition.class ) @definition.class (interface_declaration name: (identifier) @name.definition.interface ) @definition.interface (method_declaration name: (identifier) @name.definition.method ) @definition.method (namespace_declaration name: (identifier) @name.definition.module ) @definition.module + +(property_declaration + name: (identifier) @name.definition.property +) @definition.property + +(event_field_declaration + name: (identifier) @name.definition.event +) @definition.event + +(delegate_declaration + name: (identifier) @name.definition.delegate +) @definition.delegate + +(enum_declaration + name: (identifier) @name.definition.enum +) @definition.enum `;src/core/tree-sitter/lang2Query.ts (2)
1-14
: Consider using TypeScript's type system for imports.The imports look good but could benefit from TypeScript's type system.
Consider adding a type for the query functions:
+type QueryFunction = string; + import { - cQuery, + cQuery: QueryFunction, - cppQuery, + cppQuery: QueryFunction, // ... rest of the imports } from './queries/index.js';
16-29
: Consider making the mapping type-safe.The language-to-query mapping looks good but could benefit from TypeScript's type system.
Consider adding type safety:
+type SupportedLanguage = + | 'javascript' + | 'typescript' + | 'c' + | 'cpp' + | 'python' + | 'rust' + | 'go' + | 'c_sharp' + | 'ruby' + | 'java' + | 'php' + | 'swift'; + -export const lang2Query = { +export const lang2Query: Record<SupportedLanguage, QueryFunction> = { javascript: javascriptQuery, typescript: typescriptQuery, // ... rest of the mappings };src/core/tree-sitter/queries/index.ts (1)
1-12
: Consider using TypeScript's module resolution.The exports look good but could benefit from TypeScript's module resolution.
Consider:
- Using TypeScript's module resolution by removing the
.js
extension- Adding type information for the exports
-export { default as phpQuery } from './php.js'; +export { default as phpQuery } from './php'; +export type { QueryFunction } from './types'; // ... rest of the exportsAlso, consider adding JSDoc comments to document the purpose of each query:
+/** Query for extracting PHP declarations */ export { default as phpQuery } from './php.js'; +/** Query for extracting TypeScript declarations */ export { default as typescriptQuery } from './typescript.js'; // ... rest of the exportssrc/core/tree-sitter/queries/go.ts (1)
25-26
: Consider adding more Go-specific patterns.While the current patterns are correct, consider adding patterns for other Go constructs to make the query more comprehensive:
- Interface declarations
- Struct declarations
- Constant and variable declarations
Example additions:
+(interface_declaration + name: (type_identifier) @name.definition.interface) @definition.interface + +(struct_declaration + name: (type_identifier) @name.definition.class) @definition.class + +(const_declaration + (const_spec name: (identifier) @name.definition.constant)) @definition.constantsrc/core/tree-sitter/queries/cpp.ts (2)
10-10
: Consider using distinct captures for structs and unions.Currently, both structs and unions are captured as
@definition.class
. Consider using more specific captures to differentiate them:-(struct_specifier name: (type_identifier) @name.definition.class body:(_)) @definition.class +(struct_specifier name: (type_identifier) @name.definition.struct body:(_)) @definition.struct -(declaration type: (union_specifier name: (type_identifier) @name.definition.class)) @definition.class +(declaration type: (union_specifier name: (type_identifier) @name.definition.union)) @definition.unionAlso applies to: 12-12
1-23
: Consider adding more C++-specific patterns.Consider adding patterns for other C++ constructs to make the query more comprehensive:
+// Namespace declarations +(namespace_definition + name: (identifier) @name.definition.namespace) @definition.namespace + +// Template declarations +(template_declaration + (class_specifier name: (type_identifier) @name.definition.class)) @definition.class + +// Constructor declarations +(function_declarator + declarator: (qualified_identifier + name: (identifier) @name.definition.constructor)) @definition.constructor + (#match? @name.definition.constructor "^[A-Z]") + +// Destructor declarations +(function_declarator + declarator: (qualified_identifier + name: (identifier) @name.definition.destructor)) @definition.destructor + (#match? @name.definition.destructor "^~")src/core/tree-sitter/queries/swift.ts (1)
1-45
: Consider adding more Swift-specific patterns.Consider adding patterns for other Swift constructs to make the query more comprehensive:
+// Enum declarations +(enum_declaration + name: (type_identifier) @name.definition.enum) @definition.enum + +// Extension declarations +(extension_declaration + name: (type_identifier) @name.definition.extension) @definition.extension + +// Struct declarations +(struct_declaration + name: (type_identifier) @name.definition.struct) @definition.struct + +// Operator declarations +(operator_declaration + name: (custom_operator) @name.definition.operator) @definition.operatorsrc/core/tree-sitter/queries/typescript.ts (1)
1-37
: Consider adding more TypeScript-specific patterns.Consider adding patterns for other TypeScript constructs to make the query more comprehensive:
+// Interface declarations +(interface_declaration + name: (type_identifier) @name.definition.interface) @definition.interface + +// Type alias declarations +(type_alias_declaration + name: (type_identifier) @name.definition.type) @definition.type + +// Enum declarations +(enum_declaration + name: (identifier) @name.definition.enum) @definition.enum + +// Decorator declarations +(decorator + "@" @punctuation.decorator + ((identifier) @name.definition.decorator)) @definition.decoratortests/core/tree-sitter/LanguageParser.test.ts (2)
13-22
: Add error handling test cases forgetReady
.Consider adding test cases to verify error handling when
Parser.init
fails.describe('getReady', () => { it('should initialize the parser', async () => { // Mock Parser.init to avoid actual initialization during testing Parser.init = vi.fn().mockResolvedValue(undefined); await parser.getReady(); expect(Parser.init).toHaveBeenCalled(); }); + + it('should handle initialization errors', async () => { + // Mock Parser.init to simulate failure + Parser.init = vi.fn().mockRejectedValue(new Error('Init failed')); + + await expect(parser.getReady()).rejects.toThrow('Init failed'); + }); });
24-38
: Add edge cases forguessTheLang
.Consider adding test cases for:
- Files with multiple extensions (e.g.,
file.test.js
)- Case sensitivity in file extensions (e.g.,
.JS
)- Files with no extension
describe('guessTheLang', () => { it('should return the correct language based on file extension', () => { const filePath = 'file.js'; const lang = parser.guessTheLang(filePath); expect(lang).toBe('javascript'); }); it('should return undefined for unsupported extensions', () => { const filePath = 'file.txt'; const lang = parser.guessTheLang(filePath); expect(lang).toBeUndefined(); }); + + it('should handle files with multiple extensions', () => { + const filePath = 'file.test.js'; + const lang = parser.guessTheLang(filePath); + + expect(lang).toBe('javascript'); + }); + + it('should be case-insensitive for file extensions', () => { + const filePath = 'file.JS'; + const lang = parser.guessTheLang(filePath); + + expect(lang).toBe('javascript'); + }); + + it('should handle files without extension', () => { + const filePath = 'Dockerfile'; + const lang = parser.guessTheLang(filePath); + + expect(lang).toBeUndefined(); + }); });src/core/tree-sitter/queries/ruby.ts (2)
1-5
: Enhance documentation for Tree-sitter query.Consider adding more detailed comments explaining:
- The purpose and behavior of
#strip!
and#select-adjacent!
predicates- The structure of captured nodes
- Examples of Ruby code that matches each pattern
/* - method definitions (including singleton methods and aliases, with associated comments) - class definitions (including singleton classes, with associated comments) - module definitions + + Query predicates: + - #strip!: Removes leading comment characters and whitespace + - #select-adjacent!: Associates comments with their corresponding definitions + + Example matches: + ```ruby + # This is a method comment + def foo(bar) + # Method body + end + + # This is a class comment + class MyClass + # Class body + end + + # This is a module comment + module MyModule + # Module body + end + ``` */
6-52
: Consider adding support for more Ruby-specific constructs.The query could be enhanced to capture:
- Module functions (
module_function
)- Include/extend statements
- Attribute accessors (
attr_*
)export default ` ( (comment)* @doc . [ (method name: (_) @name.definition.method) @definition.method (singleton_method name: (_) @name.definition.method) @definition.method ] (#strip! @doc "^#\\s*") (#select-adjacent! @doc @definition.method) ) (alias name: (_) @name.definition.method) @definition.method ( (comment)* @doc . [ (class name: [ (constant) @name.definition.class (scope_resolution name: (_) @name.definition.class) ]) @definition.class (singleton_class value: [ (constant) @name.definition.class (scope_resolution name: (_) @name.definition.class) ]) @definition.class ] (#strip! @doc "^#\\s*") (#select-adjacent! @doc @definition.class) ) ( (module name: [ (constant) @name.definition.module (scope_resolution name: (_) @name.definition.module) ]) @definition.module ) + +( + (comment)* @doc + . + (call + method: (identifier) @name + (#match? @name "^(attr_reader|attr_writer|attr_accessor)$") + ) @definition.attribute + (#strip! @doc "^#\\s*") + (#select-adjacent! @doc @definition.attribute) +) + +( + (comment)* @doc + . + (call + method: (identifier) @name + (#eq? @name "module_function") + ) @definition.module_function + (#strip! @doc "^#\\s*") + (#select-adjacent! @doc @definition.module_function) +) `;src/core/tree-sitter/queries/javascript.ts (1)
1-6
: Enhance documentation and add TypeScript support.Consider:
- Adding examples of code that matches each pattern
- Adding support for TypeScript-specific constructs
- Adding support for decorators and static methods
/* - class definitions - method definitions - named function declarations - arrow functions and function expressions assigned to variables + + Examples: + ```typescript + // Class with decorators and static methods + @decorator + class Example { + static staticMethod() {} + method() {} + } + + // TypeScript interface + interface IExample { + method(): void; + } + + // TypeScript type alias + type Handler = (event: Event) => void; + ``` + + Note: This query currently supports JavaScript. For TypeScript support, + additional patterns for interfaces, type aliases, and decorators are needed. */src/core/tree-sitter/parseFile.ts (2)
51-53
: Enhance error handling and logging.The current error handling is too generic. Consider:
- Using specific error types
- Adding error context
- Using a proper logging framework
- } catch (error) { - console.log(`Error parsing file: ${error}\n`); + } catch (error: unknown) { + const context = { + filePath, + language: lang, + error: error instanceof Error ? error.message : String(error) + }; + console.error('Error parsing file:', context); + throw new Error(`Failed to parse ${filePath}: ${context.error}`);
5-5
: Consider renaming the variable for clarity.The variable name
magicBlob
is not descriptive of its purpose.- const magicBlob = new LanguageParser(); + const languageParser = new LanguageParser();src/cli/cliRun.ts (1)
34-34
: Enhance the CLI option description.The description "remove everything except definitions" could be more specific about what constitutes a "definition" and which file types are supported.
- .option('--only-defs', 'remove everything except definitions') + .option('--only-defs', 'extract only type definitions, interfaces, and classes from TypeScript files')Also applies to: 57-57
src/core/file/fileProcess.ts (2)
8-8
: Consider using a more descriptive function name.The name
getFn_parseFile
is not very descriptive and doesn't follow TypeScript naming conventions.-import { getFn_parseFile } from '../tree-sitter/parseFile.js'; +import { getTypeScriptParser } from '../tree-sitter/parseFile.js';
59-61
: Add progress indication for parsing operation.The parsing operation could be time-consuming for large files, but there's no progress indication.
if (config.output.onlyDefs && filePath.endsWith('.ts')) { + logger.trace(`Parsing definitions from ${filePath}`); const parseFile = await getFn_parseFile(); processedContent = (await parseFile(processedContent, filePath, config)) ?? processedContent; + logger.trace(`Completed parsing definitions from ${filePath}`); }tests/core/tree-sitter/parseFile.test.ts (1)
5-192
: Consider enhancing the test coverage.While the test suite provides good coverage for basic parsing functionality across multiple languages, consider adding:
- Error cases (e.g., invalid syntax, unsupported file extensions)
- Tests for the
onlyDefs
configuration- More complex code examples that test nested functions, classes, and methods
Example test case for
onlyDefs
:+test('should respect onlyDefs configuration', async () => { + const fileContent = ` + // A comment that should be ignored + function helper() { return 42; } + function main() { return helper(); } + `; + const filePath = 'dummy.js'; + const config = { output: { onlyDefs: true } }; + const parseFile = await getFn_parseFile(); + const result = await parseFile(fileContent, filePath, config as RepomixConfigMerged); + expect(typeof result).toBe('string'); + expect(result).toContain('helper'); + expect(result).toContain('main'); + expect(result).not.toContain('comment'); +});
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (29)
package.json
(1 hunks)repomix.config.json
(1 hunks)src/cli/actions/defaultAction.ts
(1 hunks)src/cli/cliRun.ts
(2 hunks)src/config/configSchema.ts
(2 hunks)src/core/file/fileProcess.ts
(2 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/ext2lang.ts
(1 hunks)src/core/tree-sitter/lang2Query.ts
(1 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/c.ts
(1 hunks)src/core/tree-sitter/queries/c_sharp.ts
(1 hunks)src/core/tree-sitter/queries/cpp.ts
(1 hunks)src/core/tree-sitter/queries/go.ts
(1 hunks)src/core/tree-sitter/queries/index.ts
(1 hunks)src/core/tree-sitter/queries/java.ts
(1 hunks)src/core/tree-sitter/queries/javascript.ts
(1 hunks)src/core/tree-sitter/queries/php.ts
(1 hunks)src/core/tree-sitter/queries/python.ts
(1 hunks)src/core/tree-sitter/queries/ruby.ts
(1 hunks)src/core/tree-sitter/queries/rust.ts
(1 hunks)src/core/tree-sitter/queries/swift.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)tests/cli/actions/defaultAction.test.ts
(1 hunks)tests/cli/cliRun.test.ts
(2 hunks)tests/config/configSchema.test.ts
(3 hunks)tests/core/tree-sitter/LanguageParser.test.ts
(1 hunks)tests/core/tree-sitter/parseFile.test.ts
(1 hunks)
🔇 Additional comments (19)
src/core/tree-sitter/LanguageParser.ts (1)
20-22
: Consider edge cases with files lacking extensions.
Currently,getFileExtension
returns an empty string for files with no extension or a trailing dot. Ensure that calling workflows handle these edge cases gracefully.src/core/tree-sitter/queries/c.ts (1)
8-14
: Verify query patterns against Tree-sitter C grammar.The query patterns look correct but should be verified against the Tree-sitter C grammar to ensure they capture all valid C declaration forms.
src/core/tree-sitter/queries/go.ts (2)
8-14
: LGTM! Function declaration pattern looks good.The pattern correctly captures function names and associated comments, with proper stripping of comment whitespace.
17-23
: LGTM! Method declaration pattern looks good.The pattern correctly captures method names and associated comments, maintaining consistency with function declarations.
src/core/tree-sitter/queries/cpp.ts (1)
14-18
: LGTM! Function patterns look good.The patterns correctly handle both simple identifiers and field identifiers for functions.
src/core/tree-sitter/queries/swift.ts (2)
8-12
: LGTM! Class and protocol patterns look good.The patterns correctly capture class and protocol names with appropriate definition types.
14-27
: LGTM! Method patterns are comprehensive.The patterns handle various method types including regular functions, subscripts, initializers, and deinitializers within class scope.
src/core/tree-sitter/queries/typescript.ts (2)
9-10
: LGTM! Function patterns are comprehensive.The patterns correctly handle function signatures, declarations, and variable declarations containing functions.
Also applies to: 24-25, 33-36
12-16
: LGTM! Method patterns look good.The patterns handle both regular and abstract method signatures appropriately.
Also applies to: 27-28
src/config/configSchema.ts (1)
28-28
: LGTM! Schema changes are well-structured.The
onlyDefs
property is correctly added to both base and default schemas with appropriate types and defaults.Also applies to: 68-68
src/cli/actions/defaultAction.ts (1)
134-136
: LGTM! CLI option handling is consistent.The implementation follows the established pattern for boolean options and maintains immutability.
tests/config/configSchema.test.ts (3)
72-72
: LGTM!The
onlyDefs
property is correctly added to the valid configuration test case forrepomixConfigDefaultSchema
with a default value offalse
.
157-157
: LGTM!The
onlyDefs
property is correctly added to the valid configuration test case forrepomixConfigMergedSchema
with a default value offalse
.
196-196
: LGTM!The
onlyDefs
property is correctly included in the invalid configuration test case for type validation.tests/cli/cliRun.test.ts (1)
57-57
: LGTM!The
onlyDefs
property is correctly added to the mocked configuration objects with a default value offalse
for both default and remote actions.Also applies to: 96-96
tests/core/tree-sitter/parseFile.test.ts (1)
1-4
: LGTM!The test file setup and imports are correctly configured.
repomix.config.json (2)
3-4
: Updated Output Configuration to Plain Text Format
The changes to"filePath"
and"style"
correctly reflect the new plain text output format. This update appears consistent with the intended simplification of the output.
11-11
: Added "onlyDefs" Property for Definition-Only Output
The new"onlyDefs": true
property has been added to enable output of definitions only. Please ensure that the documentation and CLI help texts are updated to reflect this new option. Also, verify that this override is intentional given that the default in the schema is set tofalse
.package.json (1)
82-83
: New Dependencies for Tree-sitter Integration
The addition of"tree-sitter-wasms": "^0.1.12"
and"web-tree-sitter": "^0.24.7"
in the dependencies is consistent with the enhancements for parsing capabilities using Tree-sitter. Please ensure that these dependencies are documented in the project README and that any version-specific caveats are noted.
@huy-trn I'll take a more detailed look tomorrow. Thank you so much for your constant support and contributions! Really appreciate it! 🚀 |
a94e854
to
a5215be
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #336 +/- ##
==========================================
- Coverage 90.12% 89.56% -0.57%
==========================================
Files 49 66 +17
Lines 2724 2912 +188
Branches 562 594 +32
==========================================
+ Hits 2455 2608 +153
- Misses 269 304 +35 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (3)
src/core/tree-sitter/parseFile.ts (3)
4-58
: 🛠️ Refactor suggestionOptimize memory usage and improve type safety.
Consider using generators for chunks and adding type guards.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 12-13: src/core/tree-sitter/parseFile.ts#L12-L13
Added lines #L12 - L13 were not covered by tests
[warning] 17-18: src/core/tree-sitter/parseFile.ts#L17-L18
Added lines #L17 - L18 were not covered by tests
[warning] 47-47: src/core/tree-sitter/parseFile.ts#L47
Added line #L47 was not covered by tests
[warning] 53-54: src/core/tree-sitter/parseFile.ts#L53-L54
Added lines #L53 - L54 were not covered by tests
8-8
: 🛠️ Refactor suggestionAdd input validation.
The function lacks input validation for its parameters.
- const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { + const parseFile = async ( + fileContent: string, + filePath: string, + config: RepomixConfigMerged + ): Promise<string | undefined> => { + // Validate input + if (!fileContent || !filePath) { + throw new Error('Invalid input: fileContent and filePath are required'); + }
52-54
:⚠️ Potential issueImprove error handling.
The current error handling only logs to console and swallows the error. This could hide issues and make debugging difficult.
- } catch (error) { - console.log(`Error parsing file: ${error}\n`); + } catch (error: unknown) { + const context = { + filePath, + language: lang, + error: error instanceof Error ? error.message : String(error) + }; + logger.error('Error parsing file:', context); + throw new Error(`Failed to parse ${filePath}: ${context.error}`);🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 53-54: src/core/tree-sitter/parseFile.ts#L53-L54
Added lines #L53 - L54 were not covered by tests
🧹 Nitpick comments (10)
src/core/tree-sitter/ext2lang.ts (1)
1-20
: Add JSDoc documentation for better maintainability.The mapping is comprehensive and well-structured. Consider adding JSDoc documentation to describe the purpose and usage of this mapping.
+/** + * Maps file extensions to their corresponding programming languages. + * Used by LanguageParser to determine the language of a file based on its extension. + * @example + * ext2lang.js // returns 'javascript' + * ext2lang.py // returns 'python' + */ export const ext2lang = {src/core/tree-sitter/lang2Query.ts (2)
1-14
: Useimport type
for better type safety.Since the imported types are only used for type checking, use
import type
to ensure they are removed during compilation.-import { +import type { cQuery, cppQuery, csharpQuery, goQuery, javaQuery, javascriptQuery, phpQuery, pythonQuery, rubyQuery, rustQuery, swiftQuery, typescriptQuery, } from './queries/index.js';
16-29
: Add JSDoc documentation for better maintainability.The mapping is well-structured. Consider adding JSDoc documentation to describe the purpose and usage of this mapping.
+/** + * Maps programming languages to their corresponding query functions. + * Used by LanguageParser to retrieve the appropriate query for a language. + * @example + * lang2Query.javascript // returns javascriptQuery + * lang2Query.python // returns pythonQuery + */ export const lang2Query = {src/core/tree-sitter/queries/ruby.ts (1)
1-5
: Enhance documentation for better maintainability.Consider expanding the documentation to include:
- Examples of the patterns being matched
- Description of the capture groups
- Explanation of the Tree-sitter features being used
-/* -- method definitions (including singleton methods and aliases, with associated comments) -- class definitions (including singleton classes, with associated comments) -- module definitions -*/ +/** + * Tree-sitter query patterns for Ruby code. + * + * Captures: + * - Method definitions: + * - Regular methods with associated comments + * - Singleton methods with associated comments + * - Method aliases + * - Class definitions: + * - Regular classes with associated comments + * - Singleton classes with associated comments + * - Module definitions + * + * Features: + * - Uses #strip! to remove leading comment characters + * - Uses #select-adjacent! to link comments to their definitions + * + * @example + * # This is a method comment + * def foo(bar) + * # Method body + * end + */src/core/tree-sitter/LanguageParser.ts (3)
1-6
: Useimport type
for better type safety.Since
SupportedLang
is only used for type checking, useimport type
to ensure it is removed during compilation.import * as path from 'node:path'; import Parser from 'web-tree-sitter'; import { ext2lang } from './ext2lang.js'; -import { lang2Query, SupportedLang } from './lang2Query.js'; +import { lang2Query } from './lang2Query.js'; +import type { SupportedLang } from './lang2Query.js'; import { loadLanguage } from './loadLanguage.js';🧰 Tools
🪛 Biome (1.9.4)
[error] 5-5: Some named imports are only used as types.
This import is only used as a type.
Importing the types with import type ensures that they are removed by the compilers and avoids loading unnecessary modules.
Safe fix: Add inline type keywords.(lint/style/useImportType)
17-20
: Handle edge cases in getFileExtension.Consider handling edge cases such as files without extensions or with multiple dots.
private getFileExtension(filePath: string) { - return path.extname(filePath).toLowerCase().slice(1); + const ext = path.extname(filePath).toLowerCase().slice(1); + return ext || ''; // Return empty string for files without extension }
44-51
: Improve type safety in guessTheLang.Consider using type assertions more safely and handling potential type mismatches.
public guessTheLang(filePath: string) { const ext = this.getFileExtension(filePath); - if (!Object.keys(ext2lang).includes(ext)) { + if (!(ext in ext2lang)) { return undefined; } - const lang = ext2lang[ext as keyof typeof ext2lang] as keyof typeof lang2Query; + const lang = ext2lang[ext as keyof typeof ext2lang]; + if (!(lang in lang2Query)) { + return undefined; + } return lang; }tests/core/tree-sitter/parseFile.test.ts (3)
1-4
: Remove unused import.The
vi
import from 'vitest' is not used in the test file.-import { describe, expect, test, vi } from 'vitest'; +import { describe, expect, test } from 'vitest';
7-25
: Enhance JavaScript/JSX test coverage.The current tests only cover basic function declarations. Consider adding test cases for:
- Arrow functions:
const greet = (name) => console.log("Hello, " + name);
- Class methods:
class Greeter { sayHello(name) { console.log("Hello, " + name); } }
- Async functions:
async function greet(name) { await delay(100); console.log("Hello, " + name); }
- Generator functions:
function* counter() { yield 1; yield 2; }
- Edge cases: empty files, invalid syntax
28-46
: Add TypeScript-specific test cases.The current tests don't cover TypeScript-specific features. Consider adding test cases for:
- Interfaces:
interface Greeter { greet(name: string): void; }
- Type aliases:
type GreetFn = (name: string) => void;
- Abstract classes:
abstract class BaseGreeter { abstract greet(name: string): void; }
- Generics:
function identity<T>(arg: T): T { return arg; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (29)
package.json
(1 hunks)repomix.config.json
(1 hunks)src/cli/actions/defaultAction.ts
(1 hunks)src/cli/cliRun.ts
(1 hunks)src/config/configSchema.ts
(2 hunks)src/core/file/workers/fileProcessWorker.ts
(2 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/ext2lang.ts
(1 hunks)src/core/tree-sitter/lang2Query.ts
(1 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/c.ts
(1 hunks)src/core/tree-sitter/queries/c_sharp.ts
(1 hunks)src/core/tree-sitter/queries/cpp.ts
(1 hunks)src/core/tree-sitter/queries/go.ts
(1 hunks)src/core/tree-sitter/queries/index.ts
(1 hunks)src/core/tree-sitter/queries/java.ts
(1 hunks)src/core/tree-sitter/queries/javascript.ts
(1 hunks)src/core/tree-sitter/queries/php.ts
(1 hunks)src/core/tree-sitter/queries/python.ts
(1 hunks)src/core/tree-sitter/queries/ruby.ts
(1 hunks)src/core/tree-sitter/queries/rust.ts
(1 hunks)src/core/tree-sitter/queries/swift.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)tests/cli/actions/defaultAction.test.ts
(1 hunks)tests/cli/cliRun.test.ts
(2 hunks)tests/config/configSchema.test.ts
(3 hunks)tests/core/tree-sitter/LanguageParser.test.ts
(1 hunks)tests/core/tree-sitter/parseFile.test.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (20)
- src/core/tree-sitter/queries/c.ts
- src/core/tree-sitter/queries/python.ts
- src/core/tree-sitter/loadLanguage.ts
- src/core/tree-sitter/queries/rust.ts
- src/core/tree-sitter/queries/typescript.ts
- tests/config/configSchema.test.ts
- src/core/tree-sitter/queries/swift.ts
- src/core/tree-sitter/queries/php.ts
- src/core/tree-sitter/queries/java.ts
- src/cli/cliRun.ts
- package.json
- tests/cli/cliRun.test.ts
- repomix.config.json
- tests/cli/actions/defaultAction.test.ts
- src/core/tree-sitter/queries/go.ts
- src/core/tree-sitter/queries/c_sharp.ts
- src/core/tree-sitter/queries/javascript.ts
- src/core/tree-sitter/queries/cpp.ts
- tests/core/tree-sitter/LanguageParser.test.ts
- src/core/tree-sitter/queries/index.ts
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/file/workers/fileProcessWorker.ts
[warning] 38-39: src/core/file/workers/fileProcessWorker.ts#L38-L39
Added lines #L38 - L39 were not covered by tests
src/core/tree-sitter/LanguageParser.ts
[warning] 31-32: src/core/tree-sitter/LanguageParser.ts#L31-L32
Added lines #L31 - L32 were not covered by tests
src/core/tree-sitter/parseFile.ts
[warning] 12-13: src/core/tree-sitter/parseFile.ts#L12-L13
Added lines #L12 - L13 were not covered by tests
[warning] 17-18: src/core/tree-sitter/parseFile.ts#L17-L18
Added lines #L17 - L18 were not covered by tests
[warning] 47-47: src/core/tree-sitter/parseFile.ts#L47
Added line #L47 was not covered by tests
[warning] 53-54: src/core/tree-sitter/parseFile.ts#L53-L54
Added lines #L53 - L54 were not covered by tests
src/cli/actions/defaultAction.ts
[warning] 171-172: src/cli/actions/defaultAction.ts#L171-L172
Added lines #L171 - L172 were not covered by tests
🪛 Biome (1.9.4)
src/core/tree-sitter/LanguageParser.ts
[error] 5-5: Some named imports are only used as types.
This import is only used as a type.
Importing the types with import type ensures that they are removed by the compilers and avoids loading unnecessary modules.
Safe fix: Add inline type keywords.
(lint/style/useImportType)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
- GitHub Check: Test (windows-latest, 18.x)
- GitHub Check: Test (windows-latest, 18.0.0)
- GitHub Check: Build and run (windows-latest, 22.x)
- GitHub Check: Build and run (windows-latest, 21.x)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (7)
src/core/tree-sitter/lang2Query.ts (1)
31-31
: LGTM!The type definition ensures type safety when working with supported languages.
src/core/tree-sitter/queries/ruby.ts (1)
6-52
: LGTM!The query patterns are well-structured and comprehensive, covering all major Ruby constructs including regular and singleton methods/classes, along with their associated comments.
src/core/tree-sitter/LanguageParser.ts (2)
8-16
: LGTM!The class structure with private caches for parsers and queries is well-designed.
29-34
: Add test coverage for error handling.The error handling path in
getParserForLang
is not covered by tests.❌ Verification inconclusive
🏁 Script executed:
#!/bin/bash # Description: Check test coverage for error handling in getParserForLang # Search for test files related to LanguageParser fd -e ts -e js "LanguageParser\.test" | xargs rg -l "getParserForLang.*error"Length of output: 79
Missing Error Handling Test Coverage in getParserForLang
Our search across test files produced no evidence that error scenarios (such as failures fromprepareLang
) are being exercised. Please add tests that simulate errors fromprepareLang
so that the error handling path ingetParserForLang
is covered.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 31-32: src/core/tree-sitter/LanguageParser.ts#L31-L32
Added lines #L31 - L32 were not covered by testssrc/core/file/workers/fileProcessWorker.ts (1)
37-40
: Add test coverage for compressed mode.The new code path for compressed output mode is not covered by tests.
❌ Verification inconclusive
🏁 Script executed:
#!/bin/bash # Description: Check test coverage for compressed mode fd -e test.ts -e spec.ts | xargs rg -l "compressed.*true"Length of output: 59
Let's run an improved search to verify if any tests set the configuration with
compressed: true
:
🏁 Script executed:
#!/bin/bash # Searching for test cases that have a configuration object with output.compressed set to true fd -e ts -e spec.ts | xargs rg -Hn "output\s*:\s*\{[^}]*compressed\s*:\s*true"Length of output: 80
Action Required: Add tests for compressed mode
The searches did not reveal any tests setting
config.output.compressed
to true. This confirms that the new code branch insrc/core/file/workers/fileProcessWorker.ts
handling compressed output is not covered by tests.
- Location:
src/core/file/workers/fileProcessWorker.ts
(lines 37–40)- Issue: No tests currently trigger the
if (config.output.compressed)
block.- Recommendation: Please add test cases that set
config.output.compressed
to true to ensure this branch is exercised.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 38-39: src/core/file/workers/fileProcessWorker.ts#L38-L39
Added lines #L38 - L39 were not covered by testssrc/config/configSchema.ts (1)
28-28
: LGTM!The schema changes for the compressed option are well-structured and consistent with the existing pattern.
Also applies to: 68-68
src/cli/actions/defaultAction.ts (1)
170-172
: Add test coverage for compressed CLI option.The new code path for handling the compressed CLI option is not covered by tests.
❌ Verification inconclusive
🏁 Script executed:
#!/bin/bash # Description: Check test coverage for compressed CLI option fd -e test.ts -e spec.ts | xargs rg -l "compressed.*options"Length of output: 62
Missing test coverage for the compressed CLI option
Our search did not locate any tests exercising the code path that updates the CLI configuration with thecompressed
option insrc/cli/actions/defaultAction.ts
(lines 170–172). Please add tests that invoke the CLI with the compressed option to verify that it behaves as expected.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 171-172: src/cli/actions/defaultAction.ts#L171-L172
Added lines #L171 - L172 were not covered by tests
a5215be
to
548790e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
src/core/tree-sitter/LanguageParser.ts (2)
21-27
:⚠️ Potential issueAdd error handling in prepareLang.
The method should handle potential errors from
loadLanguage
andparser.setLanguage
.private async prepareLang(name: SupportedLang) { - const lang = await loadLanguage(name); - const parser = new Parser(); - parser.setLanguage(lang); - this.loadedParsers[name] = parser; - this.loadedQueries[name] = lang.query(lang2Query[name]); + try { + const lang = await loadLanguage(name); + const parser = new Parser(); + parser.setLanguage(lang); + this.loadedParsers[name] = parser; + this.loadedQueries[name] = lang.query(lang2Query[name]); + } catch (error) { + throw new Error(`Failed to prepare language ${name}: ${error.message}`); + } }
53-55
: 🛠️ Refactor suggestionAdd cleanup method for proper resource management.
Consider adding a cleanup method to properly dispose of parsers when they are no longer needed.
public async init() { await Parser.init(); } + +/** + * Cleanup resources by disposing of all loaded parsers. + * Call this method when the parser is no longer needed. + */ +public cleanup() { + Object.values(this.loadedParsers).forEach(parser => { + parser.delete(); + }); + this.loadedParsers = {}; + this.loadedQueries = {}; +}src/core/tree-sitter/parseFile.ts (1)
8-56
: 🛠️ Refactor suggestionOptimize memory usage and improve type safety.
The function needs input validation, better error handling, and memory optimization.
-const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { +const parseFile = async ( + fileContent: string, + filePath: string, + config: RepomixConfigMerged +): Promise<string | undefined> => { + // Validate input + if (!fileContent || !filePath) { + throw new Error('Invalid input: fileContent and filePath are required'); + } // Split the file content into individual lines const lines = fileContent.split('\n'); if (lines.length < 1) { return ''; } const lang = magicBlob.guessTheLang(filePath); if (lang === undefined) { // Language not supported return undefined; } const query = await magicBlob.getQueryForLang(lang); const parser = await magicBlob.getParserForLang(lang); - const chunks = []; + // Use a generator to yield chunks + async function* generateChunks() { try { const tree = parser.parse(fileContent); const captures = query.captures(tree.rootNode); captures.sort((a, b) => a.node.startPosition.row - b.node.startPosition.row); for (const capture of captures) { const { node, name } = capture; const startRow = node.startPosition.row; const endRow = node.endPosition.row; if (!name.includes('name') || !lines[startRow]) { continue; } const selectedLines = lines.slice(startRow, endRow + 1); if (selectedLines.length < 1) { continue; } - const chunk = selectedLines.join('\n'); - chunks.push(chunk); + yield selectedLines.join('\n'); } - } catch (error) { - console.log(`Error parsing file: ${error}\n`); + } catch (error: unknown) { + const context = { + filePath, + language: lang, + error: error instanceof Error ? error.message : String(error) + }; + console.error('Error parsing file:', context); + throw new Error(`Failed to parse ${filePath}: ${context.error}`); } - return chunks.join('\n'); + } + // Collect chunks with reduced memory usage + const chunks: string[] = []; + for await (const chunk of generateChunks()) { + chunks.push(chunk); + } + return chunks.join('\n'); };src/core/file/workers/fileProcessWorker.ts (1)
37-40
:⚠️ Potential issueAdd error handling for parseFile result.
The compressed mode needs better error handling.
if (config.output.compressed) { const parseFile = await getFn_parseFile(); - processedContent = (await parseFile(processedContent, rawFile.path, config)) ?? processedContent; + try { + const parsedContent = await parseFile(processedContent, rawFile.path, config); + if (parsedContent === undefined) { + logger.warn(`Failed to parse ${rawFile.path} in compressed mode. Using original content.`); + } + processedContent = parsedContent ?? processedContent; + } catch (error) { + logger.error(`Error parsing ${rawFile.path} in compressed mode: ${error}`); + throw error; + } } else if (config.output.showLineNumbers) {tests/core/tree-sitter/parseFile.test.ts (1)
5-192
: 🛠️ Refactor suggestionImprove test organization and coverage.
The tests need better organization and more comprehensive coverage.
Test Configuration:
- Add tests with
compressed: true
in the config- Consider testing other configuration options
Test Setup:
- Use test fixtures or helper functions to reduce duplicate code:
function createParseTest(language: string, content: string, expectedIdentifier: string) { test(`should parse ${language} correctly`, async () => { const filePath = `dummy.${language}`; const config = {}; const parseFile = await getFn_parseFile(); const result = await parseFile(content, filePath, config as RepomixConfigMerged); expect(typeof result).toBe('string'); expect(result).toContain(expectedIdentifier); }); }Add tests for:
- Empty files
- Files with syntax errors
- Files with multiple declarations
- Files with comments and documentation
- Language-specific features (classes, interfaces, etc.)
🧹 Nitpick comments (3)
src/core/tree-sitter/LanguageParser.ts (1)
5-5
: Useimport type
for type-only imports.The
SupportedLang
import is only used as a type. Consider usingimport type
to ensure it's removed during compilation.-import { ext2lang } from './ext2lang.js'; -import { lang2Query, SupportedLang } from './lang2Query.js'; +import { ext2lang } from './ext2lang.js'; +import { lang2Query } from './lang2Query.js'; +import type { SupportedLang } from './lang2Query.js';🧰 Tools
🪛 Biome (1.9.4)
[error] 5-5: Some named imports are only used as types.
This import is only used as a type.
Importing the types with import type ensures that they are removed by the compilers and avoids loading unnecessary modules.
Safe fix: Add inline type keywords.(lint/style/useImportType)
src/core/tree-sitter/parseFile.ts (2)
7-7
: Address the TODO comment about unused config parameter.The
config
parameter is currently unused. Consider either using it or removing it if not needed.Would you like me to help implement the configuration handling or remove the unused parameter?
5-5
: Use a more descriptive variable name.The variable name
magicBlob
is not descriptive. Consider using a more meaningful name likelanguageParser
that reflects its purpose.-const magicBlob = new LanguageParser(); +const languageParser = new LanguageParser();
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (29)
package.json
(1 hunks)repomix.config.json
(1 hunks)src/cli/actions/defaultAction.ts
(1 hunks)src/cli/cliRun.ts
(1 hunks)src/config/configSchema.ts
(2 hunks)src/core/file/workers/fileProcessWorker.ts
(2 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/ext2lang.ts
(1 hunks)src/core/tree-sitter/lang2Query.ts
(1 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/c.ts
(1 hunks)src/core/tree-sitter/queries/c_sharp.ts
(1 hunks)src/core/tree-sitter/queries/cpp.ts
(1 hunks)src/core/tree-sitter/queries/go.ts
(1 hunks)src/core/tree-sitter/queries/index.ts
(1 hunks)src/core/tree-sitter/queries/java.ts
(1 hunks)src/core/tree-sitter/queries/javascript.ts
(1 hunks)src/core/tree-sitter/queries/php.ts
(1 hunks)src/core/tree-sitter/queries/python.ts
(1 hunks)src/core/tree-sitter/queries/ruby.ts
(1 hunks)src/core/tree-sitter/queries/rust.ts
(1 hunks)src/core/tree-sitter/queries/swift.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)tests/cli/actions/defaultAction.test.ts
(1 hunks)tests/cli/cliRun.test.ts
(2 hunks)tests/config/configSchema.test.ts
(3 hunks)tests/core/tree-sitter/LanguageParser.test.ts
(1 hunks)tests/core/tree-sitter/parseFile.test.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (25)
- src/core/tree-sitter/queries/php.ts
- tests/cli/cliRun.test.ts
- tests/config/configSchema.test.ts
- src/core/tree-sitter/queries/ruby.ts
- src/config/configSchema.ts
- src/core/tree-sitter/queries/java.ts
- src/core/tree-sitter/lang2Query.ts
- src/core/tree-sitter/queries/rust.ts
- src/core/tree-sitter/queries/swift.ts
- src/cli/actions/defaultAction.ts
- src/core/tree-sitter/queries/c_sharp.ts
- tests/cli/actions/defaultAction.test.ts
- src/cli/cliRun.ts
- src/core/tree-sitter/queries/javascript.ts
- src/core/tree-sitter/queries/python.ts
- src/core/tree-sitter/queries/cpp.ts
- repomix.config.json
- package.json
- src/core/tree-sitter/queries/typescript.ts
- src/core/tree-sitter/queries/go.ts
- src/core/tree-sitter/loadLanguage.ts
- src/core/tree-sitter/ext2lang.ts
- src/core/tree-sitter/queries/c.ts
- tests/core/tree-sitter/LanguageParser.test.ts
- src/core/tree-sitter/queries/index.ts
🧰 Additional context used
🪛 Biome (1.9.4)
src/core/tree-sitter/LanguageParser.ts
[error] 5-5: Some named imports are only used as types.
This import is only used as a type.
Importing the types with import type ensures that they are removed by the compilers and avoids loading unnecessary modules.
Safe fix: Add inline type keywords.
(lint/style/useImportType)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Test (macos-latest, 21.x)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
- GitHub Check: Test (windows-latest, 18.x)
- GitHub Check: Test coverage
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
src/core/tree-sitter/LanguageParser.ts (1)
44-51
: LGTM!The method correctly handles file extension extraction and type-safe language lookup.
tests/core/tree-sitter/parseFile.test.ts (1)
5-192
: Add error handling tests.The test suite needs to verify error handling scenarios.
Add tests for:
- Malformed code
- Unsupported file extensions
- Null or undefined inputs
- Invalid configurations
Run the following script to verify current error handling:
#!/bin/bash # Description: Verify error handling across different scenarios # Test 1: Check if getFn_parseFile is exported correctly rg -A 5 "export.*getFn_parseFile" src/ # Test 2: Check Tree-sitter query patterns for each language fd -e scm -x cat {} \; -x echo "---" # Test 3: Check for any existing error handling rg "throw|catch|Error" src/ -g "!*.test.ts"
b2fd315
to
74bd3f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/core/tree-sitter/parseFile.ts (1)
52-54
: 🛠️ Refactor suggestionImprove error handling and logging.
The current error handling simply logs to console and continues. Consider:
- Using proper error logging with severity levels
- Adding error context
- Allowing error handling configuration
- } catch (error) { - console.log(`Error parsing file: ${error}\n`); + } catch (error: unknown) { + const context = { + filePath, + language: lang, + error: error instanceof Error ? error.message : String(error) + }; + logger.error('Error parsing file:', context); + if (config.failOnError) { + throw new Error(`Failed to parse ${filePath}: ${context.error}`); + }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 53-54: src/core/tree-sitter/parseFile.ts#L53-L54
Added lines #L53 - L54 were not covered by tests
🧹 Nitpick comments (17)
src/core/tree-sitter/queries/python.ts (1)
1-14
: Consider enhancing Python query coverage.While the current queries cover basic class/function definitions and calls, consider adding support for:
- Decorated functions/classes
- Lambda functions
- Async functions/methods
- Class methods and static methods
Here's an example of enhanced queries:
export default ` (class_definition name: (identifier) @name.definition.class) @definition.class +(decorated_definition + definition: (class_definition + name: (identifier) @name.definition.class)) @definition.class + (function_definition name: (identifier) @name.definition.function) @definition.function +(decorated_definition + definition: (function_definition + name: (identifier) @name.definition.function)) @definition.function + +(lambda) @definition.function + +(async_function_definition + name: (identifier) @name.definition.function) @definition.function + (call function: [ (identifier) @name.reference.call (attribute attribute: (identifier) @name.reference.call) ]) @reference.call `;src/core/tree-sitter/queries/java.ts (1)
1-22
: Enhance Java query coverage for generics and annotations.The current queries handle basic Java constructs well, but could be improved by adding support for:
- Generic type parameters in class/interface declarations
- Constructor declarations
- Annotated elements
- Generic method declarations
Here's an example of enhanced queries:
export default ` (class_declaration name: (identifier) @name.definition.class) @definition.class +(generic_type + type: (type_identifier) @name.reference.type) @reference.type + +(constructor_declaration + name: (identifier) @name.definition.constructor) @definition.constructor + +(annotation + name: (identifier) @name.reference.annotation) @reference.annotation + (method_declaration name: (identifier) @name.definition.method) @definition.method +(generic_method + name: (identifier) @name.definition.method) @definition.method + (method_invocation name: (identifier) @name.reference.call arguments: (argument_list) @reference.call) (interface_declaration name: (identifier) @name.definition.interface) @definition.interface (type_list (type_identifier) @name.reference.implementation) @reference.implementation (object_creation_expression type: (type_identifier) @name.reference.class) @reference.class (superclass (type_identifier) @name.reference.class) @reference.class `;src/core/tree-sitter/queries/php.ts (1)
1-28
: Enhance PHP query coverage for static and namespaced elements.The queries handle most PHP constructs well, but could be improved by adding support for:
- Static method calls and properties
- Namespaced class and function references
- Anonymous functions and arrow functions
Here's an example of enhanced queries:
export default ` (class_declaration name: (name) @name.definition.class) @definition.class +(namespace_definition + name: (namespace_name) @name.definition.namespace) @definition.namespace + +(namespace_use_declaration + (namespace_name) @name.reference.namespace) @reference.namespace + (function_definition name: (name) @name.definition.function) @definition.function +(arrow_function) @definition.function + +(anonymous_function) @definition.function + (method_declaration name: (name) @name.definition.function) @definition.function (object_creation_expression [ (qualified_name (name) @name.reference.class) (variable_name (name) @name.reference.class) ]) @reference.class +(static_call_expression + class: [ + (qualified_name (name) @name.reference.class) + (variable_name (name) @name.reference.class) + ] + name: (name) @name.reference.call) @reference.call + (function_call_expression function: [ (qualified_name (name) @name.reference.call) (variable_name (name)) @name.reference.call ]) @reference.call (scoped_call_expression name: (name) @name.reference.call) @reference.call (member_call_expression name: (name) @name.reference.call) @reference.call `;src/core/tree-sitter/lang2query.ts (2)
1-13
: Consider using path aliases for imports.The relative imports could be simplified using path aliases, making the code more maintainable if the file structure changes.
Example configuration in
tsconfig.json
:{ "compilerOptions": { "baseUrl": ".", "paths": { "@queries/*": ["src/core/tree-sitter/queries/*"] } } }Then update imports:
-import c from './queries/c.js'; -import c_sharp from './queries/c_sharp.js'; +import c from '@queries/c.js'; +import c_sharp from '@queries/c_sharp.js'; // ... and so on
14-27
: Consider using Record type for better type safety.The language mapping could benefit from using TypeScript's Record type for better type safety.
+type QueryModule = string; + -export const lang2Query = { +export const lang2Query: Record<string, QueryModule> = { javascript: javascript, typescript: typescript, // ... rest of the mappings };src/core/tree-sitter/queries/cpp.ts (1)
1-17
: Enhance C++ query patterns for better coverage.The current queries cover basic C++ declarations but could be improved by adding patterns for:
- Template declarations and specializations
- Namespace declarations
- Constructor and destructor declarations
- Comments and documentation strings (like in other language queries)
Here's an example of additional patterns to consider:
+// Template declarations +(template_declaration + declaration: (class_specifier name: (type_identifier) @name.definition.class)) @definition.class + +// Namespace declarations +(namespace_definition + name: (identifier) @name.definition.module) @definition.module + +// Constructor declarations +(function_declarator + declarator: (qualified_identifier + scope: (namespace_identifier) + name: (identifier) @name.definition.constructor)) @definition.constructor + +// Comments +( + (comment)* @doc + . + (class_specifier name: (type_identifier) @name.definition.class) @definition.class + (#strip! @doc "^//\\s*") + (#set-adjacent! @doc @definition.class) +)src/core/tree-sitter/queries/go.ts (1)
1-32
: Add patterns for Go interfaces and structs.The queries are well-structured but could be enhanced by adding patterns for:
- Interface declarations
- Struct declarations
- Type alias declarations
Here's an example of additional patterns to consider:
+// Interface declarations +( + (comment)* @doc + . + (type_spec + name: (type_identifier) @name.definition.interface + type: (interface_type)) @definition.interface + (#strip! @doc "^//\\s*") + (#set-adjacent! @doc @definition.interface) +) + +// Struct declarations +( + (comment)* @doc + . + (type_spec + name: (type_identifier) @name.definition.class + type: (struct_type)) @definition.class + (#strip! @doc "^//\\s*") + (#set-adjacent! @doc @definition.class) +) + +// Type alias declarations +(type_spec + name: (type_identifier) @name.definition.type + type: (type_identifier) @name.reference.type) @definition.typesrc/core/tree-sitter/queries/c_sharp.ts (1)
1-48
: Add patterns for C# attributes, properties, and events.The queries are comprehensive but could be enhanced by adding:
- Attribute declarations and usage
- Property declarations
- Event declarations
- XML documentation comments
Here's an example of additional patterns to consider:
+// Attribute declarations +(attribute + name: (identifier) @name.reference.class) @reference.class + +// Property declarations +(property_declaration + name: (identifier) @name.definition.property) @definition.property + +// Event declarations +(event_declaration + name: (identifier) @name.definition.event) @definition.event + +// XML documentation comments +( + (documentation_comment)* @doc + . + [ + (class_declaration + name: (identifier) @name.definition.class) @definition.class + (method_declaration + name: (identifier) @name.definition.method) @definition.method + ] + (#strip! @doc "^///\\s*") + (#select-adjacent! @doc [@definition.class @definition.method]) +)src/core/tree-sitter/queries/ruby.ts (1)
1-66
: Add patterns for Ruby blocks, procs, and lambdas.The queries are well-structured but could be enhanced by adding patterns for:
- Block declarations
- Proc and lambda declarations
- Instance and class variable declarations
Here's an example of additional patterns to consider:
+// Block declarations +(block + parameters: (block_parameters + (identifier) @name.definition.parameter)) @definition.block + +// Proc declarations +(call + method: (identifier) @name + arguments: (argument_list + (lambda))) @definition.lambda +(#eq? @name "proc") + +// Instance and class variables +(instance_variable) @name.reference.instance_variable +(class_variable) @name.reference.class_variableAdditionally, consider adding a pattern to exclude more Ruby keywords in the call pattern on line 64:
-(#not-match? @name.reference.call "^(lambda|load|require|require_relative|__FILE__|__LINE__)$") +(#not-match? @name.reference.call "^(lambda|load|require|require_relative|__FILE__|__LINE__|include|extend|attr_reader|attr_writer|attr_accessor)$")src/core/tree-sitter/queries/rust.ts (3)
4-11
: Consider adding visibility modifiers to ADT patterns.The ADT patterns could be more specific by including visibility modifiers (pub, pub(crate), etc.) to better distinguish between public and private types.
(struct_item + visibility: (_)? ; Optional visibility modifier name: (type_identifier) @name.definition.class) @definition.class (enum_item + visibility: (_)? ; Optional visibility modifier name: (type_identifier) @name.definition.class) @definition.class (union_item + visibility: (_)? ; Optional visibility modifier name: (type_identifier) @name.definition.class) @definition.class
20-22
: Add pattern for associated functions.The method definition pattern only captures instance methods. Consider adding a pattern for associated functions (static methods).
(declaration_list (function_item - name: (identifier) @name.definition.method)) @definition.method + name: (identifier) @name.definition.method + parameters: (parameters + (self_parameter)? @is_method ; Optional self parameter + ))) @definition.method
56-61
: Add pattern for generic type parameters.The implementation patterns could be enhanced to capture generic type parameters.
(impl_item + type_parameters: (type_parameters)? ; Optional generic parameters trait: (type_identifier) @name.reference.implementation) @reference.implementation (impl_item + type_parameters: (type_parameters)? ; Optional generic parameters type: (type_identifier) @name.reference.implementation !trait) @reference.implementationsrc/core/tree-sitter/LanguageParser.ts (1)
44-51
: Improve type safety for file extensions.The file extension handling could be more type-safe by using a predefined set of supported extensions.
+type SupportedExtension = keyof typeof ext2lang; + public guessTheLang(filePath: string) { const ext = this.getFileExtension(filePath); - if (!Object.keys(ext2lang).includes(ext)) { + if (!this.isSupportedExtension(ext)) { return undefined; } - const lang = ext2lang[ext as keyof typeof ext2lang] as keyof typeof lang2Query; + const lang = ext2lang[ext as SupportedExtension] as keyof typeof lang2Query; return lang; } + +private isSupportedExtension(ext: string): ext is SupportedExtension { + return ext in ext2lang; +}src/core/tree-sitter/parseFile.ts (1)
7-7
: Implement the unused config parameter.The TODO comment indicates that the
config
parameter is not used. Consider implementing configuration options such as:
- Minimum line threshold for compression
- Custom patterns to include/exclude
- Compression level settings
Would you like me to help implement these configuration options?
src/core/tree-sitter/queries/javascript.ts (3)
1-10
: Consider handling getters and setters in method definitions.The method definition pattern excludes constructors but doesn't specifically handle getters and setters. Consider adding patterns for these special method types.
(method_definition name: (property_identifier) @name.definition.method) @definition.method (#not-eq? @name.definition.method "constructor") + (#not-match? @name.definition.method "^(get|set)\\s")
25-38
: Enhance function declaration patterns for async/generator functions.The function declaration pattern could be more specific about async and generator functions to provide better context in the output.
[ (function_declaration name: (identifier) @name.definition.function) (generator_function name: (identifier) @name.definition.function) (generator_function_declaration name: (identifier) @name.definition.function) + (async_function_declaration + name: (identifier) @name.definition.function) ] @definition.function
75-79
: Improve call expression pattern robustness.The call expression pattern could be more robust by handling additional cases like
new.target
,super()
, etc.( (call_expression function: (identifier) @name.reference.call) @reference.call (#not-match? @name.reference.call "^(require)$") + (#not-match? @name.reference.call "^(super|new\\.target)$") )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (29)
package.json
(1 hunks)repomix.config.json
(1 hunks)src/cli/actions/defaultAction.ts
(1 hunks)src/cli/cliRun.ts
(1 hunks)src/config/configSchema.ts
(2 hunks)src/core/file/workers/fileProcessWorker.ts
(2 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/ext2lang.ts
(1 hunks)src/core/tree-sitter/lang2query.ts
(1 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/README.md
(1 hunks)src/core/tree-sitter/queries/c.ts
(1 hunks)src/core/tree-sitter/queries/c_sharp.ts
(1 hunks)src/core/tree-sitter/queries/cpp.ts
(1 hunks)src/core/tree-sitter/queries/go.ts
(1 hunks)src/core/tree-sitter/queries/java.ts
(1 hunks)src/core/tree-sitter/queries/javascript.ts
(1 hunks)src/core/tree-sitter/queries/php.ts
(1 hunks)src/core/tree-sitter/queries/python.ts
(1 hunks)src/core/tree-sitter/queries/ruby.ts
(1 hunks)src/core/tree-sitter/queries/rust.ts
(1 hunks)src/core/tree-sitter/queries/swift.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)tests/cli/actions/defaultAction.test.ts
(1 hunks)tests/cli/cliRun.test.ts
(2 hunks)tests/config/configSchema.test.ts
(3 hunks)tests/core/tree-sitter/LanguageParser.test.ts
(1 hunks)tests/core/tree-sitter/parseFile.test.ts
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/core/tree-sitter/queries/README.md
🚧 Files skipped from review as they are similar to previous changes (13)
- src/cli/cliRun.ts
- package.json
- repomix.config.json
- tests/cli/cliRun.test.ts
- src/core/tree-sitter/queries/swift.ts
- tests/cli/actions/defaultAction.test.ts
- src/core/tree-sitter/queries/typescript.ts
- src/config/configSchema.ts
- src/core/tree-sitter/ext2lang.ts
- src/core/tree-sitter/queries/c.ts
- src/core/tree-sitter/loadLanguage.ts
- tests/core/tree-sitter/LanguageParser.test.ts
- tests/config/configSchema.test.ts
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/tree-sitter/parseFile.ts
[warning] 12-13: src/core/tree-sitter/parseFile.ts#L12-L13
Added lines #L12 - L13 were not covered by tests
[warning] 17-18: src/core/tree-sitter/parseFile.ts#L17-L18
Added lines #L17 - L18 were not covered by tests
[warning] 47-47: src/core/tree-sitter/parseFile.ts#L47
Added line #L47 was not covered by tests
[warning] 53-54: src/core/tree-sitter/parseFile.ts#L53-L54
Added lines #L53 - L54 were not covered by tests
src/core/tree-sitter/LanguageParser.ts
[warning] 31-32: src/core/tree-sitter/LanguageParser.ts#L31-L32
Added lines #L31 - L32 were not covered by tests
src/cli/actions/defaultAction.ts
[warning] 171-172: src/cli/actions/defaultAction.ts#L171-L172
Added lines #L171 - L172 were not covered by tests
src/core/file/workers/fileProcessWorker.ts
[warning] 38-39: src/core/file/workers/fileProcessWorker.ts#L38-L39
Added lines #L38 - L39 were not covered by tests
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (10)
src/core/tree-sitter/lang2query.ts (1)
29-29
: LGTM!The type definition using
keyof typeof
is a good practice for type safety.src/core/tree-sitter/LanguageParser.ts (2)
21-27
: Add error handling in prepareLang.The method should handle potential errors from
loadLanguage
andparser.setLanguage
.
53-55
: Add cleanup method for proper resource management.Consider adding a cleanup method to properly dispose of parsers when they are no longer needed.
src/core/file/workers/fileProcessWorker.ts (2)
37-40
: Add error handling for parseFile result.The compressed mode needs better error handling. The current implementation silently falls back to the original content when parsing fails.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 38-39: src/core/file/workers/fileProcessWorker.ts#L38-L39
Added lines #L38 - L39 were not covered by tests
37-40
: Add test coverage for compress functionality.The new compress functionality is not covered by tests. Consider adding test cases for:
- Successful compression
- Failed compression with fallback
- Unsupported file types
❌ Verification inconclusive
🏁 Script executed:
#!/bin/bash # Check test coverage for compress functionality rg -l "test.*compress" "tests/"Length of output: 31
Missing Test Coverage for Compress Functionality
After re-verifying the tests by searching for the keyword "compress" in the tests directory, there are no test cases covering the compress functionality. The review comment is correct and highlights the need to add tests for:
- Successful compression
- Failed compression with fallback
- Unsupported file types
Please add the necessary tests.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 38-39: src/core/file/workers/fileProcessWorker.ts#L38-L39
Added lines #L38 - L39 were not covered by testssrc/core/tree-sitter/parseFile.ts (1)
8-56
: Optimize memory usage and improve type safety.Consider using a generator to yield chunks instead of storing them in memory.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 12-13: src/core/tree-sitter/parseFile.ts#L12-L13
Added lines #L12 - L13 were not covered by tests
[warning] 17-18: src/core/tree-sitter/parseFile.ts#L17-L18
Added lines #L17 - L18 were not covered by tests
[warning] 47-47: src/core/tree-sitter/parseFile.ts#L47
Added line #L47 was not covered by tests
[warning] 53-54: src/core/tree-sitter/parseFile.ts#L53-L54
Added lines #L53 - L54 were not covered by testssrc/cli/actions/defaultAction.ts (1)
169-172
: Add test coverage for compress option.The new compress option handling is not covered by tests. Add test cases to verify the behavior when compress option is set to true/false.
❌ Verification inconclusive
🏁 Script executed:
#!/bin/bash # Description: Verify test coverage for compress option # Check existing test coverage rg -l "compress.*options" tests/Length of output: 34
Test Coverage Missing for Compress Option
The repository lacks tests verifying the new compress option behavior (both true and false scenarios). Please add test cases that explicitly check that the CLI configuration is updated correctly when the compress option is provided.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 171-172: src/cli/actions/defaultAction.ts#L171-L172
Added lines #L171 - L172 were not covered by teststests/core/tree-sitter/parseFile.test.ts (3)
1-15
: Add test helper to reduce code duplication.The test setup is repeated across all test cases. Consider extracting it to a helper function.
function createParseTest(language: string, content: string, expectedIdentifier: string) { test(`should parse ${language} correctly`, async () => { const filePath = `dummy.${language}`; const config = {}; const parseFile = await getFn_parseFile(); const result = await parseFile(content, filePath, config as RepomixConfigMerged); expect(typeof result).toBe('string'); expect(result).toContain(expectedIdentifier); }); }
48-57
: Add error handling tests.The test suite lacks error handling scenarios. Add tests for malformed code, unsupported file extensions, and invalid inputs.
test('should handle malformed code gracefully', async () => { const fileContent = 'def greet(name: print(f"Hello, {name}")'; // Missing closing parenthesis const filePath = 'dummy.py'; const config = {}; const parseFile = await getFn_parseFile(); await expect(parseFile(fileContent, filePath, config as RepomixConfigMerged)).rejects.toThrow(); }); test('should handle unsupported file extensions', async () => { const fileContent = 'function test() {}'; const filePath = 'dummy.unsupported'; const config = {}; const parseFile = await getFn_parseFile(); await expect(parseFile(fileContent, filePath, config as RepomixConfigMerged)).rejects.toThrow(); });
1-192
: Add tests for compressed output format.The test suite should verify the behavior when compression is enabled.
test('should handle compressed output format', async () => { const fileContent = 'function test() { console.log("test"); }'; const filePath = 'dummy.js'; const config = { output: { compress: true } }; const parseFile = await getFn_parseFile(); const result = await parseFile(fileContent, filePath, config as RepomixConfigMerged); expect(result).not.toContain('\n'); expect(result).toContain('test'); });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/core/tree-sitter/loadLanguage.ts (1)
7-10
:⚠️ Potential issueAdd type safety, validation, and error handling.
The function needs improvements in several areas:
- Missing return type annotation
- No input validation
- No error handling
- No Parser initialization check
Apply this diff:
-export async function loadLanguage(langName: string) { +export async function loadLanguage(langName: string): Promise<Parser.Language> { + if (!langName || typeof langName !== 'string') { + throw new Error('Invalid language name'); + } + + try { + if (!Parser.isInitialized) { + await Parser.init(); + } + const wasmPath = require.resolve(`tree-sitter-wasms/out/tree-sitter-${langName}.wasm`); - return await Parser.Language.load(wasmPath); + return await Parser.Language.load(wasmPath); + } catch (error) { + throw new Error(`Failed to load language ${langName}: ${error.message}`); + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/core/tree-sitter/loadLanguage.ts
(1 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
src/core/tree-sitter/loadLanguage.ts
[error] 3-3: A Node.js builtin module should be imported with the node: protocol.
Using the node: protocol is more explicit and signals that the imported module belongs to Node.js.
Unsafe fix: Add the node: protocol.
(lint/style/useNodejsImportProtocol)
🪛 GitHub Actions: CI
src/core/tree-sitter/loadLanguage.ts
[error] 3-3: A Node.js builtin module should be imported with the node: protocol.
🔇 Additional comments (1)
src/core/tree-sitter/loadLanguage.ts (1)
5-5
: LGTM!The require initialization is correctly implemented using
createRequire
withimport.meta.url
.
Hi @yamadashy I've got some update:
|
@huy-trn I really appreciate the effort you put into this! In the future, we could extend this by allowing compression types to be specified, something like: # Default - extracts function/class definitions (current implementation)
repomix --compress
# Most minimal output - only essential structure
repomix --compress minimal
# Extract only API interfaces and types
repomix --compress api
... But that's for future enhancement. For now, let's focus on getting this solid foundation merged and released. Great work! I'll take a closer look at the code later, but this is looking really promising! 👍 |
84d7641
to
43525d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (3)
src/core/tree-sitter/loadLanguage.ts (2)
8-19
: LGTM! Consider improving error handling.The function correctly validates input and handles errors. However, consider adding more specific error types for better error handling.
Apply this diff to improve error handling:
+class LanguageLoadError extends Error { + constructor(langName: string, cause: Error) { + super(`Failed to load language ${langName}: ${cause.message}`); + this.name = 'LanguageLoadError'; + this.cause = cause; + } +} export async function loadLanguage(langName: string): Promise<Parser.Language> { if (!langName) { - throw new Error('Invalid language name'); + throw new LanguageLoadError(langName, new Error('Invalid language name')); } try { const wasmPath = await getWasmPath(langName); return await Parser.Language.load(wasmPath); } catch (error: any) { - throw new Error(`Failed to load language ${langName}: ${error.message}`); + throw new LanguageLoadError(langName, error); } }🧰 Tools
🪛 Biome (1.9.4)
[error] 16-16: Unexpected any. Specify a different type.
any disables many type checking rules. Its use should be avoided.
(lint/suspicious/noExplicitAny)
🪛 GitHub Check: codecov/patch
[warning] 10-11: src/core/tree-sitter/loadLanguage.ts#L10-L11
Added lines #L10 - L11 were not covered by tests
[warning] 17-18: src/core/tree-sitter/loadLanguage.ts#L17-L18
Added lines #L17 - L18 were not covered by tests
21-29
: LGTM! Consider improving error handling.The function correctly resolves and verifies the WASM file path. However, consider adding more specific error types for better error handling.
Apply this diff to improve error handling:
+class WasmFileNotFoundError extends Error { + constructor(langName: string, wasmPath: string) { + super(`WASM file not found for language ${langName}: ${wasmPath}`); + this.name = 'WasmFileNotFoundError'; + } +} async function getWasmPath(langName: string): Promise<string> { const wasmPath = require.resolve(`tree-sitter-wasms/out/tree-sitter-${langName}.wasm`); try { await fs.access(wasmPath); return wasmPath; } catch { - throw new Error(`WASM file not found for language ${langName}: ${wasmPath}`); + throw new WasmFileNotFoundError(langName, wasmPath); } }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 27-28: src/core/tree-sitter/loadLanguage.ts#L27-L28
Added lines #L27 - L28 were not covered by testssrc/core/file/workers/fileProcessWorker.ts (1)
37-50
: LGTM! Consider improving error handling.The function correctly handles compressed output and errors. However, consider adding more specific error types and improving error messages.
Apply this diff to improve error handling:
+class ParseFileError extends Error { + constructor(filePath: string, cause: Error) { + super(`Error parsing ${filePath} in compressed mode: ${cause.message}`); + this.name = 'ParseFileError'; + this.cause = cause; + } +} if (config.output.compress) { const parseFile = await getFn_parseFile(); try { const parsedContent = await parseFile(processedContent, rawFile.path, config); if (parsedContent === undefined) { - logger.trace(`Failed to parse ${rawFile.path} in compressed mode. Using original content.`); + logger.warn(`Failed to parse ${rawFile.path} in compressed mode. Using original content.`); } processedContent = parsedContent ?? processedContent; } catch (error: any) { - logger.error(`Error parsing ${rawFile.path} in compressed mode: ${error.message}`); - //re-throw error - throw error; + throw new ParseFileError(rawFile.path, error); } }🧰 Tools
🪛 Biome (1.9.4)
[error] 45-45: Unexpected any. Specify a different type.
any disables many type checking rules. Its use should be avoided.
(lint/suspicious/noExplicitAny)
🪛 GitHub Check: codecov/patch
[warning] 38-46: src/core/file/workers/fileProcessWorker.ts#L38-L46
Added lines #L38 - L46 were not covered by tests
[warning] 48-49: src/core/file/workers/fileProcessWorker.ts#L48-L49
Added lines #L48 - L49 were not covered by tests
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/core/file/workers/fileProcessWorker.ts
(2 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
src/core/file/workers/fileProcessWorker.ts
[error] 45-45: Unexpected any. Specify a different type.
any disables many type checking rules. Its use should be avoided.
(lint/suspicious/noExplicitAny)
src/core/tree-sitter/loadLanguage.ts
[error] 16-16: Unexpected any. Specify a different type.
any disables many type checking rules. Its use should be avoided.
(lint/suspicious/noExplicitAny)
🪛 GitHub Check: codecov/patch
src/core/file/workers/fileProcessWorker.ts
[warning] 38-46: src/core/file/workers/fileProcessWorker.ts#L38-L46
Added lines #L38 - L46 were not covered by tests
[warning] 48-49: src/core/file/workers/fileProcessWorker.ts#L48-L49
Added lines #L48 - L49 were not covered by tests
src/core/tree-sitter/loadLanguage.ts
[warning] 10-11: src/core/tree-sitter/loadLanguage.ts#L10-L11
Added lines #L10 - L11 were not covered by tests
[warning] 17-18: src/core/tree-sitter/loadLanguage.ts#L17-L18
Added lines #L17 - L18 were not covered by tests
[warning] 27-28: src/core/tree-sitter/loadLanguage.ts#L27-L28
Added lines #L27 - L28 were not covered by tests
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Test (macos-latest, 22.x)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
- GitHub Check: Test (windows-latest, 18.x)
- GitHub Check: Test (windows-latest, 18.0.0)
- GitHub Check: Analyze (javascript-typescript)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (5)
src/core/tree-sitter/loadLanguage.ts (2)
8-20
: Enhance input validation.While the function includes basic input validation, it could be more thorough to prevent potential issues.
Apply this diff to improve input validation:
export async function loadLanguage(langName: string): Promise<Parser.Language> { - if (!langName) { + if (!langName || typeof langName !== 'string' || langName.trim() === '') { throw new Error('Invalid language name'); } + langName = langName.trim().toLowerCase();
22-30
: Add input validation and improve error handling.The function could benefit from input validation and more descriptive error messages.
Apply this diff to improve the function:
async function getWasmPath(langName: string): Promise<string> { + if (!langName || typeof langName !== 'string' || langName.trim() === '') { + throw new Error('Invalid language name'); + } + langName = langName.trim().toLowerCase(); + const wasmPath = require.resolve(`tree-sitter-wasms/out/tree-sitter-${langName}.wasm`); try { await fs.access(wasmPath); return wasmPath; - } catch { - throw new Error(`WASM file not found for language ${langName}: ${wasmPath}`); + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + throw new Error(`WASM file not found for language ${langName} at ${wasmPath}: ${message}`); } }src/core/tree-sitter/LanguageParser.ts (2)
18-20
: Add input validation and handle edge cases.The method should validate input and handle edge cases like files without extensions.
Apply this diff to improve the method:
private getFileExtension(filePath: string) { + if (!filePath || typeof filePath !== 'string') { + throw new Error('Invalid file path'); + } + const ext = path.extname(filePath).toLowerCase().slice(1); + return ext || ''; - return path.extname(filePath).toLowerCase().slice(1); }
50-57
: Add input validation and improve type safety.The method should validate input and use type guards instead of type assertions.
Apply this diff to improve the method:
public guessTheLang(filePath: string) { + if (!filePath || typeof filePath !== 'string') { + throw new Error('Invalid file path'); + } const ext = this.getFileExtension(filePath); - if (!Object.keys(ext2lang).includes(ext)) { + if (!(ext in ext2lang)) { return undefined; } - const lang = ext2lang[ext as keyof typeof ext2lang] as keyof typeof lang2Query; + const lang = ext2lang[ext as keyof typeof ext2lang]; + if (!(lang in lang2Query)) { + return undefined; + } return lang; }src/core/tree-sitter/parseFile.ts (1)
8-8
: Address the TODO comment.The config parameter is marked as unused. Consider removing it if not needed or document why it's reserved for future use.
Do you want me to help implement the config parameter usage or remove it?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/core/file/workers/fileProcessWorker.ts
(2 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Test (macos-latest, 19.x)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
- GitHub Check: Test (windows-latest, 18.x)
- GitHub Check: Test (windows-latest, 18.0.0)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
src/core/tree-sitter/LanguageParser.ts (1)
43-48
: Add error handling and input validation.The method should validate input and handle errors from
prepareLang
.Apply this diff to improve the method:
public async getQueryForLang(name: SupportedLang) { + if (!name) { + throw new Error('Invalid language name'); + } + try { if (!this.loadedQueries[name]) { await this.prepareLang(name); } return this.loadedQueries[name]; + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + throw new Error(`Failed to get query for language ${name}: ${message}`); + } }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/core/tree-sitter/LanguageParser.ts (3)
18-20
: Add input validation and edge case handling.The
getFileExtension
method could be more robust by handling edge cases.private getFileExtension(filePath: string) { + if (!filePath) { + return ''; + } return path.extname(filePath).toLowerCase().slice(1); }
35-40
: Add input validation to parser and query getters.Both methods could benefit from input validation.
Apply this pattern to both methods:
public async getParserForLang(name: SupportedLang) { + if (!name || typeof name !== 'string') { + throw new RepomixError('Invalid language name'); + } if (!this.loadedParsers[name]) { await this.prepareLang(name); } return this.loadedParsers[name]; }Also applies to: 43-48
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 37-38: src/core/tree-sitter/LanguageParser.ts#L37-L38
Added lines #L37 - L38 were not covered by tests
50-57
: Improve type safety and add documentation.The type casting in
guessTheLang
could be more type-safe, and the method would benefit from documentation.+/** + * Attempts to determine the programming language based on file extension. + * @param filePath - The path to the file + * @returns The detected language or undefined if not supported + */ public guessTheLang(filePath: string) { const ext = this.getFileExtension(filePath); - if (!Object.keys(ext2lang).includes(ext)) { + if (!ext || !Object.prototype.hasOwnProperty.call(ext2lang, ext)) { return undefined; } - const lang = ext2lang[ext as keyof typeof ext2lang] as keyof typeof lang2Query; + const lang = ext2lang[ext as keyof typeof ext2lang]; + return lang satisfies keyof typeof lang2Query; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/core/tree-sitter/LanguageParser.ts
(1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/tree-sitter/LanguageParser.ts
[warning] 30-32: src/core/tree-sitter/LanguageParser.ts#L30-L32
Added lines #L30 - L32 were not covered by tests
[warning] 37-38: src/core/tree-sitter/LanguageParser.ts#L37-L38
Added lines #L37 - L38 were not covered by tests
[warning] 63-65: src/core/tree-sitter/LanguageParser.ts#L63-L65
Added lines #L63 - L65 were not covered by tests
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Test (windows-latest, 20.x)
🔇 Additional comments (3)
src/core/tree-sitter/LanguageParser.ts (3)
1-17
: LGTM! Well-structured class with proper type definitions.The imports are well-organized, and the class structure follows TypeScript best practices with proper type annotations for the private fields.
22-33
: Add test coverage for error handling paths.The error handling in
prepareLang
is not covered by tests according to static analysis. Please add test cases to cover error scenarios.Would you like me to help generate test cases for error handling scenarios? For example:
- Language loading failure
- Parser initialization failure
- Query creation failure
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 30-32: src/core/tree-sitter/LanguageParser.ts#L30-L32
Added lines #L30 - L32 were not covered by tests
59-66
: Add test coverage for initialization error handling.The error handling in
init
is not covered by tests according to static analysis.Would you like me to help generate test cases for the initialization error scenario? For example:
- Mock
Parser.init()
to throw an error- Verify that the error is properly caught and wrapped
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 63-65: src/core/tree-sitter/LanguageParser.ts#L63-L65
Added lines #L63 - L65 were not covered by tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (3)
src/core/tree-sitter/LanguageParser.ts (3)
35-40
: 🛠️ Refactor suggestionAdd error handling and input validation.
The function should validate input and handle errors from
prepareLang
.Apply this diff to improve the method:
public async getParserForLang(name: SupportedLang) { + if (!name) { + throw new Error('Invalid language name'); + } + try { if (!this.loadedParsers[name]) { await this.prepareLang(name); } return this.loadedParsers[name]; + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + throw new Error(`Failed to get parser for language ${name}: ${message}`); + } }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 37-38: src/core/tree-sitter/LanguageParser.ts#L37-L38
Added lines #L37 - L38 were not covered by tests
22-33
: 🛠️ Refactor suggestionAdd resource cleanup to prevent memory leaks.
The function should clean up resources in case of errors to prevent memory leaks.
Apply this diff to add cleanup:
private async prepareLang(name: SupportedLang) { + let parser: Parser | undefined; try { const lang = await loadLanguage(name); - const parser = new Parser(); + parser = new Parser(); parser.setLanguage(lang); this.loadedParsers[name] = parser; this.loadedQueries[name] = lang.query(lang2Query[name]); } catch (error) { + if (parser) { + parser.delete(); + } const message = error instanceof Error ? error.message : String(error); throw new RepomixError(`Failed to prepare language ${name}: ${message}`); } }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 30-32: src/core/tree-sitter/LanguageParser.ts#L30-L32
Added lines #L30 - L32 were not covered by tests
43-48
: 🛠️ Refactor suggestionAdd error handling and input validation.
The function should validate input and handle errors from
prepareLang
.Apply this diff to improve the method:
public async getQueryForLang(name: SupportedLang) { + if (!name) { + throw new Error('Invalid language name'); + } + try { if (!this.loadedQueries[name]) { await this.prepareLang(name); } return this.loadedQueries[name]; + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + throw new Error(`Failed to get query for language ${name}: ${message}`); + } }
🧹 Nitpick comments (1)
src/core/tree-sitter/parseFile.ts (1)
11-62
: Optimize memory usage with generators.The function should use generators to yield chunks instead of storing them in memory.
Apply this diff to optimize memory usage:
const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { + if (!fileContent || !filePath) { + throw new Error('Invalid input: fileContent and filePath are required'); + } const lines = fileContent.split('\n'); if (lines.length < 1) { return ''; } const lang = languageParser.guessTheLang(filePath); if (lang === undefined) { return undefined; } const query = await languageParser.getQueryForLang(lang); const parser = await languageParser.getParserForLang(lang); - const chunks = []; + async function* generateChunks() { try { const tree = parser.parse(fileContent); const captures = query.captures(tree.rootNode); captures.sort((a, b) => a.node.startPosition.row - b.node.startPosition.row); for (const capture of captures) { const { node, name } = capture; const startRow = node.startPosition.row; const endRow = node.endPosition.row; if (!name.includes('name') || !lines[startRow]) { continue; } const selectedLines = lines.slice(startRow, endRow + 1); if (selectedLines.length < 1) { continue; } - const chunk = selectedLines.join('\n'); - chunks.push(chunk); + yield selectedLines.join('\n'); } } catch (error: unknown) { - logger.log(`Error parsing file: ${error}\n`); + const message = error instanceof Error ? error.message : String(error); + logger.error(`Error parsing file ${filePath}: ${message}`); + throw error; } } - return chunks.join('\n'); + const chunks = []; + for await (const chunk of generateChunks()) { + chunks.push(chunk); + } + return chunks.join('\n'); };🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 15-16: src/core/tree-sitter/parseFile.ts#L15-L16
Added lines #L15 - L16 were not covered by tests
[warning] 21-22: src/core/tree-sitter/parseFile.ts#L21-L22
Added lines #L21 - L22 were not covered by tests
[warning] 53-53: src/core/tree-sitter/parseFile.ts#L53
Added line #L53 was not covered by tests
[warning] 59-60: src/core/tree-sitter/parseFile.ts#L59-L60
Added lines #L59 - L60 were not covered by tests
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)tests/core/tree-sitter/parseFile.test.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/core/tree-sitter/queries/typescript.ts
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/tree-sitter/parseFile.ts
[warning] 15-16: src/core/tree-sitter/parseFile.ts#L15-L16
Added lines #L15 - L16 were not covered by tests
[warning] 21-22: src/core/tree-sitter/parseFile.ts#L21-L22
Added lines #L21 - L22 were not covered by tests
[warning] 53-53: src/core/tree-sitter/parseFile.ts#L53
Added line #L53 was not covered by tests
[warning] 59-60: src/core/tree-sitter/parseFile.ts#L59-L60
Added lines #L59 - L60 were not covered by tests
src/core/tree-sitter/LanguageParser.ts
[warning] 30-32: src/core/tree-sitter/LanguageParser.ts#L30-L32
Added lines #L30 - L32 were not covered by tests
[warning] 37-38: src/core/tree-sitter/LanguageParser.ts#L37-L38
Added lines #L37 - L38 were not covered by tests
[warning] 63-65: src/core/tree-sitter/LanguageParser.ts#L63-L65
Added lines #L63 - L65 were not covered by tests
🔇 Additional comments (2)
src/core/tree-sitter/LanguageParser.ts (2)
18-20
: LGTM!The function correctly extracts and normalizes file extensions.
50-57
: LGTM!The function correctly handles language detection with proper type safety.
Apply Deepseek-R1's solution to resolving tree-sitter-wasms path
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
a55a13d
to
4da50f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/core/tree-sitter/parseFile.ts (1)
17-71
: 🛠️ Refactor suggestionOptimize memory usage and improve error handling.
The current implementation can be improved in several ways:
- Use generators to yield chunks instead of storing them in memory
- Add input validation
- Improve error handling
- Add resource cleanup
Apply this diff to improve the implementation:
-export const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { +export const parseFile = async ( + fileContent: string, + filePath: string, + config: RepomixConfigMerged +): Promise<string | undefined> => { + if (!fileContent || !filePath) { + throw new Error('Invalid input: fileContent and filePath are required'); + } + const languageParser = await getLanguageParserSingleton(); const lines = fileContent.split('\n'); if (lines.length < 1) { return ''; } const lang: SupportedLang | undefined = languageParser.guessTheLang(filePath); if (lang === undefined) { return undefined; } const query = await languageParser.getQueryForLang(lang); const parser = await languageParser.getParserForLang(lang); - const chunks = []; + async function* generateChunks() { try { const tree = parser.parse(fileContent); const captures = query.captures(tree.rootNode); captures.sort((a, b) => a.node.startPosition.row - b.node.startPosition.row); for (const capture of captures) { const { node, name } = capture; const startRow = node.startPosition.row; const endRow = node.endPosition.row; if (!name.includes('name') || !lines[startRow]) { continue; } const selectedLines = lines.slice(startRow, endRow + 1); if (selectedLines.length < 1) { continue; } - const chunk = selectedLines.join('\n'); - chunks.push(chunk); + yield selectedLines.join('\n'); } } catch (error: unknown) { - logger.log(`Error parsing file: ${error}\n`); + const message = error instanceof Error ? error.message : String(error); + logger.error(`Error parsing file ${filePath}: ${message}`); + throw error; } - return chunks.join('\n'); + } + + const chunks = []; + for await (const chunk of generateChunks()) { + chunks.push(chunk); + } + return chunks.join('\n'); }; + +parseFile.cleanup = async () => { + const languageParser = await getLanguageParserSingleton(); + await languageParser.cleanup(); +};
🧹 Nitpick comments (2)
src/core/tree-sitter/parseFile.ts (1)
16-17
: Address the TODO comment.The TODO comment indicates that the
config
parameter is not being used. Consider removing it if it's not needed, or document its intended future use.Do you want me to help implement the configuration handling?
tests/core/tree-sitter/parseFile.test.ts (1)
50-50
: Remove debug logging.The
console.log
statement should be removed as it's not needed in tests.- console.log(result);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (29)
package.json
(1 hunks)repomix.config.json
(1 hunks)src/cli/actions/defaultAction.ts
(1 hunks)src/cli/cliRun.ts
(1 hunks)src/config/configSchema.ts
(2 hunks)src/core/file/workers/fileProcessWorker.ts
(2 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/ext2lang.ts
(1 hunks)src/core/tree-sitter/lang2query.ts
(1 hunks)src/core/tree-sitter/loadLanguage.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/README.md
(1 hunks)src/core/tree-sitter/queries/c.ts
(1 hunks)src/core/tree-sitter/queries/c_sharp.ts
(1 hunks)src/core/tree-sitter/queries/cpp.ts
(1 hunks)src/core/tree-sitter/queries/go.ts
(1 hunks)src/core/tree-sitter/queries/java.ts
(1 hunks)src/core/tree-sitter/queries/javascript.ts
(1 hunks)src/core/tree-sitter/queries/php.ts
(1 hunks)src/core/tree-sitter/queries/python.ts
(1 hunks)src/core/tree-sitter/queries/ruby.ts
(1 hunks)src/core/tree-sitter/queries/rust.ts
(1 hunks)src/core/tree-sitter/queries/swift.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)tests/cli/actions/defaultAction.test.ts
(1 hunks)tests/cli/cliRun.test.ts
(2 hunks)tests/config/configSchema.test.ts
(3 hunks)tests/core/tree-sitter/LanguageParser.test.ts
(1 hunks)tests/core/tree-sitter/parseFile.test.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (25)
- src/cli/cliRun.ts
- repomix.config.json
- src/core/tree-sitter/ext2lang.ts
- src/core/tree-sitter/queries/python.ts
- src/core/tree-sitter/queries/php.ts
- src/core/tree-sitter/queries/README.md
- src/core/tree-sitter/queries/rust.ts
- tests/cli/cliRun.test.ts
- src/core/tree-sitter/queries/ruby.ts
- src/cli/actions/defaultAction.ts
- src/config/configSchema.ts
- tests/config/configSchema.test.ts
- package.json
- tests/cli/actions/defaultAction.test.ts
- src/core/tree-sitter/queries/swift.ts
- src/core/tree-sitter/loadLanguage.ts
- src/core/tree-sitter/lang2query.ts
- src/core/tree-sitter/queries/c.ts
- src/core/tree-sitter/queries/go.ts
- src/core/tree-sitter/queries/cpp.ts
- src/core/tree-sitter/queries/typescript.ts
- src/core/tree-sitter/queries/c_sharp.ts
- tests/core/tree-sitter/LanguageParser.test.ts
- src/core/tree-sitter/queries/java.ts
- src/core/tree-sitter/queries/javascript.ts
🔇 Additional comments (2)
src/core/tree-sitter/LanguageParser.ts (1)
1-67
: LGTM!The implementation looks good. The class provides a robust interface for language parsing with proper error handling and resource cleanup.
src/core/file/workers/fileProcessWorker.ts (1)
1-62
: LGTM!The implementation looks good. The file processing logic is well-structured with proper error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/core/tree-sitter/parseFile.ts (1)
67-69
: 🛠️ Refactor suggestionImprove error handling and logging.
The error handling could be improved by:
- Using
logger.error
instead oflogger.log
- Including more context in the error message
- Rethrowing the error for better error propagation
Apply this diff to improve error handling:
} catch (error: unknown) { - logger.log(`Error parsing file: ${error}\n`); + const message = error instanceof Error ? error.message : String(error); + const context = { + filePath, + language: lang, + error: message + }; + logger.error('Error parsing file:', context); + throw new Error(`Failed to parse ${filePath}: ${message}`); }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 68-69: src/core/tree-sitter/parseFile.ts#L68-L69
Added lines #L68 - L69 were not covered by tests
🧹 Nitpick comments (6)
src/core/tree-sitter/ext2Lang.ts (1)
4-23
: Consider adding more file extensions.The mapping is comprehensive but could benefit from additional common file extensions:
.mjs
and.mts
for ECMAScript modules.cc
for C++.m
and.mm
for Objective-C/C++Apply this diff to add more extensions:
export const ext2Lang = { vue: 'javascript', cjs: 'javascript', js: 'javascript', + mjs: 'javascript', jsx: 'javascript', ts: 'typescript', + mts: 'typescript', tsx: 'typescript', h: 'c', c: 'c', hpp: 'cpp', cpp: 'cpp', + cc: 'cpp', py: 'python', rs: 'rust', java: 'java', go: 'go', cs: 'c_sharp', rb: 'ruby', php: 'php', swift: 'swift', + m: 'c', + mm: 'cpp', };src/core/tree-sitter/queries/php.ts (1)
1-28
: Consider adding more PHP-specific patterns.The queries cover basic PHP constructs but could benefit from additional patterns:
- Trait declarations
- Interface declarations
- Namespace declarations
- Use statements
Add these patterns to enhance PHP support:
export const queryPhp = ` (class_declaration name: (name) @name.definition.class) @definition.class +; Trait declarations +(trait_declaration + name: (name) @name.definition.trait) @definition.trait + +; Interface declarations +(interface_declaration + name: (name) @name.definition.interface) @definition.interface + +; Namespace declarations +(namespace_definition + name: (namespace_name) @name.definition.namespace) @definition.namespace + +; Use statements +(use_declaration + (qualified_name (name) @name.reference.use)) @reference.use + (function_definition name: (name) @name.definition.function) @definition.function (method_declaration name: (name) @name.definition.function) @definition.function (object_creation_expression [ (qualified_name (name) @name.reference.class) (variable_name (name) @name.reference.class) ]) @reference.class (function_call_expression function: [ (qualified_name (name) @name.reference.call) (variable_name (name)) @name.reference.call ]) @reference.call (scoped_call_expression name: (name) @name.reference.call) @reference.call (member_call_expression name: (name) @name.reference.call) @reference.call `;src/core/tree-sitter/queries/go.ts (1)
1-32
: Consider adding more Go-specific patterns.The queries handle basic Go constructs well, especially with comment handling, but could benefit from additional patterns:
- Interface declarations
- Struct declarations
- Package declarations
- Import statements
Add these patterns to enhance Go support:
export const queryGo = ` ( (comment)* @doc . (function_declaration name: (identifier) @name.definition.function) @definition.function (#strip! @doc "^//\\s*") (#set-adjacent! @doc @definition.function) ) ( (comment)* @doc . (method_declaration name: (field_identifier) @name.definition.method) @definition.method (#strip! @doc "^//\\s*") (#set-adjacent! @doc @definition.method) ) +; Interface declarations +( + (comment)* @doc + . + (type_declaration + (type_spec + name: (type_identifier) @name.definition.interface + type: (interface_type))) @definition.interface + (#strip! @doc "^//\\s*") + (#set-adjacent! @doc @definition.interface) +) + +; Struct declarations +( + (comment)* @doc + . + (type_declaration + (type_spec + name: (type_identifier) @name.definition.struct + type: (struct_type))) @definition.struct + (#strip! @doc "^//\\s*") + (#set-adjacent! @doc @definition.struct) +) + +; Package declarations +(package_clause + name: (identifier) @name.definition.package) @definition.package + +; Import declarations +(import_spec + path: (interpreted_string_literal) @name.reference.import) @reference.import + (call_expression function: [ (identifier) @name.reference.call (parenthesized_expression (identifier) @name.reference.call) (selector_expression field: (field_identifier) @name.reference.call) (parenthesized_expression (selector_expression field: (field_identifier) @name.reference.call)) ]) @reference.call (type_spec name: (type_identifier) @name.definition.type) @definition.type (type_identifier) @name.reference.type @reference.type `;src/core/tree-sitter/queries/cSharp.ts (1)
1-48
: Enhance C# query patterns for better code coverage.The current Tree-sitter query patterns could be enhanced to cover more C# language constructs:
- Properties and events
- Async/await methods
- Generic type parameters
- Delegates and lambda expressions
Add these patterns to improve code coverage:
export const queryCSharp = ` +; property definitions +(property_declaration + name: (identifier) @name.definition.property) @definition.property + +; event definitions +(event_declaration + name: (identifier) @name.definition.event) @definition.event + +; async method definitions +(method_declaration + modifiers: (modifier_list [(async_modifier)]) @modifiers + name: (identifier) @name.definition.method) @definition.method + +; generic type parameters +(type_parameter_list + (type_parameter + name: (identifier) @name.definition.type_parameter)) @definition.type_parameter + +; delegate definitions +(delegate_declaration + name: (identifier) @name.definition.delegate) @definition.delegate + +; lambda expressions +(lambda_expression + parameters: (parameter_list) @parameters) @definition.lambda ${queryCSharp}`src/core/tree-sitter/queries/rust.ts (1)
1-62
: Enhance Rust query patterns for better code coverage.The current Tree-sitter query patterns could be enhanced to cover more Rust language constructs:
- Async functions and blocks
- Lifetime parameters
- Const and static items
- Associated types in traits
Add these patterns to improve code coverage:
export const queryRust = ` +; async function definitions +(async_fn_item + name: (identifier) @name.definition.async_function) @definition.async_function + +; lifetime parameters +(lifetime + name: (identifier) @name.definition.lifetime) @definition.lifetime + +; const items +(const_item + name: (identifier) @name.definition.const) @definition.const + +; static items +(static_item + name: (identifier) @name.definition.static) @definition.static + +; associated types in traits +(assoc_type + name: (type_identifier) @name.definition.type) @definition.type ${queryRust}`src/core/tree-sitter/parseFile.ts (1)
16-17
: Address the TODO comment about unused config parameter.The config parameter is marked as unused. Consider using it to configure parsing behavior, such as:
- Filtering specific types of nodes
- Controlling the depth of parsing
- Setting parsing options
Would you like me to help implement the configuration options?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
src/cli/actions/defaultAction.ts
(2 hunks)src/cli/actions/remoteAction.ts
(1 hunks)src/cli/cliSpinner.ts
(1 hunks)src/core/tree-sitter/LanguageParser.ts
(1 hunks)src/core/tree-sitter/ext2Lang.ts
(1 hunks)src/core/tree-sitter/lang2Query.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)src/core/tree-sitter/queries/c.ts
(1 hunks)src/core/tree-sitter/queries/cSharp.ts
(1 hunks)src/core/tree-sitter/queries/cpp.ts
(1 hunks)src/core/tree-sitter/queries/go.ts
(1 hunks)src/core/tree-sitter/queries/java.ts
(1 hunks)src/core/tree-sitter/queries/javascript.ts
(1 hunks)src/core/tree-sitter/queries/php.ts
(1 hunks)src/core/tree-sitter/queries/python.ts
(1 hunks)src/core/tree-sitter/queries/ruby.ts
(1 hunks)src/core/tree-sitter/queries/rust.ts
(1 hunks)src/core/tree-sitter/queries/swift.ts
(1 hunks)src/core/tree-sitter/queries/typescript.ts
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/cli/actions/remoteAction.ts
🚧 Files skipped from review as they are similar to previous changes (8)
- src/core/tree-sitter/queries/c.ts
- src/core/tree-sitter/queries/ruby.ts
- src/core/tree-sitter/queries/swift.ts
- src/core/tree-sitter/queries/python.ts
- src/core/tree-sitter/queries/java.ts
- src/core/tree-sitter/queries/cpp.ts
- src/core/tree-sitter/queries/typescript.ts
- src/core/tree-sitter/queries/javascript.ts
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/tree-sitter/LanguageParser.ts
[warning] 30-32: src/core/tree-sitter/LanguageParser.ts#L30-L32
Added lines #L30 - L32 were not covered by tests
[warning] 37-38: src/core/tree-sitter/LanguageParser.ts#L37-L38
Added lines #L37 - L38 were not covered by tests
[warning] 63-65: src/core/tree-sitter/LanguageParser.ts#L63-L65
Added lines #L63 - L65 were not covered by tests
src/core/tree-sitter/parseFile.ts
[warning] 23-24: src/core/tree-sitter/parseFile.ts#L23-L24
Added lines #L23 - L24 were not covered by tests
[warning] 29-30: src/core/tree-sitter/parseFile.ts#L29-L30
Added lines #L29 - L30 were not covered by tests
[warning] 62-62: src/core/tree-sitter/parseFile.ts#L62
Added line #L62 was not covered by tests
[warning] 68-69: src/core/tree-sitter/parseFile.ts#L68-L69
Added lines #L68 - L69 were not covered by tests
src/cli/actions/defaultAction.ts
[warning] 171-172: src/cli/actions/defaultAction.ts#L171-L172
Added lines #L171 - L172 were not covered by tests
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
- GitHub Check: Test (windows-latest, 18.x)
- GitHub Check: Test (windows-latest, 18.0.0)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
src/cli/cliSpinner.ts (1)
6-6
: LGTM! Good improvement in export style.The change from default export to named export is a good practice as it makes imports more explicit and improves IDE auto-import functionality.
src/cli/actions/defaultAction.ts (2)
14-14
: LGTM! Import statement updated correctly.The import statement has been properly updated to use named import syntax, aligning with the changes in
cliSpinner.ts
.
170-172
:❓ Verification inconclusive
Add test coverage for the compress option.
The compress option handling logic needs test coverage to ensure it works as expected.
Would you like me to help you generate test cases for the compress option handling? The test cases should cover:
- Setting compress to true/false via CLI options
- Verifying the option is correctly merged into the final config
- Ensuring the option affects the output as expected
🏁 Script executed:
#!/bin/bash # Description: Check for existing test coverage of the compress option # Search for test files that might contain compress option tests rg -l 'compress' tests/Length of output: 163
Verify Complete Test Coverage for the Compress Option
- It appears that multiple test files (e.g.,
tests/cli/actions/defaultAction.test.ts
,tests/cli/cliRun.test.ts
) contain references to "compress." Please ensure these tests explicitly cover:
- Setting
compress
to bothtrue
andfalse
via CLI options.- Verifying that the
compress
property is correctly merged into the final configuration (i.e., incliConfig.output
).- Confirming that the compress option correctly affects the output behavior.
If the current tests don’t sufficiently cover these scenarios, please add targeted test cases. Let me know if you’d like assistance generating those tests.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 171-172: src/cli/actions/defaultAction.ts#L171-L172
Added lines #L171 - L172 were not covered by testssrc/core/tree-sitter/ext2Lang.ts (1)
1-3
: LGTM! Documentation is helpful.The reference to Tree-sitter WASM output helps understand the source of language support.
src/core/tree-sitter/lang2Query.ts (1)
1-30
: LGTM! Well-organized implementation.The implementation is clean and well-structured:
- Imports are organized logically
- Language-to-query mapping is comprehensive
- Type definition using
keyof typeof
is elegant and type-safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
src/core/tree-sitter/languageParser.ts (2)
22-33
: Unify error handling approach for consistency.Currently,
prepareLang
throws aRepomixError
, whileinit
rethrows a genericError
. For consistency across the codebase, consider usingRepomixError
in both methods to maintain uniform error handling.Here is a potential fix:
-throw new Error(`Failed to initialize parser: ${message}`); +throw new RepomixError(`Failed to initialize parser: ${message}`);
36-40
: Consider concurrency handling forprepareLang
.When two calls to
getParserForLang
with the same language occur in parallel, they may triggerprepareLang
simultaneously. Currently, nothing prevents duplicative initialization. While Node.js is single-threaded, concurrency through async awaits can still overlap. You may want to employ a locking mechanism or a queue to avoid redundant loads.src/core/tree-sitter/parseFile.ts (2)
16-16
: Address the TODO to utilizeconfig
.A
TODO
note indicates thatconfig
is not used. Decide whether the configuration influences filtering, output formatting, or AST capturing. Otherwise, remove the parameter to avoid confusion.
34-40
: Consider logging errors at the error level.Currently, the code logs parsing errors with
logger.log(...)
. For clarity and structured logging, uselogger.error(...)
when an exception occurs during parsing, ensuring that errors can be differentiated from regular logs.- logger.log(`Error parsing file: ${error}\n`); + logger.error(`Error parsing file: ${error}`);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/core/tree-sitter/languageParser.ts
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)tests/core/tree-sitter/LanguageParser.test.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/core/tree-sitter/LanguageParser.test.ts
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
- GitHub Check: Test coverage
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
src/core/tree-sitter/languageParser.ts (1)
59-67
: Ensure that parser initialization is retried or handled gracefully.If
Parser.init()
fails once, future calls toinit()
might need a way to retry or clean up partial states. Consider whether partial initializations are possible and implement a safe fallback. For now, re-invokinginit()
might not fix partial failures.src/core/tree-sitter/parseFile.ts (1)
20-71
: Optional: Incorporate streaming or generator approach for large files.As your captures can iterate through large file contents, consider yielding snippets instead of loading all into memory. This can enhance performance, especially for huge files. Revisit existing bot suggestions about using generators if memory usage is a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/core/tree-sitter/parseFile.ts (2)
21-23
:⚠️ Potential issueAdd input validation.
The function should validate input parameters.
Apply this diff to add input validation:
export const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { + if (!fileContent) { + throw new Error('Invalid input: fileContent is required'); + } + if (!filePath) { + throw new Error('Invalid input: filePath is required'); + } + if (!config) { + throw new Error('Invalid input: config is required'); + } const languageParser = await getLanguageParserSingleton();
77-79
: 🛠️ Refactor suggestionImprove error handling.
The current error handling lacks proper error typing and logging structure.
Apply this diff to improve error handling:
- } catch (error: unknown) { - logger.log(`Error parsing file: ${error}\n`); + } catch (error: unknown) { + const context = { + filePath, + language: lang, + error: error instanceof Error ? error.message : String(error) + }; + logger.error('Error parsing file:', context); + throw new Error(`Failed to parse ${filePath}: ${context.error}`); }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 78-79: src/core/tree-sitter/parseFile.ts#L78-L79
Added lines #L78 - L79 were not covered by tests
🧹 Nitpick comments (2)
src/core/tree-sitter/parseFile.ts (2)
16-18
: Enhance chunk normalization.The
normalizeChunk
function could be more robust in handling edge cases.Apply this diff to improve the implementation:
function normalizeChunk(chunk: string): string { - return chunk.trim(); + if (!chunk) return ''; + // Normalize line endings and remove extra whitespace + return chunk + .replace(/\r\n/g, '\n') + .split('\n') + .map(line => line.trim()) + .join('\n') + .trim(); }
20-20
: Address the TODO comment.The config parameter is currently unused. Consider using it to configure parsing behavior.
Would you like me to help implement configuration options for the parser? Some potential options could include:
- Configurable chunk size limits
- Filtering options for specific node types
- Custom normalization rules
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/core/tree-sitter/parseFile.ts
(1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/tree-sitter/parseFile.ts
[warning] 27-28: src/core/tree-sitter/parseFile.ts#L27-L28
Added lines #L27 - L28 were not covered by tests
[warning] 33-34: src/core/tree-sitter/parseFile.ts#L33-L34
Added lines #L33 - L34 were not covered by tests
[warning] 67-67: src/core/tree-sitter/parseFile.ts#L67
Added line #L67 was not covered by tests
[warning] 78-79: src/core/tree-sitter/parseFile.ts#L78-L79
Added lines #L78 - L79 were not covered by tests
🪛 GitHub Actions: CI
src/core/tree-sitter/parseFile.ts
[error] 68-68: Unexpected whitespace change detected. Please remove unnecessary blank lines.
🔇 Additional comments (1)
src/core/tree-sitter/parseFile.ts (1)
26-28
: Add test coverage for edge cases.Several code paths lack test coverage according to the static analysis:
- Empty file handling
- Unsupported language handling
- Empty chunk handling
- Error handling
Would you like me to help generate comprehensive test cases for these scenarios?
Also applies to: 33-34, 66-68, 78-79
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 27-28: src/core/tree-sitter/parseFile.ts#L27-L28
Added lines #L27 - L28 were not covered by tests
How is this going? |
9265066
to
9d4f260
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/core/tree-sitter/parseFile.ts (1)
20-20
: Address the TODO comment.The TODO comment indicates that the
config
parameter is not used. Please ensure that the parameter is utilized or remove it if unnecessary.Do you want me to help implement the configuration handling or open a new issue to track this task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
README.md
(4 hunks)repomix.config.json
(1 hunks)src/core/tree-sitter/parseFile.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- repomix.config.json
🧰 Additional context used
🪛 LanguageTool
README.md
[uncategorized] ~417-~417: Loose punctuation mark.
Context: ...can increase token count. - --compress
: Perform intelligent code extraction, fo...
(UNLIKELY_OPENING_PUNCTUATION)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Test (windows-latest, 23.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 21.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Test (windows-latest, 19.x)
🔇 Additional comments (6)
src/core/tree-sitter/parseFile.ts (4)
6-14
: Improve type safety and add cleanup method.The singleton pattern implementation can be improved:
- Use
undefined
instead ofnull
for better TypeScript idioms.- Add a cleanup method for proper resource management.
Apply this diff to improve the implementation:
-let languageParserSingleton: LanguageParser | null = null; +let languageParserSingleton: LanguageParser | undefined; const getLanguageParserSingleton = async () => { if (!languageParserSingleton) { languageParserSingleton = new LanguageParser(); await languageParserSingleton.init(); } return languageParserSingleton; }; + +export const cleanupLanguageParser = async () => { + if (languageParserSingleton) { + await languageParserSingleton.cleanup(); + languageParserSingleton = undefined; + } +};
16-18
: LGTM!The function is simple, focused, and correctly implements string trimming.
21-23
: Add input validation.The function should validate input parameters.
Apply this diff to add input validation:
export const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { + if (!fileContent) { + throw new Error('Invalid input: fileContent is required'); + } + if (!filePath) { + throw new Error('Invalid input: filePath is required'); + } + if (!config) { + throw new Error('Invalid input: config is required'); + } const languageParser = await getLanguageParserSingleton();
38-81
: Optimize memory usage and improve type safety.Consider the following improvements:
- Use a generator to yield chunks instead of storing them in memory
- Add type guards for error handling
- Add input validation
Apply this diff to improve the function:
export const parseFile = async (fileContent: string, filePath: string, config: RepomixConfigMerged) => { const languageParser = await getLanguageParserSingleton(); // Split the file content into individual lines const lines = fileContent.split('\n'); if (lines.length < 1) { return ''; } const lang = languageParser.guessTheLang(filePath); if (lang === undefined) { // Language not supported return undefined; } const query = await languageParser.getQueryForLang(lang); const parser = await languageParser.getParserForLang(lang); - const processedChunks = new Set<string>(); - const chunks = []; + + async function* generateChunks() { try { // Parse the file content into an Abstract Syntax Tree (AST) const tree = parser.parse(fileContent); const captures = query.captures(tree.rootNode); captures.sort((a, b) => a.node.startPosition.row - b.node.startPosition.row); for (const capture of captures) { const { node, name } = capture; const startRow = node.startPosition.row; const endRow = node.endPosition.row; if (!name.includes('name') || !lines[startRow]) { continue; } const selectedLines = lines.slice(startRow, endRow + 1); if (selectedLines.length < 1) { continue; } const chunk = selectedLines.join('\n'); const normalizedChunk = normalizeChunk(chunk); - if (!processedChunks.has(normalizedChunk)) { - processedChunks.add(normalizedChunk); - chunks.push(chunk); - } + yield chunk; } - } catch (error: unknown) { - logger.log(`Error parsing file: ${error}\n`); + } catch (error: unknown) { + const message = error instanceof Error ? error.message : String(error); + logger.error(`Error parsing file ${filePath}: ${message}`); + throw error; } - return chunks.join('\n'); + } + + const processedChunks = new Set<string>(); + const chunks = []; + for await (const chunk of generateChunks()) { + const normalizedChunk = normalizeChunk(chunk); + if (!processedChunks.has(normalizedChunk)) { + processedChunks.add(normalizedChunk); + chunks.push(chunk); + } + } + return chunks.join('\n'); };README.md (2)
417-417
: LGTM!The new
--compress
option is well-documented with a clear description.🧰 Tools
🪛 LanguageTool
[uncategorized] ~417-~417: Loose punctuation mark.
Context: ...can increase token count. ---compress
: Perform intelligent code extraction, fo...(UNLIKELY_OPENING_PUNCTUATION)
515-549
: LGTM!The code compression section is well-documented with:
- Clear explanation of the feature
- Helpful example demonstrating the compression
- Note about supported languages
9d4f260
to
169ea22
Compare
@huy-trn I plan to update the website documentation after this gets merged. Would you be okay with merging this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
README.md (2)
424-427
: [CLI Option Description Enhancement]
The newly added--compress
option is clearly introduced in the CLI options section with a concise description. However, the description would benefit from consistent punctuation. For example, adding a period at the end of the sentence can improve clarity.- `--compress`: Perform intelligent code extraction, focusing on essential function and class signatures to reduce token count + `--compress`: Perform intelligent code extraction, focusing on essential function and class signatures to reduce token count.🧰 Tools
🪛 LanguageTool
[uncategorized] ~426-~426: Loose punctuation mark.
Context: ...can increase token count. ---compress
: Perform intelligent code extraction, fo...(UNLIKELY_OPENING_PUNCTUATION)
524-548
: [Code Compression Section Clarity]
The new "### Code Compression" section provides a thorough explanation of how the--compress
option works, including a practical example that demonstrates the transformation from full code to a compressed output. This section helps users understand the benefits and expected behavior of the compression feature. One suggestion is to consider adding a brief note about any limitations or scenarios where the compression might not capture all the necessary contextual details, which can further guide users.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
README.md
(5 hunks)repomix.config.json
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- repomix.config.json
🧰 Additional context used
🪛 LanguageTool
README.md
[uncategorized] ~426-~426: Loose punctuation mark.
Context: ...can increase token count. - --compress
: Perform intelligent code extraction, fo...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (1)
README.md (1)
597-603
: [Configuration Example Consistency]
The example configuration now includes the"compress": false,
property. This update aligns the configuration schema with the new CLI option, ensuring that users can control code compression via their config file. The default value offalse
is appropriate and clearly documented.
Hi @SpyC0der77! The project is going well! |
Wow! That’s quite an effort, @yamadashy. I agree that we can merge it at this stage and address any issues later, since it’s a new feature and won’t affect other parts of Repomix. Sorry I couldn’t find time to finish the PR sooner or write cleaner code, but thanks to you, it’s all covered now! Again, great work! |
@huy-trn I'll merge and release it now! |
Hi @yamadashy,
This PR is still a work in progress, but as per your suggestion, I'm submitting it early for easier discussion.
I would greatly appreciate your feedback, particularly on the CLI options and how we should handle non-parseable content from users' code.
I'll resolve the merge conflicts soon when I get a chance.
Need help:
I'm facing some challenges with the Tree-sitter queries, as I'm not very familiar with either Tree-sitter queries or the wide range of languages we're aiming to support. For instance, the parser is currently unable to identify arrow function declarations in TypeScript code.
Ref:
Checklist
npm run test
npm run lint