-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
search: add support for counting individual matches (Fixes #566) #814
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@balajisivaraman This looks great to me! I left some comments but I think they are all pretty minor!
src/search_buffer.rs
Outdated
@@ -21,7 +21,8 @@ pub struct BufferSearcher<'a, W: 'a> { | |||
grep: &'a Grep, | |||
path: &'a Path, | |||
buf: &'a [u8], | |||
match_count: u64, | |||
matching_line_count: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you name this match_line_count
so that it's more consistent with match_count
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the future (no need to do this for this PR), it would be super helpful to put renames (and similarly uninteresting changes) in a separate commit. In this case, a commit that came before the principle change. The reason is that I don't really care about the mechanics of a name change so long as the compiler is happy, but it increases the noise in the diff among other changes that I do carefully want to look at.
It can be hard to do this though because it does require some forethought, so don't feel like I'm imposing a strong requirement! But if you think of it and it's not too much work, it definitely helps the review process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this does indeed make a lot of sense to me. I understand completely and will make the change in a separate commit for this PR also. No issues. :-) Good housekeeping is always welcome.
src/search_buffer.rs
Outdated
if self.opts.count && self.matching_line_count > 0 { | ||
self.printer.path_count(self.path, self.matching_line_count); | ||
} else if self.opts.count_matches | ||
&& self.match_count.map_or(false, |c| c > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When wrapping a conditional to multiple lines, could you move the brace down to the next line so that the division between the conditional and the conditional body is more clear?
src/search_buffer.rs
Outdated
let mut last_end = 0; | ||
for m in self.grep.iter(self.buf) { | ||
if self.opts.invert_match { | ||
self.print_inverted_matches(last_end, m.start()); | ||
} else { | ||
self.print_match(m.start(), m.end()); | ||
self.count_individual_matches(m.start(), m.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to move this call into print_match
itself?
src/search_stream.rs
Outdated
@@ -291,6 +305,7 @@ impl<'a, R: io::Read, W: WriteColor> Searcher<'a, R, W> { | |||
self.print_after_context(start); | |||
self.print_before_context(start); | |||
self.print_match(start, end); | |||
self.count_individual_matches(start, end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for the mmap searcher. Can you move this into the print_match
method?
@balajisivaraman OK, I finally got a chance to merge this today in 27fc9f2. All I did was fix up the conflicts with latest master and reformatted the commit message. (Commit messages should note the issues they close.) I also added an additional commit that causes |
As discussed in #566, this PR adds a
--count-matches
flag that will count the individual matches instead of the matching lines.This does essentially the same thing
printer.matched
does for printing only matches, which is to use the Regex'sfind_iter
method for getting the individual matches. We then increment a variable to keep track of the count, ignoring the actual match itself. Since this will always happen behind a flag, I hope this won't have any performance implications.I initially considered using the existing
match_count
variable insearch_stream
andsearch_buffer
to keep track of individual matches if--count-matches
were passed in. This would've been handy if we later wanted to use the individual match count for something like the--stats
flag, as it would get returned all the way up tomain.rs
; we can then use this for our purposes.However, I decided against this because it conflicts with the existing
--max-count
argument, which terminates the search early if we've hit the max count of matching lines. As a result,rg
still terminates for matching line count and not individual match count even after this change. I'm just throwing this out there for your consideration.Currently, I've renamed the existing
match_count
tomatching_line_count
to better reflect what it keeps track of.match_count
is now anOption<usize>
and used to keep track of individual match count.-c/--count
and--count-matches
will override each other.