You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
rg '\d\d\d000' test.txt should identify the single match in the file, as grep does. Specifically:
$ rg '\d\d\d000' test.txt
1:153.230000
Other
Note that changing the corpus in seemingly irrelevant ways can cause the bug to change or disappear. For example, the \d\d\d000 pattern matches if three 0 characters are prepended to the contents of the file (that is, the file contains 000153.230000).
The text was updated successfully, but these errors were encountered:
This commit fixes a bug where the reverse suffix literal optimization
wasn't quite right. It was too eagerly skipping past parts of the input
without verifying that there was no match. We fix this by being a bit more
careful with what we're searching by keeping track of the starting position
of the last literal matched. Subsequent literal searches then start
immediately after the last one.
This is necessary in particular when the suffix literal can have
overlapping matches. e.g., searching `000` in `0000` can match at either
positions 0 or 1, but searching `abc` in `abcd` can only match as position
0.
This was initially reported as a bug against ripgrep:
BurntSushi/ripgrep#1203
Thanks so much for reporting this! It is indeed a bug in the regex engine. You can this and this for the specific changes to the regex engine to fix this. The short story is that you were tripping in a reverse suffix literal optimization that wasn't quite correct. The reason why it seemed sensitive to different regexes and inputs is because it is! :-) The reverse suffix literal optimization only runs in very specific circumstances related to the size and structure of the regex, and this particular bug is only tripped when a suffix literal (such as 000) can result in potentially overlapping matches. In this case, both your pattern and your haystack did this.
This should now be fixed on master, since I've bumped ripgrep's regex dependency to 1.1.2.
What version of ripgrep are you using?
How did you install ripgrep?
What operating system are you using ripgrep on?
macOS 10.14.3 (18D109)
Describe your question, feature request, or bug.
rg
appears to fail to find a certain pattern in a one-line file that definitely contains that pattern.I must be missing something — this seems very unlikely to be a legitimate bug — but I can't figure out what.
If this is a bug, what are the steps to reproduce the behavior?
echo 153.230000 >| test.txt
rg '\d\d\d00' test.txt
. This successfully finds a match of23000
.rg '\d\d\d000' test.txt
. This fails to find any match, when it should match230000
Note that
grep '\d\d\d000' test.txt
correctly matches230000
. (grep --version
grep (BSD grep) 2.5.1-FreeBSD
)If this is a bug, what is the actual behavior?
If this is a bug, what is the expected behavior?
rg '\d\d\d000' test.txt
should identify the single match in the file, asgrep
does. Specifically:Other
Note that changing the corpus in seemingly irrelevant ways can cause the bug to change or disappear. For example, the
\d\d\d000
pattern matches if three0
characters are prepended to the contents of the file (that is, the file contains000153.230000
).The text was updated successfully, but these errors were encountered: