summaryrefslogtreecommitdiffstats
path: root/tests
diff options
context:
space:
mode:
authorAndrew Gallant <jamslam@gmail.com>2020-05-08 08:09:26 -0400
committerAndrew Gallant <jamslam@gmail.com>2020-05-08 23:24:40 -0400
commit7ed9a31819aa4f1c1b25f1fa95bdf602232ddbb0 (patch)
tree70790f5efb9e96b85b00bef4a035abf38f121320 /tests
parenta2e6aec7a4d9382941932245e8854f0ae5703a5e (diff)
printer: fix --count-matches output
In order to implement --count-matches, we simply re-execute the regex on the spans reported by the searcher. The spans always correspond to the lines that participated in the match. This is the correct thing to do, except when the regex contains look-ahead (or look-behind). In particular, the look-around permits the regex's match success to depends on an arbitrary point before or after the lines actually reported as participating in the match. Since only the matched lines are reported to the printer, it is possible for subsequent searching on those lines to fail. A true fix for this would somehow make the total span available to the printer. But that seems tricky since it isn't always available. For PCRE2's case in multiline mode, it is available because we force it to be so for correctness. For now, we simply detect this corner case heuristically. If the match count is zero, then it necessarily means there is some kind of look-around that isn't matching. So we set the match count to 1. This is probably incorrect in some cases, although my brain can't quite come up with a concrete example. Nevertheless, this is strictly better than the status quo. Fixes #1573
Diffstat (limited to 'tests')
-rw-r--r--tests/regression.rs42
1 files changed, 42 insertions, 0 deletions
diff --git a/tests/regression.rs b/tests/regression.rs
index 1bfd4f0f..b34fb0ff 100644
--- a/tests/regression.rs
+++ b/tests/regression.rs
@@ -822,3 +822,45 @@ foo: TaskID int `json:\"taskID\"`
";
eqnice!(expected, cmd.arg("TaskID +int").stdout());
});
+
+// See: https://github.com/BurntSushi/ripgrep/issues/1573
+//
+// Tests that if look-ahead is used, then --count-matches is correct.
+rgtest!(r1573, |dir: Dir, mut cmd: TestCommand| {
+ // Only PCRE2 supports look-ahead.
+ if !dir.is_pcre2() {
+ return;
+ }
+
+ dir.create_bytes("foo", b"\xFF\xFE\x00\x62");
+ dir.create(
+ "foo",
+ "\
+def A;
+def B;
+use A;
+use B;
+",
+ );
+
+ // Check that normal --count is correct.
+ cmd.args(&[
+ "--pcre2",
+ "--multiline",
+ "--count",
+ r"(?s)def (\w+);(?=.*use \w+)",
+ "foo",
+ ]);
+ eqnice!("2\n", cmd.stdout());
+
+ // Now check --count-matches.
+ let mut cmd = dir.command();
+ cmd.args(&[
+ "--pcre2",
+ "--multiline",
+ "--count-matches",
+ r"(?s)def (\w+);(?=.*use \w+)",
+ "foo",
+ ]);
+ eqnice!("2\n", cmd.stdout());
+});