summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndrew Gallant <jamslam@gmail.com>2023-10-09 19:51:44 -0400
committerAndrew Gallant <jamslam@gmail.com>2023-10-09 20:29:52 -0400
commit5011f6e9f1da44ffd923d612e75e70411d63a0ea (patch)
treebd4af8c3c00736d39b4a8736ec109758d863cdf3
parenta2799ccb41078c75a0a0420299a80ca4b1361632 (diff)
changelog: add perf bug fix for \b
Like the previous CHANGELOG entry, this marks a bug that was fixed likely with the introduction of regex 1.9: $ hyperfine "rg-13.0.0 -ic '\bfoo\b \bbar\b' git-3a06386e.txt" "rg -ic '\bfoo\b \bbar\b' git-3a06386e.txt" Benchmark 1: rg-13.0.0 -ic '\bfoo\b \bbar\b' git-3a06386e.txt Time (mean ± σ): 1.034 s ± 0.011 s [User: 1.030 s, System: 0.004 s] Range (min … max): 1.021 s … 1.053 s 10 runs Benchmark 2: rg -ic '\bfoo\b \bbar\b' git-3a06386e.txt Time (mean ± σ): 6.3 ms ± 0.3 ms [User: 4.6 ms, System: 1.6 ms] Range (min … max): 5.6 ms … 7.3 ms 343 runs Summary 'rg -ic '\bfoo\b \bbar\b' git-3a06386e.txt' ran 164.95 ± 7.70 times faster than 'rg-13.0.0 -ic '\bfoo\b \bbar\b' git-3a06386e.txt' This was not fixed by making \b itself faster, but rather, by improving inner literal extraction. In particular, if the regex doesn't have any literals extracted, then search time can still be quite slow: $ time rg-13.0.0 -ic '\b[a-z]{3}\b\s\b[a-z]{3}\b' git-3a06386e.txt 57538 real 0.427 user 0.423 sys 0.003 maxmem 46 MB faults 0 $ time rg -ic '\b[a-z]{3}\b\s\b[a-z]{3}\b' git-3a06386e.txt 57538 real 0.337 user 0.333 sys 0.003 maxmem 46 MB faults 0 But then again, so is grep, because grep doesn't benefit from any literal optimizations either: $ time grep -E -ic '\b[a-z]{3}\b\s\b[a-z]{3}\b' git-3a06386e.txt 62396 real 1.316 user 1.292 sys 0.007 maxmem 13 MB faults 7 The count mismatch should probably be investigated. Fixes #1760
-rw-r--r--CHANGELOG.md2
1 files changed, 2 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 3c160b9f..f9180118 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,8 @@ Unreleased changes. Release notes have not yet been written.
Performance improvements:
+* [PERF #1760](https://github.com/BurntSushi/ripgrep/issues/1760):
+ Make most searches with `\b` look-arounds (among others) much faster.
* [PERF #2591](https://github.com/BurntSushi/ripgrep/pull/2591):
Parallel directory traversal now uses work stealing for faster searches.