diff options
author | Andrew Gallant <jamslam@gmail.com> | 2020-02-16 10:36:38 -0500 |
---|---|---|
committer | Andrew Gallant <jamslam@gmail.com> | 2020-02-17 17:16:28 -0500 |
commit | 6a0e0147e03a0322fc8e7e959e787f7a635df906 (patch) | |
tree | 46e64ca747d648a8bb0b4c8e21f816464527c0c8 /CHANGELOG.md | |
parent | ad97e9c93fc0687ba7a96680ddc749c1da664446 (diff) |
grep-regex: improve literal detection with -w
When the -w/--word-regexp was used, ripgrep would in many cases fail to
apply literal optimizations. This occurs specifically when the regex
given by the user is an alternation of literals with no common prefixes
or suffixes, e.g.,
rg -w 'foo|bar|baz|quux'
In this case, the inner literal detector fails. Normally, this would
result in literal prefixes being detected by the regex engine. But
because of the -w/--word-regexp flag, the actual regex that we run ends
up looking like this:
(^|\W)(foo|bar|baz|quux)($|\W)
which of course defeats any prefix or suffix literal optimizations in
the regex crate's somewhat naive extractor. (A better extractor could
still do literal optimizations in the above case.)
So this commit fixes this by falling back to prefix or suffix literals
when they're available instead of prematurely giving up and assuming the
regex engine will do the rest.
Diffstat (limited to 'CHANGELOG.md')
-rw-r--r-- | CHANGELOG.md | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md index 49b7a692..34caeaa2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,8 @@ Performance improvements: Improve inner literal detection to cover more cases more effectively. e.g., ` +Sherlock Holmes +` now has ` Sherlock Holmes ` extracted instead of ` `. +* PERF: + Improve literal detection when the `-w/--word-regexp` flag is used. Feature enhancements: |