summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndrew Gallant <jamslam@gmail.com>2023-08-04 14:05:54 -0400
committerAndrew Gallant <jamslam@gmail.com>2023-08-05 09:33:57 -0400
commit7227e94ce5f661355a1547a1838284bb7ad5b815 (patch)
tree4879bfe5a3c3bcde8b755c6cd08a6c419921e543
parent341a19e0d05bc6f0cabeb2dc756b37cfc047f8f0 (diff)
globset: use non-capture groups in regex transform
We currently implement globs by converting them to regexes, and in doing so, sometimes use grouping. In all but one case, we used non-capturing groups. But for alternations, we used capturing groups, which was likely just an oversight. We don't make use of capture groups at all, and while they usually don't have any overhead, they lead to weird cases like this one: https://github.com/rust-lang/regex/issues/1059 That particular issue is also a bug in the regex crate itself, which is fixed in https://github.com/rust-lang/regex/pull/1062. Note though that the bug fix in the regex crate is required. Even with this patch to globset, memory usage is reduced (by about half in rust-lang/regex#1059) but is not returned to where it was prior to the regex 1.9 release.
-rw-r--r--crates/globset/src/glob.rs3
1 files changed, 2 insertions, 1 deletions
diff --git a/crates/globset/src/glob.rs b/crates/globset/src/glob.rs
index cda39cab..d19c70ed 100644
--- a/crates/globset/src/glob.rs
+++ b/crates/globset/src/glob.rs
@@ -736,7 +736,7 @@ impl Tokens {
// It is possible to have an empty set in which case the
// resulting alternation '()' would be an error.
if !parts.is_empty() {
- re.push('(');
+ re.push_str("(?:");
re.push_str(&parts.join("|"));
re.push(')');
}
@@ -1276,6 +1276,7 @@ mod tests {
toregex!(re32, "/a**", r"^/a.*.*$");
toregex!(re33, "/**a", r"^/.*.*a$");
toregex!(re34, "/a**b", r"^/a.*.*b$");
+ toregex!(re35, "{a,b}", r"^(?:b|a)$");
matches!(match1, "a", "a");
matches!(match2, "a*b", "a_b");