summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2023-07-08readme: fix awkward grammarJakub Wilk
Closes #2402
2023-07-08readme: add winget installation sectionsitiom
Closes #2409
2023-07-08ignore/types: add USD to the default file typesMark Sisson
Closes #2432
2023-07-08ignore/types: add Gentoo eclass typeSam James
Eclasses are "ebuild libraries" and generally if you're filtering for/filtering out an ebuild/eclass, you don't want the other either. Followup to 4dfea016b915bb1e88679361de83a91e60447835 Closes #2437
2023-07-08ignore/types: improve Elixir globsangrycandy
Closes #2450
2023-07-08core: don't let context flags override eachotherAndrew Gallant
This matches the behavior of GNU grep which does not ignore before-context and after-context completely if the context flag is also provided. Note that this change wasn't done just to match GNU grep. In this case, GNU grep has the more sensible behavior. Fixes #2288, Closes #2451
2023-07-08doc: add another example for the config fileAndrew Gallant
Closes #2453
2023-07-08doc: note '-n' and '-N' override each otherMisaki
Closes #2460
2023-07-08ignore/gitignore: expose `gitconfig_excludes_path`Eric Arellano
I have reservations about this, but it looks useful and doesn't seem terribly onerous to support. The `ignore` crate will really always need to have some kind of logic supporting this in some form I think. Closes #2482
2023-07-08test: test that regex inline flags work as intendedGal Ofri
This was originally fixed by using non-capturing groups when joining patterns in crates/core/args.rs, but before that landed, it ended up getting fixed via a refactor in the course of migrating to regex 1.9. Namely, it's now fixed by pushing pattern joining down into the regex layer, so that patterns can be joined in the most effective way possible. Still, #2488 contains a useful test, so we bring that in here. The test actually failed for `rg -e ')('`, since it expected the command to fail with a syntax error. But my refactor actually causes this command to succeed. And indeed, #2488 worked around this by special casing a single pattern. That work-around fixes it for the single pattern case, but doesn't fix it for the -w or -X or multi-pattern case. So for now, we're content to leave well enough alone. The only real way to fix this for real is to parse each regexp individual and verify that each is valid on its own. It's not clear that doing so is worth it. Fixes #2480, Closes #2488
2023-07-08ignore: tweak regex crate featuresJakub Jirutka
This removes most of the Unicode features as they aren't currently used. We can always add them back later if necessary. We can avoid the unicode-perl feature by changing `\s` to `[[:space:]]`, which uses the ASCII-only definition of `\s`. Since we don't expect non-ASCII whitespace in git config files, this seems okay. Closes #2502
2023-07-08ignore/types: add 'graphql' typeJon Parise
GraphQL file extensions: .graphql and .graphqls (schema) We could also add `.gql`, but perhaps it's less correct to do so. We'll start conservatively here, and we can always add `.gql` later. Closes #2439, Closes #2508
2023-07-08cli: make resolve_binary take COM executables into accountmataha
When `resolve_binary()` attempts to resolve a path to a program on Windows while searching for a program in `PATH` without an extension, `ripgrep` will assume the extension of the file to be `.exe` as it's the *de facto* standard, which will work most (99.99%) of the time... ...unless the binary is a COM executable (we're on Windows, duh). Closes #2523
2023-07-08ignore/types: add cml to the default types listYifei Teng
It's used in Fuchsia to mean "component manifest language."[1] [1]: https://fuchsia.dev/reference/cml?hl=en Closes #2529
2023-07-08doc: update rust-version in Cargo.tomlJonathan Schwender
The MSRV got bumped a little bit ago, so this is just catchup. Closes #2539
2023-07-05grep-cli-0.1.8grep-cli-0.1.8Andrew Gallant
2023-07-05ci: try to fix CIAndrew Gallant
2023-07-05regex: remove old inner literal extractorAndrew Gallant
(It had already been removed from the crate.)
2023-07-05deps: update everythingAndrew Gallant
2023-07-05deps: drop temporary patch and move to bstr 1.6Andrew Gallant
Now that regex 1.9 is out, we can depend on it from crates.io.
2023-07-05deps: update everythingAndrew Gallant
2023-07-05regex: add new inner literal extractorAndrew Gallant
This is mostly a copy of the prefix literal extractor in regex-syntax, but with a tweaked notion of Seq that keeps track of whether it's a prefix of an expression or not. If it isn't, then we can't cross it as a suffix to another Seq. This new extractor should be a lot more robust than the old one. We actually will keep going through the regex to try and find the "best" literals to search for (according to some heuristic).
2023-07-05regex: tweak formatting of regex-automata version specAndrew Gallant
This makes it easier to enable the `logging` feature for regex-automata. I wish I could just enable it unconditionally, but it winds up producing a lot of output because ripgrep uses regexes for things other than the primary search (like every glob). Sigh.
2023-07-05regex: refactor matcher constructionAndrew Gallant
This does a little bit of refactoring so that we can pass both a ConfiguredHIR and a Regex to the inner literal extraction routine. One downside of this approach is that a regex object hangs on to a ConfiguredHIR. But the extra memory usage is probably negligible. A benefit though is that converting the HIR to its concrete syntax is now lazy and only happens when logging is enabled.
2023-07-05regex: tweak DFA settingsAndrew Gallant
This increases the limits a bit for when the regex engine will build and use a fully compiled DFA. They can faster in some circumstances. For example, '(?-u)^\w{30,}$' gets a nice speed boost from state acceleration. We are also able to remove `regex` proper as a dependency. Wow.
2023-07-05regex: push more pattern handling to matcher constructionAndrew Gallant
Previously, ripgrep core was responsible for escaping regex patterns and implementing the --line-regexp flag. This commit moves that responsibility down into the matchers such that ripgrep just needs to hand the patterns it gets off to the matcher builder. The builder will then take care of escaping and all that. This was done to make pattern construction completely owned by the matcher builders. With the arrival regex-automata, this means we can move to the HIR very quickly and then never move back to the concrete syntax. We can then build our regex directly from the HIR. This overall can save quite a bit of time, especially when searching for large dictionaries. We still aren't quite as fast as GNU grep when searching something on the scale of /usr/share/dict/words, but we are basically within spitting distance. Prior to this, we were about an order of magnitude slower. This architecture in particular lets us write a pretty simple fast path that avoids AST parsing and HIR translation entirely: the case where one is just searching for a literal. In that case, we can hand construct the HIR directly.
2023-07-05globset: fix build error in testsAndrew Gallant
I guess we haven't been testing with the Serde feature enabled? Weird.
2023-07-05deps: update to pcre2 0.2.4Andrew Gallant
0.2.4 updates to PCRE2 10.42 and has a few other nice changes. For example, when `utf` is enabled, the crate will always set the PCRE2_MATCH_INVALID_UTF option. That means we no longer need to do transcoding or UTF-8 validity checks. Because of this, we actually get to remove one of the two uses of `unsafe` in ripgrep's `main` program. (This also updates a couple other dependencies for convenience.)
2023-07-05regex: small cleanupsAndrew Gallant
Just some small polishing. We also get rid of thread_local in favor of using regex-automata, mostly just in the name of reducing dependencies. (We should eventually be able to drop thread_local completely.)
2023-07-05regex: s/locations/capturesAndrew Gallant
Now that we use regex-automata, we no longer use any type with "locations" in it. Instead, that's mostly legacy from the top-level regex crate.
2023-07-05regex: simplify AST analysis a bitAndrew Gallant
The verbatim literal stuff hasn't been used for a while and I don't foresee it being used. If it's really needed, it would probably better to just implement it by looking at the pattern string itself, which avoids parsing it into an AST altogether.
2023-07-05regex: some small cleanup in 'strip.rs'Andrew Gallant
We also utilize bstr's methods to get rid of some helpers we had written by hand.
2023-07-05BREAKING: regex: finally remove CRLF hackAndrew Gallant
Now that Rust's regex crate finally supports a CRLF mode, we can remove this giant hack in ripgrep to enable it. (And assuredly did not work in all cases.) The way this works in the regex engine is actually subtly different than what ripgrep previously did. Namely, --crlf would previously treat either \r\n or \n as a line terminator. But now it treats \r\n, \n and \r as line terminators. In effect, it is implemented by treating \r and \n as line terminators, but ^ and $ will never match at a position between a \r and a \n. So basically this means that $ will end up matching in more cases than it might be intended too, but I don't expect this to be a big problem in practice. Note that passing --crlf to ripgrep and enabling CRLF mode in the regex via the `R` inline flag (e.g., `(?R:$)`) are subtly different. The `R` flag just controls the regex engine, but --crlf instructs all of ripgrep to use \r\n as a line terminator. There are likely some inconsistencies or corner cases that are wrong as a result of this cognitive dissonance, but we choose to leave well enough alone for now. Fixing this for real will probably require re-thinking how line terminators are handled in ripgrep. For example, one "problem" with how they're handled now is that ripgrep will re-insert its own line terminators when printing output instead of copying the input. This is maybe not so great and perhaps unexpected. (ripgrep probably can't get away with not inserting any line terminators. Users probably expect files that don't end with a line terminator whose last line matches to have a line terminator inserted.)
2023-07-05regex: migrate grep-regex to regex-automataAndrew Gallant
We just do a "basic" dumb migration. We don't try to improve anything here.
2023-07-05deps: initial migration steps to regex 1.9Andrew Gallant
This leaves the grep-regex crate in tatters. Pretty much the entire thing needs to be re-worked. The upshot is that it should result in some big simplifications. I hope. The idea here is to drop down and actually use regex-automata 0.3 instead of the regex crate itself.
2023-06-12readme: update Debian instructionsAndrew Gallant
We probably don't need to mention Buster specifically nor Debian unstable since ripgrep has been in Debian for a while now. But we can't just get rid of the `deb` file either, because Debian might package a very old version. Fixes #2531
2023-06-05cli: replace atty with std::io::IsTerminalMartin Nordholts
The `atty` crate is unmaintained[1] and `std::io::IsTerminal` was stabilized in Rust 1.70. [1]: https://rustsec.org/advisories/RUSTSEC-2021-0145.html PR #2526
2023-05-26ignore/types: add 'mdwn' to MarkdownFrancois Marier
PR #2520
2023-05-25deps: update everything elseAndrew Gallant
2023-05-25deps: bump regex to 1.8.3Andrew Gallant
This brings in an update from the regex crate that fixes a matching bug for particular kinds of alternations of literals. Fixes #2518
2023-05-23ignore/types: add *.pyi for PythonVille Skyttä
https://peps.python.org/pep-0484/#stub-files PR #2517
2023-05-19searcher: re-enable mmap on 32-bit architecturesAdam Reichold
memmap2 v0.3.0 introduced a regression when trying to map files larger than 4GB on 32-bit architectures[1] which was subsequently fixed in v0.3.1[2]. This commit bumps locked version of the memmap2 dependency to the current v0.5.0 and reverts fdfc418be55ff91e0c2efad6a3e27db054cb5534 to re-enable mmap on 32-bit architectures as a different approach to fixing [3]. This was tested to report matches from the end of a 5GB file using MinGW and Wine. Ref #1911, PR #2000 [1] https://github.com/RazrFalcon/memmap2-rs/commit/5e271224c8411c89b42060294f9393cfc7b12a2a [2] https://github.com/RazrFalcon/memmap2-rs/commit/9aa838aed99a4879d8357ff295a0ca1c98ba1ae5 [3] https://github.com/BurntSushi/ripgrep/issues/1911
2023-05-16deps: update everythingAndrew Gallant
This does unfortunately bring in both regex-syntax 0.6 and 0.7, but we'll fix that once regex 1.9 is out.
2023-05-16deps: update minimum version of grep crateAndrew Gallant
Ref #2516
2023-05-16grep-0.2.12grep-0.2.12Andrew Gallant
2023-05-16crates/grep: remove 'deny(missing_docs)'Andrew Gallant
This crate is only a shim over a bunch of other crates. I'm not sure that there's anything to add to each of the `pub extern` items. So instead of just writing fluff, I removed the lint. Fixes #2516
2023-03-28doc: fix --quiet docsRyan Whitehouse
The wording was previously inverted, which had the opposite meaning as was intended. Fixes #1962
2023-03-21ignore/types: add support for docker-compose filesManu
Default file is docker-compose.yml and the documentation mentions overrides in the form of docker-compose.*.yml. PR #2469
2023-03-15readme: add a link to delta's support for ripgrepAndrew Gallant
Ref: https://github.com/BurntSushi/ripgrep/issues/86#issuecomment-1469717706
2023-02-09ignore/types: add *.sln for msbuildDavid Ringo
.sln is the extension for Visual Studio Project Soltion files, one of the file types accepted as inputs by MSBuild. PR #2415