summaryrefslogtreecommitdiffstats
path: root/grep
AgeCommit message (Collapse)Author
2018-09-04grep-cli: introduce new grep-cli crateAndrew Gallant
This commit moves a lot of "utility" code from ripgrep core into grep-cli. Any one of these things might not be worth creating a new crate, but combining everything together results in a fair number of a convenience routines that make up a decent sized crate. There is potentially more we could move into the crate, but much of what remains in ripgrep core is almost entirely dealing with the number of flags we support. In the course of doing moving things to the grep-cli crate, we clean up a lot of gunk and improve failure modes in a number of cases. In particular, we've fixed a bug where other processes could deadlock if they write too much to stderr. Fixes #990
2018-08-23deps: update walkdir minimum versionAndrew Gallant
We'll want to be using the new `same_file_system` option soon.
2018-08-21ignore: fix false positive in path_is_symlinkAndrew Gallant
This commit fixes a bug where the first path always reported itself as as symlink via `path_is_symlink`. Part of this fix includes updating walkdir to 2.2.1, which also includes a corresponding bug fix. Fixes #984
2018-08-20deps: update libripgrep crate versionsAndrew Gallant
This prepares them for an initial 0.1.0 release.
2018-08-20ripgrep: migrate to libripgrepAndrew Gallant
This commit does the work to delete the old `grep` crate and effectively rewrite most of ripgrep core to use the new libripgrep crates. The new `grep` crate is now a facade that collects the various crates that make up libripgrep. The most complex part of ripgrep core is now arguably the translation between command line parameters and the library options, which is ultimately where we want to be.
2018-08-15grep: remove senseless testAndrew Gallant
It was pulling in a sizable data file and doesn't appear to be testing anything meaningful that isn't covered by a variety of other tests.
2018-08-03grep-0.1.9grep-0.1.9Andrew Gallant
2018-07-17grep: small literal detection fixAndrew Gallant
This commit tweaks the inner literal detection heuristic such that if it comes up with any literal that is all whitespace, then it's likely a bad literal to look for since it's so common. Therefore, we simply reject the inner literal optimization in this case and let the regex engine do its thang.
2018-05-07deps: update regex to 1.0Bastien Orivel
We retain the `simd-accel` feature on globset for backwards compatibility, but will remove it in the next semver release.
2018-03-13grep: add "perfect" smart case detectionAndrew Gallant
This commit removes the previous smart case detection logic and replaces it with detection based on the regex AST. This particular AST is a faithful representation of the concrete syntax, which lets us be very precise in how we handle it. Closes #851
2018-03-13grep: upgrade to regex-syntax 0.5Andrew Gallant
This update brings with it many bug fixes: * Better error messages are printed overall. We also include explicit call out for unsupported features like backreferences and look-around. * Regexes like `\s*{` no longer emit incomprehensible errors. * Unicode escape sequences, such as `\u{..}` are now supported. For the most part, this upgrade was done in a straight-forward way. We resist the urge to refactor the `grep` crate, in anticipation of it being rewritten anyway. Note that we removed the `--fixed-strings` suggestion whenever a regex syntax error occurs. In practice, I've found that it results in a lot of false positives, and I believe that its use is not as paramount now that regex parse errors are much more readable. Closes #268, Closes #395, Closes #702, Closes #853
2018-03-12deps: update regex crateAndrew Gallant
This update brings with it a new feature of the regex crate which will now use SIMD optimizations automatically at runtime with no necessary compile time flags. All that's needed is to enable the `unstable` feature. Other crates, such as bytecount and encoding_rs, are still using the old-style SIMD support, so we leave the simd-accel and avx-accel features. However, the binaries we distribute on Github no longer have those features enabled, which makes them truly portable. Fixes #135
2018-02-11grep: release 0.1.8grep-0.1.8Andrew Gallant
2018-02-04logger: drop env_loggerAndrew Gallant
This commit updates the `log` crate to 0.4 and drops the dependency on env_logger. In particular, the latest version of env_logger brings in additional non-optional dependencies such as chrono that I don't think is worth including into ripgrep. It turns out ripgrep doesn't need any fancy logging. We just need a concept of log levels and the ability to print to stderr. Therefore, we just roll our own super simple logger. This update is motivated by the persistent configuration task. In particular, we need the ability to toggle the global log level more than once, and this doesn't appear to be possible with older versions of the log crate.
2018-01-01cleanup: replace try! with ?Balaji Sivaraman
2017-12-18Improve detection of upper-case characters by smart-case featuredana
Fixes #717 (partially) The previous implementation of the smart-case feature was actually *too* smart, in that it inspected the final character ranges in the AST to determine if the pattern contained upper-case characters. This meant that patterns like `foo\w` would not be handled case-insensitively, since `\w` includes the range of upper-case characters A–Z. As a medium-term solution to this problem, we now inspect the input pattern itself for upper-case characters, ignoring any that immediately follow a `\`. This neatly handles all of the most basic cases like `\w`, `\S`, and `É`, though it still has problems with more complex features like `\p{Ll}`. Handling those correctly will require improvements to the AST.
2017-10-21cargo: bump to 0.7.0ignore-0.3.0grep-0.1.7globset-0.2.10.7.0Andrew Gallant
2017-10-21deps: upgrade to memchr 2Andrew Gallant
2017-03-12Bump and update deps.wincolor-0.1.3termcolor-0.3.1ignore-0.1.8grep-0.1.6globset-0.1.4Andrew Gallant
2017-03-12Add license files to each crate.Andrew Gallant
Fixes #381
2017-01-130.4.00.4.0Andrew Gallant
2017-01-01Update to regex 0.2.Andrew Gallant
2016-12-30bump various versionsAndrew Gallant
2016-12-27Remove superfluous memmap dependency in `grep` crate.Andrew Gallant
Fixes #295.
2016-11-28Disable Unicode mode for literal regex.Andrew Gallant
When ripgrep detects a literal, it emits them as raw hex escaped byte sequences to Regex::new. This permits literal optimizations for arbitrary byte sequences (i.e., possibly invalid UTF-8). The problem is that Regex::new interprets hex escaped byte sequences as *Unicode codepoints* by default, but we want them to actually stand for their raw byte values. Therefore, disable Unicode mode. This is OK, since the regex is composed entirely of literals and literal extraction does Unicode case folding. Fixes #251
2016-11-28Detect more uppercase literals for --smart-case.Andrew Gallant
This changes the uppercase literal detection for the "smart case" functionality. In particular, a character class is considered to have an uppercase literal if at least one of its ranges starts or stops with an uppercase literal. Fixes #229
2016-11-06grep-0.1.4grep-0.1.4Andrew Gallant
2016-11-06Fixes a bug with --smart-case.Andrew Gallant
This was a subtle bug, but the big picture was that the smart case information wasn't being carried through to the literal extraction in some cases. When this happened, it was possible to get back an incomplete set of literals, which would therefore miss some valid matches. The fix to this is to actually parse the regex and determine whether smart case applies before doing anything else. It's a little extra work, but parsing is pretty fast. Fixes #199
2016-10-29Move all gitignore matching to separate crate.Andrew Gallant
This PR introduces a new sub-crate, `ignore`, which primarily provides a fast recursive directory iterator that respects ignore files like gitignore and other configurable filtering rules based on globs or even file types. This results in a substantial source of complexity moved out of ripgrep's core and into a reusable component that others can now (hopefully) benefit from. While much of the ignore code carried over from ripgrep's core, a substantial portion of it was rewritten with the following goals in mind: 1. Reuse matchers built from gitignore files across directory iteration. 2. Design the matcher data structure to be amenable for parallelizing directory iteration. (Indeed, writing the parallel iterator is the next step.) Fixes #9, #44, #45
2016-10-10Fix debug expression statement.Andrew Gallant
2016-09-25grep 0.1.3Andrew Gallant
2016-09-25Don't union inner literals of repetitions.Andrew Gallant
If we do, this results in extracting `foofoofoo` from `(\wfoo){3}`, which is wrong. This does prevent us from extracting `foofoofoo` from `foo{3}`, which is unfortunate, but we miss plenty of other stuff too. Literal extracting needs a good rethink (all the way down into the regex engine). Fixes #93
2016-09-24Add --smart-case.Andrew Gallant
It does what it says on the tin. Closes #70.
2016-09-21grep 0.1.2Andrew Gallant
2016-09-21Fix a performance bug where using -w could result in very bad performance.Andrew Gallant
The specific issue is that -w causes the regex to be wrapped in Unicode word boundaries. Regrettably, Unicode word boundaries are the one thing our regex engine can't handle well in the presence of non-ASCII text. We work around its slowness by stripping word boundaries in some circumstances, and using the resulting expression as a way to produce match candidates that are then verified by the full original regex. This doesn't fix all cases, but it should fix all cases where -w is used.
2016-09-21Bump regex version.Andrew Gallant
2016-09-17grep 0.1.1Andrew Gallant
2016-09-16Improve the "bad literal" error message.Andrew Gallant
Incidentally, this was done by using the Debug impl for `char` instead of the Display impl. Cute. Fixes #5.
2016-09-13add readmeAndrew Gallant
2016-09-13update grep Cargo.tomlAndrew Gallant
2016-09-11Update regex.Andrew Gallant
2016-09-10Fix off-by-one bug in searcher.Andrew Gallant
2016-09-08Rename xrep to ripgrep.Andrew Gallant
2016-09-06Fix grep match iterator.Andrew Gallant
2016-09-06Fix required literal handling and add debug prints.Andrew Gallant
In particular, if we had an inner literal and were doing a case insensitive search, then the literals are dropped because we previously only allowed a single inner literal to have an effect. Now we allow alternations of inner literals, but still don't quite take full advantage.
2016-09-05Fix deps so that others can build it.Andrew Gallant
2016-09-03making search work (finally)Andrew Gallant
2016-08-29The search code is a mess, but...Andrew Gallant
... we now support inverted matches and line numbers!
2016-08-28Implementing core functionality.Andrew Gallant
Initially experimenting with crossbeam to manage synchronization.
2016-08-24docs and small polishAndrew Gallant