summaryrefslogtreecommitdiffstats
path: root/src/main.rs
AgeCommit message (Collapse)Author
2018-04-23logging: add new --no-ignore-messages flagAndrew Gallant
The new --no-ignore-messages flag permits suppressing errors related to parsing .gitignore or .ignore files. These error messages can be somewhat annoying since they can surface from repositories that one has no control over. Fixes #646
2018-03-10output: add --stats flagBalaji Sivaraman
This commit provides basic support for a --stats flag, which will print various aggregate statistics about a search after all of the results have been printed. This is mostly intended to support a similar feature found in the Silver Searcher. Note though that we don't emit the total bytes searched; this is a first pass at an implementation and we can improve upon it later. Closes #411, Closes #799
2018-03-10cleanup: rename match_count to match_line_countBalaji Sivaraman
2018-02-04config: add persistent configurationAndrew Gallant
This commit adds support for reading configuration files that change ripgrep's default behavior. The format of the configuration file is an "rc" style and is very simple. It is defined by two rules: 1. Every line is a shell argument, after trimming ASCII whitespace. 2. Lines starting with '#' (optionally preceded by any amount of ASCII whitespace) are ignored. ripgrep will look for a single configuration file if and only if the RIPGREP_CONFIG_PATH environment variable is set and is non-empty. ripgrep will parse shell arguments from this file on startup and will behave as if the arguments in this file were prepended to any explicit arguments given to ripgrep on the command line. For example, if your ripgreprc file contained a single line: --smart-case then the following command RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo would behave identically to the following command rg --smart-case foo This commit also adds a new flag, --no-config, that when present will suppress any and all support for configuration. This includes any future support for auto-loading configuration files from pre-determined paths (which this commit does not add). Conflicts between configuration files and explicit arguments are handled exactly like conflicts in the same command line invocation. That is, this command: RIPGREP_CONFIG_PATH=wherever/.ripgreprc rg foo --case-sensitive is exactly equivalent to rg --smart-case foo --case-sensitive in which case, the --case-sensitive flag would override the --smart-case flag. Closes #196
2018-02-04logger: drop env_loggerAndrew Gallant
This commit updates the `log` crate to 0.4 and drops the dependency on env_logger. In particular, the latest version of env_logger brings in additional non-optional dependencies such as chrono that I don't think is worth including into ripgrep. It turns out ripgrep doesn't need any fancy logging. We just need a concept of log levels and the ability to print to stderr. Therefore, we just roll our own super simple logger. This update is motivated by the persistent configuration task. In particular, we need the ability to toggle the global log level more than once, and this doesn't appear to be possible with older versions of the log crate.
2018-02-01windows: fix OneDrive traversalsAndrew Gallant
This commit fixes a bug on Windows where directory traversals were completely broken when attempting to scan OneDrive directories that use the "file on demand" strategy. The specific problem was that Rust's standard library treats OneDrive directories as reparse points instead of directories, which causes methods like `FileType::is_file` and `FileType::is_dir` to always return false, even when retrieved via methods like `metadata` that purport to follow symbolic links. We fix this by peppering our code with checks on the underlying file attributes exposed by Windows. We consider an entry a directory if and only if the directory bit is set on the attributes. We are careful to make sure that the code remains the same on non-Windows platforms. Note that we also bump the dependency on `walkdir`, which contains a similar fix for its traversals. This bug is recorded upstream: https://github.com/rust-lang/rust/issues/46484 Upstream also has a pending PR: https://github.com/rust-lang/rust/pull/47956 Fixes #705
2018-01-31worker: better error handling for memory mapsAndrew Gallant
Previously, we would bail out of using memory maps if we could detect ahead of time that opening a memory map would fail. The only case we checked was whether the file size was 0 or not. This is actually insufficient. The mmap call can return ENODEV errors when a file doesn't support memory maps. This is the case for new files exposed by Linux, for example, /sys/devices/system/cpu/vulnerabilities/meltdown. We fix this by checking the actual error codes returned by the mmap call. If ENODEV (or EOVERFLOW) is returned, then we fall back to regular `read` calls. If any other error occurs, we report it to the user. Fixes #760
2018-01-31style: remove eprintln macroAndrew Gallant
The eprintln! macro was added to Rust's standard library in Rust 1.19.0, which is below ripgrep's minimum Rust version. Therefore, we can rely on the standard library variant now.
2018-01-30search: add support for searching compressed filesBalaji Sivaraman
This commit adds opt-in support for searching compressed files during recursive search. This behavior is only enabled when the `-z/--search-zip` flag is passed to ripgrep. When enabled, a limited set of common compression formats are recognized via file extension, and a new process is spawned to perform the decompression. ripgrep then searches the stdout of that spawned process. Closes #539
2017-11-22clippy: main.rs: call Clone() on trait instead of ref-counted pointers and ↵Matthias Krüger
pass Arc<Args> by ref more often.
2017-09-18Avoid expensive check with --files (fixes #600)Christof Marti
2017-08-27restore the default SIGPIPE behavior as a temporary workaroundJack O'Connor
See https://github.com/BurntSushi/ripgrep/issues/200.
2017-08-08Remove unused libc dependencyVurich
2017-05-19Make --quiet flag apply when using --files optionMarc Tiehuis
Fixes #483.
2017-03-12Add support for additional text encodings.Andrew Gallant
This includes, but is not limited to, UTF-16, latin-1, GBK, EUC-JP and Shift_JIS. (Courtesy of the `encoding_rs` crate.) Specifically, this feature enables ripgrep to search files that are encoded in an encoding other than UTF-8. The list of available encodings is tied directly to what the `encoding_rs` crate supports, which is in turn tied to the Encoding Standard. The full list of available encodings can be found here: https://encoding.spec.whatwg.org/#concept-encoding-get This pull request also introduces the notion that text encodings can be automatically detected on a best effort basis. Currently, the only support for this is checking for a UTF-16 bom. In all other cases, a text encoding of `auto` (the default) implies a UTF-8 or ASCII compatible source encoding. When a text encoding is otherwise specified, it is unconditionally used for all files searched. Since ripgrep's regex engine is fundamentally built on top of UTF-8, this feature works by transcoding the files to be searched from their source encoding to UTF-8. This transcoding only happens when: 1. `auto` is specified and a non-UTF-8 encoding is detected. 2. A specific encoding is given by end users (including UTF-8). When transcoding occurs, errors are handled by automatically inserting the Unicode replacement character. In this case, ripgrep's output is guaranteed to be valid UTF-8 (excluding non-UTF-8 file paths, if they are printed). In all other cases, the source text is searched directly, which implies an assumption that it is at least ASCII compatible, but where UTF-8 is most useful. In this scenario, encoding errors are not detected. In this case, ripgrep's output will match the input exactly, byte-for-byte. This design may not be optimal in all cases, but it has some advantages: 1. In the happy path ("UTF-8 everywhere") remains happy. I have not been able to witness any performance regressions. 2. In the non-UTF-8 path, implementation complexity is kept relatively low. The cost here is transcoding itself. A potentially superior implementation might build decoding of any encoding into the regex engine itself. In particular, the fundamental problem with transcoding everything first is that literal optimizations are nearly negated. Future work should entail improving the user experience. For example, we might want to auto-detect more text encodings. A more elaborate UX experience might permit end users to specify multiple text encodings, although this seems hard to pull off in an ergonomic way. Fixes #1
2017-02-18Remove Windows deps from ripgrep proper.Andrew Gallant
All Windows specific code has been (mostly) pushed out of ripgrep and into its constituent libraries.
2017-01-15Replace internal atty module with atty crate.Andrew Gallant
This removes all use of explicit unsafe in ripgrep proper except for one: accessing the contents of a memory map. (Which may never go away.)
2017-01-09Don't search stdout redirected file.Andrew Gallant
When running ripgrep like this: rg foo > output we must be careful not to search `output` since ripgrep is actively writing to it. Searching it can cause massive blowups where the file grows without bound. While this is conceptually easy to fix (check the inode of the redirection and the inode of the file you're about to search), there are a few problems with it. First, inodes are a Unix thing, so we need a Windows specific solution to this as well. To resolve this concern, I created a new crate, `same-file`, which provides a cross platform abstraction. Second, stat'ing every file is costly. This is not avoidable on Windows, but on Unix, we can get the inode number directly from directory traversal. However, this information wasn't exposed, but now it is (through both the ignore and walkdir crates). Fixes #286
2016-12-24Remove special ^C handling.Andrew Gallant
This means that ripgrep will no longer try to reset your colors in your terminal if you kill it while searching. This could result in messing up the colors in your terminal, and the fix is to simply run some other command that resets them for you. For example: $ echo -ne "\033[0m" The reason why the ^C handling was removed is because it is irrevocably broken on Windows and is impossible to do correctly and efficiently in ANSI terminals. Fixes #281
2016-12-24Small code cleanups.Andrew Gallant
2016-12-04Simplify code.Andrew Gallant
Instead of `Ok(n) if n == 0` we can just write `Ok(0)`.
2016-11-20Completely re-work colored output and tty handling.Andrew Gallant
This commit completely guts all of the color handling code and replaces most of it with two new crates: wincolor and termcolor. wincolor provides a simple API to coloring using the Windows console and termcolor provides a platform independent coloring API tuned for multithreaded command line programs. This required a lot more flexibility than what the `term` crate provided, so it was dropped. We instead switch to writing ANSI escape sequences directly and ignore the TERMINFO database. In addition to fixing several bugs, this commit also permits end users to customize colors to a certain extent. For example, this command will set the match color to magenta and the line number background to yellow: rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo For tty handling, we've adopted a hack from `git` to do tty detection in MSYS/mintty terminals. As a result, ripgrep should get both color detection and piping correct on Windows regardless of which terminal you use. Finally, switch to line buffering. Performance doesn't seem to be impacted and it's an otherwise more user friendly option. Fixes #37, Fixes #51, Fixes #94, Fixes #117, Fixes #182, Fixes #231
2016-11-17Switch from Docopt to Clap.Andrew Gallant
There were two important reasons for the switch: 1. Performance. Docopt does poorly when the argv becomes large, which is a reasonable common use case for search tools. (e.g., use with xargs) 2. Better failure modes. Clap knows a lot more about how a particular argv might be invalid, and can therefore provide much clearer error messages. While both were important, (1) made it urgent. Note that since Clap requires at least Rust 1.11, this will in turn increase the minimum Rust version supported by ripgrep from Rust 1.9 to Rust 1.11. It is therefore a breaking change, so the soonest release of ripgrep with Clap will have to be 0.3. There is also at least one subtle breaking change in real usage. Previous to this commit, this used to work: rg -e -foo Where this would cause ripgrep to search for the string `-foo`. Clap currently has problems supporting this use case (see: https://github.com/kbknapp/clap-rs/issues/742), but it can be worked around by using this instead: rg -e [-]foo or even rg [-]foo and this still works: rg -- -foo This commit also adds Bash, Fish and PowerShell completion files to the release, fixes a bug that prevented ripgrep from working on file paths containing invalid UTF-8 and shows short descriptions in the output of `-h` but longer descriptions in the output of `--help`. Fixes #136, Fixes #189, Fixes #210, Fixes #230
2016-11-06Don't ever search directories.Andrew Gallant
2016-11-06Always search paths given by user.Andrew Gallant
This permits doing `rg -a test /dev/sda1` for example, where as before /dev/sda1 was skipped because it wasn't a regular file.
2016-11-06Add --no-messages flag.Andrew Gallant
This flag is similar to what's found in grep: it will suppress all error messages, such as those shown when a particular file couldn't be read. Closes #149
2016-11-06Add -m/--max-count flag.Andrew Gallant
This flag limits the number of matches printed *per file*. Closes #159
2016-11-05Add parallel recursive directory iterator.Andrew Gallant
This adds a new walk type in the `ignore` crate, `WalkParallel`, which provides a way for recursively iterating over a set of paths in parallel while respecting various ignore rules. The API is a bit strange, as a closure producing a closure isn't something one often sees, but it does seem to work well. This also allowed us to simplify much of the worker logic in ripgrep proper, where MultiWorker is now gone.
2016-10-29Reset the terminal when Ctrl-C is pressedBrian Campbell
If a user hits Ctrl-C to exit out of a search in the middle of printing a line, we don't want to leave the terminal colors screwed up for them. Catch Ctrl-C using the ctrlc crate, obtain a stdout lock to ensure that other threads don't continue writing after we do so, reset the terminal, and exit the program. Closes #119
2016-10-29Move all gitignore matching to separate crate.Andrew Gallant
This PR introduces a new sub-crate, `ignore`, which primarily provides a fast recursive directory iterator that respects ignore files like gitignore and other configurable filtering rules based on globs or even file types. This results in a substantial source of complexity moved out of ripgrep's core and into a reusable component that others can now (hopefully) benefit from. While much of the ignore code carried over from ripgrep's core, a substantial portion of it was rewritten with the following goals in mind: 1. Reuse matchers built from gitignore files across directory iteration. 2. Design the matcher data structure to be amenable for parallelizing directory iteration. (Indeed, writing the parallel iterator is the next step.) Fixes #9, #44, #45
2016-10-11Switch to thread_local crate in lieu of thread_local!.Andrew Gallant
This is to work around a bug where using a thread_local! was causing a segfault on macos. Fixes #164.
2016-09-30Move glob implementation to new crate.Andrew Gallant
It is isolated and complex enough that it deserves attention all on its own. It's also eminently reusable.
2016-09-28Be better with short circuiting with --quiet.Andrew Gallant
It didn't make sense for --quiet to be part of the printer, because --quiet doesn't just mean "don't print," it also means, "stop after the first match is found." This needs to be wired all the way up through directory traversal, and it also needs to cause all of the search workers to quit as well. We do it with an atomic that is only checked with --quiet is given. Fixes #116.
2016-09-26Don't print empty lines in single threaded mode.Andrew Gallant
Fixes #99.
2016-09-26Don't quit if opening a file fails.Andrew Gallant
This was already working correctly in multithreaded mode, but in single threaded mode, a file failing to open caused search to stop. That's bad. Fixes #98.
2016-09-25Don't use an intermediate buffer when --threads=1.Andrew Gallant
Fixes #8
2016-09-25Merge pull request #71 from catchmrbharath/issue46Andrew Gallant
[Fixes #46] Use 1 less worker thread than number of threads
2016-09-24[Fixes #46] Use 1 less worker thread than number of threadsBharath M R
The main thread does directory traversal. Hence number of threads = main Thread + number of worker threads. We should have atleast one worker thread.
2016-09-24If a file is empty, still try to search it.Andrew Gallant
Files like /proc/cpuinfo will advertise themselves as a normal file with size 0. Normally, this isn't a problem, but if ripgrep decides to use a memory map, it skipped searching if the file was empty since it's an error to memory map an empty file. Instead of returning 0, we should just fall back to standard read calls. Fixes #55.
2016-09-20Add an error message for catching a common failure mode.Andrew Gallant
If you're in a directory that has a parent .gitignore (like, your $HOME), then it can cause ripgrep to simply not do anything depending on your ignore rules. There are probably other scenarios where ripgrep applies some filter that an end user doesn't expect, so try to catch the worst case (when ripgrep doesn't search anything).
2016-09-15Rework glob sets.Andrew Gallant
We try to reduce the pressure on regexes and offload some of it to Aho-Corasick or exact lookups.
2016-09-14Replace crossbeam with deque.Andrew Gallant
deque appears faster.
2016-09-13We don't use thread_local any more, so remove it.Andrew Gallant
2016-09-13Stream results when feasible.0.0.19Andrew Gallant
For example, when only a single file (or stdin) is being searched, then we should be able to print directly to the terminal instead of intermediate buffers. (The buffers are only necessary for parallelism.) Closes #4.
2016-09-11We don't need regex-syntax directly in ripgrep.Andrew Gallant
2016-09-10Rename search module to search_stream.Andrew Gallant
The name better reflects the difference between it and the search_buffer module.
2016-09-10Rejigger the atty detection stuff.Andrew Gallant
2016-09-08Refactor how coloring is done.0.0.14Andrew Gallant
All in the name of appeasing Windows.
2016-09-07Hack in Windows console coloring.0.0.11Andrew Gallant
The code has suffered and needs refactoring/commenting. BUT... IT WORKS!
2016-09-06Add support for memory maps.Andrew Gallant
I though plain `read` had usurped them, but when searching a very small number of files, mmaps can be around 20% faster on Linux. It'd be really unfortunate to leave that on the table. Mmap searching doesn't support contexts yet, but we probably don't really care. And duplicating that logic doesn't sound fun. Without contexts, mmap searching is delightfully simple.