diff options
author | sharkdp <davidpeter@web.de> | 2019-09-17 20:07:23 +0200 |
---|---|---|
committer | Andrew Gallant <jamslam@gmail.com> | 2020-02-17 17:16:28 -0500 |
commit | a18cf6ec39cd1d8fc7b4061858bb9263b6e5e073 (patch) | |
tree | cdbec45c8989f7dd337c6d5f083be0261ed3e800 /CHANGELOG.md | |
parent | c78c3236a8c530af076ec3d56c48e4bfa7dcd677 (diff) |
ignore: add existence check for ignore files
This commit adds a simple `.exists()` check for `.gitignore`,
`.ignore`, and other similar files before actually calling
`File::open(…)` in `GitIgnoreBuilder::add`.
The reason is that a simple existence check via `stat` can be faster
than actually trying to `open` the file, see
https://stackoverflow.com/a/12774387/704831. As we typically expect(?)
the number of directories *without* ignore files to be much larger
than the number of directories *with* ignore files, this leads to an
overall speedup.
The performance gain is not huge for `rg`, but can be quite significant
if more `.gitignore`-like files are added via
`add_custom_ignore_filename`. The speedup is *larger* for folders with
*low* files-per-directory ratios.
Note though that we do not do this check on Windows until a specific
analysis there suggests this is beneficial. Namely, Windows generally
has slower file system operations, so it's not clear whether this
speculative check is actually a benefit or not.
Benchmark results
-----------------
`rg --files` in my home folder (200k results, 6.5 files per directory):
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files` | 396.4 ± 3.2 | 390.9 | 400.0 | 1.05 |
| `./rg-feature --files` | 376.0 ± 3.6 | 369.3 | 383.5 | 1.00 |
`rg --files --hidden` in my home folder (800k results, 5.4
files per directory)
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files --hidden` | 1.575 ± 0.012 | 1.560 | 1.597 | 1.06 |
| `./rg-feature --files --hidden` | 1.479 ± 0.011 | 1.464 | 1.496 | 1.00 |
`rg --files` in the chromium-79.0.3915.2 source tree (300k results, 12.7 files per
directory)
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `~/rg-master --files` | 445.2 ± 5.3 | 435.6 | 453.0 | 1.04 |
| `~/rg-feature --files` | 428.9 ± 7.0 | 418.2 | 440.0 | 1.00 |
`rg --files` in the linux-5.3 source tree (65k results, 15.1
files per directory)
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./rg-master --files` | 94.5 ± 1.9 | 89.8 | 98.5 | 1.02 |
| `./rg-feature --files` | 92.6 ± 2.7 | 88.4 | 98.7 | 1.00 |
Closes #1381
Diffstat (limited to 'CHANGELOG.md')
-rw-r--r-- | CHANGELOG.md | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md index 68762a14..083000de 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,11 @@ TBD === TODO +Performance improvements: + +* [PERF #1381](https://github.com/BurntSushi/ripgrep/pull/1381): + Directory traversal is sped up with speculative ignore-file existence checks. + Bug fixes: * [BUG #1335](https://github.com/BurntSushi/ripgrep/issues/1335): |