diff options
author | Andrew Gallant <jamslam@gmail.com> | 2017-02-18 16:20:21 -0500 |
---|---|---|
committer | Andrew Gallant <jamslam@gmail.com> | 2017-02-18 16:20:21 -0500 |
commit | 79d40d0e20a5710476b43c58c42db5124c2010a1 (patch) | |
tree | 93fb8b258fa8bb2ece2efc7c508ca6142f4b2e5f /src/search_buffer.rs | |
parent | 525b27804949d9362d4e1890a0cfa18a2eb272bd (diff) |
Tweak how binary files are handled internally.
This commit fixes two issues. The first issue is that if a file contained
many NUL bytes without any LF bytes, then the InputBuffer would read the
entire file into memory. This is not typically a problem, but if you run
rg on /proc, then bad things can happen when reading virtual memory mapping
files. Arguably, such files should be ignored, but we should also try to
avoid exhausting memory too. We fix this by pushing the `-a/--text` flag
option down into InputBuffer, so that it knows to stop immediately if it
finds a NUL byte.
The other issue this fixes is that binary detection is now applied to every
buffer instead of just the first one. This helps avoid detecting too many
files as plain text if the first parts of a binary file happen to contain
no NUL bytes. This issue still persists somewhat in the memory map
searcher, since we probably don't want to search the entire file upfront
for NUL bytes before actually performing our search. Instead, we search the
first 10KB for now.
Fixes #52, Fixes #311
Diffstat (limited to 'src/search_buffer.rs')
-rw-r--r-- | src/search_buffer.rs | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/src/search_buffer.rs b/src/search_buffer.rs index 2b792b5c..4745c2f8 100644 --- a/src/search_buffer.rs +++ b/src/search_buffer.rs @@ -113,8 +113,8 @@ impl<'a, W: WriteColor> BufferSearcher<'a, W> { #[inline(never)] pub fn run(mut self) -> u64 { - let binary_upto = cmp::min(4096, self.buf.len()); - if !self.opts.text && is_binary(&self.buf[..binary_upto]) { + let binary_upto = cmp::min(10240, self.buf.len()); + if !self.opts.text && is_binary(&self.buf[..binary_upto], true) { return 0; } |