diff options
author | Andrew Gallant <jamslam@gmail.com> | 2017-03-12 20:21:22 -0400 |
---|---|---|
committer | Andrew Gallant <jamslam@gmail.com> | 2017-03-12 20:21:22 -0400 |
commit | 8db24e135375a2510e3eca85c72005172788471e (patch) | |
tree | 986e18f17647b5f56b63dfbbd23ed7e1db67909a | |
parent | 8bbe58d623db78a32b04eabff9a69667ad23ff7b (diff) |
Stop aggressive inlining.
It's not clear what exactly is happening here, but the Read implementation
for text decoding appears a bit sensitive. Small pertubations in the code
appear to have a nearly 100% impact on the overall speed of ripgrep when
searching UTF-16 files.
I haven't had the time to examine the generated code in detail, but
`perf stat` seems to think that the instruction cache is performing a lot
worse when the code slows down. This might mean that excessive inlining
causes a different code structure that leads to less-than-optimal icache
usage, but it's at best a guess.
Explicitly disabling the inline for the cold path seems to help the
optimizer figure out the right thing.
-rw-r--r-- | src/decoder.rs | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/src/decoder.rs b/src/decoder.rs index d43cbdbb..345389a5 100644 --- a/src/decoder.rs +++ b/src/decoder.rs @@ -251,6 +251,7 @@ impl<R: io::Read, B: AsMut<[u8]>> DecodeReader<R, B> { Ok(nwrite) } + #[inline(never)] // impacts perf... fn detect(&mut self) -> io::Result<()> { let bom = try!(self.rdr.peek_bom()); self.decoder = bom.decoder(); |