summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndrew Gallant <jamslam@gmail.com>2017-03-12 20:21:22 -0400
committerAndrew Gallant <jamslam@gmail.com>2017-03-12 20:21:22 -0400
commit8db24e135375a2510e3eca85c72005172788471e (patch)
tree986e18f17647b5f56b63dfbbd23ed7e1db67909a
parent8bbe58d623db78a32b04eabff9a69667ad23ff7b (diff)
Stop aggressive inlining.
It's not clear what exactly is happening here, but the Read implementation for text decoding appears a bit sensitive. Small pertubations in the code appear to have a nearly 100% impact on the overall speed of ripgrep when searching UTF-16 files. I haven't had the time to examine the generated code in detail, but `perf stat` seems to think that the instruction cache is performing a lot worse when the code slows down. This might mean that excessive inlining causes a different code structure that leads to less-than-optimal icache usage, but it's at best a guess. Explicitly disabling the inline for the cold path seems to help the optimizer figure out the right thing.
-rw-r--r--src/decoder.rs1
1 files changed, 1 insertions, 0 deletions
diff --git a/src/decoder.rs b/src/decoder.rs
index d43cbdbb..345389a5 100644
--- a/src/decoder.rs
+++ b/src/decoder.rs
@@ -251,6 +251,7 @@ impl<R: io::Read, B: AsMut<[u8]>> DecodeReader<R, B> {
Ok(nwrite)
}
+ #[inline(never)] // impacts perf...
fn detect(&mut self) -> io::Result<()> {
let bom = try!(self.rdr.peek_bom());
self.decoder = bom.decoder();