summaryrefslogtreecommitdiffstats
path: root/grep-searcher/src/lib.rs
blob: 6a9f4ba7e39eacc0e1c3e3d0ecdc52bfe4134469 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
/*!
This crate provides an implementation of line oriented search, with optional
support for multi-line search.

# Brief overview

The principle type in this crate is a
[`Searcher`](struct.Searcher.html),
which can be configured and built by a
[`SearcherBuilder`](struct.SearcherBuilder.html).
A `Searcher` is responsible for reading bytes from a source (e.g., a file),
executing a search of those bytes using a `Matcher` (e.g., a regex) and then
reporting the results of that search to a
[`Sink`](trait.Sink.html)
(e.g., stdout). The `Searcher` itself is principally responsible for managing
the consumption of bytes from a source and applying a `Matcher` over those
bytes in an efficient way. The `Searcher` is also responsible for inverting
a search, counting lines, reporting contextual lines, detecting binary data
and even deciding whether or not to use memory maps.

A `Matcher` (which is defined in the
[`grep-matcher`](https://crates.io/crates/grep-matcher)
crate) is a trait for describing the lowest levels of pattern search in a
generic way. The interface itself is very similar to the interface of a regular
expression. For example, the
[`grep-regex`](https://crates.io/crates/grep-regex)
crate provides an implementation of the `Matcher` trait using Rust's
[`regex`](https://crates.io/crates/regex)
crate.

Finally, a `Sink` describes how callers receive search results producer by a
`Searcher`. This includes routines that are called at the beginning and end of
a search, in addition to routines that are called when matching or contextual
lines are found by the `Searcher`. Implementations of `Sink` can be trivially
simple, or extraordinarily complex, such as the
`Standard` printer found in the
[`grep-printer`](https://crates.io/crates/grep-printer)
crate, which effectively implements grep-like output.
This crate also provides convenience `Sink` implementations in the
[`sinks`](sinks/index.html)
sub-module for easy searching with closures.

# Example

This example shows how to execute the searcher and read the search results
using the
[`UTF8`](sinks/struct.UTF8.html)
implementation of `Sink`.

```
extern crate grep_matcher;
extern crate grep_regex;
extern crate grep_searcher;

use std::error::Error;

use grep_matcher::Matcher;
use grep_regex::RegexMatcher;
use grep_searcher::Searcher;
use grep_searcher::sinks::UTF8;

const SHERLOCK: &'static [u8] = b"\
For the Doctor Watsons of this world, as opposed to the Sherlock
Holmeses, success in the province of detective work must always
be, to a very large extent, the result of luck. Sherlock Holmes
can extract a clew from a wisp of straw or a flake of cigar ash;
but Doctor Watson has to have it taken out for him and dusted,
and exhibited clearly, with a label attached.
";

# fn main() { example().unwrap() }
fn example() -> Result<(), Box<Error>> {
    let matcher = RegexMatcher::new(r"Doctor \w+")?;
    let mut matches: Vec<(u64, String)> = vec![];
    Searcher::new().search_slice(&matcher, SHERLOCK, UTF8(|lnum, line| {
        // We are guaranteed to find a match, so the unwrap is OK.
        let mymatch = matcher.find(line.as_bytes())?.unwrap();
        matches.push((lnum, line[mymatch].to_string()));
        Ok(true)
    }))?;

    assert_eq!(matches.len(), 2);
    assert_eq!(
        matches[0],
        (1, "Doctor Watsons".to_string())
    );
    assert_eq!(
        matches[1],
        (5, "Doctor Watson".to_string())
    );
    Ok(())
}
```

See also `examples/search-stdin.rs` from the root of this crate's directory
to see a similar example that accepts a pattern on the command line and
searches stdin.
*/

#![deny(missing_docs)]

extern crate bstr;
extern crate bytecount;
extern crate encoding_rs;
extern crate encoding_rs_io;
extern crate grep_matcher;
#[macro_use]
extern crate log;
extern crate memmap;
#[cfg(test)]
extern crate regex;

pub use lines::{LineIter, LineStep};
pub use searcher::{
    BinaryDetection, ConfigError, Encoding, MmapChoice,
    Searcher, SearcherBuilder,
};
pub use sink::{
    Sink, SinkError,
    SinkContext, SinkContextKind, SinkFinish, SinkMatch,
};
pub use sink::sinks;

#[macro_use]
mod macros;

mod line_buffer;
mod lines;
mod searcher;
mod sink;
#[cfg(test)]
mod testutil;