diff options
author | Andrew Gallant <jamslam@gmail.com> | 2018-04-29 09:29:52 -0400 |
---|---|---|
committer | Andrew Gallant <jamslam@gmail.com> | 2018-08-20 07:10:19 -0400 |
commit | d9ca5293569efb255608d3c601107bcfe7060f15 (patch) | |
tree | 7fd8611c333a2f7d703987de3a379ee8292013e2 /grep-printer | |
parent | 0958837ee104985412f08e81b6f08df1e5291042 (diff) |
libripgrep: initial commit introducing libripgrep
libripgrep is not any one library, but rather, a collection of libraries
that roughly separate the following key distinct phases in a grep
implementation:
1. Pattern matching (e.g., by a regex engine).
2. Searching a file using a pattern matcher.
3. Printing results.
Ultimately, both (1) and (3) are defined by de-coupled interfaces, of
which there may be multiple implementations. Namely, (1) is satisfied by
the `Matcher` trait in the `grep-matcher` crate and (3) is satisfied by
the `Sink` trait in the `grep2` crate. The searcher (2) ties everything
together and finds results using a matcher and reports those results
using a `Sink` implementation.
Closes #162
Diffstat (limited to 'grep-printer')
-rw-r--r-- | grep-printer/Cargo.toml | 30 | ||||
-rw-r--r-- | grep-printer/LICENSE-MIT | 21 | ||||
-rw-r--r-- | grep-printer/README.md | 35 | ||||
-rw-r--r-- | grep-printer/UNLICENSE | 24 | ||||
-rw-r--r-- | grep-printer/src/color.rs | 366 | ||||
-rw-r--r-- | grep-printer/src/counter.rs | 90 | ||||
-rw-r--r-- | grep-printer/src/json.rs | 921 | ||||
-rw-r--r-- | grep-printer/src/jsont.rs | 213 | ||||
-rw-r--r-- | grep-printer/src/lib.rs | 106 | ||||
-rw-r--r-- | grep-printer/src/macros.rs | 23 | ||||
-rw-r--r-- | grep-printer/src/standard.rs | 3049 | ||||
-rw-r--r-- | grep-printer/src/stats.rs | 147 | ||||
-rw-r--r-- | grep-printer/src/summary.rs | 1068 | ||||
-rw-r--r-- | grep-printer/src/util.rs | 392 |
14 files changed, 6485 insertions, 0 deletions
diff --git a/grep-printer/Cargo.toml b/grep-printer/Cargo.toml new file mode 100644 index 00000000..ffc85bc8 --- /dev/null +++ b/grep-printer/Cargo.toml @@ -0,0 +1,30 @@ +[package] +name = "grep-printer" +version = "0.0.1" #:version +authors = ["Andrew Gallant <jamslam@gmail.com>"] +description = """ +An implementation of the grep crate's Sink trait that provides standard +printing of search results, similar to grep itself. +""" +documentation = "https://docs.rs/grep-printer" +homepage = "https://github.com/BurntSushi/ripgrep" +repository = "https://github.com/BurntSushi/ripgrep" +readme = "README.md" +keywords = ["grep", "pattern", "print", "printer", "sink"] +license = "Unlicense/MIT" + +[features] +default = ["serde1"] +serde1 = ["base64", "serde", "serde_derive", "serde_json"] + +[dependencies] +base64 = { version = "0.9", optional = true } +grep-matcher = { version = "0.0.1", path = "../grep-matcher" } +grep-searcher = { version = "0.0.1", path = "../grep-searcher" } +termcolor = "1" +serde = { version = "1", optional = true } +serde_derive = { version = "1", optional = true } +serde_json = { version = "1", optional = true } + +[dev-dependencies] +grep-regex = { version = "0.0.1", path = "../grep-regex" } diff --git a/grep-printer/LICENSE-MIT b/grep-printer/LICENSE-MIT new file mode 100644 index 00000000..3b0a5dc0 --- /dev/null +++ b/grep-printer/LICENSE-MIT @@ -0,0 +1,21 @@ +The MIT License (MIT) + +Copyright (c) 2015 Andrew Gallant + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. diff --git a/grep-printer/README.md b/grep-printer/README.md new file mode 100644 index 00000000..8ccdf951 --- /dev/null +++ b/grep-printer/README.md @@ -0,0 +1,35 @@ +grep-printer +------------ +Print results from line oriented searching in a human readable, aggregate or +JSON Lines format. + +[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.svg)](https://travis-ci.org/BurntSushi/ripgrep) +[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep) +[![](https://img.shields.io/crates/v/grep-printer.svg)](https://crates.io/crates/grep-printer) + +Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org). + +### Documentation + +[https://docs.rs/grep-printer](https://docs.rs/grep-printer) + +**NOTE:** You probably don't want to use this crate directly. Instead, you +should prefer the facade defined in the +[`grep`](https://docs.rs/grep) +crate. + + +### Usage + +Add this to your `Cargo.toml`: + +```toml +[dependencies] +grep-printer = "0.1" +``` + +and this to your crate root: + +```rust +extern crate grep_printer; +``` diff --git a/grep-printer/UNLICENSE b/grep-printer/UNLICENSE new file mode 100644 index 00000000..68a49daa --- /dev/null +++ b/grep-printer/UNLICENSE @@ -0,0 +1,24 @@ +This is free and unencumbered software released into the public domain. + +Anyone is free to copy, modify, publish, use, compile, sell, or +distribute this software, either in source code form or as a compiled +binary, for any purpose, commercial or non-commercial, and by any +means. + +In jurisdictions that recognize copyright laws, the author or authors +of this software dedicate any and all copyright interest in the +software to the public domain. We make this dedication for the benefit +of the public at large and to the detriment of our heirs and +successors. We intend this dedication to be an overt act of +relinquishment in perpetuity of all present and future rights to this +software under copyright law. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR +OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +For more information, please refer to <http://unlicense.org/> diff --git a/grep-printer/src/color.rs b/grep-printer/src/color.rs new file mode 100644 index 00000000..dcaca59d --- /dev/null +++ b/grep-printer/src/color.rs @@ -0,0 +1,366 @@ +use std::error; +use std::fmt; +use std::str::FromStr; + +use termcolor::{Color, ColorSpec, ParseColorError}; + +/// An error that can occur when parsing color specifications. +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum ColorError { + /// This occurs when an unrecognized output type is used. + UnrecognizedOutType(String), + /// This occurs when an unrecognized spec type is used. + UnrecognizedSpecType(String), + /// This occurs when an unrecognized color name is used. + UnrecognizedColor(String, String), + /// This occurs when an unrecognized style attribute is used. + UnrecognizedStyle(String), + /// This occurs when the format of a color specification is invalid. + InvalidFormat(String), +} + +impl error::Error for ColorError { + fn description(&self) -> &str { + match *self { + ColorError::UnrecognizedOutType(_) => "unrecognized output type", + ColorError::UnrecognizedSpecType(_) => "unrecognized spec type", + ColorError::UnrecognizedColor(_, _) => "unrecognized color name", + ColorError::UnrecognizedStyle(_) => "unrecognized style attribute", + ColorError::InvalidFormat(_) => "invalid color spec", + } + } +} + +impl ColorError { + fn from_parse_error(err: ParseColorError) -> ColorError { + ColorError::UnrecognizedColor( + err.invalid().to_string(), + err.to_string(), + ) + } +} + +impl fmt::Display for ColorError { + fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { + match *self { + ColorError::UnrecognizedOutType(ref name) => { + write!( + f, + "unrecognized output type '{}'. Choose from: \ + path, line, column, match.", + name, + ) + } + ColorError::UnrecognizedSpecType(ref name) => { + write!( + f, + "unrecognized spec type '{}'. Choose from: \ + fg, bg, style, none.", + name, + ) + } + ColorError::UnrecognizedColor(_, ref msg) => { + write!(f, "{}", msg) + } + ColorError::UnrecognizedStyle(ref name) => { + write!( + f, + "unrecognized style attribute '{}'. Choose from: \ + nobold, bold, nointense, intense, nounderline, \ + underline.", + name, + ) + } + ColorError::InvalidFormat(ref original) => { + write!( + f, + "invalid color spec format: '{}'. Valid format \ + is '(path|line|column|match):(fg|bg|style):(value)'.", + original, + ) + } + } + } +} + +/// A merged set of color specifications. +/// +/// This set of color specifications represents the various color types that +/// are supported by the printers in this crate. A set of color specifications +/// can be created from a sequence of +/// [`UserColorSpec`s](struct.UserColorSpec.html). +#[derive(Clone, Debug, Default, Eq, PartialEq)] +pub struct ColorSpecs { + path: ColorSpec, + line: ColorSpec, + column: ColorSpec, + matched: ColorSpec, +} + +/// A single color specification provided by the user. +/// +/// ## Format +/// +/// The format of a `Spec` is a triple: `{type}:{attribute}:{value}`. Each +/// component is defined as follows: +/// +/// * `{type}` can be one of `path`, `line`, `column` or `match`. +/// * `{attribute}` can be one of `fg`, `bg` or `style`. `{attribute}` may also +/// be the special value `none`, in which case, `{value}` can be omitted. +/// * `{value}` is either a color name (for `fg`/`bg`) or a style instruction. +/// +/// `{type}` controls which part of the output should be styled. +/// +/// When `{attribute}` is `none`, then this should cause any existing style +/// settings to be cleared for the specified `type`. +/// +/// `{value}` should be a color when `{attribute}` is `fg` or `bg`, or it +/// should be a style instruction when `{attribute}` is `style`. When +/// `{attribute}` is `none`, `{value}` must be omitted. +/// +/// Valid colors are `black`, `blue`, `green`, `red`, `cyan`, `magenta`, +/// `yellow`, `white`. Extended colors can also be specified, and are formatted +/// as `x` (for 256-bit colors) or `x,x,x` (for 24-bit true color), where +/// `x` is a number between 0 and 255 inclusive. `x` may be given as a normal +/// decimal number of a hexadecimal number, where the latter is prefixed by +/// `0x`. +/// +/// Valid style instructions are `nobold`, `bold`, `intense`, `nointense`, +/// `underline`, `nounderline`. +/// +/// ## Example +/// +/// The standard way to build a `UserColorSpec` is to parse it from a string. +/// Once multiple `UserColorSpec`s have been constructed, they can be provided +/// to the standard printer where they will automatically be applied to the +/// output. +/// +/// A `UserColorSpec` can also be converted to a `termcolor::ColorSpec`: +/// +/// ```rust +/// extern crate grep_printer; +/// extern crate termcolor; +/// +/// # fn main() { +/// use termcolor::{Color, ColorSpec}; +/// use grep_printer::UserColorSpec; +/// +/// let user_spec1: UserColorSpec = "path:fg:blue".parse().unwrap(); +/// let user_spec2: UserColorSpec = "match:bg:0xff,0x7f,0x00".parse().unwrap(); +/// +/// let spec1 = user_spec1.to_color_spec(); +/// let spec2 = user_spec2.to_color_spec(); +/// +/// assert_eq!(spec1.fg(), Some(&Color::Blue)); +/// assert_eq!(spec2.bg(), Some(&Color::Rgb(0xFF, 0x7F, 0x00))); +/// # } +/// ``` +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct UserColorSpec { + ty: OutType, + value: SpecValue, +} + +impl UserColorSpec { + /// Convert this user provided color specification to a specification that + /// can be used with `termcolor`. This drops the type of this specification + /// (where the type indicates where the color is applied in the standard + /// printer, e.g., to the file path or the line numbers, etc.). + pub fn to_color_spec(&self) -> ColorSpec { + let mut spec = ColorSpec::default(); + self.value.merge_into(&mut spec); + spec + } +} + +/// The actual value given by the specification. +#[derive(Clone, Debug, Eq, PartialEq)] +enum SpecValue { + None, + Fg(Color), + Bg(Color), + Style(Style), +} + +/// The set of configurable portions of ripgrep's output. +#[derive(Clone, Debug, Eq, PartialEq)] +enum OutType { + Path, + Line, + Column, + Match, +} + +/// The specification type. +#[derive(Clone, Debug, Eq, PartialEq)] +enum SpecType { + Fg, + Bg, + Style, + None, +} + +/// The set of available styles for use in the terminal. +#[derive(Clone, Debug, Eq, PartialEq)] +enum Style { + Bold, + NoBold, + Intense, + NoIntense, + Underline, + NoUnderline +} + +impl ColorSpecs { + /// Create color specifications from a list of user supplied + /// specifications. + pub fn new(specs: &[UserColorSpec]) -> ColorSpecs { + let mut merged = ColorSpecs::default(); + for spec in specs { + match spec.ty { + OutType::Path => spec.merge_into(&mut merged.path), + OutType::Line => spec.merge_into(&mut merged.line), + OutType::Column => spec.merge_into(&mut merged.column), + OutType::Match => spec.merge_into(&mut merged.matched), + } + } + merged + } + + /// Return the color specification for coloring file paths. + pub fn path(&self) -> &ColorSpec { + &self.path + } + + /// Return the color specification for coloring line numbers. + pub fn line(&self) -> &ColorSpec { + &self.line + } + + /// Return the color specification for coloring column numbers. + pub fn column(&self) -> &ColorSpec { + &self.column + } + + /// Return the color specification for coloring matched text. + pub fn matched(&self) -> &ColorSpec { + &self.matched + } +} + +impl UserColorSpec { + /// Merge this spec into the given color specification. + fn merge_into(&self, cspec: &mut ColorSpec) { + self.value.merge_into(cspec); + } +} + +impl SpecValue { + /// Merge this spec value into the given color specification. + fn merge_into(&self, cspec: &mut ColorSpec) { + match *self { + SpecValue::None => cspec.clear(), + SpecValue::Fg(ref color) => { cspec.set_fg(Some(color.clone())); } + SpecValue::Bg(ref color) => { cspec.set_bg(Some(color.clone())); } + SpecValue::Style(ref style) => { + match *style { + Style::Bold => { cspec.set_bold(true); } + Style::NoBold => { cspec.set_bold(false); } + Style::Intense => { cspec.set_intense(true); } + Style::NoIntense => { cspec.set_intense(false); } + Style::Underline => { cspec.set_underline(true); } + Style::NoUnderline => { cspec.set_underline(false); } + } + } + } + } +} + +impl FromStr for UserColorSpec { + type Err = ColorError; + + fn from_str(s: &str) -> Result<UserColorSpec, ColorError> { + let pieces: Vec<&str> = s.split(':').collect(); + if pieces.len() <= 1 || pieces.len() > 3 { + return Err(ColorError::InvalidFormat(s.to_string())); + } + let otype: OutType = pieces[0].parse()?; + match pieces[1].parse()? { + SpecType::None => { + Ok(UserColorSpec { + ty: otype, + value: SpecValue::None, + }) + } + SpecType::Style => { + if pieces.len() < 3 { + return Err(ColorError::InvalidFormat(s.to_string())); + } + let style: Style = pieces[2].parse()?; + Ok(UserColorSpec { ty: otype, value: SpecValue::Style(style) }) + } + SpecType::Fg => { + if pieces.len() < 3 { + return Err(ColorError::InvalidFormat(s.to_string())); + } + let color: Color = pieces[2] + .parse() + .map_err(ColorError::from_parse_error)?; + Ok(UserColorSpec { ty: otype, value: SpecValue::Fg(color) }) + } + SpecType::Bg => { + if pieces.len() < 3 { + return Err(ColorError::InvalidFormat(s.to_string())); + } + let color: Color = pieces[2] + .parse() + .map_err(ColorError::from_parse_error)?; + Ok(UserColorSpec { ty: otype, value: SpecValue::Bg(color) }) + } + } + } +} + +impl FromStr for OutType { + type Err = ColorError; + + fn from_str(s: &str) -> Result<OutType, ColorError> { + match &*s.to_lowercase() { + "path" => Ok(OutType::Path), + "line" => Ok(OutType::Line), + "column" => Ok(OutType::Column), + "match" => Ok(OutType::Match), + _ => Err(ColorError::UnrecognizedOutType(s.to_string())), + } + } +} + +impl FromStr for SpecType { + type Err = ColorError; + + fn from_str(s: &str) -> Result<SpecType, ColorError> { + match &*s.to_lowercase() { + "fg" => Ok(SpecType::Fg), + "bg" => Ok(SpecType::Bg), + "style" => Ok(SpecType::Style), + "none" => Ok(SpecType::None), + _ => Err(ColorError::UnrecognizedSpecType(s.to_string())), + } + } +} + +impl FromStr for Style { + type Err = ColorError; + + fn from_str(s: &str) -> Result<Style, ColorError> { + match &*s.to_lowercase() { + "bold" => Ok(Style::Bold), + "nobold" => Ok(Style::NoBold), + "intense" => Ok(Style::Intense), + "nointense" => Ok(Style::NoIntense), + "underline" => Ok(Style::Underline), + "nounderline" => Ok(Style::NoUnderline), + _ => Err(ColorError::UnrecognizedStyle(s.to_string())), + } + } +} diff --git a/grep-printer/src/counter.rs b/grep-printer/src/counter.rs new file mode 100644 index 00000000..c2faac83 --- /dev/null +++ b/grep-printer/src/counter.rs @@ -0,0 +1,90 @@ +use std::io::{self, Write}; + +use termcolor::{ColorSpec, WriteColor}; + +/// A writer that counts the number of bytes that have been successfully +/// written. +#[derive(Clone, Debug)] +pub struct CounterWriter<W> { + wtr: W, + count: u64, + total_count: u64, +} + +impl<W: Write> CounterWriter<W> { + pub fn new(wtr: W) -> CounterWriter<W> { + CounterWriter { wtr: wtr, count: 0, total_count: 0 } + } +} + +impl<W> CounterWriter<W> { + /// Returns the total number of bytes written since construction or the + /// last time `reset` was called. + pub fn count(&self) -> u64 { + self.count + } + + /// Returns the total number of bytes written since construction. + pub fn total_count(&self) -> u64 { + self.total_count + self.count + } + + /// Resets the number of bytes written to `0`. + pub fn reset_count(&mut self) { + self.total_count += self.count; + self.count = 0; + } + + /// Clear resets all counting related state for this writer. + /// + /// After this call, the total count of bytes written to the underlying + /// writer is erased and reset. + #[allow(dead_code)] + pub fn clear(&mut self) { + self.count = 0; + self.total_count = 0; + } + + #[allow(dead_code)] + pub fn get_ref(&self) -> &W { + &self.wtr + } + + pub fn get_mut(&mut self) -> &mut W { + &mut self.wtr + } + + pub fn into_inner(self) -> W { + self.wtr + } +} + +impl<W: Write> Write for CounterWriter<W> { + fn write(&mut self, buf: &[u8]) -> Result<usize, io::Error> { + let n = self.wtr.write(buf)?; + self.count += n as u64; + Ok(n) + } + + fn flush(&mut self) -> Result<(), io::Error> { + self.wtr.flush() + } +} + +impl<W: WriteColor> WriteColor for CounterWriter<W> { + fn supports_color(&self) -> bool { + self.wtr.supports_color() + } + + fn set_color(&mut self, spec: &ColorSpec) -> io::Result<()> { + self.wtr.set_color(spec) + } + + fn reset(&mut self) -> io::Result<()> { + self.wtr.reset() + } + + fn is_synchronous(&self) -> bool { + self.wtr.is_synchronous() + } +} diff --git a/grep-printer/src/json.rs b/grep-printer/src/json.rs new file mode 100644 index 00000000..45d6d682 --- /dev/null +++ b/grep-printer/src/json.rs @@ -0,0 +1,921 @@ +use std::io::{self, Write}; +use std::path::Path; +use std::time::Instant; + +use grep_matcher::{Match, Matcher}; +use grep_searcher::{ + Searcher, + Sink, SinkError, SinkContext, SinkContextKind, SinkFinish, SinkMatch, +}; +use serde_json as json; + +use counter::CounterWriter; +use jsont; +use stats::Stats; + +/// The configuration for the JSON printer. +/// +/// This is manipulated by the JSONBuilder and then referenced by the actual +/// implementation. Once a printer is build, the configuration is frozen and +/// cannot changed. +#[derive(Debug, Clone)] +struct Config { + pretty: bool, + max_matches: Option<u64>, + always_begin_end: bool, +} + +impl Default for Config { + fn default() -> Config { + Config { + pretty: false, + max_matches: None, + always_begin_end: false, + } + } +} + +/// A builder for a JSON lines printer. +/// +/// The builder permits configuring how the printer behaves. The JSON printer +/// has fewer configuration options than the standard printer because it is +/// a structured format, and the printer always attempts to find the most +/// information possible. +/// +/// Some configuration options, such as whether line numbers are included or +/// whether contextual lines are shown, are drawn directly from the +/// `grep_searcher::Searcher`'s configuration. +/// +/// Once a `JSON` printer is built, its configuration cannot be changed. +#[derive(Clone, Debug)] +pub struct JSONBuilder { + config: Config, +} + +impl JSONBuilder { + /// Return a new builder for configuring the JSON printer. + pub fn new() -> JSONBuilder { + JSONBuilder { config: Config::default() } + } + + /// Create a JSON printer that writes results to the given writer. + pub fn build<W: io::Write>(&self, wtr: W) -> JSON<W> { + JSON { + config: self.config.clone(), + wtr: CounterWriter::new(wtr), + matches: vec![], + } + } + + /// Print JSON in a pretty printed format. + /// + /// Enabling this will no longer produce a "JSON lines" format, in that + /// each JSON object printed may span multiple lines. + /// + /// This is disabled by default. + pub fn pretty(&mut self, yes: bool) -> &mut JSONBuilder { + self.config.pretty = yes; + self + } + + /// Set the maximum amount of matches that are printed. + /// + /// If multi line search is enabled and a match spans multiple lines, then + /// that match is counted exactly once for the purposes of enforcing this + /// limit, regardless of how many lines it spans. + pub fn max_matches(&mut self, limit: Option<u64>) -> &mut JSONBuilder { + self.config.max_matches = limit; + self + } + + /// When enabled, the `begin` and `end` messages are always emitted, even + /// when no match is found. + /// + /// When disabled, the `begin` and `end` messages are only shown if there + /// is at least one `match` or `context` message. + /// + /// This is disabled by default. + pub fn always_begin_end(&mut self, yes: bool) -> &mut JSONBuilder { + self.config.always_begin_end = yes; + self + } +} + +/// The JSON printer, which emits results in a JSON lines format. +/// +/// This type is generic over `W`, which represents any implementation of +/// the standard library `io::Write` trait. +/// +/// # Format +/// +/// This section describes the JSON format used by this printer. +/// +/// To skip the rigamarole, take a look at the +/// [example](#example) +/// at the end. +/// +/// ## Overview +/// +/// The format of this printer is the [JSON Lines](http://jsonlines.org/) +/// format. Specifically, this printer emits a sequence of messages, where +/// each message is encoded as a single JSON value on a single line. There are +/// four different types of messages (and this number may expand over time): +/// +/// * **begin** - A message that indicates a file is being searched. +/// * **end** - A message the indicates a file is done being searched. This +/// message also include summary statistics about the search. +/// * **match** - A message that indicates a match was found. This includes +/// the text and offsets of the match. +/// * **context** - A message that indicates a contextual line was found. +/// This includes the text of the line, along with any match information if +/// the search was inverted. +/// +/// Every message is encoded in the same envelope format, which includes a tag +/// indicating the message type along with an object for the payload: +/// +/// ```json +/// { +/// "type": "{begin|end|match|context}", +/// "data": { ... } +/// } +/// ``` +/// +/// The message itself is encoded in the envelope's `data` key. +/// +/// ## Text encoding +/// +/// Before describing each message format, we first must briefly discuss text +/// encoding, since it factors into every type of message. In particular, JSON +/// may only be encoded in UTF-8, UTF-16 or UTF-32. For the purposes of this +/// printer, we need only worry about UTF-8. The problem here is that searching +/// is not limited to UTF-8 exclusively, which in turn implies that matches +/// may be reported that contain invalid UTF-8. Moreover, this printer may +/// also print file paths, and the encoding of file paths is itself not +/// guarnateed to be valid UTF-8. Therefore, this printer must deal with the +/// presence of invalid UTF-8 somehow. The printer could silently ignore such +/// things completely, or even lossily transcode invalid UTF-8 to valid UTF-8 +/// by replacing all invalid sequences with the Unicode replacement character. +/// However, this would prevent consumers of this format from accessing the +/// original data in a non-lossy way. +/// +/// Therefore, this printer will emit valid UTF-8 encoded bytes as normal +/// JSON strings and otherwise base64 encode data that isn't valid UTF-8. To +/// communicate whether this process occurs or not, strings are keyed by the +/// name `text` where as arbitrary bytes are keyed by `bytes`. +/// +/// For example, when a path is included in a message, it is formatted like so, +/// if and only if the path is valid UTF-8: +/// +/// ```json +/// { +/// "path": { +/// "text": "/home/ubuntu/lib.rs" +/// } +/// } +/// ``` +/// +/// If instead our path was `/home/ubuntu/lib\xFF.rs`, where the `\xFF` byte +/// makes it invalid UTF-8, the path would instead be encoded like so: +/// +/// ```json +/// { +/// "path": { +/// "bytes": "L2hvbWUvdWJ1bnR1L2xpYv8ucnM=" +/// } +/// } +/// ``` +/// +/// This same representation is used for reporting matches as well. +/// +/// The printer guarantees that the `text` field is used whenever the +/// underlying bytes are valid UTF-8. +/// +/// ## Wire format +/// +/// This section documents the wire format emitted by this printer, starting +/// with the four types of messages. +/// +/// Each message has its own format, and is contained inside an envelope that +/// indicates the type of message. The envelope has these fields: +/// +/// * **type** - A string indicating the type of this message. It may be one +/// of four possible strings: `begin`, `end`, `match` or `context`. This +/// list may expand over time. +/// * **data** - The actual message data. The format of this field depends on +/// the value of `type`. The possible message formats are +/// [`begin`](#message-begin), +/// [`end`](#message-end), +/// [`match`](#message-match), +/// [`context`](#message-context). +/// +/// #### Message: **begin** +/// +/// This message indicates that a search has begun. It has these fields: +/// +/// * **path** - An +/// [arbitrary data object](#object-arbitrary-data) +/// representing the file path corresponding to the search, if one is +/// present. If no file path is available, then this field is `null`. +/// +/// #### Message: **end** +/// +/// This message indicates that a search has finished. It has these fields: +/// +/// * **path** - An +/// [arbitrary data object](#object-arbitrary-data) +/// representing the file path corresponding to the search, if one is +/// present. If no file path is available, then this field is `null`. +/// * **binary_offset** - The absolute offset in the data searched +/// corresponding to the place at which binary data was detected. If no +/// binary data was detected (or if binary detection was disabled), then this +/// field is `null`. +/// * **stats** - A [`stats` object](#object-stats) that contains summary +/// statistics for the previous search. +/// +/// #### Message: **match** +/// +/// This message indicates that a match has been found. A match generally +/// corresponds to a single line of text, although it may correspond to +/// multiple lines if the search can emit matches over multiple lines. It +/// has these fields: +/// +/// * **path** - An +/// [arbitrary data object](#object-arbitrary-data) +/// representing the file path corresponding to the search, if one is +/// present. If no file path is available, then this field is `null`. +/// * **lines** - An +/// [arbitrary data object](#object-arbitrary-data) +/// representing one or more lines contained in this match. +/// * **line_number** - If the searcher has been configured to report line +/// numbers, then this corresponds to the line number of the first line +/// in `lines`. If no line numbers are available, then this is `null`. +/// * **absolute_offset** - The absolute byte offset corresponding to the start +/// of `lines` in the data being searched. +/// * **submatches** - An array of [`submatch` objects](#object-submatch) +/// corresponding to matches in `lines`. The offsets included in each +/// `submatch` correspond to byte offsets into `lines`. (If `lines` is base64 +/// encoded, then the byte offsets correspond to the data after base64 +/// decoding.) The `submatch` objects are guaranteed to be sorted by their +/// starting offsets. Note that it is possible for this array to be empty, +/// for example, when searching reports inverted matches. +/// +/// #### Message: **context** +/// +/// This message indicates that a contextual line has been found. A contextual +/// line is a line that doesn't contain a match, but is generally adjacent to +/// a line that does contain a match. The precise way in which contextual lines +/// are reported is determined by the searcher. It has these fields, which are +/// exactly the same fields found in a [`match`](#message-match): +/// +/// * **path** - An +/// [arbitrary data object](#object-arbitrary-data) +/// representing the file path corresponding to the search, if one is +/// present. If no file path is available, then this field is `null`. +/// * **lines** - An +/// [arbitrary data object](#object-arbitrary-data) +/// representing one or more lines contained in this context. This includes +/// line terminators, if they're present. +/// * **line_number** - If the searcher has been configured to report line +/// numbers, then this corresponds to the line number of the first line +/// in `lines`. If no line numbers are available, then this is `null`. +/// * **absolute_ |