diff options
author | har7an <99636919+har7an@users.noreply.github.com> | 2022-09-09 13:21:03 +0000 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-09-09 13:21:03 +0000 |
commit | 99e2bef8c68bd166cf89e90c8ffe8c02272ab4d3 (patch) | |
tree | 1eb34d081e8d96c9f6d49f26beb8e0d97666aa6e /zellij-utils | |
parent | 3f43a057cb65e5790f8024fe0d961255f6699678 (diff) |
Feature: Better error handling/reporting (#1670)
* utils: re-export "anyhow" unconditionally
even for wasm targets.
* utils/errors: Share wasm-compatible code
and move everything that can't run in wasm into a separate submodule.
* utils: Share "errors" module unconditionally
The module is now structured such that all code incompatible with wasm
targets lives in its own submodule that isn't included when compiling
for wasm targets.
* utils/errors: Add "Help wanted" doc section
that informs the reader about the endeavour to improve error handling
throughout the zellij code base.
* plugins: Handle errors returned by `zellij_tile`
now that the panic calls have been removed.
* utils/errors: Extend `anyhow::Result` with traits
that allow for easy/concise logging of `anyhow::Result` types and
panicking the application when they are fatal or non-fatal.
* utils/errors: Fix doctest
* utils/errors: Add prelude
that applications can import to conveniently access the error handling
functionality part of the module. Re-exports some parts and macros from
anyhow and the `LoggableError` and `FatalError` traits.
* server/screen: Adopt error handling
and make all fallible functions from the public API return a `Result`.
Append error contexts in all functions that can come across error types
to allow tracing where an error originates and what lead there.
* server/lib: Catch errors from `screen`
and make them `fatal`. This will log the errors first, before unwrapping
on the error type and panicking the application.
* server/unit/screen: Fix unit tests
and unwrap on the `Result` types introduced from the new error handling.
* utils/errors: Track error source
in calls to `fatal`, so we keep track of the location where the panic
really originates. Otherwise, any call to `fatal` will show the code in
`errors` as source, which of course isn't true.
Also change the error formatting and do not call `to_log` for fatal
errors anymore, because the panic is already logged and contains much
more information.
* utils/errors: Update `FatalError` docs
* plugins: Undo accidental modifications
* utils/errors: Improve module docs
explain some error handling facilities and the motivation behind using
them.
* server/screen: Remove `Result` from Infallible
functions that are part of the public API.
Diffstat (limited to 'zellij-utils')
-rw-r--r-- | zellij-utils/src/errors.rs | 447 | ||||
-rw-r--r-- | zellij-utils/src/lib.rs | 9 |
2 files changed, 331 insertions, 125 deletions
diff --git a/zellij-utils/src/errors.rs b/zellij-utils/src/errors.rs index 07dfb89e0..a46427ed6 100644 --- a/zellij-utils/src/errors.rs +++ b/zellij-utils/src/errors.rs @@ -1,19 +1,106 @@ //! Error context system based on a thread-local representation of the call stack, itself based on //! the instructions that are sent between threads. - -use crate::channels::{SenderWithContext, ASYNCOPENCALLS, OPENCALLS}; +//! +//! # Help wanted +//! +//! As of writing this, zellij relies on `unwrap()` to catch errors (terminate execution) in many +//! functions, rather than returning a [`Result`] to propagate these errors further up. While we +//! don't consider `unwrap` to be a bad thing in general, it hides the underlying error and leaves +//! the user only with a stack trace to go on. Worse than this, it will crash the application. This +//! is particularly bad when the user is using long-running sessions to perform tasks. +//! +//! Hence, we would like to eliminate `unwrap()` statements from the code where possible, and apply +//! better error handling instead. This way, functions higher up in the call stack can react to +//! errors from underlying functions and either try to recover, or give some meaningful error +//! messages if recovery isn't possible. +//! +//! Since the zellij codebase is pretty big and growing rapidly, this endeavour will continue to be +//! pursued over time, as zellij develops. The idea is that modules or single files are converted +//! bit by bit, preferrably in small PRs that each target a specific module or file. **If you are +//! looking to contribute to zellij, this may be an ideal start for you!** This way you get to know +//! the codebase and get an idea which modules are used at which other places in the code. +//! +//! If you have an interest in this, don't hesitate to get in touch with us. +//! +//! +//! # Error handling facilities +//! +//! ## Displaying panic messages +//! +//! Panics are generally handled via the [`Panic`] error type and the +//! [`handle_panic`][`handle_panic`] panic handler function. The fancy formatting is performed by +//! the [`miette`] crate. +//! +//! +//! ## Propagating errors +//! +//! We use the [`anyhow`] crate to propagate errors up the call stack. At the moment, zellij +//! doesn't have custom error types, so we wrap whatever errors the underlying libraries give us, +//! if any. [`anyhow`] serves the purpose of providing [`context`][`context`] about where (i.e. +//! under which circumstances) an error happened. +//! +//! A critical requirement for propagating errors is that all functions involved must return the +//! [`Result`] type. This allows convenient error handling with the `?` operator. +//! +//! At some point you will likely stop propagating errors and decide what to do with the error. +//! Generally you can: +//! +//! 1. Try to recover from the error, or +//! 2. Report the error to the user and either +//! 1. Terminate program execution (See [`fatal`][`fatal`]), or +//! 2. Continue program execution (See [`non_fatal`][`non_fatal`]) +//! +//! +//! ## Handling errors +//! +//! Ideally, when the program encounters an error it will try to recover as best as it can. This +//! can mean falling back to some sane default if a specific value (e.g. an environment variable) +//! cannot be found. Note that this isn't always applicable. If in doubt, don't hesitate to ask. +//! +//! Recovery usually isn't an option if an operation has changed the internal state (i.e. the value +//! or content of specific variables) of objects in the code. In this case, if an error is +//! encountered, it is best to declare the program state corrupted and terminate the whole +//! application. This can be done by [`unwrap`]ing on the [`Result`] type. Always try to propagate +//! the error as best as you can and attach meaningful context before [`unwrap`]ing. This gives the +//! user an idea what went wrong and can also help developers in quickly identifying which parts of +//! the code to debug if necessary. +//! +//! When you encounter such a fatal error and cannot propagate it further up (e.g. because the +//! current function cannot be changed to return a [`Result`], or because it is the "root" function +//! of a program thread), use the [`fatal`][`fatal`] function to panic the application. It will +//! attach some small context to the error and finally [`unwrap`] it. Using this function over the +//! regular [`unwrap`] has the added benefit that other developers seeing this in the code know +//! that someone has previously spent some thought about error handling at this location. +//! +//! If you encounter a non-fatal error, use the [`non_fatal`][`non_fatal`] function to handle +//! it. Instead of [`panic`]ing the application, the error is written to the application log and +//! execution continues. Please use this sparingly, as an error usually calls for actions to be +//! taken rather than ignoring it. +//! +//! +//! [`handle_panic`]: not_wasm::handle_panic +//! [`context`]: anyhow::Context +//! [`fatal`]: FatalError::fatal +//! [`non_fatal`]: FatalError::non_fatal + +use anyhow::Context; use colored::*; use log::error; use serde::{Deserialize, Serialize}; use std::fmt::{Display, Error, Formatter}; -use std::panic::PanicInfo; -use miette::{Diagnostic, GraphicalReportHandler, GraphicalTheme, Report}; +use miette::Diagnostic; use thiserror::Error as ThisError; -/// The maximum amount of calls an [`ErrorContext`] will keep track -/// of in its stack representation. This is a per-thread maximum. -const MAX_THREAD_CALL_STACK: usize = 6; +/// Re-exports of common error-handling code. +pub mod prelude { + pub use super::FatalError; + pub use super::LoggableError; + pub use anyhow::anyhow; + pub use anyhow::bail; + pub use anyhow::Context; + pub use anyhow::Result; +} pub trait ErrorInstruction { fn error(err: String) -> Self; @@ -44,136 +131,106 @@ Also, if you want to see the backtrace, you can set the `RUST_BACKTRACE` environ } } -fn fmt_report(diag: Report) -> String { - let mut out = String::new(); - GraphicalReportHandler::new_themed(GraphicalTheme::unicode()) - .render_report(&mut out, diag.as_ref()) - .unwrap(); - out -} - -/// Custom panic handler/hook. Prints the [`ErrorContext`]. -pub fn handle_panic<T>(info: &PanicInfo<'_>, sender: &SenderWithContext<T>) -where - T: ErrorInstruction + Clone, -{ - use std::{process, thread}; - let thread = thread::current(); - let thread = thread.name().unwrap_or("unnamed"); - - let msg = match info.payload().downcast_ref::<&'static str>() { - Some(s) => Some(*s), - None => info.payload().downcast_ref::<String>().map(|s| &**s), - } - .unwrap_or("An unexpected error occurred!"); - - let err_ctx = OPENCALLS.with(|ctx| *ctx.borrow()); - - let mut report: Report = Panic(format!("\u{1b}[0;31m{}\u{1b}[0;0m", msg)).into(); - - let mut location_string = String::new(); - if let Some(location) = info.location() { - location_string = format!( - "At {}:{}:{}", - location.file(), - location.line(), - location.column() - ); - report = report.wrap_err(location_string.clone()); +/// Helper trait to easily log error types. +/// +/// The `print_error` function takes a closure which takes a `&str` and fares with it as necessary +/// to log the error to some usable location. For convenience, logging to stdout, stderr and +/// `log::error!` is already implemented. +/// +/// Note that the trait functions pass the error through unmodified, so they can be chained with +/// the usual handling of [`std::result::Result`] types. +pub trait LoggableError<T>: Sized { + /// Gives a formatted error message derived from `self` to the closure `fun` for + /// printing/logging as appropriate. + /// + /// # Examples + /// + /// ```should_panic + /// use anyhow; + /// use zellij_utils::errors::LoggableError; + /// + /// let my_err: anyhow::Result<&str> = Err(anyhow::anyhow!("Test error")); + /// my_err + /// .print_error(|msg| println!("{msg}")) + /// .unwrap(); + /// ``` + fn print_error<F: Fn(&str)>(self, fun: F) -> Self; + + /// Convenienve function, calls `print_error` with the closure `|msg| log::error!("{}", msg)`. + fn to_log(self) -> Self { + self.print_error(|msg| log::error!("{}", msg)) } - if !err_ctx.is_empty() { - report = report.wrap_err(format!("{}", err_ctx)); + /// Convenienve function, calls `print_error` with the closure `|msg| eprintln!("{}", msg)`. + fn to_stderr(self) -> Self { + self.print_error(|msg| eprintln!("{}", msg)) } - report = report.wrap_err(format!( - "Thread '\u{1b}[0;31m{}\u{1b}[0;0m' panicked.", - thread - )); - - error!( - "{}", - format!( - "Panic occured: - thread: {} - location: {} - message: {}", - thread, location_string, msg - ) - ); - - if thread == "main" { - // here we only show the first line because the backtrace is not readable otherwise - // a better solution would be to escape raw mode before we do this, but it's not trivial - // to get os_input here - println!("\u{1b}[2J{}", fmt_report(report)); - process::exit(1); - } else { - let _ = sender.send(T::error(fmt_report(report))); + /// Convenienve function, calls `print_error` with the closure `|msg| println!("{}", msg)`. + fn to_stdout(self) -> Self { + self.print_error(|msg| println!("{}", msg)) } } -pub fn get_current_ctx() -> ErrorContext { - ASYNCOPENCALLS - .try_with(|ctx| *ctx.borrow()) - .unwrap_or_else(|_| OPENCALLS.with(|ctx| *ctx.borrow())) -} - -/// A representation of the call stack. -#[derive(Clone, Copy, Serialize, Deserialize, Debug)] -pub struct ErrorContext { - calls: [ContextType; MAX_THREAD_CALL_STACK], -} - -impl ErrorContext { - /// Returns a new, blank [`ErrorContext`] containing only [`Empty`](ContextType::Empty) - /// calls. - pub fn new() -> Self { - Self { - calls: [ContextType::Empty; MAX_THREAD_CALL_STACK], - } - } - - /// Returns `true` if the calls has all [`Empty`](ContextType::Empty) calls. - pub fn is_empty(&self) -> bool { - self.calls.iter().all(|c| c == &ContextType::Empty) - } - - /// Adds a call to this [`ErrorContext`]'s call stack representation. - pub fn add_call(&mut self, call: ContextType) { - for ctx in &mut self.calls { - if let ContextType::Empty = ctx { - *ctx = call; - break; +impl<T> LoggableError<T> for anyhow::Result<T> { + fn print_error<F: Fn(&str)>(self, fun: F) -> Self { + if let Err(ref err) = self { + let mut msg = format!("ERROR: {}", err); + for cause in err.chain().skip(1) { + msg = format!("{msg}\nbecause: {cause}"); } + fun(&msg); } - self.update_thread_ctx() + self } +} - /// Updates the thread local [`ErrorContext`]. - pub fn update_thread_ctx(&self) { - ASYNCOPENCALLS - .try_with(|ctx| *ctx.borrow_mut() = *self) - .unwrap_or_else(|_| OPENCALLS.with(|ctx| *ctx.borrow_mut() = *self)); - } +/// Special trait to mark fatal/non-fatal errors. +/// +/// This works in tandem with `LoggableError` above and is meant to make reading code easier with +/// regard to whether an error is fatal or not (i.e. can be ignored, or at least doesn't make the +/// application crash). +/// +/// This essentially degrades any `std::result::Result<(), _>` to a simple `()`. +pub trait FatalError<T> { + /// Mark results as being non-fatal. + /// + /// If the result is an `Err` variant, this will [print the error to the log][`to_log`]. + /// Discards the result type afterwards. + /// + /// [`to_log`]: LoggableError::to_log + fn non_fatal(self); + + /// Mark results as being fatal. + /// + /// If the result is an `Err` variant, this will unwrap the error and panic the application. + /// If the result is an `Ok` variant, the inner value is unwrapped and returned instead. + /// + /// # Panics + /// + /// If the given result is an `Err` variant. + #[track_caller] + fn fatal(self) -> T; } -impl Default for ErrorContext { - fn default() -> Self { - Self::new() +/// Helper function to silence `#[warn(unused_must_use)]` cargo warnings. Used exclusively in +/// `FatalError::non_fatal`! +fn discard_result<T>(_arg: anyhow::Result<T>) {} + +impl<T> FatalError<T> for anyhow::Result<T> { + fn non_fatal(self) { + if self.is_err() { + discard_result(self.context("a non-fatal error occured").to_log()); + } } -} -impl Display for ErrorContext { - fn fmt(&self, f: &mut Formatter) -> Result<(), Error> { - writeln!(f, "Originating Thread(s)")?; - for (index, ctx) in self.calls.iter().enumerate() { - if *ctx == ContextType::Empty { - break; - } - writeln!(f, "\t\u{1b}[0;0m{}. {}", index + 1, ctx)?; + fn fatal(self) -> T { + if let Ok(val) = self { + val + } else { + self.context("a fatal error occured") + .expect("Program terminates") } - Ok(()) } } @@ -379,3 +436,151 @@ pub enum PtyWriteContext { Write, Exit, } + +#[cfg(not(target_family = "wasm"))] +pub use not_wasm::*; + +#[cfg(not(target_family = "wasm"))] +mod not_wasm { + use super::*; + use crate::channels::{SenderWithContext, ASYNCOPENCALLS, OPENCALLS}; + use miette::{GraphicalReportHandler, GraphicalTheme, Report}; + use std::panic::PanicInfo; + + /// The maximum amount of calls an [`ErrorContext`] will keep track + /// of in its stack representation. This is a per-thread maximum. + const MAX_THREAD_CALL_STACK: usize = 6; + + /// Custom panic handler/hook. Prints the [`ErrorContext`]. + pub fn handle_panic<T>(info: &PanicInfo<'_>, sender: &SenderWithContext<T>) + where + T: ErrorInstruction + Clone, + { + use std::{process, thread}; + let thread = thread::current(); + let thread = thread.name().unwrap_or("unnamed"); + + let msg = match info.payload().downcast_ref::<&'static str>() { + Some(s) => Some(*s), + None => info.payload().downcast_ref::<String>().map(|s| &**s), + } + .unwrap_or("An unexpected error occurred!"); + + let err_ctx = OPENCALLS.with(|ctx| *ctx.borrow()); + + let mut report: Report = Panic(format!("\u{1b}[0;31m{}\u{1b}[0;0m", msg)).into(); + + let mut location_string = String::new(); + if let Some(location) = info.location() { + location_string = format!( + "At {}:{}:{}", + location.file(), + location.line(), + location.column() + ); + report = report.wrap_err(location_string.clone()); + } + + if !err_ctx.is_empty() { + report = report.wrap_err(format!("{}", err_ctx)); + } + + report = report.wrap_err(format!( + "Thread '\u{1b}[0;31m{}\u{1b}[0;0m' panicked.", + thread + )); + + error!( + "{}", + format!( + "Panic occured: + thread: {} + location: {} + message: {}", + thread, location_string, msg + ) + ); + + if thread == "main" { + // here we only show the first line because the backtrace is not readable otherwise + // a better solution would be to escape raw mode before we do this, but it's not trivial + // to get os_input here + println!("\u{1b}[2J{}", fmt_report(report)); + process::exit(1); + } else { + let _ = sender.send(T::error(fmt_report(report))); + } + } + + pub fn get_current_ctx() -> ErrorContext { + ASYNCOPENCALLS + .try_with(|ctx| *ctx.borrow()) + .unwrap_or_else(|_| OPENCALLS.with(|ctx| *ctx.borrow())) + } + + fn fmt_report(diag: Report) -> String { + let mut out = String::new(); + GraphicalReportHandler::new_themed(GraphicalTheme::unicode()) + .render_report(&mut out, diag.as_ref()) + .unwrap(); + out + } + + /// A representation of the call stack. + #[derive(Clone, Copy, Serialize, Deserialize, Debug)] + pub struct ErrorContext { + calls: [ContextType; MAX_THREAD_CALL_STACK], + } + + impl ErrorContext { + /// Returns a new, blank [`ErrorContext`] containing only [`Empty`](ContextType::Empty) + /// calls. + pub fn new() -> Self { + Self { + calls: [ContextType::Empty; MAX_THREAD_CALL_STACK], + } + } + + /// Returns `true` if the calls has all [`Empty`](ContextType::Empty) calls. + pub fn is_empty(&self) -> bool { + self.calls.iter().all(|c| c == &ContextType::Empty) + } + + /// Adds a call to this [`ErrorContext`]'s call stack representation. + pub fn add_call(&mut self, call: ContextType) { + for ctx in &mut self.calls { + if let ContextType::Empty = ctx { + *ctx = call; + break; + } + } + self.update_thread_ctx() + } + + /// Updates the thread local [`ErrorContext`]. + pub fn update_thread_ctx(&self) { + ASYNCOPENCALLS + .try_with(|ctx| *ctx.borrow_mut() = *self) + .unwrap_or_else(|_| OPENCALLS.with(|ctx| *ctx.borrow_mut() = *self)); + } + } + + impl Default for ErrorContext { + fn default() -> Self { + Self::new() + } + } + + impl Display for ErrorContext { + fn fmt(&self, f: &mut Formatter) -> Result<(), Error> { + writeln!(f, "Originating Thread(s)")?; + for (index, ctx) in self.calls.iter().enumerate() { + if *ctx == ContextType::Empty { + break; + } + writeln!(f, "\t\u{1b}[0;0m{}. {}", index + 1, ctx)?; + } + Ok(()) + } + } +} diff --git a/zellij-utils/src/lib.rs b/zellij-utils/src/lib.rs index 3cff1f046..e871aeee3 100644 --- a/zellij-utils/src/lib.rs +++ b/zellij-utils/src/lib.rs @@ -2,6 +2,7 @@ pub mod cli; pub mod consts; pub mod data; pub mod envs; +pub mod errors; pub mod input; pub mod pane_size; pub mod position; @@ -12,14 +13,14 @@ pub mod shared; #[cfg(not(target_family = "wasm"))] pub mod channels; // Requires async_std #[cfg(not(target_family = "wasm"))] -pub mod errors; // Requires async_std (via channels) -#[cfg(not(target_family = "wasm"))] pub mod ipc; // Requires interprocess #[cfg(not(target_family = "wasm"))] pub mod logging; // Requires log4rs #[cfg(not(target_family = "wasm"))] pub use ::{ - anyhow, async_std, clap, interprocess, lazy_static, libc, nix, regex, serde, serde_yaml, - signal_hook, tempfile, termwiz, vte, + async_std, clap, interprocess, lazy_static, libc, nix, regex, serde, serde_yaml, signal_hook, + tempfile, termwiz, vte, }; + +pub use anyhow; |