summaryrefslogtreecommitdiffstats
path: root/health
diff options
context:
space:
mode:
authorIlya Mashchenko <ilya@netdata.cloud>2023-08-15 20:56:24 +0300
committerGitHub <noreply@github.com>2023-08-15 20:56:24 +0300
commitd5bdb7cf15b73ef4e761d31298eda9b7567bc8a8 (patch)
tree8c42bc12a3d491849933f1ac2d756129312d5ab4 /health
parent4040a16ba2e68191b237b3501599bfa3c585f655 (diff)
docs rename alarm to alert (#15812)
Diffstat (limited to 'health')
-rw-r--r--health/README.md6
-rw-r--r--health/REFERENCE.md429
-rw-r--r--health/notifications/README.md20
-rw-r--r--health/notifications/awssns/README.md30
-rw-r--r--health/notifications/custom/README.md32
-rw-r--r--health/notifications/dynatrace/README.md4
-rw-r--r--health/notifications/email/README.md2
-rw-r--r--health/notifications/flock/README.md2
-rw-r--r--health/notifications/gotify/README.md2
-rw-r--r--health/notifications/hangouts/README.md10
-rw-r--r--health/notifications/irc/README.md10
-rw-r--r--health/notifications/matrix/README.md2
-rw-r--r--health/notifications/ntfy/README.md4
-rw-r--r--health/notifications/opsgenie/README.md5
-rw-r--r--health/notifications/rocketchat/README.md4
-rw-r--r--health/notifications/slack/README.md4
-rw-r--r--health/notifications/stackpulse/README.md34
17 files changed, 299 insertions, 301 deletions
diff --git a/health/README.md b/health/README.md
index 96f71f87a2..eec8ad06ff 100644
--- a/health/README.md
+++ b/health/README.md
@@ -2,10 +2,10 @@
The Netdata Agent is a health watchdog for the health and performance of your systems, services, and applications. We've
worked closely with our community of DevOps engineers, SREs, and developers to define hundreds of production-ready
-alarms that work without any configuration.
+alerts that work without any configuration.
-The Agent's health monitoring system is also dynamic and fully customizable. You can write entirely new alarms, tune the
-community-configured alarms for every app/service [the Agent collects metrics from](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md), or
+The Agent's health monitoring system is also dynamic and fully customizable. You can write entirely new alerts, tune the
+community-configured alerts for every app/service [the Agent collects metrics from](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md), or
silence anything you're not interested in. You can even power complex lookups by running statistical algorithms against
your metrics.
diff --git a/health/REFERENCE.md b/health/REFERENCE.md
index e5179b4e54..a451f671b4 100644
--- a/health/REFERENCE.md
+++ b/health/REFERENCE.md
@@ -1,15 +1,15 @@
# Configure alerts
-Netdata's health watchdog is highly configurable, with support for dynamic thresholds, hysteresis, alarm templates, and
-more. You can tweak any of the existing alarms based on your infrastructure's topology or specific monitoring needs, or
+Netdata's health watchdog is highly configurable, with support for dynamic thresholds, hysteresis, alert templates, and
+more. You can tweak any of the existing alerts based on your infrastructure's topology or specific monitoring needs, or
create new entities.
-You can use health alarms in conjunction with any of Netdata's [collectors](https://github.com/netdata/netdata/blob/master/collectors/README.md) (see
+You can use health alerts in conjunction with any of Netdata's [collectors](https://github.com/netdata/netdata/blob/master/collectors/README.md) (see
the [supported collector list](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md)) to monitor the health of your systems, containers, and
applications in real time.
-While you can see active alarms both on the local dashboard and Netdata Cloud, all health alarms are configured _per
-node_ via individual Netdata Agents. If you want to deploy a new alarm across your
+While you can see active alerts both on the local dashboard and Netdata Cloud, all health alerts are configured _per
+node_ via individual Netdata Agents. If you want to deploy a new alert across your
[infrastructure](https://github.com/netdata/netdata/blob/master/docs/quickstart/infrastructure.md), you must configure each node with the same health configuration
files.
@@ -55,7 +55,7 @@ template: 10min_cpu_usage
to: sysadmin
```
-To tune this alarm to trigger warning and critical alarms at a lower CPU utilization, change the `warn` and `crit` lines
+To tune this alert to trigger warning and critical alerts at a lower CPU utilization, change the `warn` and `crit` lines
to the values of your choosing. For example:
```yaml
@@ -79,7 +79,7 @@ In the `netdata.conf` `[health]` section, set `enabled` to `no`, and restart the
In the `netdata.conf` `[health]` section, set `enabled alarms` to a
[simple pattern](https://github.com/netdata/netdata/edit/master/libnetdata/simple_pattern/README.md) that
-excludes one or more alerts. e.g. `enabled alarms = !oom_kill *` will load all alarms except `oom_kill`.
+excludes one or more alerts. e.g. `enabled alarms = !oom_kill *` will load all alerts except `oom_kill`.
You can also [edit the file where the alert is defined](#edit-individual-alerts), comment out its definition,
and [reload Netdata's health configuration](#reload-health-configuration).
@@ -112,7 +112,7 @@ or restarting the agent.
## Write a new health entity
-While tuning existing alarms may work in some cases, you may need to write entirely new health entities based on how
+While tuning existing alerts may work in some cases, you may need to write entirely new health entities based on how
your systems, containers, and applications work.
Read the [health entity reference](#health-entity-reference) for a full listing of the format,
@@ -128,8 +128,8 @@ sudo touch health.d/ram-usage.conf
sudo ./edit-config health.d/ram-usage.conf
```
-For example, here is a health entity that triggers a warning alarm when a node's RAM usage rises above 80%, and a
-critical alarm above 90%:
+For example, here is a health entity that triggers a warning alert when a node's RAM usage rises above 80%, and a
+critical alert above 90%:
```yaml
alarm: ram_usage
@@ -151,7 +151,7 @@ Let's look into each of the lines to see how they create a working health entity
- `on`: Which chart the entity listens to.
-- `lookup`: Which metrics the alarm monitors, the duration of time to monitor, and how to process the metrics into a
+- `lookup`: Which metrics the alert monitors, the duration of time to monitor, and how to process the metrics into a
usable format.
- `average`: Calculate the average of all the metrics collected.
- `-1m`: Use metrics from 1 minute ago until now to calculate that average.
@@ -160,13 +160,13 @@ Let's look into each of the lines to see how they create a working health entity
- `units`: Use percentages rather than absolute units.
-- `every`: How often to perform the `lookup` calculation to decide whether or not to trigger this alarm.
+- `every`: How often to perform the `lookup` calculation to decide whether to trigger this alert.
-- `warn`/`crit`: The value at which Netdata should trigger a warning or critical alarm. This example uses simple
+- `warn`/`crit`: The value at which Netdata should trigger a warning or critical alert. This example uses simple
syntax, but most pre-configured health entities use
[hysteresis](#special-use-of-the-conditional-operator) to avoid superfluous notifications.
-- `info`: A description of the alarm, which will appear in the dashboard and notifications.
+- `info`: A description of the alert, which will appear in the dashboard and notifications.
In human-readable format:
@@ -174,8 +174,8 @@ In human-readable format:
> metrics from the **used** dimension and calculates the **average** of all those metrics in a **percentage** format,
> using a **% unit**. The entity performs this lookup **every minute**.
>
-> If the average RAM usage percentage over the last 1 minute is **more than 80%**, the entity triggers a warning alarm.
-> If the usage is **more than 90%**, the entity triggers a critical alarm.
+> If the average RAM usage percentage over the last 1 minute is **more than 80%**, the entity triggers a warning alert.
+> If the usage is **more than 90%**, the entity triggers a critical alert.
When you finish writing this new health entity, [reload Netdata's health configuration](#reload-health-configuration) to
see it live on the local dashboard or Netdata Cloud.
@@ -188,20 +188,20 @@ without restarting all of Netdata, run `netdatacli reload-health` or `killall -U
## Health entity reference
The following reference contains information about the syntax and options of _health entities_, which Netdata attaches
-to charts in order to trigger alarms.
+to charts in order to trigger alerts.
### Entity types
There are two entity types: **alarms** and **templates**. They have the same format and feature set—the only difference
is their label.
-**Alarms** are attached to specific charts and use the `alarm` label.
+**Alerts** are attached to specific charts and use the `alarm` label.
**Templates** define rules that apply to all charts of a specific context, and use the `template` label. Templates help
you apply one entity to all disks, all network interfaces, all MySQL databases, and so on.
-Alarms have higher precedence and will override templates. If an alarm and template entity have the same name and attach
-to the same chart, Netdata will use the alarm.
+Alerts have higher precedence and will override templates.
+If the `alert` and `template` entities have the same name and are attached to the same chart, Netdata will use `alarm`.
### Entity format
@@ -219,39 +219,39 @@ Netdata parses the following lines. Beneath the table is an in-depth explanation
This comes in handy if your `info` line consists of several sentences.
| line | required | functionality |
-| --------------------------------------------------- | --------------- | ------------------------------------------------------------------------------------- |
-| [`alarm`/`template`](#alarm-line-alarm-or-template) | yes | Name of the alarm/template. |
-| [`on`](#alarm-line-on) | yes | The chart this alarm should attach to. |
-| [`class`](#alarm-line-class) | no | The general alarm classification. |
-| [`type`](#alarm-line-type) | no | What area of the system the alarm monitors. |
-| [`component`](#alarm-line-component) | no | Specific component of the type of the alarm. |
-| [`os`](#alarm-line-os) | no | Which operating systems to run this chart. |
-| [`hosts`](#alarm-line-hosts) | no | Which hostnames will run this alarm. |
-| [`plugin`](#alarm-line-plugin) | no | Restrict an alarm or template to only a certain plugin. |
-| [`module`](#alarm-line-module) | no | Restrict an alarm or template to only a certain module. |
-| [`charts`](#alarm-line-charts) | no | Restrict an alarm or template to only certain charts. |
-| [`families`](#alarm-line-families) | no | Restrict a template to only certain families. |
-| [`lookup`](#alarm-line-lookup) | yes | The database lookup to find and process metrics for the chart specified through `on`. |
-| [`calc`](#alarm-line-calc) | yes (see above) | A calculation to apply to the value found via `lookup` or another variable. |
-| [`every`](#alarm-line-every) | no | The frequency of the alarm. |
-| [`green`/`red`](#alarm-lines-green-and-red) | no | Set the green and red thresholds of a chart. |
-| [`warn`/`crit`](#alarm-lines-warn-and-crit) | yes (see above) | Expressions evaluating to true or false, and when true, will trigger the alarm. |
-| [`to`](#alarm-line-to) | no | A list of roles to send notifications to. |
-| [`exec`](#alarm-line-exec) | no | The script to execute when the alarm changes status. |
-| [`delay`](#alarm-line-delay) | no | Optional hysteresis settings to prevent floods of notifications. |
-| [`repeat`](#alarm-line-repeat) | no | The interval for sending notifications when an alarm is in WARNING or CRITICAL mode. |
-| [`options`](#alarm-line-options) | no | Add an option to not clear alarms. |
-| [`host labels`](#alarm-line-host-labels) | no | Restrict an alarm or template to a list of matching labels present on a host. |
-| [`chart labels`](#alarm-line-chart-labels) | no | Restrict an alarm or template to a list of matching labels present on a host. |
-| [`info`](#alarm-line-info) | no | A brief description of the alarm. |
+|-----------------------------------------------------|-----------------|---------------------------------------------------------------------------------------|
+| [`alarm`/`template`](#alert-line-alarm-or-template) | yes | Name of the alert/template. |
+| [`on`](#alert-line-on) | yes | The chart this alert should attach to. |
+| [`class`](#alert-line-class) | no | The general alert classification. |
+| [`type`](#alert-line-type) | no | What area of the system the alert monitors. |
+| [`component`](#alert-line-component) | no | Specific component of the type of the alert. |
+| [`os`](#alert-line-os) | no | Which operating systems to run this chart. |
+| [`hosts`](#alert-line-hosts) | no | Which hostnames will run this alert. |
+| [`plugin`](#alert-line-plugin) | no | Restrict an alert or template to only a certain plugin. |
+| [`module`](#alert-line-module) | no | Restrict an alert or template to only a certain module. |
+| [`charts`](#alert-line-charts) | no | Restrict an alert or template to only certain charts. |
+| [`families`](#alert-line-families) | no | Restrict a template to only certain families. |
+| [`lookup`](#alert-line-lookup) | yes | The database lookup to find and process metrics for the chart specified through `on`. |
+| [`calc`](#alert-line-calc) | yes (see above) | A calculation to apply to the value found via `lookup` or another variable. |
+| [`every`](#alert-line-every) | no | The frequency of the alert. |
+| [`green`/`red`](#alert-lines-green-and-red) | no | Set the green and red thresholds of a chart. |
+| [`warn`/`crit`](#alert-lines-warn-and-crit) | yes (see above) | Expressions evaluating to true or false, and when true, will trigger the alert. |
+| [`to`](#alert-line-to) | no | A list of roles to send notifications to. |
+| [`exec`](#alert-line-exec) | no | The script to execute when the alert changes status. |
+| [`delay`](#alert-line-delay) | no | Optional hysteresis settings to prevent floods of notifications. |
+| [`repeat`](#alert-line-repeat) | no | The interval for sending notifications when an alert is in WARNING or CRITICAL mode. |
+| [`options`](#alert-line-options) | no | Add an option to not clear alerts. |
+| [`host labels`](#alert-line-host-labels) | no | Restrict an alert or template to a list of matching labels present on a host. |
+| [`chart labels`](#alert-line-chart-labels) | no | Restrict an alert or template to a list of matching labels present on a host. |
+| [`info`](#alert-line-info) | no | A brief description of the alert. |
The `alarm` or `template` line must be the first line of any entity.
-#### Alarm line `alarm` or `template`
+#### Alert line `alarm` or `template`
-This line starts an alarm or template based on the [entity type](#entity-types) you're interested in creating.
+This line starts an alert or template based on the [entity type](#entity-types) you're interested in creating.
-**Alarm:**
+**Alert:**
```yaml
alarm: NAME
@@ -266,11 +266,11 @@ template: NAME
`NAME` can be any alpha character, with `.` (period) and `_` (underscore) as the only allowed symbols, but the names
cannot be `chart name`, `dimension name`, `family name`, or `chart variables names`.
-#### Alarm line `on`
+#### Alert line `on`
-This line defines the chart this alarm should attach to.
+This line defines the chart this alert should attach to.
-**Alarms:**
+**Alerts:**
```yaml
on: CHART
@@ -297,40 +297,40 @@ shows a disk I/O chart, the tooltip reads: `proc:/proc/diskstats, disk.io`.
You're interested in what comes after the comma: `disk.io`. That's the name of the chart's context.
-If you create a template using the `disk.io` context, it will apply an alarm to every disk available on your system.
+If you create a template using the `disk.io` context, it will apply an alert to every disk available on your system.
-#### Alarm line `class`
+#### Alert line `class`
-This indicates the type of error (or general problem area) that the alarm or template applies to. For example, `Latency` can be used for alarms that trigger on latency issues on network interfaces, web servers, or database systems. Example:
+This indicates the type of error (or general problem area) that the alert or template applies to. For example, `Latency` can be used for alerts that trigger on latency issues on network interfaces, web servers, or database systems. Example:
```yaml
class: Latency
```
<details>
-<summary>Netdata's stock alarms use the following `class` attributes by default:</summary>
+<summary>Netdata's stock alerts use the following `class` attributes by default:</summary>
-| Class |
-| ----------------|
-| Errors |
-| Latency |
-| Utilization |
-| Workload |
+| Class |
+|-------------|
+| Errors |
+| Latency |
+| Utilization |
+| Workload |
</details>
-`class` will default to `Unknown` if the line is missing from the alarm configuration.
+`class` will default to `Unknown` if the line is missing from the alert configuration.
-#### Alarm line `type`
+#### Alert line `type`
-Type can be used to indicate the broader area of the system that the alarm applies to. For example, under the general `Database` type, you can group together alarms that operate on various database systems, like `MySQL`, `CockroachDB`, `CouchDB` etc. Example:
+Type can be used to indicate the broader area of the system that the alert applies to. For example, under the general `Database` type, you can group together alerts that operate on various database systems, like `MySQL`, `CockroachDB`, `CouchDB` etc. Example:
```yaml
type: Database
```
<details>
-<summary>Netdata's stock alarms use the following `type` attributes by default, but feel free to adjust for your own requirements.</summary>
+<summary>Netdata's stock alerts use the following `type` attributes by default, but feel free to adjust for your own requirements.</summary>
| Type | Description |
|-----------------|------------------------------------------------------------------------------------------------|
@@ -352,7 +352,7 @@ type: Database
| Power Supply | Alerts from power supply related services (e.g. apcupsd) |
| Search engine | Alerts for search services (e.g. elasticsearch) |
| Storage | Class for alerts dealing with storage services (storage devices typically live under `System`) |
-| System | General system alarms (e.g. cpu, network, etc.) |
+| System | General system alerts (e.g. cpu, network, etc.) |
| Virtual Machine | Virtual Machine software |
| Web Proxy | Web proxy software (e.g. squid) |
| Web Server | Web server software (e.g. Apache, ngnix, etc.) |
@@ -360,11 +360,11 @@ type: Database
</details>
-If an alarm configuration is missing the `type` line, its value will default to `Unknown`.
+If an alert configuration is missing the `type` line, its value will default to `Unknown`.
-#### Alarm line `component`
+#### Alert line `component`
-Component can be used to narrow down what the previous `type` value specifies for each alarm or template. Continuing from the previous example, `component` might include `MySQL`, `CockroachDB`, `MongoDB`, all under the same `Database` type. Example:
+Component can be used to narrow down what the previous `type` value specifies for each alert or template. Continuing from the previous example, `component` might include `MySQL`, `CockroachDB`, `MongoDB`, all under the same `Database` type. Example:
```yaml
component: MySQL
@@ -372,9 +372,9 @@ component: MySQL
As with the `class` and `type` line, if `component` is missing from the configuration, its value will default to `Unknown`.
-#### Alarm line `os`
+#### Alert line `os`
-The alarm or template will be used only if the operating system of the host matches this list specified in `os`. The
+The alert or template will be used only if the operating system of the host matches this list specified in `os`. The
value is a space-separated list.
The following example enables the entity on Linux, FreeBSD, and macOS, but no other operating systems.
@@ -383,9 +383,9 @@ The following example enables the entity on Linux, FreeBSD, and macOS, but no ot
os: linux freebsd macos
```
-#### Alarm line `hosts`
+#### Alert line `hosts`
-The alarm or template will be used only if the hostname of the host matches this space-separated list.
+The alert or template will be used only if the hostname of the host matches this space-separated list.
The following example will load on systems with the hostnames `server` and `server2`, and any system with hostnames that
begin with `database`. It _will not load_ on the host `redis3`, but will load on any _other_ systems with hostnames that
@@ -395,47 +395,47 @@ begin with `redis`.
hosts: server1 server2 database* !redis3 redis*
```
-#### Alarm line `plugin`
+#### Alert line `plugin`
-The `plugin` line filters which plugin within the context this alarm should apply to. The value is a space-separated
+The `plugin` line filters which plugin within the context this alert should apply to. The value is a space-separated
list of [simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md). For example,
-you can create a filter for an alarm that applies specifically to `python.d.plugin`:
+you can create a filter for an alert that applies specifically to `python.d.plugin`:
```yaml
plugin: python.d.plugin
```
The `plugin` line is best used with other options like `module`. When used alone, the `plugin` line creates a very
-inclusive filter that is unlikely to be of much use in production. See [`module`](#alarm-line-module) for a
+inclusive filter that is unlikely to be of much use in production. See [`module`](#alert-line-module) for a
comprehensive example using both.
-#### Alarm line `module`
+#### Alert line `module`
-The `module` line filters which module within the context this alarm should apply to. The value is a space-separated
+The `module` line filters which module within the context this alert should apply to. The value is a space-separated
list of [simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md). For
-example, you can create an alarm that applies only on the `isc_dhcpd` module started by `python.d.plugin`:
+example, you can create an alert that applies only on the `isc_dhcpd` module started by `python.d.plugin`:
```yaml
plugin: python.d.plugin
module: isc_dhcpd
```
-#### Alarm line `charts`
+#### Alert line `charts`
-The `charts` line filters which chart this alarm should apply to. It is only available on entities using the
-[`template`](#alarm-line-alarm-or-template) line.
+The `charts` line filters which chart this alert should apply to. It is only available on entities using the
+[`template`](#alert-line-alarm-or-template) line.
The value is a space-separated list of [simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md). For
-example, a template that applies to `disk.svctm` (Average Service Time) context, but excludes the disk `sdb` from alarms:
+example, a template that applies to `disk.svctm` (Average Service Time) context, but excludes the disk `sdb` from alerts:
```yaml
-template: disk_svctm_alarm
+template: disk_svctm_alert
on: disk.svctm
charts: !*sdb* *
```
-#### Alarm line `families`
+#### Alert line `families`
-The `families` line, used only alongside templates, filters which families within the context this alarm should apply
+The `families` line, used only alongside templates, filters which families within the context this alert should apply
to. The value is a space-separated list.
The value is a space-separate list of simple patterns. See our [simple patterns docs](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) for
@@ -448,9 +448,9 @@ families: sda sdb
```
Please note that the use of the `families` filter is planned to be deprecated in upcoming Netdata releases.
-Please use [`chart labels`](#alarm-line-chart-labels) instead.
+Please use [`chart labels`](#alert-line-chart-labels) instead.
-#### Alarm line `lookup`
+#### Alert line `lookup`
This line makes a database lookup to find a value. This result of this lookup is available as `$this`.
@@ -485,17 +485,17 @@ The full [database query API](https://github.com/netdata/netdata/blob/master/web
`,` or `|` instead of spaces)_ and the `match-ids` and `match-names` options affect the searches
for dimensions.
-- `foreach DIMENSIONS` is optional and works only with [templates](#alarm-line-alarm-or-template), will always be the last parameter, and uses the same `,`/`|`
+- `foreach DIMENSIONS` is optional and works only with [templates](#alert-line-alarm-or-template), will always be the last parameter, and uses the same `,`/`|`
rules as the `of` parameter. Each dimension you specify in `foreach` will use the same rule
- to trigger an alarm. If you set both `of` and `foreach`, Netdata will ignore the `of` parameter
+ to trigger an alert. If you set both `of` and `foreach`, Netdata will ignore the `of` parameter
and replace it with one of the dimensions you gave to `foreach`. This option allows you to
- [use dimension templates to create dynamic alarms](#use-dimension-templates-to-create-dynamic-alarms).
+ [use dimension templates to create dynamic alerts](#use-dimension-templates-to-create-dynamic-alerts).
The result of the lookup will be available as `$this` and `$NAME` in expressions.
The timestamps of the timeframe evaluated by the database lookup is available as variables
`$after` and `$before` (both are unix timestamps).
-#### Alarm line `calc`
+#### Alert line `calc`
A `calc` is designed to apply some calculation to the values or variables available to the entity. The result of the
calculation will be made available at the `$this` variable, overwriting the value from your `lookup`, to use in warning
@@ -512,9 +512,9 @@ The `calc` line uses [expressions](#expressions) for its syntax.
calc: EXPRESSION
```
-#### Alarm line `every`
+#### Alert line `every`
-Sets the update frequency of this alarm. This is the same to the `every DURATION` given
+Sets the update frequency of this alert. This is the same to the `every DURATION` given
in the `lookup` lines.
Format:
@@ -525,11 +525,11 @@ every: DURATION
`DURATION` accepts `s` for seconds, `m` is minutes, `h` for hours, `d` for days.
-#### Alarm lines `green` and `red`
+#### Alert lines `green` and `red`
Set the green and red thresholds of a chart. Both are available as `$green` and `$red` in expressions. If multiple
-alarms define different thresholds, the ones defined by the first alarm will be used. These will eventually visualized
-on the dashboard, so only one set of them is allowed. If you need multiple sets of them in different alarms, use
+alerts define different thresholds, the ones defined by the first alert will be used. Eventually it will be visualized
+on the dashboard, so only one set of them is allowed If you need multiple sets of them in different alerts, use
absolute numbers instead of `$red` and `$green`.
Format:
@@ -539,9 +539,9 @@ green: NUMBER
red: NUMBER
```
-#### Alarm lines `warn` and `crit`
+#### Alert lines `warn` and `crit`
-Define the expression that triggers either a warning or critical alarm. These are optional, and should evaluate to
+Define the expression that triggers either a warning or critical alert. These are optional, and should evaluate to
either true or false (or zero/non-zero).
The format uses Netdata's [expressions syntax](#expressions).
@@ -551,9 +551,9 @@ warn: EXPRESSION
crit: EXPRESSION
```
-#### Alarm line `to`
+#### Alert line `to`
-This will be the first parameter of the script to be executed when the alarm switches status. Its meaning is left up to
+This will be the first script parameter that will be executed when the alert changes its status. Its meaning is left up to
the `exec` script.
The default `exec` script, `alarm-notify.sh`, uses this field as a space separated list of roles, which are then
@@ -565,9 +565,9 @@ Format:
to: ROLE1 ROLE2 ROLE3 ...
```
-#### Alarm line `exec`
+#### Alert line `exec`
-The script that will be executed when the alarm changes status.
+Script to be executed when the alert status changes.
Format:
@@ -578,10 +578,10 @@ exec: SCRIPT
The default `SCRIPT` is Netdata's `alarm-notify.sh`, which supports all the notifications methods Netdata supports,
including custom hooks.
-#### Alarm line `delay`
+#### Alert line `delay`
This is used to provide optional hysteresis settings for the notifications, to defend against notification floods. These
-settings do not affect the actual alarm - only the time the `exec` script is executed.
+settings do not affect the actual alert - only the time the `exec` script is executed.
Format:
@@ -589,45 +589,45 @@ Format:
delay: [[[up U] [down D] multiplier M] max X]
```
-- `up U` defines the delay to be applied to a notification for an alarm that raised its status
+- `up U` defines the delay to be applied to a notification for an alert that raised its status
(i.e. CLEAR to WARNING, CLEAR to CRITICAL, WARNING to CRITICAL). For example, `up 10s`, the
notification for this event will be sent 10 seconds after the actual event. This is used in
- hope the alarm will get back to its previous state within the duration given. The default `U`
+ hope the alert will get back to its previous state within the duration given. The default `U`
is zero.
-- `down D` defines the delay to be applied to a notification for an alarm that moves to lower
+- `down D` defines the delay to be applied to a notification for an alert that moves to lower
state (i.e. CRITICAL to WARNING, CRITICAL to CLEAR, WARNING to CLEAR). For example, `down 1m`
will delay the notification by 1 minute. This is used to prevent notifications for flapping
- alarms. The default `D` is zero.
+ alerts. The default `D` is zero.
-- `multiplier M` multiplies `U` and `D` when an alarm changes state, while a notification is
+- `multiplier M` multiplies `U` and `D` when an alert changes state, while a notification is
delayed. The default multiplier is `1.0`.
-- `max X` defines the maximum absolute notification delay an alarm may get. The default `X`
+- `max X` defines the maximum absolute notification delay an alert may get. The default `X`
is `max(U * M, D * M)` (i.e. the max duration of `U` or `D` multiplied once with `M`).
Example:
`delay: up 10s down 15m multiplier 2 max 1h`
- The time is `00:00:00` and the status of the alarm is CLEAR.
+ The time is `00:00:00` and the status of the alert is CLEAR.
| time of event | new status | delay | notification will be sent | why |
- | ------------- | ---------- | --- | ------------------------- | --- |
+ |---------------|------------|---------------------|---------------------------|-------------------------------------------------------------------------------|
| 00:00:01 | WARNING | `up 10s` | 00:00:11 | first state switch |
- | 00:00:05 | CLEAR | `down 15m x2` | 00:30:05 | the alarm changes state while a notification is delayed, so it was multiplied |
+ | 00:00:05 | CLEAR | `down 15m x2` | 00:30:05 | the alert changes state while a notification is delayed, so it was multiplied |
| 00:00:06 | WARNING | `up 10s x2 x2` | 00:00:26 | multiplied twice |
| 00:00:07 | CLEAR | `down 15m x2 x2 x2` | 00:45:07 | multiplied 3 times. |
So:
- - `U` and `D` are multiplied by `M` every time the alarm changes state (any state, not just
+ - `U` and `D` are multiplied by `M` every time the alert changes state (any state, not just
their matching one) and a delay is in place.
- - All are reset to their defaults when the alarm switches state without a delay in place.
+ - All are reset to their defaults when the alert switches state without a delay in place.
-#### Alarm line `repeat`
+#### Alert line `repeat`
-Defines the interval between repeating notifications for the alarms in CRITICAL or WARNING mode. This will override the
+Defines the interval between repeating notifications for the alerts in CRITICAL or WARNING mode. This will override the
default interval settings inherited from health settings in `netdata.conf`. The default settings for repeating
notifications are `default repeat warning = DURATION` and `default repeat critical = DURATION` which can be found in
health stock configuration, when one of these interval is bigger than 0, Netdata will activate the repeat notification
@@ -639,14 +639,14 @@ Format:
repeat: [off] [warning DURATION] [critical DURATION]
```
-- `off`: Turns off the repeating feature for the current alarm. This is effective when the default repeat settings has
+- `off`: Tur