summaryrefslogtreecommitdiffstats
path: root/health
diff options
context:
space:
mode:
authorJoel Hans <joel@netdata.cloud>2020-06-04 09:05:25 -0700
committerGitHub <noreply@github.com>2020-06-04 09:05:25 -0700
commitfecbb89d0c33e2bbe84aa14c0b3204cb60134218 (patch)
tree40b7581657e2bf13c0f72646c7f5137b4770172e /health
parent78ca668e50d88670f8aaf4d2434e325f705f975c (diff)
Move/refactor docs to accomodate new Guides section on Learn (#9266)
* Move directories and change verbiage to guide * Move health guides * Quick fix to collectors quickstart * Fix broken links * Remove health/tutorials dir * Fix links in collectors quickstart * Fix links to go.d pages
Diffstat (limited to 'health')
-rw-r--r--health/QUICKSTART.md4
-rw-r--r--health/README.md28
-rw-r--r--health/REFERENCE.md2
-rw-r--r--health/tutorials/dimension-templates.md178
-rw-r--r--health/tutorials/stop-notifications-alarms.md94
5 files changed, 9 insertions, 297 deletions
diff --git a/health/QUICKSTART.md b/health/QUICKSTART.md
index 57baa7ea04..e9418a46a6 100644
--- a/health/QUICKSTART.md
+++ b/health/QUICKSTART.md
@@ -37,7 +37,7 @@ cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /et
> You may need to use `sudo` or another method of elevating your privileges: `sudo ./edit-config health.d/cpu.conf`.
>
> You can also use the `$EDITOR` environment variable to use your preferred terminal editor with `edit-config`. See
-> [this page](/docs/step-by-step/step-04.md#use-edit-config-to-open-netdataconf) for details.
+> [this page](/docs/guides/step-by-step/step-04.md#use-edit-config-to-open-netdataconf) for details.
Each health configuration file contains one or more health entities, which always begin with an `alarm:` or `template:`
line. You can edit these entities based on your needs. To make any changes live, be sure to [reload your health
@@ -140,7 +140,7 @@ killall -USR2 netdata
To learn about all of Netdata's health configuration options, view the [reference guide](/health/REFERENCE.md).
-Or, get guided insights into specific health configurations with our [health tutorials](/health/README.md#tutorials).
+Or, get guided insights into specific health configurations with our [health guides](/health/README.md#guides).
Finally, move on to Netdata's [notification system](/health/notifications/README.md) to learn more about how Netdata can
let you know when the health of your systems or apps goes awry.
diff --git a/health/README.md b/health/README.md
index b3d26e31e0..aa84426a8c 100644
--- a/health/README.md
+++ b/health/README.md
@@ -17,35 +17,19 @@ operates like a health watchdog for your services and applications.
You can even run statistical algorithms against the metrics you've collected to power complex lookups. Your imagination,
and the needs of your infrastructure, are your only limits.
-**Take the next steps with health monitoring:**
+[Quickstart](/health/QUICKSTART.md)
-<DocsSteps>
+[Configuration reference](/health/REFERENCE.md)
-[<FiPlay /> Quickstart](/health/QUICKSTART.md)
-
-[<FiCode /> Configuration reference](/health/REFERENCE.md)
-
-[<FiBook /> Tutorials](#tutorials)
-
-</DocsSteps>
-
-## Tutorials
+## Guides
Every infrastructure is different, so we're not interested in mandating how you should configure Netdata's health
-monitoring features. Instead, these tutorials should give you the details you need to tweak alarms to your heart's
+monitoring features. Instead, these guides should give you the details you need to tweak alarms to your heart's
content.
-<DocsTutorials>
-<div>
-
-### Health entities
-
-[Stopping notifications for individual alarms](/health/tutorials/stop-notifications-alarms.md)
-
-[Use dimension templates to create dynamic alarms](/health/tutorials/dimension-templates.md)
+[Stopping notifications for individual alarms](/docs/guides/monitor/stop-notifications-alarms.md)
-</div>
-</DocsTutorials>
+[Use dimension templates to create dynamic alarms](/docs/guides/monitor/dimension-templates.md)
## Related features
diff --git a/health/REFERENCE.md b/health/REFERENCE.md
index 8d38bf2d99..8150c8b9ad 100644
--- a/health/REFERENCE.md
+++ b/health/REFERENCE.md
@@ -383,7 +383,7 @@ good idea to tell Netdata to not clear the notification, by using the `no-clear-
#### Alarm line `host labels`
-Defines the list of labels present on a host. See our [host labels tutorial](/docs/tutorials/using-host-labels.md) for
+Defines the list of labels present on a host. See our [host labels guide](/docs/guides/using-host-labels.md) for
an explanation of host labels and how to implement them.
For example, let's suppose that `netdata.conf` is configured with the following labels:
diff --git a/health/tutorials/dimension-templates.md b/health/tutorials/dimension-templates.md
deleted file mode 100644
index e22796a9ca..0000000000
--- a/health/tutorials/dimension-templates.md
+++ /dev/null
@@ -1,178 +0,0 @@
-<!--
----
-title: "Use dimension templates to create dynamic alarms"
-custom_edit_url: https://github.com/netdata/netdata/edit/master/health/tutorials/dimension-templates.md
----
--->
-
-# Use dimension templates to create dynamic alarms
-
-Your ability to monitor the health of your systems and applications relies on your ability to create and maintain
-the best set of alarms for your particular needs.
-
-In v1.18 of Netdata, we introduced **dimension templates** for alarms, which simplifies the process of writing [alarm
-entities](../REFERENCE.md#health-entity-reference) for charts with many dimensions.
-
-Dimension templates can condense many individual entities into one—no more copy-pasting one entity and changing the
-`alarm`/`template` and `lookup` lines for each dimension you'd like to monitor.
-
-They are, however, an advanced health monitoring feature. For more basic instructions on creating your first alarm,
-check out our [health monitoring documentation](../README.md), which also includes
-[examples](../REFERENCE.md#example-alarms).
-
-## The fundamentals of `foreach`
-
-Our dimension templates update creates a new `foreach` parameter to the existing [`lookup`
-line](../REFERENCE.md#alarm-line-lookup). This is where the magic happens.
-
-You use the `foreach` parameter to specify which dimensions you want to monitor with this single alarm. You can separate
-them with a comma (`,`) or a pipe (`|`). You can also use a [Netdata simple pattern](../../libnetdata/simple_pattern/README.md)
-to create many alarms with a regex-like syntax.
-
-The `foreach` parameter _has_ to be the last parameter in your `lookup` line, and if you have both `of` and `foreach` in
-the same `lookup` line, Netdata will ignore the `of` parameter and use `foreach` instead.
-
-Let's get into some examples so you can see how the new parameter works.
-
-> ⚠️ The following entities are examples to showcase the functionality and syntax of dimension templates. They are not
-> meant to be run as-is on production systems.
-
-## Condensing entities with `foreach`
-
-Let's say you want to monitor the `system`, `user`, and `nice` dimensions in your system's overall CPU utilization.
-Before dimension templates, you would need the following three entities:
-
-```yaml
- alarm: cpu_system
- on: system.cpu
-lookup: average -10m percentage of system
- every: 1m
- warn: $this > 50
- crit: $this > 80
-
- alarm: cpu_user
- on: system.cpu
-lookup: average -10m percentage of user
- every: 1m
- warn: $this > 50
- crit: $this > 80
-
- alarm: cpu_nice
- on: system.cpu
-lookup: average -10m percentage of nice
- every: 1m
- warn: $this > 50
- crit: $this > 80
-```
-
-With dimension templates, you can condense these into a single alarm. Take note of the `alarm` and `lookup` lines.
-
-```yaml
- alarm: cpu_template
- on: system.cpu
-lookup: average -10m percentage foreach system,user,nice
- every: 1m
- warn: $this > 50
- crit: $this > 80
-```
-
-The `alarm` line specifies the naming scheme Netdata will use. You can use whatever naming scheme you'd like, with `.`
-and `_` being the only allowed symbols.
-
-The `lookup` line has changed from `of` to `foreach`, and we're now passing three dimensions.
-
-In this example, Netdata will create three alarms with the names `cpu_template_system`, `cpu_template_user`, and
-`cpu_template_nice`. Every minute, each alarm will use the same database query to calculate the average CPU usage for
-the `system`, `user`, and `nice` dimensions over the last 10 minutes and send out alarms if necessary.
-
-You can find these three alarms active by clicking on the **Alarms** button in the top navigation, and then clicking on
-the **All** tab and scrolling to the **system - cpu** collapsible section.
-
-![Three new alarms created from the dimension template](https://user-images.githubusercontent.com/1153921/66218994-29523800-e67f-11e9-9bcb-9bca23e2c554.png)
-
-Let's look at some other examples of how `foreach` works so you can best apply it in your configurations.
-
-### Using a Netdata simple pattern in `foreach`
-
-In the last example, we used `foreach system,user,nice` to create three distinct alarms using dimension templates. But
-what if you want to quickly create alarms for _all_ the dimensions of a given chart?
-
-Use a [simple pattern](../../libnetdata/simple_pattern/README.md)! One example of a simple pattern is a single wildcard
-(`*`).
-
-Instead of monitoring system CPU usage, let's monitor per-application CPU usage using the `apps.cpu` chart. Passing a
-wildcard as the simple pattern tells Netdata to create a separate alarm for _every_ process on your system:
-
-```yaml
- alarm: app_cpu
- on: apps.cpu
-lookup: average -10m percentage foreach *
- every: 1m
- warn: $this > 50
- crit: $this > 80
-```
-
-This entity will now create alarms for every dimension in the `apps.cpu` chart. Given that most `apps.cpu` charts have
-10 or more dimensions, using the wildcard ensures you catch every CPU-hogging process.
-
-To learn more about how to use simple patterns with dimension templates, see our [simple patterns
-documentation](../../libnetdata/simple_pattern/README.md).
-
-## Using `foreach` with alarm templates
-
-Dimension templates also work with [alarm templates](../REFERENCE.md#alarm-line-alarm-or-template). Alarm
-templates help you create alarms for all the charts with a given context—for example, all the cores of your system's
-CPU.
-
-By combining the two, you can create dozens of individual alarms with a single template entity. Here's how you would
-create alarms for the `system`, `user`, and `nice` dimensions for every chart in the `cpu.cpu` context—or, in other
-words, every CPU core.
-
-```yaml
-template: cpu_template
- on: cpu.cpu
- lookup: average -10m percentage foreach system,user,nice
- every: 1m
- warn: $this > 50
- crit: $this > 80
-```
-
-On a system with a 6-core, 12-thread Ryzen 5 1600 CPU, this one entity creates alarms on the following charts and
-dimensions:
-
-- `cpu.cpu0`
- - `cpu_template_user`
- - `cpu_template_system`
- - `cpu_template_nice`
-- `cpu.cpu1`
- - `cpu_template_user`
- - `cpu_template_system`
- - `cpu_template_nice`
-- `cpu.cpu2`
- - `cpu_template_user`
- - `cpu_template_system`
- - `cpu_template_nice`
-- ...
-- `cpu.cpu11`
- - `cpu_template_user`
- - `cpu_template_system`
- - `cpu_template_nice`
-
-And how just a few of those dimension template-generated alarms look like in the Netdata dashboard.
-
-![A few of the created alarms in the Netdata dashboard](https://user-images.githubusercontent.com/1153921/66219669-708cf880-e680-11e9-8b3a-7bfe178fa28b.png)
-
-All in all, this single entity creates 36 individual alarms. Much easier than writing 36 separate entities in your
-health configuration files!
-
-## What's next?
-
-We hope you're excited about the possibilities of using dimension templates! Maybe they'll inspire you to build new
-alarms that will help you better monitor the health of your systems.
-
-Or, at the very least, simplify your configuration files.
-
-For information about other advanced features in Netdata's health monitoring toolkit, check out our [health
-documentation](../../health/). And if you have some cool alarms you built using dimension templates,
-
-[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fhealth%2Ftutorials%2dimension-templates&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/health/tutorials/stop-notifications-alarms.md b/health/tutorials/stop-notifications-alarms.md
deleted file mode 100644
index 59ea3b9edc..0000000000
--- a/health/tutorials/stop-notifications-alarms.md
+++ /dev/null
@@ -1,94 +0,0 @@
-<!--
----
-title: "Stop notifications for individual alarms"
-custom_edit_url: https://github.com/netdata/netdata/edit/master/health/tutorials/stop-notifications-alarms.md
----
--->
-
-# Stop notifications for individual alarms
-
-In this short tutorial, you'll learn how to stop notifications for individual alarms in Netdata's health
-monitoring system. We also refer to this process as _silencing_ the alarm.
-
-Why silence alarms? We designed Netdata's pre-configured alarms for production systems, so they might not be
-relevant if you run Netdata on your laptop or a small virtual server. If they're not helpful, they can be a distraction
-to real issues with health and performance.
-
-Silencing individual alarms is an excellent solution for situations where you're not interested in seeing a specific
-alarm but don't want to disable a [notification system](../notifications/README.md) entirely.
-
-## Find the alarm configuration file
-
-To silence an alarm, you need to know where to find its configuration file.
-
-Let's use the `system.cpu` chart as an example. It's the first chart you'll see on most Netdata dashboards.
-
-To figure out which file you need to edit, open up Netdata's dashboard and, click the **Alarms** button at the top
-of the dashboard, followed by clicking on the **All** tab.
-
-In this example, we're looking for the `system - cpu` entity, which, when opened, looks like this:
-
-![The system - cpu alarm
-entity](https://user-images.githubusercontent.com/1153921/67034648-ebb4cc80-f0cc-11e9-9d49-1023629924f5.png)
-
-In the `source` row, you see that this chart is getting its configuration from
-`4@/usr/lib/netdata/conf.d/health.d/cpu.conf`. The relevant part of begins at `health.d`: `health.d/cpu.conf`. That's
-the file you need to edit if you want to silence this alarm.
-
-For more information about editing or referencing health configuration files on your system, see the [health
-quickstart](../QUICKSTART.md#edit-health-configuration-files).
-
-## Edit the file to enable silencing
-
-To edit `health.d/cpu.conf`, use `edit-config` from inside of your Netdata configuration directory.
-
-```bash
-cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /etc/netdata/
-./edit-config health.d/cpu.conf
-```
-
-> You may need to use `sudo` or another method of elevating your privileges.
-
-The beginning of the file looks like this:
-
-```yaml
-template: 10min_cpu_usage
- on: system.cpu
- os: linux
- hosts: *
- lookup: average -10m unaligned of user,system,softirq,irq,guest
- units: %
- every: 1m
- warn: $this > (($status >= $WARNING) ? (75) : (85))
- crit: $this > (($status == $CRITICAL) ? (85) : (95))
- delay: down 15m multiplier 1.5 max 1h
- info: average cpu utilization for the last 10 minutes (excluding iowait, nice and steal)
- to: sysadmin
-```
-
-To silence this alarm, change `sysadmin` to `silent`.
-
-```yaml
- to: silent
-```
-
-Use `killall -USR2 netdata` to reload your health configuration and ensure you get no more notifications about that
-alarm.
-
-You can add `to: silence` to any alarm you'd rather not bother you with notifications.
-
-## What's next?
-
-You should now know the fundamentals behind silencing any individual alarm in Netdata.
-
-To learn about _all_ of Netdata's health configuration possibilities, visit the [health reference
-guide](../REFERENCE.md), or check out other [tutorials on health monitoring](../README.md#tutorials).
-
-Or, take better control over how you get notified about alarms via the [notification
-system](../notifications/README.md).
-
-You can also use Netdata's [Health Management API](../../web/api/health/README.md#health-management-api) to control
-health checks and notifications while Netdata runs. With this API, you can disable health checks during a maintenance
-window or backup process, for example.
-
-[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fhealth%2Ftutorials%2Fstop-notifications-alarms%2F&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)