diff options
author | Chris Akritidis <43294513+cakrit@users.noreply.github.com> | 2023-02-17 12:20:08 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-02-17 12:20:08 -0800 |
commit | 1413b5bac327e8f90229361fbd9005aa0e139fa9 (patch) | |
tree | a61dfb806f26187428dfdce85b6ae773b4aad483 | |
parent | 851ce5a184abd4f38377d826635848093f022f4f (diff) |
Reorg learn 021723 (#14556)
* Change titles of agent alert notifications
* Reintroduce netdata for iot
* Eliminate guides category, merge health config docs
* Rename setup to configuration
* Codacy fixes and move health config reference
62 files changed, 390 insertions, 1208 deletions
diff --git a/aclk/README.md b/aclk/README.md index f4149eef4c..fc39032552 100644 --- a/aclk/README.md +++ b/aclk/README.md @@ -6,7 +6,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/aclk/README.md" sidebar_label: "Agent-Cloud link (ACLK)" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Agent-cloud link (ACLK) diff --git a/claim/README.md b/claim/README.md index 50591c9071..85c438210d 100644 --- a/claim/README.md +++ b/claim/README.md @@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/claim/README.md sidebar_label: "Connect Agent to Cloud" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Connect Agent to Cloud diff --git a/collectors/REFERENCE.md b/collectors/REFERENCE.md index 96c6057d35..f19533f213 100644 --- a/collectors/REFERENCE.md +++ b/collectors/REFERENCE.md @@ -4,7 +4,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/collectors/REFE sidebar_label: "Collectors configuration" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Collectors configuration reference diff --git a/database/README.md b/database/README.md index c23e74ec4c..3d4a89467b 100644 --- a/database/README.md +++ b/database/README.md @@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/database/README sidebar_label: "Database" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Database diff --git a/docs/cloud/insights/anomaly-advisor.md b/docs/cloud/insights/anomaly-advisor.md index 0e99522c4b..adce593a4d 100644 --- a/docs/cloud/insights/anomaly-advisor.md +++ b/docs/cloud/insights/anomaly-advisor.md @@ -32,8 +32,6 @@ To enable ML on your Netdata Agent, you need to edit the `[ml]` section in your At a minimum you just need to set `enabled = yes` to enable ML with default params. More details about configuration can be found in the [Netdata Agent ML docs](https://learn.netdata.cloud/docs/agent/ml#configuration). -**Note**: Follow [this guide](https://github.com/netdata/netdata/blob/master/docs/guides/step-by-step/step-04.md) if you are unfamiliar with making configuration changes in Netdata. - When you have finished your configuration, restart Netdata with a command like `sudo systemctl restart netdata` for the config changes to take effect. You can find more info on restarting Netdata [here](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md). After a brief delay, you should see the number of `trained` dimensions start to increase on the "dimensions" chart of the "Anomaly Detection" menu on the Overview page. By default the `minimum num samples to train = 3600` parameter means at least 1 hour of data is required to train initial models, but you could set this to `900` if you want to train initial models quicker but on less data. Over time, they will retrain on up to `maximum num samples to train = 14400` (4 hours by default), but you could increase this is you wanted to train on more data. diff --git a/docs/configure/common-changes.md b/docs/configure/common-changes.md index 933525f9a3..4a2ca7c68a 100644 --- a/docs/configure/common-changes.md +++ b/docs/configure/common-changes.md @@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/configure/ sidebar_label: "Common configuration changes" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Common configuration changes diff --git a/docs/configure/nodes.md b/docs/configure/nodes.md index 0b6ebe6ab3..9ed1c8966e 100644 --- a/docs/configure/nodes.md +++ b/docs/configure/nodes.md @@ -2,10 +2,10 @@ title: "Configure the Netdata Agent" description: "Netdata is zero-configuration for most users, but complex infrastructures may require you to tweak some of the Agent's granular settings." custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/configure/nodes.md" -sidebar_label: "Setup" +sidebar_label: "Configuration" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" sidebar_position: 30 --> diff --git a/docs/configure/secure-nodes.md b/docs/configure/secure-nodes.md index c3d6a33e80..5381fcc273 100644 --- a/docs/configure/secure-nodes.md +++ b/docs/configure/secure-nodes.md @@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/configure/ sidebar_label: "Secure your nodes" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Secure your nodes diff --git a/docs/export/enable-connector.md b/docs/export/enable-connector.md index 8efd61bde7..02e380e15d 100644 --- a/docs/export/enable-connector.md +++ b/docs/export/enable-connector.md @@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/export/ena sidebar_label: "Enable an exporting connector" learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Setup" +learn_rel_path: "Configuration" --> # Enable an exporting connector diff --git a/docs/guides/collect-apache-nginx-web-logs.md b/docs/guides/collect-apache-nginx-web-logs.md index 87137332c2..dc3afc045d 100644 --- a/docs/guides/collect-apache-nginx-web-logs.md +++ b/docs/guides/collect-apache-nginx-web-logs.md @@ -4,7 +4,7 @@ sidebar_label: "Monitor Nginx or Apache web server log files with Netdata" custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/collect-apache-nginx-web-logs.md learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Guides" +learn_rel_path: "Miscellaneous" --> # Monitor Nginx or Apache web server log files with Netdata @@ -120,12 +120,5 @@ You can also edit this file directly with `edit-config`: ./edit-config health.d/weblog.conf ``` -For more information about editing the defaults or writing new alarm entities, see our [health monitoring -documentation](https://github.com/netdata/netdata/blob/master/health/README.md). - -## What's next? - -Now that you have web log collection up and running, we recommend you take a look at the collector's [documentation](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md) for some ideas of how you can turn these rather "boring" logs into powerful real-time tools for keeping your servers happy. - -Don't forget to give GitHub user [Wing924](https://github.com/Wing924) a big 👍 for his hard work in starting up the Go -refactoring effort. +For more information about editing the defaults or writing new alarm entities, see our +[health monitoring documentation](https://github.com/netdata/netdata/blob/master/health/README.md). diff --git a/docs/guides/collect-unbound-metrics.md b/docs/guides/collect-unbound-metrics.md index e033af5cb5..c5f4deb518 100644 --- a/docs/guides/collect-unbound-metrics.md +++ b/docs/guides/collect-unbound-metrics.md @@ -5,7 +5,7 @@ date: 2020-03-31 custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/collect-unbound-metrics.md learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Guides" +learn_rel_path: "Miscellaneous" --> # Monitor Unbound DNS servers with Netdata diff --git a/docs/guides/configure/performance.md b/docs/guides/configure/performance.md index 820e8f5982..9782f9af68 100644 --- a/docs/guides/configure/performance.md +++ b/docs/guides/configure/performance.md @@ -6,7 +6,7 @@ image: /img/seo/guides/configure/performance.png custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/configure/performance.md learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Guides" +learn_rel_path: "Configuration" --> # How to optimize the Netdata Agent's performance diff --git a/docs/guides/longer-metrics-storage.md b/docs/guides/longer-metrics-storage.md deleted file mode 100644 index bea3b8a51f..0000000000 --- a/docs/guides/longer-metrics-storage.md +++ /dev/null @@ -1,162 +0,0 @@ -<!-- -title: "Netdata Longer Metrics Retention" -sidebar_label: "Netdata Longer Metrics Retention" -description: "" -custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/longer-metrics-storage.md -learn_status: "Published" -learn_topic_type: "Tasks" -learn_rel_path: "Guides" ---> - -# Netdata Longer Metrics Retention - -Metrics retention affects 3 parameters on the operation of a Netdata Agent: - -1. The disk space required to store the metrics. -2. The memory the Netdata Agent will require to have that retention available for queries. -3. The CPU resources that will be required to query longer time-frames. - -As retention increases, the resources required to support that retention increase too. - -Since Netdata Agents usually run at the edge, inside production systems, Netdata Agent **parents** should be considered. When having a **parent - child** setup, the child (the Netdata Agent running on a production system) delegates all its functions, including longer metrics retention and querying, to the parent node that can dedicate more resources to this task. A single Netdata Agent parent can centralize multiple children Netdata Agents (dozens, hundreds, or even thousands depending on its available resources). - - -## Ephemerality of metrics - -The ephemerality of metrics plays an important role in retention. In environments where metrics stop being collected and new metrics are constantly being generated, we are interested about 2 parameters: - -1. The **expected concurrent number of metrics** as an average for the lifetime of the database. - This affects mainly the storage requirements. - -2. The **expected total number of unique metrics** for the lifetime of the database. - This affects mainly the memory requirements for having all these metrics indexed and available to be queried. - -## Granularity of metrics - -The granularity of metrics (the frequency they are collected and stored, i.e. their resolution) is significantly affecting retention. - -Lowering the granularity from per second to every two seconds, will double their retention and half the CPU requirements of the Netdata Agent, without affecting disk space or memory requirements. - -## Which database mode to use - -Netdata Agents support multiple database modes. - -The default mode `[db].mode = dbengine` has been designed to scale for longer retentions. - -The other available database modes are designed to minimize resource utilization and should usually be considered on **parent - child** setups at the children side. - -So, - -* On a single node setup, use `[db].mode = dbengine` to increase retention. -* On a **parent - child** setup, use `[db].mode = dbengine` on the parent to increase retention and a more resource efficient mode (like `save`, `ram` or `none`) for the child to minimize resources utilization. - -To use `dbengine`, set this in `netdata.conf` (it is the default): - -``` -[db] - mode = dbengine -``` - -## Tiering - -`dbengine` supports tiering. Tiering allows having up to 3 versions of the data: - -1. Tier 0 is the high resolution data. -2. Tier 1 is the first tier that samples data every 60 data collections of Tier 0. -3. Tier 2 is the second tier that samples data every 3600 data collections of Tier 0 (60 of Tier 1). - -To enable tiering set `[db].storage tiers` in `netdata.conf` (the default is 1, to enable only Tier 0): - -``` -[db] - mode = dbengine - storage tiers = 3 -``` - -## Disk space requirements - -Netdata Agents require about 1 bytes on disk per database point on Tier 0 and 4 times more on higher tiers (Tier 1 and 2). They require 4 times more storage per point compared to Tier 0, because for every point higher tiers store `min`, `max`, `sum`, `count` and `anomaly rate` (the values are 5, but they require 4 times the storage because `count` and `anomaly rate` are 16-bit integers). The `average` is calculated on the fly at query time using `sum / count`. - -### Tier 0 - per second for a week - -For 2000 metrics, collected every second and retained for a week, Tier 0 needs: 1 byte x 2000 metrics x 3600 secs per hour x 24 hours per day x 7 days per week = 1100MB. - -The setting to control this is in `netdata.conf`: - -``` -[db] - mode = dbengine - - # per second data collection - update every = 1 - - # enable only Tier 0 - storage tiers = 1 - - # Tier 0, per second data for a week - dbengine multihost disk space MB = 1100 -``` - -By setting it to `1100` and restarting the Netdata Agent, this node will start maintaining about a week of data. But pay attention to the number of metrics. If you have more than 2000 metrics on a node, or you need more that a week of high resolution metrics, you may need to adjust this setting accordingly. - -### Tier 1 - per minute for a month - -Tier 1 is by default sampling the data every 60 points of Tier 0. If Tier 0 is per second, then Tier 1 is per minute. - -Tier 1 needs 4 times more storage per point compared to Tier 0. So, for 2000 metrics, with per minute resolution, retained for a month, Tier 1 needs: 4 bytes x 2000 metrics x 60 minutes per hour x 24 hours per day x 30 days per month = 330MB. - -Do this in `netdata.conf`: - -``` -[db] - mode = dbengine - - # per second data collection - update every = 1 - - # enable only Tier 0 and Tier 1 - storage tiers = 2 - - # Tier 0, per second data for a week - dbengine multihost disk space MB = 1100 - - # Tier 1, per minute data for a month - dbengine tier 1 multihost disk space MB = 330 -``` - -Once `netdata.conf` is edited, the Netdata Agent needs to be restarted for the changes to take effect. - -### Tier 2 - per hour for a year - -Tier 2 is by default sampling data every 3600 points of Tier 0 (60 of Tier 1). If Tier 0 is per second, then Tier 2 is per hour. - -The storage requirements are the same to Tier 1. - -For 2000 metrics, with per hour resolution, retained for a year, Tier 2 needs: 4 bytes x 2000 metrics x 24 hours per day x 365 days per year = 67MB. - -Do this in `netdata.conf`: - -``` -[db] - mode = dbengine - - # per second data collection - update every = 1 - - # enable only Tier 0 and Tier 1 - storage tiers = 3 - - # Tier 0, per second data for a week - dbengine multihost disk space MB = 1100 - - # Tier 1, per minute data for a month - dbengine tier 1 multihost disk space MB = 330 - - # Tier 2, per hour data for a year - dbengine tier 2 multihost disk space MB = 67 -``` - -Once `netdata.conf` is edited, the Netdata Agent needs to be restarted for the changes to take effect. - - - diff --git a/docs/guides/monitor-cockroachdb.md b/docs/guides/monitor-cockroachdb.md index bc3c47f939..0d5397c0c4 100644 --- a/docs/guides/monitor-cockroachdb.md +++ b/docs/guides/monitor-cockroachdb.md @@ -4,7 +4,7 @@ sidebar_label: "Monitor CockroachDB metrics with Netdata" custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor-cockroachdb.md learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Guides" +learn_rel_path: "Miscellaneous" --> # Monitor CockroachDB metrics with Netdata @@ -29,7 +29,6 @@ Let's dive in and walk through the process of monitoring CockroachDB metrics wit - [Configure the CockroachDB collector](#configure-the-cockroachdb-collector) - [Manual setup for a local CockroachDB database](#manual-setup-for-a-local-cockroachdb-database) - [Tweak CockroachDB alarms](#tweak-cockroachdb-alarms) - - [What's next?](#whats-next) ## Configure the CockroachDB collector @@ -117,23 +116,3 @@ cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /et ``` For more information about editing the defaults or writing new alarm entities, see our documentation on [configuring health alarms](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md). - -## What's next? - -Now that you're collecting metrics from your CockroachDB databases, let us know how it's working for you! There's always -room for improvement or refinement based on real-world use cases. Feel free to [file an -issue](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml) with -your -thoughts. - -Also, be sure to check out these useful resources: - -- [Netdata's CockroachDB documentation](https://github.com/netdata/go.d.plugin/blob/master/modules/cockroachdb/README.md) -- [Netdata's CockroachDB configuration](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/cockroachdb.conf) -- [Netdata's CockroachDB alarms](https://github.com/netdata/netdata/blob/29d9b5e51603792ee27ef5a21f1de0ba8e130158/health/health.d/cockroachdb.conf) -- [CockroachDB homepage](https://www.cockroachlabs.com/product/) -- [CockroachDB documentation](https://www.cockroachlabs.com/docs/stable/) -- [`_status/vars` endpoint docs](https://www.cockroachlabs.com/docs/stable/monitoring-and-alerting.html#prometheus-endpoint) -- [Monitor CockroachDB with Prometheus](https://www.cockroachlabs.com/docs/stable/monitor-cockroachdb-with-prometheus.html) - - diff --git a/docs/guides/monitor-hadoop-cluster.md b/docs/guides/monitor-hadoop-cluster.md index 0fb3b7c407..91282b9559 100644 --- a/docs/guides/monitor-hadoop-cluster.md +++ b/docs/guides/monitor-hadoop-cluster.md @@ -4,7 +4,7 @@ sidebar_label: "Monitor a Hadoop cluster with Netdata" custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor-hadoop-cluster.md learn_status: "Published" learn_topic_type: "Tasks" -learn_rel_path: "Guides" +learn_rel_path: "Miscellaneous" --> # Monitor a Hadoop cluster with Netdata @@ -188,20 +188,5 @@ sudo /etc/netdata/edit-config health.d/hdfs.conf sudo /etc/netdata/edit-config health.d/zookeeper.conf ``` -For more information about editing the defaults or writing new alarm entities, see our [health monitoring -documentation](https://github.com/netdata/netdata/blob/master/health/README.md). - -## What's next? - -If you're having issues with Netdata auto-detecting your HDFS/Zookeeper servers, or want to help improve how Netdata -collects or presents metrics from these services, feel free to [file an -issue](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml). - -- Read up on the [HDFS configuration - file](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/hdfs.conf) to understand how to configure - global options or per-job options, such as username/password, TLS certificates, timeouts, and mo |