summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorthiagoftsm <thiagoftsm@gmail.com>2024-01-17 13:41:52 +0000
committerGitHub <noreply@github.com>2024-01-17 15:41:52 +0200
commitaae9a7ea6365aa596ef7e95721ed1d85ec337c3f (patch)
treef9aad7a3d5de617af0fb7cb4e834440546b2039a
parent9bb7a0e605b09afdf5f7b1f804a394d2f5aab974 (diff)
Update replication documentation (#16778)
Co-authored-by: ilyam8 <ilya@netdata.cloud>
-rw-r--r--streaming/README.md45
-rw-r--r--streaming/stream.conf8
2 files changed, 41 insertions, 12 deletions
diff --git a/streaming/README.md b/streaming/README.md
index 153286e5fb..fae02cf715 100644
--- a/streaming/README.md
+++ b/streaming/README.md
@@ -52,8 +52,8 @@ node**. This file is automatically generated by Netdata the first time it is sta
|-----------------------------------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `enabled` | `no` | Whether this API KEY enabled or disabled. |
| [`allow from`](#allow-from) | `*` | A space-separated list of [Netdata simple patterns](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) matching the IPs of nodes that will stream metrics using this API key. [Read more &rarr;](#allow-from) |
-| `default history` | `3600` | The default amount of child metrics history to retain when using the `ram` memory mode. |
-| [`default memory mode`](#default-memory-mode) | `ram` | The [database](https://github.com/netdata/netdata/blob/master/database/README.md) to use for all nodes using this `API_KEY`. Valid settings are `dbengine`, `ram`, or `none`. [Read more &rarr;](#default-memory-mode) |
+| `default history` | `3600` | The default amount of child metrics history to retain when using the `ram` memory mode. |
+| [`default memory mode`](#default-memory-mode) | `ram` | The [database](https://github.com/netdata/netdata/blob/master/database/README.md) to use for all nodes using this `API_KEY`. Valid settings are `dbengine`, `ram`, or `none`. [Read more &rarr;](#default-memory-mode) |
| `health enabled by default` | `auto` | Whether alerts and notifications should be enabled for nodes using this `API_KEY`. `auto` enables alerts when the child is connected. `yes` enables alerts always, and `no` disables alerts. |
| `default postpone alarms on connect seconds` | `60` | Postpone alerts and notifications for a period of time after the child connects. |
| `default health log history` | `432000` | History of health log events (in seconds) kept in the database. |
@@ -61,6 +61,11 @@ node**. This file is automatically generated by Netdata the first time it is sta
| `default proxy destination` | | Space-separated list of `IP:PORT` for proxies. |
| `default proxy api key` | | The `API_KEY` of the proxy. |
| `default send charts matching` | `*` | See [`send charts matching`](#send-charts-matching). |
+| `enable compression` | `yes` | Enable/disable stream compression. |
+| `enable replication` | `yes` | Enable/disable replication. |
+| `seconds to replicate` | `86400` | How many seconds of data to replicate from each child at a time |
+| `seconds per replication step` | `600` | The duration we want to replicate per each replication step. |
+| `is ephemeral node` | `no` | Indicate whether this child is an ephemeral node. An ephemeral node will become unavailable after the specified duration of "cleanup ephemeral hosts after secs" from the time of the node's last connection. |
#### `destination`
@@ -143,13 +148,13 @@ cache size` and `dbengine multihost disk space` settings in the `[global]` secti
### `netdata.conf`
-| Setting | Default | Description |
-|--------------------------------------------|-------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `[global]` section | | |
+| Setting | Default | Description |
+|--------------------------------------------|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `[global]` section | | |
| `memory mode` | `dbengine` | Determines the [database type](https://github.com/netdata/netdata/blob/master/database/README.md) to be used on that node. Other options settings include `none`, and `ram`. `none` disables the database at this host. This also disables alerts and notifications, as those can't run without a database. |
-| `[web]` section | | |
-| `mode` | `static-threaded` | Determines the [web server](https://github.com/netdata/netdata/blob/master/web/server/README.md) type. The other option is `none`, which disables the dashboard, API, and registry. |
-| `accept a streaming request every seconds` | `0` | Set a limit on how often a parent node accepts streaming requests from child nodes. `0` equals no limit. If this is set, you may see `... too busy to accept new streaming request. Will be allowed in X secs` in Netdata's `error.log`. |
+| `[web]` section | | |
+| `mode` | `static-threaded` | Determines the [web server](https://github.com/netdata/netdata/blob/master/web/server/README.md) type. The other option is `none`, which disables the dashboard, API, and registry. |
+| `accept a streaming request every seconds` | `0` | Set a limit on how often a parent node accepts streaming requests from child nodes. `0` equals no limit. If this is set, you may see `... too busy to accept new streaming request. Will be allowed in X secs` in Netdata's `error.log`. |
### Basic use cases
@@ -459,6 +464,30 @@ In addition, edit `netdata.conf` on each child node to disable the database and
enabled = no
```
+## Replication
+
+Netdata streaming automatically replicates data from child nodes to parent nodes, ensuring that the parent node has a complete and up-to-date view of all metrics.
+This replication process ensures data continuity even if child nodes temporarily disconnect.
+
+Replication is enabled by default in Netdata, but you can customize the replication behavior by modifying the `[API_KEY]` section of the `stream.conf` file. Here's an example configuration:
+
+```conf
+[11111111-2222-3333-4444-555555555555]
+ # Enable replication for all hosts using this api key. Default: yes.
+ enable replication = yes
+
+ # How many seconds of data to replicate from each child at a time. Default: a day (86400 seconds).
+ seconds to replicate = 86400
+
+ # The duration we want to replicate per each replication step. Default: 600 seconds (10 minutes).
+ seconds per replication step = 600
+```
+
+You can monitor the replication process in two ways:
+
+1. **Netdata Monitoring**: access the Netdata Monitoring section and look for the Replication charts.
+2. **Streaming Function**: use the Streaming function (Top) to see the replication status of children nodes. This function provides real-time insights into the replication status of each child node.
+
## Troubleshooting
Both parent and child nodes log information at `/var/log/netdata/error.log`.
diff --git a/streaming/stream.conf b/streaming/stream.conf
index 36213af022..b8343e5aa0 100644
--- a/streaming/stream.conf
+++ b/streaming/stream.conf
@@ -181,12 +181,12 @@
#seconds to replicate = 86400
# The duration we want to replicate per each step.
- #replication_step = 600
+ #seconds per replication step = 600
# Indicate whether this child is an ephemeral node. An ephemeral node will become unavailable
# after the specified duration of "cleanup ephemeral hosts after secs" (as defined in the db section of netdata.conf)
# from the time of the node's last connection.
- #is ephemeral node = false
+ #is ephemeral node = no
# -----------------------------------------------------------------------------
# 3. PER SENDING HOST SETTINGS, ON PARENT NETDATA
@@ -257,9 +257,9 @@
#seconds to replicate = 86400
# The duration we want to replicate per each step.
- #replication_step = 600
+ #seconds per replication step = 600
# Indicate whether this child is an ephemeral node. An ephemeral node will become unavailable
# after the specified duration of "cleanup ephemeral hosts after secs" (as defined in the db section of netdata.conf)
# from the time of the node's last connection.
- #is ephemeral node = false
+ #is ephemeral node = no