diff options
author | Costa Tsaousis <costa@netdata.cloud> | 2022-06-28 17:04:37 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-06-28 17:04:37 +0300 |
commit | c3dfbe52a61dd0d1995bc420b0e0576cf058fd74 (patch) | |
tree | 193bfe3de88bff1a8effb9dd062a96beda8d16c6 /daemon | |
parent | e86cec2631c961b434031e2e09597701a9ec53f8 (diff) |
netdata doubles (#13217)
* netdata doubles
* fix cmocka test
* fix cmocka test again
* fix left-overs of long double to NETDATA_DOUBLE
* RRDDIM detached from disk representation; db settings in [db] section of netdata.conf
* update the memory before saving
* rrdset is now detached from file structures too
* on memory mode map, update the memory mapped structures on every iteration
* allow RRD_ID_LENGTH_MAX to be changed
* granularity secs, back to update every
* fix formatting
* more formatting
Diffstat (limited to 'daemon')
-rw-r--r-- | daemon/config/README.md | 78 | ||||
-rw-r--r-- | daemon/global_statistics.c | 2 | ||||
-rw-r--r-- | daemon/main.c | 135 | ||||
-rw-r--r-- | daemon/unit_test.c | 251 |
4 files changed, 279 insertions, 187 deletions
diff --git a/daemon/config/README.md b/daemon/config/README.md index 72f688543a..a44d31e28c 100644 --- a/daemon/config/README.md +++ b/daemon/config/README.md @@ -26,20 +26,21 @@ the [web server access lists](/web/server/README.md#access-lists). `netdata.conf` has sections stated with `[section]`. You will see the following sections: 1. `[global]` to [configure](#global-section-options) the [Netdata daemon](/daemon/README.md). -2. `[directories]` to [configure](#directories-section-options) the directories used by Netdata. -3. `[logs]` to [configure](#logs-section-options) the Netdata logging. -4. `[environment variables]` to [configure](#environment-variables-section-options) the environment variables used +2. `[db]` to [configure](#db-section-options) the database of Netdata. +3. `[directories]` to [configure](#directories-section-options) the directories used by Netdata. +4. `[logs]` to [configure](#logs-section-options) the Netdata logging. +5. `[environment variables]` to [configure](#environment-variables-section-options) the environment variables used Netdata. -5. `[sqlite]` to [configure](#sqlite-section-options) the [Netdata daemon](/daemon/README.md) SQLite settings. -6. `[ml]` to configure settings for [machine learning](/ml/README.md). -7. `[health]` to [configure](#health-section-options) general settings for [health monitoring](/health/README.md). -8. `[web]` to [configure the web server](/web/server/README.md). -9. `[registry]` for the [Netdata registry](/registry/README.md). -10. `[global statistics]` for the [Netdata registry](/registry/README.md). -11. `[statsd]` for the general settings of the [stats.d.plugin](/collectors/statsd.plugin/README.md). -12. `[plugins]` to [configure](#plugins-section-options) which [collectors](/collectors/README.md) to use and PATH +6. `[sqlite]` to [configure](#sqlite-section-options) the [Netdata daemon](/daemon/README.md) SQLite settings. +7. `[ml]` to configure settings for [machine learning](/ml/README.md). +8. `[health]` to [configure](#health-section-options) general settings for [health monitoring](/health/README.md). +9. `[web]` to [configure the web server](/web/server/README.md). +10. `[registry]` for the [Netdata registry](/registry/README.md). +11. `[global statistics]` for the [Netdata registry](/registry/README.md). +12. `[statsd]` for the general settings of the [stats.d.plugin](/collectors/statsd.plugin/README.md). +13. `[plugins]` to [configure](#plugins-section-options) which [collectors](/collectors/README.md) to use and PATH settings. -13. `[plugin:NAME]` sections for each collector plugin, under the +14. `[plugin:NAME]` sections for each collector plugin, under the comment [Per plugin configuration](#per-plugin-configuration). The configuration file is a `name = value` dictionary. Netdata will not complain if you set options unknown to it. When @@ -67,30 +68,35 @@ Please note that your data history will be lost if you have modified `history` p ### [global] section options -| setting | default | info | -|:-------------------------------------:|:------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| process scheduling policy | `keep` | See [Netdata process scheduling policy](/daemon/README.md#netdata-process-scheduling-policy) | -| OOM score | `0` | | -| glibc malloc arena max for plugins | `1` | See [Virtual memory](/daemon/README.md#virtual-memory). | -| glibc malloc arena max for Netdata | `1` | See [Virtual memory](/daemon/README.md#virtual-memory). | -| hostname | auto-detected | The hostname of the computer running Netdata. | -| history | `3996` | Used with `memory mode = save/map/ram/alloc`, not the default `memory mode = dbengine`. This number reflects the number of entries the `netdata` daemon will by default keep in memory for each chart dimension. This setting can also be configured per chart. Check [Memory Requirements](/database/README.md) for more information. | -| update every | `1` | The frequency in seconds, for data collection. For more information see the [performance guide](/docs/guides/configure/performance.md). | -| memory mode | `dbengine` | `dbengine`: The default for long-term metrics storage with efficient RAM and disk usage. Can be extended with `page cache size` and `dbengine disk space`. <br />`save`: Netdata will save its round robin database on exit and load it on startup. <br />`map`: Cache files will be updated in real-time. Not ideal for systems with high load or slow disks (check `man mmap`). <br />`ram`: The round-robin database will be temporary and it will be lost when Netdata exits. <br />`none`: Disables the database at this host, and disables health monitoring entirely, as that requires a database of metrics. | -| page cache size | 32 | Determines the amount of RAM in MiB that is dedicated to caching Netdata metric values. | -| dbengine disk space | 256 | Determines the amount of disk space in MiB that is dedicated to storing Netdata metric values and all related metadata describing them. | -| dbengine multihost disk space | 256 | Same functionality as `dbengine disk space`, but includes support for storing metrics streamed to a parent node by its children. Can be used in single-node environments as well. | -| host access prefix | | This is used in docker environments where /proc, /sys, etc have to be accessed via another path. You may also have to set SYS_PTRACE capability on the docker for this work. Check [issue 43](https://github.com/netdata/netdata/issues/43). | -| memory deduplication (ksm) | `yes` | When set to `yes`, Netdata will offer its in-memory round robin database to kernel same page merging (KSM) for deduplication. For more information check [Memory Deduplication - Kernel Same Page Merging - KSM](/database/README.md#ksm) | -| timezone | auto-detected | The timezone retrieved from the environment variable | -| run as user | `netdata` | The user Netdata will run as. | -| pthread stack size | auto-detected | | -| cleanup obsolete charts after seconds | `3600` | See [monitoring ephemeral containers](/collectors/cgroups.plugin/README.md#monitoring-ephemeral-containers), also sets the timeout for cleaning up obsolete dimensions | -| gap when lost iterations above | `1` | | -| cleanup orphan hosts after seconds | `3600` | How long to wait until automatically removing from the DB a remote Netdata host (child) that is no longer sending data. | -| delete obsolete charts files | `yes` | See [monitoring ephemeral containers](/collectors/cgroups.plugin/README.md#monitoring-ephemeral-containers), also affects the deletion of files for obsolete dimensions | -| delete orphan hosts files | `yes` | Set to `no` to disable non-responsive host removal. | -| enable zero metrics | `no` | Set to `yes` to show charts when all their metrics are zero. | +| setting | default | info | +|:-------------------------------------:|:-------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| process scheduling policy | `keep` | See [Netdata process scheduling policy](/daemon/README.md#netdata-process-scheduling-policy) | +| OOM score | `0` | | +| glibc malloc arena max for plugins | `1` | See [Virtual memory](/daemon/README.md#virtual-memory). | +| glibc malloc arena max for Netdata | `1` | See [Virtual memory](/daemon/README.md#virtual-memory). | +| hostname | auto-detected | The hostname of the computer running Netdata. | +| host access prefix | empty | This is used in docker environments where /proc, /sys, etc have to be accessed via another path. You may also have to set SYS_PTRACE capability on the docker for this work. Check [issue 43](https://github.com/netdata/netdata/issues/43). | +| timezone | auto-detected | The timezone retrieved from the environment variable | +| run as user | `netdata` | The user Netdata will run as. | +| pthread stack size | auto-detected | | + +### [db] section options + +| setting | default | info | +|:----------------------------------:|:----------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| mode | `dbengine` | `dbengine`: The default for long-term metrics storage with efficient RAM and disk usage. Can be extended with `page cache size MB` and `dbengine disk space MB`. <br />`save`: Netdata will save its round robin database on exit and load it on startup. <br />`map`: Cache files will be updated in real-time. Not ideal for systems with high load or slow disks (check `man mmap`). <br />`ram`: The round-robin database will be temporary and it will be lost when Netdata exits. <br />`none`: Disables the database at this host, and disables health monitoring entirely, as that requires a database of metrics. | +| retention | `3600` | Used with `mode = save/map/ram/alloc`, not the default `mode = dbengine`. This number reflects the number of entries the `netdata` daemon will by default keep in memory for each chart dimension. Check [Memory Requirements](/database/README.md) for more information. | +| update every | `1` | The frequency in seconds, for data collection. For more information see the [performance guide](/docs/guides/configure/performance.md). | +| page cache size MB | 32 | Determines the amount of RAM in MiB that is dedicated to caching Netdata metric values. | +| dbengine disk space MB | 256 | Determines the amount of disk space in MiB that is dedicated to storing Netdata metric values and all related metadata describing them. | +| dbengine multihost disk space MB | 256 | Same functionality as `dbengine disk space MB`, but includes support for storing metrics streamed to a parent node by its children. Can be used in single-node environments as well. | +| memory deduplication (ksm) | `yes` | When set to `yes`, Netdata will offer its in-memory round robin database and the dbengine page cache to kernel same page merging (KSM) for deduplication. For more information check [Memory Deduplication - Kernel Same Page Merging - KSM](/database/README.md#ksm) | +| cleanup obsolete charts after secs | `3600` | See [monitoring ephemeral containers](/collectors/cgroups.plugin/README.md#monitoring-ephemeral-containers), also sets the timeout for cleaning up obsolete dimensions | +| gap when lost iterations above | `1` | | +| cleanup orphan hosts after secs | `3600` | How long to wait until automatically removing from the DB a remote Netdata host (child) that is no longer sending data. | +| delete obsolete charts files | `yes` | See [monitoring ephemeral containers](/collectors/cgroups.plugin/README.md#monitoring-ephemeral-containers), also affects the deletion of files for obsolete dimensions | +| delete orphan hosts files | `yes` | Set to `no` to disable non-responsive host removal. | +| enable zero metrics | `no` | Set to `yes` to show charts when all their metrics are zero. | ### [directories] section options diff --git a/daemon/global_statistics.c b/daemon/global_statistics.c index b5740b176f..1d268ce81f 100644 --- a/daemon/global_statistics.c +++ b/daemon/global_statistics.c @@ -1146,7 +1146,7 @@ static void workers_utilization_update_chart(struct worker_utilization *wu) { if(wu->workers_cpu_registered == 0) rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, 0); else - rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, (collected_number)( wu->workers_cpu_total * 10000ULL / (calculated_number)wu->workers_cpu_registered )); + rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, (collected_number)( wu->workers_cpu_total * 10000ULL / (NETDATA_DOUBLE)wu->workers_cpu_registered )); rrdset_done(wu->st_workers_cpu); } diff --git a/daemon/main.c b/daemon/main.c index 48d22f00c0..f2714e08eb 100644 --- a/daemon/main.c +++ b/daemon/main.c @@ -522,6 +522,58 @@ static void backwards_compatible_config() { config_move(CONFIG_SECTION_STATSD, "enabled", CONFIG_SECTION_PLUGINS, "statsd"); + + config_move(CONFIG_SECTION_GLOBAL, "memory mode", + CONFIG_SECTION_DB, "mode"); + + config_move(CONFIG_SECTION_GLOBAL, "history", + CONFIG_SECTION_DB, "retention"); + + config_move(CONFIG_SECTION_GLOBAL, "update every", + CONFIG_SECTION_DB, "update every"); + + config_move(CONFIG_SECTION_GLOBAL, "page cache size", + CONFIG_SECTION_DB, "page cache size MB"); + + config_move(CONFIG_SECTION_GLOBAL, "page cache uses malloc", + CONFIG_SECTION_DB, "page cache with malloc"); + + config_move(CONFIG_SECTION_GLOBAL, "dbengine disk space", + CONFIG_SECTION_DB, "dbengine disk space MB"); + + config_move(CONFIG_SECTION_GLOBAL, "dbengine multihost disk space", + CONFIG_SECTION_DB, "dbengine multihost disk space MB"); + + config_move(CONFIG_SECTION_GLOBAL, "memory deduplication (ksm)", + CONFIG_SECTION_DB, "memory deduplication (ksm)"); + + config_move(CONFIG_SECTION_GLOBAL, "dbengine page fetch timeout", + CONFIG_SECTION_DB, "dbengine page fetch timeout secs"); + + config_move(CONFIG_SECTION_GLOBAL, "dbengine page fetch retries", + CONFIG_SECTION_DB, "dbengine page fetch retries"); + + config_move(CONFIG_SECTION_GLOBAL, "dbengine extent pages", + CONFIG_SECTION_DB, "dbengine pages per extent"); + + config_move(CONFIG_SECTION_GLOBAL, "cleanup obsolete charts after seconds", + CONFIG_SECTION_DB, "cleanup obsolete charts after secs"); + + config_move(CONFIG_SECTION_GLOBAL, "gap when lost iterations above", + CONFIG_SECTION_DB, "gap when lost iterations above"); + + config_move(CONFIG_SECTION_GLOBAL, "cleanup orphan hosts after seconds", + CONFIG_SECTION_DB, "cleanup orphan hosts after secs"); + + config_move(CONFIG_SECTION_GLOBAL, "delete obsolete charts files", + CONFIG_SECTION_DB, "delete obsolete charts files"); + + config_move(CONFIG_SECTION_GLOBAL, "delete orphan hosts files", + CONFIG_SECTION_DB, "delete orphan hosts files"); + + config_move(CONFIG_SECTION_GLOBAL, "enable zero metrics", + CONFIG_SECTION_DB, "enable zero metrics"); + } static void get_netdata_configured_variables() { @@ -539,28 +591,40 @@ static void get_netdata_configured_variables() { debug(D_OPTIONS, "hostname set to '%s'", netdata_configured_hostname); // ------------------------------------------------------------------------ - // get default database size - - default_rrd_history_entries = (int) config_get_number(CONFIG_SECTION_GLOBAL, "history", align_entries_to_pagesize(default_rrd_memory_mode, RRD_DEFAULT_HISTORY_ENTRIES)); + // get default database update frequency - long h = align_entries_to_pagesize(default_rrd_memory_mode, default_rrd_history_entries); - if(h != default_rrd_history_entries) { - config_set_number(CONFIG_SECTION_GLOBAL, "history", h); - default_rrd_history_entries = (int)h; + default_rrd_update_every = (int) config_get_number(CONFIG_SECTION_DB, "update every", UPDATE_EVERY); + if(default_rrd_update_every < 1 || default_rrd_update_every > 600) { + error("Invalid data collection frequency (update every) %d given. Defaulting to %d.", default_rrd_update_every, UPDATE_EVERY); + default_rrd_update_every = UPDATE_EVERY; + config_set_number(CONFIG_SECTION_DB, "update every", default_rrd_update_every); } - if(default_rrd_history_entries < 5 || default_rrd_history_entries > RRD_HISTORY_ENTRIES_MAX) { - error("Invalid history entries %d given. Defaulting to %d.", default_rrd_history_entries, RRD_DEFAULT_HISTORY_ENTRIES); - default_rrd_history_entries = RRD_DEFAULT_HISTORY_ENTRIES; + // ------------------------------------------------------------------------ + // get default memory mode for the database + + { + const char *mode = config_get(CONFIG_SECTION_DB, "mode", rrd_memory_mode_name(default_rrd_memory_mode)); + default_rrd_memory_mode = rrd_memory_mode_id(mode); + if(strcmp(mode, rrd_memory_mode_name(default_rrd_memory_mode)) != 0) { + error("Invalid memory mode '%s' given. Using '%s'", mode, rrd_memory_mode_name(default_rrd_memory_mode)); + config_set(CONFIG_SECTION_DB, "mode", rrd_memory_mode_name(default_rrd_memory_mode)); + } } // ------------------------------------------------------------------------ - // get default database update frequency + // get default database size - default_rrd_update_every = (int) config_get_number(CONFIG_SECTION_GLOBAL, "update every", UPDATE_EVERY); - if(default_rrd_update_every < 1 || default_rrd_update_every > 600) { - error("Invalid data collection frequency (update every) %d given. Defaulting to %d.", default_rrd_update_every, UPDATE_EVERY_MAX); - default_rrd_update_every = UPDATE_EVERY; + if(default_rrd_memory_mode != RRD_MEMORY_MODE_DBENGINE && default_rrd_memory_mode != RRD_MEMORY_MODE_NONE) { + default_rrd_history_entries = (int)config_get_number( + CONFIG_SECTION_DB, "retention", + align_entries_to_pagesize(default_rrd_memory_mode, RRD_DEFAULT_HISTORY_ENTRIES)); + + long h = align_entries_to_pagesize(default_rrd_memory_mode, default_rrd_history_entries); + if (h != default_rrd_history_entries) { + config_set_number(CONFIG_SECTION_DB, "retention", h); + default_rrd_history_entries = (int)h; + } } // ------------------------------------------------------------------------ @@ -582,39 +646,38 @@ static void get_netdata_configured_variables() { netdata_configured_primary_plugins_dir = plugin_directories[PLUGINSD_STOCK_PLUGINS_DIRECTORY_PATH]; } - // ------------------------------------------------------------------------ - // get default memory mode for the database - - default_rrd_memory_mode = rrd_memory_mode_id(config_get(CONFIG_SECTION_GLOBAL, "memory mode", rrd_memory_mode_name(default_rrd_memory_mode))); #ifdef ENABLE_DBENGINE // ------------------------------------------------------------------------ // get default Database Engine page cache size in MiB - db_engine_use_malloc = config_get_boolean(CONFIG_SECTION_GLOBAL, "page cache uses malloc", CONFIG_BOOLEAN_NO); - default_rrdeng_page_cache_mb = (int) config_get_number(CONFIG_SECTION_GLOBAL, "page cache size", default_rrdeng_page_cache_mb); + db_engine_use_malloc = config_get_boolean(CONFIG_SECTION_DB, "page cache with malloc", CONFIG_BOOLEAN_NO); + default_rrdeng_page_cache_mb = (int) config_get_number(CONFIG_SECTION_DB, "page cache size MB", default_rrdeng_page_cache_mb); if(default_rrdeng_page_cache_mb < RRDENG_MIN_PAGE_CACHE_SIZE_MB) { error("Invalid page cache size %d given. Defaulting to %d.", default_rrdeng_page_cache_mb, RRDENG_MIN_PAGE_CACHE_SIZE_MB); default_rrdeng_page_cache_mb = RRDENG_MIN_PAGE_CACHE_SIZE_MB; + config_set_number(CONFIG_SECTION_DB, "page cache size MB", default_rrdeng_page_cache_mb); } // ------------------------------------------------------------------------ // get default Database Engine disk space quota in MiB - default_rrdeng_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_GLOBAL, "dbengine disk space", default_rrdeng_disk_quota_mb); + default_rrdeng_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_DB, "dbengine disk space MB", default_rrdeng_disk_quota_mb); if(default_rrdeng_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) { error("Invalid dbengine disk space %d given. Defaulting to %d.", default_rrdeng_disk_quota_mb, RRDENG_MIN_DISK_SPACE_MB); default_rrdeng_disk_quota_mb = RRDENG_MIN_DISK_SPACE_MB; + config_set_number(CONFIG_SECTION_DB, "dbengine disk space MB", default_rrdeng_disk_quota_mb); } - default_multidb_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_GLOBAL, "dbengine multihost disk space", compute_multidb_diskspace()); + default_multidb_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_DB, "dbengine multihost disk space MB", compute_multidb_diskspace()); if(default_multidb_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) { error("Invalid multidb disk space %d given. Defaulting to %d.", default_multidb_disk_quota_mb, default_rrdeng_disk_quota_mb); default_multidb_disk_quota_mb = default_rrdeng_disk_quota_mb; + config_set_number(CONFIG_SECTION_DB, "dbengine multihost disk space MB", default_multidb_disk_quota_mb); } #else if (default_rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE) { - error_report("RRD_MEMORY_MODE_DBENGINE is not supported in th |