diff options
author | Costa Tsaousis <costa@netdata.cloud> | 2022-09-19 23:46:13 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-09-19 23:46:13 +0300 |
commit | cb7af25c09d8775d1967cb0553268075cda868d4 (patch) | |
tree | 9e86bc359bb2b1ec72d3a1382236703dc633ad63 /daemon | |
parent | 62246029160025a8d6503d9fbb617c7b029b9126 (diff) |
RRD structures managed by dictionaries (#13646)
* rrdset - in progress
* rrdset optimal constructor; rrdset conflict
* rrdset final touches
* re-organization of rrdset object members
* prevent use-after-free
* dictionary dfe supports also counting of iterations
* rrddim managed by dictionary
* rrd.h cleanup
* DICTIONARY_ITEM now is referencing actual dictionary items in the code
* removed rrdset linked list
* Revert "removed rrdset linked list"
This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5.
* removed rrdset linked list
* added comments
* Switch chart uuid to static allocation in rrdset
Remove unused functions
* rrdset_archive() and friends...
* always create rrdfamily
* enable ml_free_dimension
* rrddim_foreach done with dfe
* most custom rrddim loops replaced with rrddim_foreach
* removed accesses to rrddim->dimensions
* removed locks that are no longer needed
* rrdsetvar is now managed by the dictionary
* set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853
* conflict callback of rrdsetvar now properly checks if it has to reset the variable
* dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM
* dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe
* dictionary walkthrough callbacks get dictionary acquired items
* dictionary reference counters that can be dupped from zero
* added advanced functions for get and del
* rrdvar managed by dictionaries
* thread safety for rrdsetvar
* faster rrdvar initialization
* rrdvar string lengths should match in all add, del, get functions
* rrdvar internals hidden from the rest of the world
* rrdvar is now acquired throughout netdata
* hide the internal structures of rrdsetvar
* rrdsetvar is now acquired through out netdata
* rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata
* better error handling
* dont create variables if not initialized for health
* dont create variables if not initialized for health again
* rrdfamily is now managed by dictionaries; references of it are acquired dictionary items
* type checking on acquired objects
* rrdcalc renaming of functions
* type checking for rrdfamily_acquired
* rrdcalc managed by dictionaries
* rrdcalc double free fix
* host rrdvars is always needed
* attempt to fix deadlock 1
* attempt to fix deadlock 2
* Remove unused variable
* attempt to fix deadlock 3
* snprintfz
* rrdcalc index in rrdset fix
* Stop storing active charts and computing chart hashes
* Remove store active chart function
* Remove compute chart hash function
* Remove sql_store_chart_hash function
* Remove store_active_dimension function
* dictionary delayed destruction
* formatting and cleanup
* zero dictionary base on rrdsetvar
* added internal error to log delayed destructions of dictionaries
* typo in rrddimvar
* added debugging info to dictionary
* debug info
* fix for rrdcalc keys being empty
* remove forgotten unlock
* remove deadlock
* Switch to metadata version 5 and drop
chart_hash
chart_hash_map
chart_active
dimension_active
v_chart_hash
* SQL cosmetic changes
* do not busy wait while destroying a referenced dictionary
* remove deadlock
* code cleanup; re-organization;
* fast cleanup and flushing of dictionaries
* number formatting fixes
* do not delete configured alerts when archiving a chart
* rrddim obsolete linked list management outside dictionaries
* removed duplicate contexts call
* fix crash when rrdfamily is not initialized
* dont keep rrddimvar referenced
* properly cleanup rrdvar
* removed some locks
* Do not attempt to cleanup chart_hash / chart_hash_map
* rrdcalctemplate managed by dictionary
* register callbacks on the right dictionary
* removed some more locks
* rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread
* when looking up for an alarm look using both chart id and chart name
* host initialization a bit more modular
* init rrdlabels on host update
* preparation for dictionary views
* improved comment
* unused variables without internal checks
* service threads isolation and worker info
* more worker info in service thread
* thread cancelability debugging with internal checks
* strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647
* dictionary modularization
* Remove unused SQL statement definition
* unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated
* remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops
* rewritten dictionary to have 2 separate locks, one for indexing and another for traversal
* Update collectors/cgroups.plugin/sys_fs_cgroup.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Update collectors/cgroups.plugin/sys_fs_cgroup.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Update collectors/proc.plugin/proc_net_dev.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* fix memory leak in rrdset cache_dir
* minor dictionary changes
* dont use index locks in single threaded
* obsolete dict option
* rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim;
* fix jump on uninitialized value in dictionary; remove double free of cache_dir
* addressed codacy findings
* removed debugging code
* use the private refcount on dictionaries
* make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim;
* more dictionary statistics
* global statistics about dictionary operations, memory, items, callbacks
* dictionary support for views - missing the public API
* removed warning about unused parameter
* chart and context name for cloud
* chart and context name for cloud, again
* dictionary statistics fixed; first implementation of dictionary views - not currently used
* only the master can globally delete an item
* context needs netdata prefix
* fix context and chart it of spins
* fix for host variables when health is not enabled
* run garbage collector on item insert too
* Fix info message; remove extra "using"
* update dict unittest for new placement of garbage collector
* we need RRDHOST->rrdvars for maintaining custom host variables
* Health initialization needs the host->host_uuid
* split STRING to its own files; no code changes other than that
* initialize health unconditionally
* unit tests do not pollute the global scope with their variables
* Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
Diffstat (limited to 'daemon')
-rw-r--r-- | daemon/analytics.c | 60 | ||||
-rw-r--r-- | daemon/global_statistics.c | 369 | ||||
-rw-r--r-- | daemon/main.c | 4 | ||||
-rw-r--r-- | daemon/service.c | 251 | ||||
-rw-r--r-- | daemon/unit_test.c | 36 |
5 files changed, 667 insertions, 53 deletions
diff --git a/daemon/analytics.c b/daemon/analytics.c index 6c40930b75..bf8668d065 100644 --- a/daemon/analytics.c +++ b/daemon/analytics.c @@ -249,8 +249,7 @@ void analytics_exporters(void) buffer_free(bi); } -int collector_counter_callb(const char *name, void *entry, void *data) { - (void)name; +int collector_counter_callb(const DICTIONARY_ITEM *item __maybe_unused, void *entry, void *data) { struct array_printer *ap = (struct array_printer *)data; struct collector *col = (struct collector *)entry; @@ -279,21 +278,22 @@ int collector_counter_callb(const char *name, void *entry, void *data) { void analytics_collectors(void) { RRDSET *st; - DICTIONARY *dict = dictionary_create(DICTIONARY_FLAG_SINGLE_THREADED); + DICTIONARY *dict = dictionary_create(DICT_OPTION_SINGLE_THREADED); char name[500]; BUFFER *bt = buffer_create(1000); - rrdset_foreach_read(st, localhost) - { - if (rrdset_is_available_for_viewers(st)) { - struct collector col = { - .plugin = rrdset_plugin_name(st), - .module = rrdset_module_name(st) - }; - snprintfz(name, 499, "%s:%s", col.plugin, col.module); - dictionary_set(dict, name, &col, sizeof(struct collector)); - } + rrdset_foreach_read(st, localhost) { + if(!rrdset_is_available_for_viewers(st)) + continue; + + struct collector col = { + .plugin = rrdset_plugin_name(st), + .module = rrdset_module_name(st) + }; + snprintfz(name, 499, "%s:%s", col.plugin, col.module); + dictionary_set(dict, name, &col, sizeof(struct collector)); } + rrdset_foreach_done(st); struct array_printer ap; ap.c = 0; @@ -398,12 +398,11 @@ void analytics_charts(void) { RRDSET *st; int c = 0; + rrdset_foreach_read(st, localhost) - { - if (rrdset_is_available_for_viewers(st)) { - c++; - } - } + if(rrdset_is_available_for_viewers(st)) c++; + rrdset_foreach_done(st); + { char b[7]; snprintfz(b, 6, "%d", c); @@ -415,22 +414,19 @@ void analytics_metrics(void) { RRDSET *st; long int dimensions = 0; - RRDDIM *rd; - rrdset_foreach_read(st, localhost) - { - rrdset_rdlock(st); - + rrdset_foreach_read(st, localhost) { if (rrdset_is_available_for_viewers(st)) { - rrddim_foreach_read(rd, st) - { - if (rrddim_flag_check(rd, RRDDIM_FLAG_HIDDEN) || rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) + RRDDIM *rd; + rrddim_foreach_read(rd, st) { + if (rrddim_option_check(rd, RRDDIM_OPTION_HIDDEN) || rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) continue; dimensions++; } + rrddim_foreach_done(rd); } - - rrdset_unlock(st); } + rrdset_foreach_done(st); + { char b[7]; snprintfz(b, 6, "%ld", dimensions); @@ -443,7 +439,7 @@ void analytics_alarms(void) int alarm_warn = 0, alarm_crit = 0, alarm_normal = 0; char b[10]; RRDCALC *rc; - foreach_rrdcalc_in_rrdhost(localhost, rc) { + foreach_rrdcalc_in_rrdhost_read(localhost, rc) { if (unlikely(!rc->rrdset || !rc->rrdset->last_collected_time.tv_sec)) continue; @@ -458,6 +454,7 @@ void analytics_alarms(void) alarm_normal++; } } + foreach_rrdcalc_in_rrdhost_done(rc); snprintfz(b, 9, "%d", alarm_normal); analytics_set_data(&analytics_data.netdata_alarms_normal, b); @@ -527,16 +524,11 @@ void analytics_gather_immutable_meta_data(void) */ void analytics_gather_mutable_meta_data(void) { - rrdhost_rdlock(localhost); - analytics_collectors(); analytics_alarms(); analytics_charts(); analytics_metrics(); analytics_aclk(); - - rrdhost_unlock(localhost); - analytics_mirrored_hosts(); analytics_alarms_notifications(); diff --git a/daemon/global_statistics.c b/daemon/global_statistics.c index ce2cdc6a10..1d4da897a5 100644 --- a/daemon/global_statistics.c +++ b/daemon/global_statistics.c @@ -10,8 +10,9 @@ #define WORKER_JOB_DBENGINE 3 #define WORKER_JOB_HEARTBEAT 4 #define WORKER_JOB_STRINGS 5 +#define WORKER_JOB_DICTIONARIES 6 -#if WORKER_UTILIZATION_MAX_JOB_TYPES < 6 +#if WORKER_UTILIZATION_MAX_JOB_TYPES < 7 #error WORKER_UTILIZATION_MAX_JOB_TYPES has to be at least 5 #endif @@ -1216,6 +1217,367 @@ static void update_heartbeat_charts() { } // --------------------------------------------------------------------------------------------------------------------- +// dictionary statistics + +struct dictionary_categories { + struct dictionary_stats *stats; + const char *family; + const char *context_prefix; + int priority; + + RRDSET *st_dicts; + RRDDIM *rd_dicts_active; + RRDDIM *rd_dicts_deleted; + + RRDSET *st_items; + RRDDIM *rd_items_entries; + RRDDIM *rd_items_referenced; + RRDDIM *rd_items_pending_deletion; + + RRDSET *st_ops; + RRDDIM *rd_ops_creations; + RRDDIM *rd_ops_destructions; + RRDDIM *rd_ops_flushes; + RRDDIM *rd_ops_traversals; + RRDDIM *rd_ops_walkthroughs; + RRDDIM *rd_ops_garbage_collections; + RRDDIM *rd_ops_searches; + RRDDIM *rd_ops_inserts; + RRDDIM *rd_ops_resets; + RRDDIM *rd_ops_deletes; + + RRDSET *st_callbacks; + RRDDIM *rd_callbacks_inserts; + RRDDIM *rd_callbacks_conflicts; + RRDDIM *rd_callbacks_reacts; + RRDDIM *rd_callbacks_deletes; + + RRDSET *st_memory; + RRDDIM *rd_memory_indexed; + RRDDIM *rd_memory_values; + RRDDIM *rd_memory_dict; + + RRDSET *st_spins; + RRDDIM *rd_spins_use; + RRDDIM *rd_spins_search; + RRDDIM *rd_spins_insert; + +} dictionary_categories[] = { + { .stats = &dictionary_stats_category_other, "dictionaries", "dictionaries", 900000 }, + + // terminator + { .stats = NULL, NULL, NULL, 0 }, +}; + +#define load_dictionary_stats_entry(x) total += (size_t)(stats.x = __atomic_load_n(&c->stats->x, __ATOMIC_RELAXED)) + +static void update_dictionary_category_charts(struct dictionary_categories *c) { + struct dictionary_stats stats; + stats.name = c->stats->name; + + // ------------------------------------------------------------------------ + + size_t total = 0; + load_dictionary_stats_entry(dictionaries.active); + load_dictionary_stats_entry(dictionaries.deleted); + + if(c->st_dicts || total != 0) { + if (unlikely(!c->st_dicts)) { + char id[RRD_ID_LENGTH_MAX + 1]; + snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.dictionaries", c->context_prefix, stats.name); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.dictionaries", c->context_prefix); + + c->st_dicts = rrdset_create_localhost( + "netdata" + , id + , NULL + , c->family + , context + , "Dictionaries" + , "dictionaries" + , "netdata" + , "stats" + , c->priority + 0 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + c->rd_dicts_active = rrddim_add(c->st_dicts, "active", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + c->rd_dicts_deleted = rrddim_add(c->st_dicts, "deleted", NULL, -1, 1, RRD_ALGORITHM_ABSOLUTE); + + rrdlabels_add(c->st_dicts->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO); + } + else + rrdset_next(c->st_dicts); + + rrddim_set_by_pointer(c->st_dicts, c->rd_dicts_active, (collected_number)stats.dictionaries.active); + rrddim_set_by_pointer(c->st_dicts, c->rd_dicts_deleted, (collected_number)stats.dictionaries.deleted); + rrdset_done(c->st_dicts); + } + + // ------------------------------------------------------------------------ + + total = 0; + load_dictionary_stats_entry(items.entries); + load_dictionary_stats_entry(items.referenced); + load_dictionary_stats_entry(items.pending_deletion); + + if(c->st_items || total != 0) { + if (unlikely(!c->st_items)) { + char id[RRD_ID_LENGTH_MAX + 1]; + snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.items", c->context_prefix, stats.name); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.items", c->context_prefix); + + c->st_items = rrdset_create_localhost( + "netdata" + , id + , NULL + , c->family + , context + , "Dictionary Items" + , "items" + , "netdata" + , "stats" + , c->priority + 1 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + c->rd_items_entries = rrddim_add(c->st_items, "active", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + c->rd_items_pending_deletion = rrddim_add(c->st_items, "deleted", NULL, -1, 1, RRD_ALGORITHM_ABSOLUTE); + c->rd_items_referenced = rrddim_add(c->st_items, "referenced", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + + rrdlabels_add(c->st_items->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO); + } + else + rrdset_next(c->st_items); + + rrddim_set_by_pointer(c->st_items, c->rd_items_entries, stats.items.entries); + rrddim_set_by_pointer(c->st_items, c->rd_items_pending_deletion, stats.items.pending_deletion); + rrddim_set_by_pointer(c->st_items, c->rd_items_referenced, stats.items.referenced); + rrdset_done(c->st_items); + } + + // ------------------------------------------------------------------------ + + total = 0; + load_dictionary_stats_entry(ops.creations); + load_dictionary_stats_entry(ops.destructions); + load_dictionary_stats_entry(ops.flushes); + load_dictionary_stats_entry(ops.traversals); + load_dictionary_stats_entry(ops.walkthroughs); + load_dictionary_stats_entry(ops.garbage_collections); + load_dictionary_stats_entry(ops.searches); + load_dictionary_stats_entry(ops.inserts); + load_dictionary_stats_entry(ops.resets); + load_dictionary_stats_entry(ops.deletes); + + if(c->st_ops || total != 0) { + if (unlikely(!c->st_ops)) { + char id[RRD_ID_LENGTH_MAX + 1]; + snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.ops", c->context_prefix, stats.name); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.ops", c->context_prefix); + + c->st_ops = rrdset_create_localhost( + "netdata" + , id + , NULL + , c->family + , context + , "Dictionary Operations" + , "ops/s" + , "netdata" + , "stats" + , c->priority + 2 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + c->rd_ops_creations = rrddim_add(c->st_ops, "creations", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_destructions = rrddim_add(c->st_ops, "destructions", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_flushes = rrddim_add(c->st_ops, "flushes", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_traversals = rrddim_add(c->st_ops, "traversals", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_walkthroughs = rrddim_add(c->st_ops, "walkthroughs", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_garbage_collections = rrddim_add(c->st_ops, "garbage_collections", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_searches = rrddim_add(c->st_ops, "searches", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_inserts = rrddim_add(c->st_ops, "inserts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_resets = rrddim_add(c->st_ops, "resets", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_ops_deletes = rrddim_add(c->st_ops, "deletes", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrdlabels_add(c->st_ops->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO); + } + else + rrdset_next(c->st_ops); + + rrddim_set_by_pointer(c->st_ops, c->rd_ops_creations, (collected_number)stats.ops.creations); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_destructions, (collected_number)stats.ops.destructions); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_flushes, (collected_number)stats.ops.flushes); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_traversals, (collected_number)stats.ops.traversals); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_walkthroughs, (collected_number)stats.ops.walkthroughs); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_garbage_collections, (collected_number)stats.ops.garbage_collections); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_searches, (collected_number)stats.ops.searches); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_inserts, (collected_number)stats.ops.inserts); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_resets, (collected_number)stats.ops.resets); + rrddim_set_by_pointer(c->st_ops, c->rd_ops_deletes, (collected_number)stats.ops.deletes); + + rrdset_done(c->st_ops); + } + + // ------------------------------------------------------------------------ + + total = 0; + load_dictionary_stats_entry(callbacks.inserts); + load_dictionary_stats_entry(callbacks.conflicts); + load_dictionary_stats_entry(callbacks.reacts); + load_dictionary_stats_entry(callbacks.deletes); + + if(c->st_callbacks || total != 0) { + if (unlikely(!c->st_callbacks)) { + char id[RRD_ID_LENGTH_MAX + 1]; + snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.callbacks", c->context_prefix, stats.name); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.callbacks", c->context_prefix); + + c->st_callbacks = rrdset_create_localhost( + "netdata" + , id + , NULL + , c->family + , context + , "Dictionary Callbacks" + , "callbacks/s" + , "netdata" + , "stats" + , c->priority + 3 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + c->rd_callbacks_inserts = rrddim_add(c->st_callbacks, "inserts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_callbacks_deletes = rrddim_add(c->st_callbacks, "deletes", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_callbacks_conflicts = rrddim_add(c->st_callbacks, "conflicts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_callbacks_reacts = rrddim_add(c->st_callbacks, "reacts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrdlabels_add(c->st_callbacks->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO); + } + else + rrdset_next(c->st_callbacks); + + rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_inserts, (collected_number)stats.callbacks.inserts); + rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_conflicts, (collected_number)stats.callbacks.conflicts); + rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_reacts, (collected_number)stats.callbacks.reacts); + rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_deletes, (collected_number)stats.callbacks.deletes); + + rrdset_done(c->st_callbacks); + } + + // ------------------------------------------------------------------------ + + total = 0; + load_dictionary_stats_entry(memory.indexed); + load_dictionary_stats_entry(memory.values); + load_dictionary_stats_entry(memory.dict); + + if(c->st_memory || total != 0) { + if (unlikely(!c->st_memory)) { + char id[RRD_ID_LENGTH_MAX + 1]; + snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.memory", c->context_prefix, stats.name); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.memory", c->context_prefix); + + c->st_memory = rrdset_create_localhost( + "netdata" + , id + , NULL + , c->family + , context + , "Dictionary Memory" + , "bytes" + , "netdata" + , "stats" + , c->priority + 4 + , localhost->rrd_update_every + , RRDSET_TYPE_STACKED + ); + + c->rd_memory_indexed = rrddim_add(c->st_memory, "index", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + c->rd_memory_values = rrddim_add(c->st_memory, "data", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + c->rd_memory_dict = rrddim_add(c->st_memory, "structures", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + + rrdlabels_add(c->st_memory->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO); + } + else + rrdset_next(c->st_memory); + + rrddim_set_by_pointer(c->st_memory, c->rd_memory_indexed, (collected_number)stats.memory.indexed); + rrddim_set_by_pointer(c->st_memory, c->rd_memory_values, (collected_number)stats.memory.values); + rrddim_set_by_pointer(c->st_memory, c->rd_memory_dict, (collected_number)stats.memory.dict); + + rrdset_done(c->st_memory); + } + + // ------------------------------------------------------------------------ + + total = 0; + load_dictionary_stats_entry(spin_locks.use); + load_dictionary_stats_entry(spin_locks.search); + load_dictionary_stats_entry(spin_locks.insert); + + if(c->st_spins || total != 0) { + if (unlikely(!c->st_spins)) { + char id[RRD_ID_LENGTH_MAX + 1]; + snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.spins", c->context_prefix, stats.name); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.spins", c->context_prefix); + + c->st_spins = rrdset_create_localhost( + "netdata" + , id + , NULL + , c->family + , context + , "Dictionary Spins" + , "count" + , "netdata" + , "stats" + , c->priority + 5 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + c->rd_spins_use = rrddim_add(c->st_spins, "use", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_spins_search = rrddim_add(c->st_spins, "search", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + c->rd_spins_insert = rrddim_add(c->st_spins, "insert", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrdlabels_add(c->st_spins->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO); + } + else + rrdset_next(c->st_spins); + + rrddim_set_by_pointer(c->st_spins, c->rd_spins_use, (collected_number)stats.spin_locks.use); + rrddim_set_by_pointer(c->st_spins, c->rd_spins_search, (collected_number)stats.spin_locks.search); + rrddim_set_by_pointer(c->st_spins, c->rd_spins_insert, (collected_number)stats.spin_locks.insert); + + rrdset_done(c->st_spins); + } +} + +static void dictionary_statistics(void) { + for(int i = 0; dictionary_categories[i].stats ;i++) { + update_dictionary_category_charts(&dictionary_categories[i]); + } +} + +// --------------------------------------------------------------------------------------------------------------------- // worker utilization #define WORKERS_MIN_PERCENT_DEFAULT 10000.0 @@ -1334,6 +1696,7 @@ static struct worker_utilization all_workers_utilization[] = { { .name = "TIMEX", .family = "workers plugin timex", .priority = 1000000 }, { .name = "IDLEJITTER", .family = "workers plugin idlejitter", .priority = 1000000 }, { .name = "RRDCONTEXT", .family = "workers contexts", .priority = 1000000 }, + { .name = "SERVICE", .family = "workers service", .priority = 1000000 }, // has to be terminated with a NULL { .name = NULL, .family = NULL } @@ -2011,6 +2374,7 @@ void *global_statistics_main(void *ptr) worker_register_job_name(WORKER_JOB_WORKERS, "workers"); worker_register_job_name(WORKER_JOB_DBENGINE, "dbengine"); worker_register_job_name(WORKER_JOB_STRINGS, "strings"); + worker_register_job_name(WORKER_JOB_DICTIONARIES, "dictionaries"); netdata_thread_cleanup_push(global_statistics_cleanup, ptr); @@ -2048,6 +2412,9 @@ void *global_statistics_main(void *ptr) worker_is_busy(WORKER_JOB_STRINGS); update_strings_charts(); + + worker_is_busy(WORKER_JOB_DICTIONARIES); + dictionary_statistics(); } netdata_thread_cleanup_pop(1); diff --git a/daemon/main.c b/daemon/main.c index ada3c14f2a..a51e4a94c8 100644 --- a/daemon/main.c +++ b/daemon/main.c @@ -1004,6 +1004,7 @@ int main(int argc, char **argv) { if(test_dbengine()) return 1; #endif if(test_sqlite()) return 1; + if(string_unittest(10000)) return 1; if (dictionary_unittest(10000)) return 1; if (rrdlabels_unittest()) @@ -1028,6 +1029,9 @@ int main(int argc, char **argv) { else if(strcmp(optarg, "dicttest") == 0) { return dictionary_unittest(10000); } + else if(strcmp(optarg, "stringtest") == 0) { + return string_unittest(10000); + } else if(strcmp(optarg, "rrdlabelstest") == 0) { return rrdlabels_unittest(); } diff --git a/daemon/service.c b/daemon/service.c index 61cc1281ae..f8371103b4 100644 --- a/daemon/service.c +++ b/daemon/service.c @@ -5,12 +5,238 @@ /* Run service jobs every X seconds */ #define SERVICE_HEARTBEAT 10 +#define WORKER_JOB_CHILD_CHART_OBSOLETION_CHECK 1 +#define WORKER_JOB_CLEANUP_OBSOLETE_CHARTS 2 +#define WORKER_JOB_ARCHIVE_CHART 3 +#define WORKER_JOB_ARCHIVE_CHART_DIMENSIONS 4 +#define WORKER_JOB_ARCHIVE_DIMENSION 5 +#define WORKER_JOB_CLEANUP_ORPHAN_HOSTS 6 +#define WORKER_JOB_CLEANUP_OBSOLETE_CHARTS_ON_HOSTS 7 +#define WORKER_JOB_FREE_HOST 9 +#define WORKER_JOB_SAVE_HOST_CHARTS 10 +#define WORKER_JOB_DELETE_HOST_CHARTS 11 +#define WORKER_JOB_FREE_CHART 12 +#define WORKER_JOB_SAVE_CHART 13 +#define WORKER_JOB_DELETE_CHART 14 +#define WORKER_JOB_FREE_DIMENSION 15 + +static void svc_rrddim_obsolete_to_archive(RRDDIM *rd) { + RRDSET *st = rd->rrdset; + + if(rrddim_flag_check(rd, RRDDIM_FLAG_ARCHIVED | RRDDIM_FLAG_ACLK) || !rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) + return; + + worker_is_busy(WORKER_JOB_ARCHIVE_DIMENSION); + + rrddim_flag_set(rd, RRDDIM_FLAG_ARCHIVED); + rrddim_flag_clear(rd, RRDDIM_FLAG_OBSOLETE); + + const char *cache_filename = rrddim_cache_filename(rd); + if(cache_filename) { + info("Deleting dimension file '%s'.", cache_filename); + if (unlikely(unlink(cache_filename) == -1)) + error("Cannot delete dimension file '%s'", cache_filename); + } + + if (rd->rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE) { + rrddimvar_delete_all(rd); + + /* only a collector can mark a chart as obsolete, so we must remove the reference */ + + size_t tiers_available = 0, tiers_said_yes = 0; + for(int tier = 0; tier < storage_tiers ;tier++) { + if(rd->tiers[tier]) { + tiers_available++; + + if(rd->tiers[tier]->collect_ops.finalize(rd->tiers[tier]->db_collection_handle)) + tiers_said_yes++; + + rd->tiers[tier]->db_collection_handle = NULL; + } + } + + if (tiers_available == tiers_said_yes && tiers_said_yes) { + /* This metric has no data and no references */ + delete_dimension_uuid(&rd->metric_uuid); + } + else { + /* Do not delete this dimension */ +#ifdef ENABLE_ACLK + queue_dimension_to_aclk(rd, calc_dimension_liveness(rd, now_realtime_sec())); +#endif + return; + } + } + + worker_is_busy(WORKER_JOB_FREE_DIMENSION); + rrddim_free(st, rd); +} + +static void svc_rrdset_archive_obsolete_dimensions(RRDSET *st, bool all_dimensions) { + worker_is_busy(WORKER_JOB_ARCHIVE_CHART_DIMENSIONS); + + RRDDIM *rd; + time_t now = now_realtime_sec(); + + dfe_start_reentrant(st->rrddim_root_index, rd) { + if(unlikely( + all_dimensions || + (rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE) && (rd->last_collected_time.tv_sec + rrdset_free_obsolete_time < now)) + )) { + + info("Removing obsolete dimension '%s' (%s) of '%s' (%s).", rrddim_name(rd), rrddim_id(rd), rrdset_name(st), rrdset_id(st)); + svc_rrddim_obsolete_to_archive(rd); + + } + } + dfe_done(rd); +} + +static void svc_rrdset_obsolete_to_archive(RRDSET *st) { + worker_is_busy(WORKER_JOB_ARCHIVE_CHART); + + rrdset_flag_set(st, RRDSET_FLAG_ARCHIVED); + rrdset_flag_clear(st, RRDSET_FLAG_OBSOLETE); + + rrdcalc_unlink_all_rrdset_alerts(st); + + svc_rrdset_archive_obsolete_dimensions(st, true); + + rrdsetvar_release_and_delete_all(st); + + // has to be run after all dimensions are archived - or use-after-free will occur + rrdvar_delete_all(st->rrdvars); + + if(st->rrd_memory_mode != RRD_MEMORY_MODE_DBENGINE) { + if(rrdhost_flag_check(st->rrdhost, RRDHOST_FLAG_DELETE_OBSOLETE_CHARTS)) { + worker_is_busy(WORKER_JOB_DELETE_CHART); + rrdset_delete_files(st); + } + else { + worker_is_busy(WORKER_JOB_SAVE_CHART); + rrdset_save(st); + } + + worker_is_busy(WORKER_JOB_FREE_CHART); + rrdset_free(st); + } +} + +static void svc_rrdhost_cleanup_obsolete_charts(RRDHOST *host) { + worker_is_busy(WORKER_JOB_CLEANUP_OBSOLETE_CHARTS); + + time_t now = now_realtime_sec(); + RRDSET *st; + rrdset_foreach_reentrant(st, host) { + if(unlikely(rrdset_flag_check(st, RRDSET_FLAG_OBSOLETE) + && st->last_accessed_time + rrdset_free_obsolete_time < now + && st->last_updated.tv_sec + rrdset_free_obsolete_time < now + && st->last_collected_time.tv_sec + rrdset_free_obsolete_time < now + )) { + svc_rrdset_obsolete_to_archive(st); + } + else if(rrdset_flag_check(st, RRDSET_FLAG_OBSOLETE_DIMENSIONS)) { + rrdset_flag_clear(st, RRDSET_FLAG_OBSOLETE_DIMENSIONS); + svc_rrdset_archive_obsolete_dimensions(st, false); + } +#ifdef ENABLE_ACLK + else + sql_check_chart_liveness(st); +#endif + } + rrdset_foreach_done(st); +} + +static void svc_rrdset_check_obsoletion(RRDHOST *host) { + worker_is_busy(WORKER_JOB_CHILD_CHART_OBSOLETION_CHECK); + + time_t last_entry_t; + RRDSET *st; + rrdset_foreach_read(st, host) { + last_entry_t = rrdset_last_entry_t(st); + + if(last_entry_t && last_entry_t < host->senders_connect_time) + rrdset_is_obsolete(st); + + } + rrdset_foreach_done(st); +} + +static void svc_rrd_cleanup_obsolete_charts_from_all_hosts() { + worker_is_busy(WORKER_JOB_CLEANUP_OBSOLETE_CHARTS_ON_HOSTS); + + rrd_rdlock(); + + RRDHOST *host; + rrdhost_foreach_read(host) { + + if(rrdhost_flag_check(host, RRDHOST_FLAG_PENDING_OBSOLETE_CHARTS|RRDHOST_FLAG_PENDING_OBSOLETE_DIMENSIONS)) { + rrdhost_flag_clear(host, RRDHOST_FLAG_PENDING_OBSOLETE_CHARTS|RRDHOST_FLAG_PENDING_OBSOLETE_DIMENSIONS); + svc_rrdhost_cleanup_obsolete_charts(host); + } + + if(host != localhost + && host->trigger_chart_obsoletion_check + && ( + ( + host->senders_last_chart_command + && host->senders_last_chart_command + host->health_delay_up_to < now_realtime_sec() + ) |