diff options
author | Costa Tsaousis <costa@netdata.cloud> | 2022-09-05 19:31:06 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-09-05 19:31:06 +0300 |
commit | 5e1b95cf92168c4df74586fb4430dc284806da82 (patch) | |
tree | f42077d8b02eaf316683453a7474bd1f599a833d /daemon | |
parent | 544aef1fde6e79ac57d2dea85d3f063076d7f885 (diff) |
Deduplicate all netdata strings (#13570)
* rrdfamily
* rrddim
* rrdset plugin and module names
* rrdset units
* rrdset type
* rrdset family
* rrdset title
* rrdset title more
* rrdset context
* rrdcalctemplate context and removal of context hash from rrdset
* strings statistics
* rrdset name
* rearranged members of rrdset
* eliminate rrdset name hash; rrdcalc chart converted to STRING
* rrdset id, eliminated rrdset hash
* rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate
* rrdcalctemplate
* rrdvar
* eval_variable
* rrddimvar and rrdsetvar
* rrdhost hostname, os and tags
* fix master commits
* added thread cache; implemented string_dup without locks
* faster thread cache
* rrdset and rrddim now use dictionaries for indexing
* rrdhost now uses dictionary
* rrdfamily now uses DICTIONARY
* rrdvar using dictionary instead of AVL
* allocate the right size to rrdvar flag members
* rrdhost remaining char * members to STRING *
* better error handling on indexing
* strings now use a read/write lock to allow parallel searches to the index
* removed AVL support from dictionaries; implemented STRING with native Judy calls
* string releases should be negative
* only 31 bits are allowed for enum flags
* proper locking on strings
* string threading unittest and fixes
* fix lgtm finding
* fixed naming
* stream chart/dimension definitions at the beginning of a streaming session
* thread stack variable is undefined on thread cancel
* rrdcontext garbage collect per host on startup
* worker control in garbage collection
* relaxed deletion of rrdmetrics
* type checking on dictfe
* netdata chart to monitor rrdcontext triggers
* Group chart label updates
* rrdcontext better handling of collected rrdsets
* rrdpush incremental transmition of definitions should use as much buffer as possible
* require 1MB per chart
* empty the sender buffer before enabling metrics streaming
* fill up to 50% of buffer
* reset signaling metrics sending
* use the shared variable for status
* use separate host flag for enabling streaming of metrics
* make sure the flag is clear
* add logging for streaming
* add logging for streaming on buffer overflow
* circular_buffer proper sizing
* removed obsolete logs
* do not execute worker jobs if not necessary
* better messages about compression disabling
* proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped
* monitor stream sender used buffer ratio
* Update exporting unit tests
* no need to compare label value with strcmp
* streaming send workers now monitor bandwidth
* workers now use strings
* streaming receiver monitors incoming bandwidth
* parser shift of worker ids
* minor fixes
* Group chart label updates
* Populate context with dimensions that have data
* Fix chart id
* better shift of parser worker ids
* fix for streaming compression
* properly count received bytes
* ensure LZ4 compression ring buffer does not wrap prematurely
* do not stream empty charts; do not process empty instances in rrdcontext
* need_to_send_chart_definition() does not need an rrdset lock any more
* rrdcontext objects are collected, after data have been written to the db
* better logging of RRDCONTEXT transitions
* always set all variables needed by the worker utilization charts
* implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes
* lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock
* STRING code re-organization for clarity
* thread_cache improvements; double numbers precision on worker threads
* STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS
* rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts
* Add index to speed up database context searches
* Removed last_updated optimization (was also buggy after latest merge with master)
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
Diffstat (limited to 'daemon')
-rw-r--r-- | daemon/analytics.c | 14 | ||||
-rw-r--r-- | daemon/global_statistics.c | 383 | ||||
-rw-r--r-- | daemon/global_statistics.h | 4 | ||||
-rw-r--r-- | daemon/unit_test.c | 62 |
4 files changed, 400 insertions, 63 deletions
diff --git a/daemon/analytics.c b/daemon/analytics.c index 370818b8a3..6c40930b75 100644 --- a/daemon/analytics.c +++ b/daemon/analytics.c @@ -10,8 +10,8 @@ extern void analytics_build_info (BUFFER *b); extern int aclk_connected; struct collector { - char *plugin; - char *module; + const char *plugin; + const char *module; }; struct array_printer { @@ -286,8 +286,10 @@ void analytics_collectors(void) rrdset_foreach_read(st, localhost) { if (rrdset_is_available_for_viewers(st)) { - struct collector col = { .plugin = st->plugin_name ? st->plugin_name : "", - .module = st->module_name ? st->module_name : "" }; + struct collector col = { + .plugin = rrdset_plugin_name(st), + .module = rrdset_module_name(st) + }; snprintfz(name, 499, "%s:%s", col.plugin, col.module); dictionary_set(dict, name, &col, sizeof(struct collector)); } @@ -441,7 +443,7 @@ void analytics_alarms(void) int alarm_warn = 0, alarm_crit = 0, alarm_normal = 0; char b[10]; RRDCALC *rc; - for (rc = localhost->alarms; rc; rc = rc->next) { + foreach_rrdcalc_in_rrdhost(localhost, rc) { if (unlikely(!rc->rrdset || !rc->rrdset->last_collected_time.tv_sec)) continue; @@ -539,7 +541,7 @@ void analytics_gather_mutable_meta_data(void) analytics_alarms_notifications(); analytics_set_data( - &analytics_data.netdata_config_is_parent, (localhost->next || configured_as_parent()) ? "true" : "false"); + &analytics_data.netdata_config_is_parent, (rrdhost_hosts_available() > 1 || configured_as_parent()) ? "true" : "false"); char *claim_id = get_agent_claimid(); analytics_set_data(&analytics_data.netdata_host_agent_claimed, claim_id ? "true" : "false"); diff --git a/daemon/global_statistics.c b/daemon/global_statistics.c index 9ef9ebc851..22fb0997a0 100644 --- a/daemon/global_statistics.c +++ b/daemon/global_statistics.c @@ -9,8 +9,9 @@ #define WORKER_JOB_WORKERS 2 #define WORKER_JOB_DBENGINE 3 #define WORKER_JOB_HEARTBEAT 4 +#define WORKER_JOB_STRINGS 5 -#if WORKER_UTILIZATION_MAX_JOB_TYPES < 5 +#if WORKER_UTILIZATION_MAX_JOB_TYPES < 6 #error WORKER_UTILIZATION_MAX_JOB_TYPES has to be at least 5 #endif @@ -38,6 +39,10 @@ static struct global_statistics { volatile uint64_t sqlite3_queries_failed_locked; volatile uint64_t sqlite3_rows; + volatile uint64_t rrdcontext_metric_triggers; + volatile uint64_t rrdcontext_instance_triggers; + volatile uint64_t rrdcontext_context_triggers; + } global_statistics = { .connected_clients = 0, .web_requests = 0, @@ -80,6 +85,18 @@ void rrdr_query_completed(uint64_t db_points_read, uint64_t result_points_genera __atomic_fetch_add(&global_statistics.rrdr_result_points_generated, result_points_generated, __ATOMIC_RELAXED); } +void rrdcontext_triggered_update_on_rrdmetric() { + __atomic_fetch_add(&global_statistics.rrdcontext_metric_triggers, 1, __ATOMIC_RELAXED); +} + +void rrdcontext_triggered_update_on_rrdinstance() { + __atomic_fetch_add(&global_statistics.rrdcontext_instance_triggers, 1, __ATOMIC_RELAXED); +} + +void rrdcontext_triggered_update_on_rrdcontext() { + __atomic_fetch_add(&global_statistics.rrdcontext_context_triggers, 1, __ATOMIC_RELAXED); +} + void finished_web_request_statistics(uint64_t dt, uint64_t bytes_received, uint64_t bytes_sent, @@ -133,6 +150,10 @@ static inline void global_statistics_copy(struct global_statistics *gs, uint8_t gs->sqlite3_queries_failed_busy = __atomic_load_n(&global_statistics.sqlite3_queries_failed_busy, __ATOMIC_RELAXED); gs->sqlite3_queries_failed_locked = __atomic_load_n(&global_statistics.sqlite3_queries_failed_locked, __ATOMIC_RELAXED); gs->sqlite3_rows = __atomic_load_n(&global_statistics.sqlite3_rows, __ATOMIC_RELAXED); + + gs->rrdcontext_metric_triggers = __atomic_load_n(&global_statistics.rrdcontext_metric_triggers, __ATOMIC_RELAXED); + gs->rrdcontext_instance_triggers = __atomic_load_n(&global_statistics.rrdcontext_instance_triggers, __ATOMIC_RELAXED); + gs->rrdcontext_context_triggers = __atomic_load_n(&global_statistics.rrdcontext_context_triggers, __ATOMIC_RELAXED); } static void global_statistics_charts(void) { @@ -579,6 +600,42 @@ static void global_statistics_charts(void) { rrdset_done(st_sqlite3_rows); } + + // ---------------------------------------------------------------- + + if(gs.rrdcontext_context_triggers || gs.rrdcontext_instance_triggers || gs.rrdcontext_metric_triggers) { + static RRDSET *st_rrdcontext_triggers = NULL; + static RRDDIM *rd_context = NULL, *rd_instance = NULL, *rd_metric = NULL; + + if (unlikely(!st_rrdcontext_triggers)) { + st_rrdcontext_triggers = rrdset_create_localhost( + "netdata" + , "rrdcontext_triggers" + , NULL + , "rrdcontext" + , NULL + , "Netdata RRD context triggers" + , "triggers/s" + , "netdata" + , "stats" + , 131200 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + rd_metric = rrddim_add(st_rrdcontext_triggers, "metric", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rd_instance = rrddim_add(st_rrdcontext_triggers, "instance", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rd_context = rrddim_add(st_rrdcontext_triggers, "context", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(st_rrdcontext_triggers); + + rrddim_set_by_pointer(st_rrdcontext_triggers, rd_metric, (collected_number)gs.rrdcontext_metric_triggers); + rrddim_set_by_pointer(st_rrdcontext_triggers, rd_instance, (collected_number)gs.rrdcontext_instance_triggers); + rrddim_set_by_pointer(st_rrdcontext_triggers, rd_context, (collected_number)gs.rrdcontext_context_triggers); + + rrdset_done(st_rrdcontext_triggers); + } } static void dbengine_statistics_charts(void) { @@ -988,6 +1045,93 @@ static void dbengine_statistics_charts(void) { #endif } +static void update_strings_charts() { + static RRDSET *st_ops = NULL, *st_entries = NULL, *st_mem = NULL; + static RRDDIM *rd_ops_inserts = NULL, *rd_ops_deletes = NULL, *rd_ops_searches = NULL, *rd_ops_duplications = NULL, *rd_ops_releases = NULL; + static RRDDIM *rd_entries_entries = NULL, *rd_entries_refs = NULL; + static RRDDIM *rd_mem = NULL; + + size_t inserts, deletes, searches, entries, references, memory, duplications, releases; + + string_statistics(&inserts, &deletes, &searches, &entries, &references, &memory, &duplications, &releases); + + if (unlikely(!st_ops)) { + st_ops = rrdset_create_localhost( + "netdata" + , "strings_ops" + , NULL + , "strings" + , NULL + , "Strings operations" + , "ops/s" + , "netdata" + , "stats" + , 910000 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE); + + rd_ops_inserts = rrddim_add(st_ops, "inserts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rd_ops_deletes = rrddim_add(st_ops, "deletes", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL); + rd_ops_searches = rrddim_add(st_ops, "searches", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rd_ops_duplications = rrddim_add(st_ops, "duplications", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rd_ops_releases = rrddim_add(st_ops, "releases", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL); + } else + rrdset_next(st_ops); + + rrddim_set_by_pointer(st_ops, rd_ops_inserts, (collected_number)inserts); + rrddim_set_by_pointer(st_ops, rd_ops_deletes, (collected_number)deletes); + rrddim_set_by_pointer(st_ops, rd_ops_searches, (collected_number)searches); + rrddim_set_by_pointer(st_ops, rd_ops_duplications, (collected_number)duplications); + rrddim_set_by_pointer(st_ops, rd_ops_releases, (collected_number)releases); + rrdset_done(st_ops); + + if (unlikely(!st_entries)) { + st_entries = rrdset_create_localhost( + "netdata" + , "strings_entries" + , NULL + , "strings" + , NULL + , "Strings entries" + , "entries" + , "netdata" + , "stats" + , 910001 + , localhost->rrd_update_every + , RRDSET_TYPE_AREA); + + rd_entries_entries = rrddim_add(st_entries, "entries", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + rd_entries_refs = rrddim_add(st_entries, "references", NULL, 1, -1, RRD_ALGORITHM_ABSOLUTE); + } else + rrdset_next(st_entries); + + rrddim_set_by_pointer(st_entries, rd_entries_entries, (collected_number)entries); + rrddim_set_by_pointer(st_entries, rd_entries_refs, (collected_number)references); + rrdset_done(st_entries); + + if (unlikely(!st_mem)) { + st_mem = rrdset_create_localhost( + "netdata" + , "strings_memory" + , NULL + , "strings" + , NULL + , "Strings memory" + , "bytes" + , "netdata" + , "stats" + , 910001 + , localhost->rrd_update_every + , RRDSET_TYPE_AREA); + + rd_mem = rrddim_add(st_mem, "memory", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + } else + rrdset_next(st_mem); + + rrddim_set_by_pointer(st_mem, rd_mem, (collected_number)memory); + rrdset_done(st_mem); +} + static void update_heartbeat_charts() { static RRDSET *st_heartbeat = NULL; static RRDDIM *rd_heartbeat_min = NULL; @@ -1032,13 +1176,26 @@ static void update_heartbeat_charts() { #define WORKERS_MIN_PERCENT_DEFAULT 10000.0 -struct worker_job_type { - char name[WORKER_UTILIZATION_MAX_JOB_NAME_LENGTH + 1]; +struct worker_job_type_gs { + STRING *name; + STRING *units; + size_t jobs_started; usec_t busy_time; RRDDIM *rd_jobs_started; RRDDIM *rd_busy_time; + + WORKER_METRIC_TYPE type; + NETDATA_DOUBLE min_value; + NETDATA_DOUBLE max_value; + NETDATA_DOUBLE sum_value; + size_t count_value; + + RRDSET *st; + RRDDIM *rd_min; + RRDDIM *rd_max; + RRDDIM *rd_avg; }; struct worker_thread { @@ -1071,7 +1228,7 @@ struct worker_utilization { char *name_lowercase; - struct worker_job_type per_job_type[WORKER_UTILIZATION_MAX_JOB_TYPES]; + struct worker_job_type_gs per_job_type[WORKER_UTILIZATION_MAX_JOB_TYPES]; size_t workers_registered; size_t workers_busy; @@ -1170,14 +1327,16 @@ static void workers_total_cpu_utilization_chart(void) { if(!wu->workers_cpu_registered) continue; if(!wu->rd_total_cpu_utilizaton) - wu->rd_total_cpu_utilizaton = rrddim_add(st, wu->name_lowercase, NULL, 1, 10000ULL, RRD_ALGORITHM_ABSOLUTE); + wu->rd_total_cpu_utilizaton = rrddim_add(st, wu->name_lowercase, NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); - rrddim_set_by_pointer(st, wu->rd_total_cpu_utilizaton, (collected_number)((double)wu->workers_cpu_total * 10000.0)); + rrddim_set_by_pointer(st, wu->rd_total_cpu_utilizaton, (collected_number)((double)wu->workers_cpu_total * 100.0)); } rrdset_done(st); } +#define WORKER_CHART_DECIMAL_PRECISION 100 + static void workers_utilization_update_chart(struct worker_utilization *wu) { if(!wu->workers_registered) return; @@ -1215,28 +1374,28 @@ static void workers_utilization_update_chart(struct worker_utilization *wu) { // we add the min and max dimensions only when we have multiple workers if(unlikely(!wu->rd_workers_time_min && wu->workers_registered > 1)) - wu->rd_workers_time_min = rrddim_add(wu->st_workers_time, "min", NULL, 1, 10000, RRD_ALGORITHM_ABSOLUTE); + wu->rd_workers_time_min = rrddim_add(wu->st_workers_time, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); if(unlikely(!wu->rd_workers_time_max && wu->workers_registered > 1)) - wu->rd_workers_time_max = rrddim_add(wu->st_workers_time, "max", NULL, 1, 10000, RRD_ALGORITHM_ABSOLUTE); + wu->rd_workers_time_max = rrddim_add(wu->st_workers_time, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); if(unlikely(!wu->rd_workers_time_avg)) - wu->rd_workers_time_avg = rrddim_add(wu->st_workers_time, "average", NULL, 1, 10000, RRD_ALGORITHM_ABSOLUTE); + wu->rd_workers_time_avg = rrddim_add(wu->st_workers_time, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); rrdset_next(wu->st_workers_time); if(unlikely(wu->workers_min_busy_time == WORKERS_MIN_PERCENT_DEFAULT)) wu->workers_min_busy_time = 0.0; if(wu->rd_workers_time_min) - rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_min, (collected_number)((double)wu->workers_min_busy_time * 10000.0)); + rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_min, (collected_number)((double)wu->workers_min_busy_time * WORKER_CHART_DECIMAL_PRECISION)); if(wu->rd_workers_time_max) - rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_max, (collected_number)((double)wu->workers_max_busy_time * 10000.0)); + rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_max, (collected_number)((double)wu->workers_max_busy_time * WORKER_CHART_DECIMAL_PRECISION)); if(wu->workers_total_duration == 0) rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_avg, 0); else - rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_avg, (collected_number)((double)wu->workers_total_busy_time * 100.0 * 10000.0 / (double)wu->workers_total_duration)); + rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_avg, (collected_number)((double)wu->workers_total_busy_time * 100.0 * WORKER_CHART_DECIMAL_PRECISION / (double)wu->workers_total_duration)); rrdset_done(wu->st_workers_time); @@ -1268,28 +1427,28 @@ static void workers_utilization_update_chart(struct worker_utilization *wu) { } if (unlikely(!wu->rd_workers_cpu_min && wu->workers_registered > 1)) - wu->rd_workers_cpu_min = rrddim_add(wu->st_workers_cpu, "min", NULL, 1, 10000ULL, RRD_ALGORITHM_ABSOLUTE); + wu->rd_workers_cpu_min = rrddim_add(wu->st_workers_cpu, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); if (unlikely(!wu->rd_workers_cpu_max && wu->workers_registered > 1)) - wu->rd_workers_cpu_max = rrddim_add(wu->st_workers_cpu, "max", NULL, 1, 10000ULL, RRD_ALGORITHM_ABSOLUTE); + wu->rd_workers_cpu_max = rrddim_add(wu->st_workers_cpu, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); if(unlikely(!wu->rd_workers_cpu_avg)) - wu->rd_workers_cpu_avg = rrddim_add(wu->st_workers_cpu, "average", NULL, 1, 10000ULL, RRD_ALGORITHM_ABSOLUTE); + wu->rd_workers_cpu_avg = rrddim_add(wu->st_workers_cpu, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); rrdset_next(wu->st_workers_cpu); if(unlikely(wu->workers_cpu_min == WORKERS_MIN_PERCENT_DEFAULT)) wu->workers_cpu_min = 0.0; if(wu->rd_workers_cpu_min) - rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_min, (collected_number)(wu->workers_cpu_min * 10000ULL)); + rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_min, (collected_number)(wu->workers_cpu_min * WORKER_CHART_DECIMAL_PRECISION)); if(wu->rd_workers_cpu_max) - rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_max, (collected_number)(wu->workers_cpu_max * 10000ULL)); + rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_max, (collected_number)(wu->workers_cpu_max * WORKER_CHART_DECIMAL_PRECISION)); if(wu->workers_cpu_registered == 0) rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, 0); else - rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, (collected_number)( wu->workers_cpu_total * 10000ULL / (NETDATA_DOUBLE)wu->workers_cpu_registered )); + rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, (collected_number)( wu->workers_cpu_total * WORKER_CHART_DECIMAL_PRECISION / (NETDATA_DOUBLE)wu->workers_cpu_registered )); rrdset_done(wu->st_workers_cpu); } @@ -1325,10 +1484,13 @@ static void workers_utilization_update_chart(struct worker_utilization *wu) { { size_t i; for(i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES ;i++) { - if (wu->per_job_type[i].name[0]) { + if(unlikely(wu->per_job_type[i].type != WORKER_METRIC_IDLE_BUSY)) + continue; + + if (wu->per_job_type[i].name) { if(unlikely(!wu->per_job_type[i].rd_jobs_started)) - wu->per_job_type[i].rd_jobs_started = rrddim_add(wu->st_workers_jobs_per_job_type, wu->per_job_type[i].name, NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + wu->per_job_type[i].rd_jobs_started = rrddim_add(wu->st_workers_jobs_per_job_type, string2str(wu->per_job_type[i].name), NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); rrddim_set_by_pointer(wu->st_workers_jobs_per_job_type, wu->per_job_type[i].rd_jobs_started, (collected_number)(wu->per_job_type[i].jobs_started)); } @@ -1367,10 +1529,13 @@ static void workers_utilization_update_chart(struct worker_utilization *wu) { { size_t i; for(i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES ;i++) { - if (wu->per_job_type[i].name[0]) { + if(unlikely(wu->per_job_type[i].type != WORKER_METRIC_IDLE_BUSY)) + continue; + + if (wu->per_job_type[i].name) { if(unlikely(!wu->per_job_type[i].rd_busy_time)) - wu->per_job_type[i].rd_busy_time = rrddim_add(wu->st_workers_busy_per_job_type, wu->per_job_type[i].name, NULL, 1, USEC_PER_MS, RRD_ALGORITHM_ABSOLUTE); + wu->per_job_type[i].rd_busy_time = rrddim_add(wu->st_workers_busy_per_job_type, string2str(wu->per_job_type[i].name), NULL, 1, USEC_PER_MS, RRD_ALGORITHM_ABSOLUTE); rrddim_set_by_pointer(wu->st_workers_busy_per_job_type, wu->per_job_type[i].rd_busy_time, (collected_number)(wu->per_job_type[i].busy_time)); } @@ -1414,6 +1579,122 @@ static void workers_utilization_update_chart(struct worker_utilization *wu) { rrddim_set_by_pointer(wu->st_workers_threads, wu->rd_workers_threads_busy, (collected_number)(wu->workers_busy)); rrdset_done(wu->st_workers_threads); } + + // ---------------------------------------------------------------------- + // custom metric types WORKER_METRIC_ABSOLUTE + + { + size_t i; + for (i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES; i++) { + if(wu->per_job_type[i].type != WORKER_METRIC_ABSOLUTE) + continue; + + if(!wu->per_job_type[i].count_value) + continue; + + if(!wu->per_job_type[i].st) { + size_t job_name_len = string_strlen(wu->per_job_type[i].name); + if(job_name_len > RRD_ID_LENGTH_MAX) job_name_len = RRD_ID_LENGTH_MAX; + + char job_name_sanitized[job_name_len + 1]; + rrdset_strncpyz_name(job_name_sanitized, string2str(wu->per_job_type[i].name), job_name_len); + + char name[RRD_ID_LENGTH_MAX + 1]; + snprintfz(name, RRD_ID_LENGTH_MAX, "workers_%s_value_%s", wu->name_lowercase, job_name_sanitized); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.value.%s", wu->name_lowercase, job_name_sanitized); + + char title[1000 + 1]; + snprintf(title, 1000, "Netdata Workers %s Value of %s", wu->name_lowercase, string2str(wu->per_job_type[i].name)); + + wu->per_job_type[i].st = rrdset_create_localhost( + "netdata" + , name + , NULL + , wu->family + , context + , title + , (wu->per_job_type[i].units)?string2str(wu->per_job_type[i].units):"value" + , "netdata" + , "stats" + , wu->priority + 5 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + wu->per_job_type[i].rd_min = rrddim_add(wu->per_job_type[i].st, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); + wu->per_job_type[i].rd_max = rrddim_add(wu->per_job_type[i].st, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); + wu->per_job_type[i].rd_avg = rrddim_add(wu->per_job_type[i].st, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(wu->per_job_type[i].st); + + rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_min, (collected_number)(wu->per_job_type[i].min_value * WORKER_CHART_DECIMAL_PRECISION)); + rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_max, (collected_number)(wu->per_job_type[i].max_value * WORKER_CHART_DECIMAL_PRECISION)); + rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_avg, (collected_number)(wu->per_job_type[i].sum_value / wu->per_job_type[i].count_value * WORKER_CHART_DECIMAL_PRECISION)); + + rrdset_done(wu->per_job_type[i].st); + } + } + + // ---------------------------------------------------------------------- + // custom metric types WORKER_METRIC_INCREMENTAL + + { + size_t i; + for (i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES; i++) { + if(wu->per_job_type[i].type != WORKER_METRIC_INCREMENTAL) + continue; + + if(!wu->per_job_type[i].count_value) + continue; + + if(!wu->per_job_type[i].st) { + size_t job_name_len = string_strlen(wu->per_job_type[i].name); + if(job_name_len > RRD_ID_LENGTH_MAX) job_name_len = RRD_ID_LENGTH_MAX; + + char job_name_sanitized[job_name_len + 1]; + rrdset_strncpyz_name(job_name_sanitized, string2str(wu->per_job_type[i].name), job_name_len); + + char name[RRD_ID_LENGTH_MAX + 1]; + snprintfz(name, RRD_ID_LENGTH_MAX, "workers_%s_rate_%s", wu->name_lowercase, job_name_sanitized); + + char context[RRD_ID_LENGTH_MAX + 1]; + snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.rate.%s", wu->name_lowercase, job_name_sanitized); + + char title[1000 + 1]; + snprintf(title, 1000, "Netdata Workers %s Rate of %s", wu->name_lowercase, string2str(wu->per_job_type[i].name)); + + wu->per_job_type[i].st = rrdset_create_localhost( + "netdata" + , name + , NULL + , wu->family + , context + , title + , (wu->per_job_type[i].units)?string2str(wu->per_job_type[i].units):"rate" + , "netdata" + , "stats" + , wu->priority + 5 + , localhost->rrd_update_every + , RRDSET_TYPE_LINE + ); + + wu->per_job_type[i].rd_min = rrddim_add(wu->per_job_type[i].st, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); + wu->per_job_type[i].rd_max = rrddim_add(wu->per_job_type[i].st, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); + wu->per_job_type[i].rd_avg = rrddim_add(wu->per_job_type[i].st, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(wu->per_job_type[i].st); + + rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_min, (collected_number)(wu->per_job_type[i].min_value * WORKER_CHART_DECIMAL_PRECISION)); + rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_max, (collected_number)(wu->per_job_type[i].max_value * WORKER_CHART_DECIMAL_PRECISION)); + rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_avg, (collected_number)(wu->per_job_type[i].sum_value / wu->per_job_type[i].count_value * WORKER_CHART_DECIMAL_PRECISION)); + + rrdset_done(wu->per_job_type[i].st); + } + } } static void workers_utilization_reset_statistics(struct worker_utilization *wu) { @@ -1440,6 +1721,11 @@ static void workers_utilization_reset_statistics(struct worker_utilization *wu) wu->per_job_type[i].jobs_started = 0; wu->per_job_type[i].busy_time = 0; + + wu->per_job_type[i].min_value = NAN; + wu->per_job_type[i].max_value = NAN; + wu->per_job_type[i].sum_value = NAN; + wu->per_job_type[i].count_value = 0; } struct worker_thread *wt; @@ -1522,7 +1808,20 @@ static struct worker_thread *worker_thread_find_or_create(struct worker_utilizat return wt; } -static void worker_utilization_charts_callback(void *ptr, pid_t pid __maybe_unused, const char *thread_tag __maybe_unused, size_t utilization_usec __maybe_unused, size_t duration_usec __maybe_unused, size_t jobs_started __maybe_unused, size_t is_running __maybe_unused, const char **job_types_names __maybe_unused, size_t *job_types_jobs_started __maybe_unused, usec_t *job_types_busy_time __maybe_unused) { +static void worker_utilization_charts_callback(void *ptr + , pid_t pid __maybe_unused + , const char *thread_tag __maybe_unused + , size_t utilization_usec __maybe_unused + , size_t duration_usec __maybe_unused + , size_t jobs_started __maybe_unused + , size_t is_running __maybe_unused + , STRING **job_types_names __maybe_unused + , STRING **job_types_units __maybe_unused + , WORKER_METRIC_TYPE *job_types_metric_types __maybe_unused + , size_t *job_types_jobs_started __maybe_unused + , usec_t *job_types_busy_time __maybe_unused + , NETDATA_DOUBLE *job_types_custom_metrics __maybe_unused + ) { struct worker_utilization *wu = (struct worker_utilization *)ptr; // find the worker_thread in the list @@ -1552,12 +1851,32 @@ static void worker_utilization_charts_callback(void *ptr, pid_t pid __maybe_unus // accumulate per job type statistics size_t i; for(i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES ;i++) { + if(!wu->per_job_type[i].name && job_types_names[i]) + wu->per_job_type[i].name = string_dup(job_types_names[i]); + + if(!wu->per_job_type[i].units && job_types_units[i]) + wu->per_job_type[i].units = string_dup(job_types_units[i]); + + wu->per_job_type[i].type = job_types_metric_types[i]; + wu->per_job_type[i].jobs_started += job_types_jobs_started[i]; wu->per_job_type[i].busy_time += job_types_busy_time[i]; - // new job type found - if(unlikely(!wu->per_job_type[i].name[0] && job_types_names[i])) - strncpyz(wu->per_job_type[i].name, job_types_names[i], WORKER_UTILIZATION_MAX_JOB_NAME_LENGTH); + NETDATA_DOUBLE value = job_types_custom_metrics[i]; + if(netdata_double_isnumber(value)) { + if(!wu->per_job_type[i].count_value) { + wu->per_job_type[i].count_value = 1; + wu->per_job_type[i].min_value = value; + wu->per_job_type[i].max_value = value; + wu->per_job_type[i].sum_value = value; + } + else { + wu->per_job_type[i].count_value++; + wu->per_job_type[i].sum_value += value; + if(value < wu->per_job_type[i].min_value) wu->per_job_type[i].min_value = value; + if(value > wu->per_job_type[i].max_value) wu->per_job_type[i].max_value = value; + } + } } // find its CPU utilization @@ -1607,6 +1926,14 @@ static void worker_utilization_finish(void) { wu->name_lowercase = NULL; } + for(i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES ;i++) { + string_freez(wu->per_job_type[i].name); + wu->per_job_type[i].name = NULL; + + string_freez(wu->per_job_type[i].units); + wu->per_job_type[i].units = NULL; + } + // mark all threads as not enabled struct worker_thread *t; for(t = wu->threads; t ; t = t->next) t->enabled = 0; @@ -1639,6 +1966,7 @@ void *global_statistics_main(void *ptr) worker_register_job_name(WORKER_JOB_REGISTRY, "registry"); worker_register_job_name(WORKER_JOB_WORKERS, "workers"); worker_register_job_name(WORKER_JOB_DBENGINE, "dbengine"); + worker_register_job_name(WORKER_JOB_STRINGS, "strings"); netdata_thread_cleanup_push(global_statistics_cleanup, ptr); @@ -1673,6 +2001,9 @@ void *global_statistics_main(void *ptr) worker_is_busy(WORKER_JOB_HEARTBEAT); update_heartbeat_charts(); + + worker_is_busy(WORKER_JOB_STRINGS); + update_strings_charts(); } netdata_thread_cleanup_pop(1); diff --git a/daemon/global_statistics.h b/daemon/global_statistics.h index 8d4a63d085..8f50a9087c 100644 --- a/daemon/global_statistics.h +++ b/daemon/global_statistics.h @@ -12,6 +12,10 @@ extern void rrdr_query_completed(uint64_t db_points_read, uint64_t result_points extern void sqlite3_query_completed(bool success, bool busy, bool locked); extern void sqlite3_row_completed(void); +extern void rrdcontext_triggered_update_on_rrdmetric(void); +extern void rrdcontext_triggered_update_on_rrdinstance(void); +extern void rrdcontext_triggered_update_on_rrdcontext(void); + extern void finished_web_request_statistics(uint64_t dt, uint64_t bytes_received, uint64_t bytes_sent, diff --git a/daemon/unit_test.c b/daemon/unit_test.c index 33df4da357..a2cff95493 100644 --- a/daemon/unit_test.c +++ b/daemon/unit_test.c @@ -1197,12 +1197,12 @@ int run_test(struct test *test) fprintf(stderr, " > %s: feeding position %lu\n", test->name, c+1); } - fprintf(stderr, " >> %s with value " COLLECTED_NUMBER_FORMAT "\n", rd->name, test->feed[c].value); + fprintf(stderr, " >> %s with value " COLLECTED_NUMBER_FORMAT "\n", rrddim_name(rd), test->feed[c].value); rrddim_set(st, "dim1", test->feed[c].value); last = test->feed[c].value; if(rd2) { - fprintf(stderr, " >> %s with value " COLLECTED_NUMBER_FORMAT "\n", rd2->name, test->feed2[c]); + fprintf(stderr, " >> %s with value " COLLECTED_NUMBER_FORMAT "\n", rrddim_name(rd2), test->feed2[c]); rrddim_set(st, "dim2", test->feed2[c]); } @@ -1231,7 +1231,7 @@ int run_test(struct test *test) int same = (roundndd(v * 10000000.0) == roundndd(n * 10000000.0))?1:0; fprintf(stderr, " %s/%s: checking position %lu (at %"PRId64" secs), expecting value " NETDATA_DOUBLE_FORMAT ", found " NETDATA_DOUBLE_FORMAT ", %s\n", - test->name, rd->name, c+1, + test->name, rrddim_name(rd), c+1, (int64_t)((rrdset_first_entry_t(st) + c * st->update_every) - time_start), n, v, (same)?"OK":"### E R R O R ###"); @@ -1243,7 +1243,7 @@ int run_test(struct test *test) same = (roundndd(v * 10000000.0) == roundndd(n * 10000000.0))?1:0; fprintf(stderr, " %s/%s: checking position %lu (at %"PRId64" secs), expecting value " NETDATA_DOUBLE_FORMAT ", found " NETDATA_DOUBLE_FORMAT ", %s\n", - test->name, rd2->name, c+1, + test->name, rrddim_name(rd2), c+1, (int64_t)((rrdset_first_entry_t(st) + c * st->update_every) - time_start), n, v, (same)?"OK":"### E R R O R ###"); if(!same) errors++; @@ -1258,39 +1258,39 @@ static int test_variable_renames(void) { fprintf(stderr, "Creating chart\n"); RRDSET *st = rrdset_create_localhost("chart", "ID", NULL, "family", "context", "Unit Testing", "a value", "unittest", NULL, 1, 1, RRDSET_TYPE_LINE); - fprintf(stderr, "Cre |