summaryrefslogtreecommitdiffstats
path: root/aclk
AgeCommit message (Collapse)Author
2022-10-28Remove option to use MQTT 3 (#13824)Timotej S
* remove mqtt3 support
2022-10-24app to api netdata cloud (#13856)Timotej S
2022-10-19Health thread per host (#13712)Emmanuel Vasilakis
* Rebased * rebased * health_execute_pending_updates -> health_execute_delayed_initializations * fix labels for current host only * missing bracket * misc fixes, reload health for disconnected hosts * remove volatile, add comment
2022-10-17Inject costallocz to mqtt_websockets library and its children (#13813)Timotej S
* use mallocz, freez & family also from within the mqtt libs
2022-10-13Count currently streaming senders on the localhost (#13755)Emmanuel Vasilakis
rebased
2022-10-13allow disabling netdata monitoring section of the dashboard (#13788)Costa Tsaousis
* allow disabling netdata monitoring section of the dashboard * disable-netdata-stats: Modify eBPF.plugin to disable statistic charts according user selection, by default it is enabled * Don't send internal statistics for exporting engine if it's disabled. * Fix global statistics flag initialization * Don't send internal statistics for checks plugin if it's disabled. Co-authored-by: Thiago Marques <thiagoftsm@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-10-13dbengine free from RRDSET and RRDDIM (#13772)Costa Tsaousis
* dbengine free from RRDSET and RRDDIM * fix for excess parameters to query ops * add comment about ML * update_every from int to uint32_t * rrddim_mem storage engine working * fixes for update_every_s * working dbengine * a lot of changes in dbengine regarding timestamps * better logging of not sequential points * rrdset_done() now gives aligned timestamps for higher tiers * dont change the end_time of descriptors, because they cant be loaded back * fixes for cmake * fixes for db mode ram * Global counters for dbengine loading errors. Ensure dbengine store metrics always has aligned metrics or breaks the page when storing new data. * update lgtm config * fixes for 32-bit systems * update unittests * Don't try to find and create a host on the fly if not already in memory * Remove unused functions * print backtrace in case of fatal * always set ctx to page_index * detect ctx and metric uuid discrepancies * use legacy uuid if multihost is not available * fix for last commit * prevent repeating log * Do not try to access archived charts when executing a data query * Remove unused function * log inconsistent collections once every 10 mins Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-10-11fix warning when -Wfree-nonheap-object is used (#13805)Timotej S
2022-10-09full memory tracking and profiling of Netdata Agent (#13789)Costa Tsaousis
* full memory tracking and profiling of Netdata Agent * initialize dbengine only when it is needed * handling of dbengine compiled but not available * restore unittest * restore unittest again * more improvements about ifdef dbengine * fix compilation when dbengine is not enabled * check if dbengine is enabled on exit * call freez() not free() * aral unittest * internal checks activate trace allocations; dev mode activates internal checks
2022-10-06Bump websockets submodule (#13776)Timotej S
* update mqtt websockets submodule and reflect API changes
2022-10-05Allow netdata plugins to expose functions for querying more information ↵Costa Tsaousis
about specific charts (#13720) * function renames and code cleanup in popen.c; no actual code changes * netdata popen() now opens both child process stdin and stdout and returns FILE * for both * pass both input and output to parser structures * updated rrdset to call custom functions * RRDSET FUNCTION leading calls for both sync and async operation * put RRDSET functions to a separate file * added format and timeout at function definition * support for synchronous (internal plugins) and asynchronous (external plugins and children) functions * /api/v1/function endpoint * functions are now attached to the host and there is a dictionary view per chart * functions implemented at plugins.d * remove the defer until keyword hook from plugins.d when it is done * stream sender implementation of functions * sanitization of all functions so that certain characters are only allowed * strictier sanitization * common max size * 1st working plugins.d example * always init inflight dictionary * properly destroy dictionaries to avoid parallel insertion of items * add more debugging on disconnection reasons * add more debugging on disconnection reasons again * streaming receiver respects newlines * dont use the same fp for both streaming receive and send * dont free dbengine memory with internal checks * make sender proceed in the buffer * added timing info and garbage collection at plugins.d * added info about routing nodes * added info about routing nodes with delay * added more info about delays * added more info about delays again * signal sending thread to wake up * streaming version labeling and commented code to support capabilities * added functions to /api/v1/data, /api/v1/charts, /api/v1/chart, /api/v1/info * redirect top output to stdout * address coverity findings * fix resource leaks of popen * log attempts to connect to individual destinations * better messages * properly parse destinations * try to find a function from the most matching to the least matching * log added streaming destinations * rotate destinations bypassing a node in the middle that does not accept our connection * break the loops properly * use typedef to define callbacks * capabilities negotiation during streaming * functions exposed upstream based on capabilities; compression disabled per node persisting reconnects; always try to connect with all capabilities * restore functionality to lookup functions * better logging of capabilities * remove old versions from capabilities when a newer version is there * fix formatting * optimization for plugins.d rrdlabels to avoid creating and destructing dictionaries all the time * delayed health initialization for rrddim and rrdset * cleanup health initialization * fix for popen() not returning the right value * add health worker jobs for initializing rrdset and rrddim * added content type support for functions; apps.plugin permanent function to display all the processes * fixes for functions parameters parsing in apps.plugin * fix for process matching in apps.plugiin * first working function for apps.plugin * Dashboard ACL is disabled for functions; Function errors are all in JSON format * apps.plugin function processes returns json table * use json_escape_string() to escape message * fix formatting * apps.plugin exposes all its metrics to function processes * fix json formatting when filtering out some rows * reopen the internal pipe of rrdpush in case of errors * misplaced statement * do not use buffer->len * support for GLOBAL functions (functions that are not linked to a chart * added /api/v1/functions endpoint; removed format from the FUNCTIONS api; * swagger documentation about the new api end points * added plugins.d documentation about functions * never re-close a file * remove uncessesary ifdef * fixed issues identified by codacy * fix for null label value * make edit-config copy-and-paste friendly * Revert "make edit-config copy-and-paste friendly" This reverts commit 54500c0e0a97f65a0c66c4d34e966f6a9056698e. * reworked sender handshake to fix coverity findings * timeout is zero, for both send_timeout() and recv_timeout() * properly detect that parent closed the socket * support caching of function responses; limit function response to 10MB; added protection from malformed function responses * disabled excessive logging * added units to apps.plugin function processes and normalized all values to be human readable * shorter field names * fixed issues reported * fixed apps.plugin error response; tested that pluginsd can properly handle faulty responses * use double linked list macros for double linked list management * faster apps.plugin function printing by minimizing file operations * added memory percentage * fix compatibility issues with older compilers and FreeBSD * rrdpush sender code cleanup; rrhost structure cleanup from sender flags and variables; * fix letftover variable in ifdef * apps.plugin: do not call detach from the thread; exit immediately when input is broken * exclude AR charts from health * flush cleaner; prefer sender output * clarity * do not fill the cbuffer if not connected * fix * dont enabled host->sender if streaming is not enabled; send host label updates to parent; * functions are only available through ACLK * Prepared statement reports only in dev mode * fix AR chart detection * fix for streaming not being enabling itself * more cleanup of sender and receiver structures * moved read-only flags and configuration options to rrdhost->options * fixed merge with master * fix for incomplete rename * prevent service thread from working on charts that are being collected Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-09-27Remove Chart/Dim based communication (#13650)Timotej S
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-09-07Obsolete RRDSET state (#13635)Costa Tsaousis
* move chart_labels to rrdset * rename chart_labels to rrdlabels * renamed hash_id to uuid * turned is_ar_chart into an rrdset flag * removed rrdset state * removed unused senders_connected member of rrdhost * removed unused host flag RRDHOST_FLAG_MULTIHOST * renamed rrdhost host_labels to rrdlabels * Update exporting unit tests Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-05Deduplicate all netdata strings (#13570)Costa Tsaousis
* rrdfamily * rrddim * rrdset plugin and module names * rrdset units * rrdset type * rrdset family * rrdset title * rrdset title more * rrdset context * rrdcalctemplate context and removal of context hash from rrdset * strings statistics * rrdset name * rearranged members of rrdset * eliminate rrdset name hash; rrdcalc chart converted to STRING * rrdset id, eliminated rrdset hash * rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate * rrdcalctemplate * rrdvar * eval_variable * rrddimvar and rrdsetvar * rrdhost hostname, os and tags * fix master commits * added thread cache; implemented string_dup without locks * faster thread cache * rrdset and rrddim now use dictionaries for indexing * rrdhost now uses dictionary * rrdfamily now uses DICTIONARY * rrdvar using dictionary instead of AVL * allocate the right size to rrdvar flag members * rrdhost remaining char * members to STRING * * better error handling on indexing * strings now use a read/write lock to allow parallel searches to the index * removed AVL support from dictionaries; implemented STRING with native Judy calls * string releases should be negative * only 31 bits are allowed for enum flags * proper locking on strings * string threading unittest and fixes * fix lgtm finding * fixed naming * stream chart/dimension definitions at the beginning of a streaming session * thread stack variable is undefined on thread cancel * rrdcontext garbage collect per host on startup * worker control in garbage collection * relaxed deletion of rrdmetrics * type checking on dictfe * netdata chart to monitor rrdcontext triggers * Group chart label updates * rrdcontext better handling of collected rrdsets * rrdpush incremental transmition of definitions should use as much buffer as possible * require 1MB per chart * empty the sender buffer before enabling metrics streaming * fill up to 50% of buffer * reset signaling metrics sending * use the shared variable for status * use separate host flag for enabling streaming of metrics * make sure the flag is clear * add logging for streaming * add logging for streaming on buffer overflow * circular_buffer proper sizing * removed obsolete logs * do not execute worker jobs if not necessary * better messages about compression disabling * proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped * monitor stream sender used buffer ratio * Update exporting unit tests * no need to compare label value with strcmp * streaming send workers now monitor bandwidth * workers now use strings * streaming receiver monitors incoming bandwidth * parser shift of worker ids * minor fixes * Group chart label updates * Populate context with dimensions that have data * Fix chart id * better shift of parser worker ids * fix for streaming compression * properly count received bytes * ensure LZ4 compression ring buffer does not wrap prematurely * do not stream empty charts; do not process empty instances in rrdcontext * need_to_send_chart_definition() does not need an rrdset lock any more * rrdcontext objects are collected, after data have been written to the db * better logging of RRDCONTEXT transitions * always set all variables needed by the worker utilization charts * implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes * lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock * STRING code re-organization for clarity * thread_cache improvements; double numbers precision on worker threads * STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS * rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts * Add index to speed up database context searches * Removed last_updated optimization (was also buggy after latest merge with master) Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-08-24Remove aclk_api.[ch] (#13540)Timotej S
* get rid of aclk_starter middleman * get rid of aclk_api.[ch]
2022-08-15reduce memcpy and memory usage on mqtt5 (#13450)Timotej S
* reduce memcpy on mqtt5
2022-08-04Send chart context with alert events to the cloud (#13409)Emmanuel Vasilakis
* add chart context to alert events * migrate health log tables to add chart_context * send it via proto message * add from v3 to v4 * free table * free chart_context
2022-07-28Revert "Query queue only for queries" (#13452)Stelios Fragkakis
Revert "Query queue only for queries (#13431)" This reverts commit 221fd512873613a10b3d95b25a8a4d542b2c4801.
2022-07-28Query queue only for queries (#13431)Timotej S
simplify and clean up
2022-07-24Rrdcontext (#13335)Costa Tsaousis
* type checking on dictionary return values * first STRING implementation, used by DICTIONARY and RRDLABEL * enable AVL compilation of STRING * Initial functions to store context info * Call simple test functions * Add host_id when getting charts * Allow host to be null and in this case it will process the localhost * Simplify init Do not use strdupz - link directly to sqlite result set * Init the database during startup * make it compile - no functionality yet * intermediate commit * intermidiate * first interface to sql * loading instances * check if we need to update cloud * comparison of rrdcontext on conflict * merge context titles * rrdcontext public interface; statistics on STRING; scratchpad on DICTIONARY * dictionaries maintain version numbers; rrdcontext api * cascading changes * first operational cleanup * string unittest * proper cleanup of referenced dictionaries * added rrdmetrics * rrdmetric starting retention * Add fields to context Adjuct context creation and delete * Memory cleanup * Fix get context list Fix memory double free in tests Store context with two hosts * calculated retention * rrdcontext retention with collection * Persist database and shutdown * loading all from sql * Get chart list and dimension list changes * fully working attempt 1 * fully working attempt 2 * missing archived flag from log * fixed archived / collected * operational * proper cleanup * cleanup - implemented all interface functions - dictionary react callback triggers after the dictionary is unlocked * track all reasons for changes * proper tracking of reasons of changes * fully working thread * better versioning of contexts * fix string indexing with AVL * running version per context vs hub version; ifdef dbengine * added option to disable rrdmetrics * release old context when a chart changes context * cleanup properly * renamed config * cleanup contexts; general cleanup; * deletion inline with dequeue; lots of cleanup; child connected/disconnected * ml should start after rrdcontext * added missing NULL to ri->rrdset; rrdcontext flags are now only changed under a mutex lock * fix buggy STRING under AVL * Rework database initialization Add migration logic to the context database * fix data race conditions during context deletion * added version hash algorithm * fix string over AVL * update aclk-schemas * compile new ctx related protos * add ctx stream message utils * add context messages * add dummy rx message handlers * add the new topics * add ctx capability * add helper functions to send the new messages * update cmake build to not fail * update topic names * handle rrdcontext_enabled * add more functions * fatal on OOM cases instead of return NULL * silence unknown query type error * fully working attempt 1 * fully working attempt 2 * allow compiling without ACLK * added family to the context * removed excess character in UUID * smarter merging of titles and families * Database migration code to add family Add family to SQL_CHART_DATA and VERSIONED_CONTEXT_DATA * add family to context message * enable ctx in communication * hardcoded enabled contexts * Add hard code for CTX * add update node collectors to json * add context message log * fix log about last_time_t * fix collected flags for queued items * prevent crash on charts cleanup * fix bug in AVL indexing of dictionaries; make sure react callback of dictionaries has a reference counter, which is acquired while the dictionary is locked * fixed dictionary unittest * strict policy to cleanup and garbage collector * fix db rotation and garbage collection timings * remove deadlock * proper garbage collection - a lot faster retention recalculation * Added not NULL in database columns Remove migration code for context -- we will ship with version 1 of the table schema Added define for query in tests to detect localhost * Use UUID_STR_LEN instead of GUID_LEN + 1 Use realistic timestamps when adding test data in the database * Add NULL checks for passed parameters * Log deleted context when compiled with NETDATA_INTERNAL_CHECKS * Error checking for null host id * add missing ContextsCheckpoint log convertor * Fix spelling in VACCUM * Hold additional information for host -- prepare to load archived hosts on startup * Make sure claim id is valid * is_get_claimed is actually get the current claim id * Simplify ctx get chart list query * remove env negotiation * fix string unittest when there are some strings already in the index * propagate live-retention flag upstream; cleanup all update reasons; updated instances logging; automated attaching started/stopped collecting flags; * first implementation of /api/v1/contexts * full contexts API; updated swagger * disabled debugging; rrdcontext enabled by default * final cleanup and renaming of global variables * return current time on currently collected contexts, charts and dimensions * added option "deepscan" to the API to have the server refresh the retention and recalculate the contexts on the fly * fixed identation of yaml * Add constrains to the host table * host->node_id may not be available * new capabilities * lock the context while rendering json * update aclk-schemas * added permanent labels to all charts about plugin, module and family; added labels to all proc plugin modules * always add the labels * allow merging of families down to [x] * dont show uuids by default, added option to enable them; response is now accepting after,before to show only data for a specific timeframe; deleted items are only shown when "deleted" is requested; hub version is now shown when "queue" is requested * Use the localhost claim id * Fix to handle host constrains better * cgroups: add "k8s." prefix to chart context in k8s * Improve sqlite metadata version migration check * empty values set to "[none]"; fix labels unit test to reflect that * Check if we reached the version we want first (address CODACY report re: Array index 'i' is used before limits check) * Rewrite condition to address CODACY report (Redundant condition: t->filter_callback. '!A || (A && B)' is equivalent to '!A || B') * Properly unlock context * fixed memory leak on rrdcontexts - it was not freeing all dictionaries in rrdhost; added wait of up to 100ms on dictionary_destroy() to give time to dictionaries to release their items before destroying them * fixed memory leak on rrdlabels not freed on rrdinstances * fixed leak when dimensions and charts are redefined * Mark entries for charts and dimensions as submitted to the cloud 3600 seconds after their creation Mark entries for charts and dimensions as updated (confirmed by the cloud) 1800 seconds after their submission * renamed struct string * update cgroups alarms * fixed codacy suggestions * update dashboard info * fix k8s_cgroup_10s_received_packets_storm alarm * added filtering options to /api/v1/contexts and /api/v1/context * fix eslint * fix eslint * Fix pointer binding for host / chart uuids * Fix cgroups unit tests * fixed non-retention updates not propagated upstream * removed non-fatal fatals * Remove context from 2 way string merge. * Move string_2way_merge to dictionary.c * Add 2-way string merge tests. * split long lines * fix indentation in netdata-swagger.yaml * update netdata-swagger.json * yamllint please * remove the deleted flag when a context is collected * fix yaml warning in swagger * removed non-fatal fatals * charts should now be able to switch contexts * allow deletion of unused metrics, instances and contexts * keep the queued flag * cleanup old rrdinstance labels * dont hide objects when there is no filter; mark objects as deleted when there are no sub-objects * delete old instances once they changed context * delete all instances and contexts that do not have sub-objects * more precise transitions * Load archived hosts on startup (part 1) * update the queued time every time * disable by default; dedup deleted dimensions after snapshot * Load archived hosts on startup (part 2) * delayed processing of events until charts are being collected * remove dont-trigger flag when object is collected * polish all triggers given the new dont_process flag * Remove always true condition Enums for readbility / create_host_callback only if ACLK is enabled (for now) * Skip retention message if context streaming is enabled Add messages in the access log if context streaming is enabled * Check for node id being a UUID that can be parsed Improve error check / reporting when loading archived hosts and creating ACLK sync threads * collected, archived, deleted are now mutually exclusive * Enable the "orphan" handling for now Remove dead code Fix memory leak on free host * Queue charts and dimensions will be no-op if host is set to stream contexts * removed unused parameter and made sure flags are set on rrdcontext insert * make the rrdcontext thread abort mid-work when exiting * Skip chart hash computation and storage if contexts streaming is enabled Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Timo <timotej@netdata.cloud> Co-authored-by: ilyam8 <ilya@netdata.cloud> Co-authored-by: Vladimir Kobal <vlad@prokk.net> Co-authored-by: Vasilis Kalintiris <vasilis@netdata.cloud>
2022-07-14Docs housekeeping (#13179)Tasos Katsoulas
* Change the title of the /docs/configure/machine-learning doc * fix some outdated links * Clarify some details in the postgres collector
2022-07-08Better ACLK debug communication log (#13281)Timotej S
2022-07-07UpdateNodeCollectors message (#13330)Emmanuel Vasilakis
* add new aclk-schemas. remove services related * add updatenodecollectors message * build with --disable-cloud
2022-07-06Use new MQTT as default (revert #13258)" (#13287)Timotej S
This reverts commit b92f5a3271bb0fb7d16b2c406e473eec7bd68102.
2022-06-30Remove warnings when openssl 3 is used. (#13170)thiagoftsm
* remove_warnings_openssl_v3: Add new macro to define latest OpenSSL version * remove_warnings_openssl_v3: Add headers necessary for new API * remove_warnings_openssl_v3: Add compatible variables and adjst code inside load_private_key * remove_warnings_openssl_v3: Adjust function aclk_get_mqtt_otp according to openssl version * remove_warnings_openssl_v3: Adjust function private_decrypt * remove_warnings_openssl_v3: Fix function private_decrypt * remove_warnings_openssl_v3: Update error message * remove_warnings_openssl_v3: Update missing error message
2022-06-29Use old mqtt implementation as default (#13258)Emmanuel Vasilakis
mqtt5 as default off
2022-06-28netdata doubles (#13217)Costa Tsaousis
* netdata doubles * fix cmocka test * fix cmocka test again * fix left-overs of long double to NETDATA_DOUBLE * RRDDIM detached from disk representation; db settings in [db] section of netdata.conf * update the memory before saving * rrdset is now detached from file structures too * on memory mode map, update the memory mapped structures on every iteration * allow RRD_ID_LENGTH_MAX to be changed * granularity secs, back to update every * fix formatting * more formatting
2022-06-27Removes Legacy JSON Cloud Protocol Support In Agent (#13111)Timotej S
* removes old protocol support (cloud removed support already)
2022-06-27Use new mqtt implementation as default (#13132)Timotej S
flips mqtt5 to be default instead of mqtt3
2022-06-14ACLK statistics on bytes recvd and sent (#13091)Timotej S
2022-06-13Labels with dictionary (#13070)Costa Tsaousis
* squashed and rebased to master * fix overflow and single character bug in sanitize; include rrd.h instead of node_info.h * added unittest for UTF-8 multibyte sanitization * Fix unit test compilation * Fix CMake build * remove double sanitizer for opentsdb; cleanup sanitize_json_string() * rename error_description to error_message to avoid conflict with json-c * revert last and undef error_description from json-c * more unittests; attempt to fix protobuf map issue * get rid of rrdlabels_get() and replace it with a safe version that writes the value to a buffer * added dictionary sorting unittest; rrdlabels_to_buffer() now is sorted * better sorted dictionary checking * proper unittesting for sorted dictionaries * call dictionary deletion callback when destroying the dictionary * remove obsolete variable * Fix exporting unit tests * Fix k8s label parsing test * workaround for cmocka and strdupz() * Bypass cmocka memory allocation check * Revert "Bypass cmocka memory allocation check" This reverts commit 4c49923839d9229bea23ca914dd8a0be1ebe2bf4. * Revert "workaround for cmocka and strdupz()" This reverts commit 7bebee04801db1865c748a7896d5fa54bb7104a5. * Bypass cmocka memory allocation checks * respect json formatting for chart labels * cloud sends colons * print the value only once * allow parenthesis in values and spaces; make stream sender send quotes for values Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-06-06Allow usage of the new MQTT 5 implementation (#12838)Timotej S
* adds support for new MQTT5 implementation in agent, currently by default disabled as Tech Preview
2022-05-31Fix coverity issue 378617,378615 (#13021)Stelios Fragkakis
* Fix CID 378617 * Fix CID 378615 * Make sure the ST rrdr lock indicator is set/reset while holding a lock * Switch to int
2022-05-30Trigger queue removed alerts on health log exchange with cloud (#12954)Emmanuel Vasilakis
trigger queue removed on health log exchange with cloud
2022-05-27Cleanup Challenge Response Code (#11730)Timotej S
- Use OpenSSL for Base64 encode/decode - general cleanup of code
2022-05-24Stream and advertise metric correlations to the cloud (#12940)Emmanuel Vasilakis
* stream and advertise mc to the cloud * better reporting * remove log * remove aclk debug
2022-05-13Implements new capability fields in aclk_schemas (#12602)Timotej S
use new capability fields
2022-05-12Take into account the in queue wait time when executing a data query (#12885)Stelios Fragkakis
Take into account the in queue wait time when executing a query with a timeout
2022-05-09Workers utilization charts (#12807)Costa Tsaousis
* initial version of worker utilization * working example * without mutexes * monitoring DBENGINE, ACLKSYNC, WEB workers * added charts to monitor worker usage * fixed charts units * updated contexts * updated priorities * added documentation * converted threads to stacked chart * One query per query thread * Revert "One query per query thread" This reverts commit 6aeb391f5987c3c6ba2864b559fd7f0cd64b14d3. * fixed priority for web charts * read worker cpu utilization from proc * read workers cpu utilization via /proc/self/task/PID/stat, so that we have cpu utilization even when the jobs are too long to finish within our update_every frequency * disabled web server cpu utilization monitoring - it is now monitored by worker utilization * tight integration of worker utilization to web server * monitoring statsd worker threads * code cleanup and renaming of variables * contrained worker and statistics conflict to just one variable * support for rendering jobs per type * better priorities and removed the total jobs chart * added busy time in ms per job type * added proc.plugin monitoring, switch clock to MONOTONIC_RAW if available, global statistics now cleans up old worker threads * isolated worker thread families * added cgroups.plugin workers * remove unneeded dimensions when then expected worker is just one * plugins.d and streaming monitoring * rebased; support worker_is_busy() to be called one after another * added diskspace plugin monitoring * added tc.plugin monitoring * added ML threads monitoring * dont create dimensions and charts that are not needed * fix crash when job types are added on the fly * added timex and idlejitter plugins; collected heartbeat statistics; reworked heartbeat according to the POSIX * the right name is heartbeat for this chart * monitor streaming senders * added streaming senders to global stats * prevent division by zero * added clock_init() to external C plugins * added freebsd and macos plugins * added freebsd and macos to global statistics * dont use new as a variable; address compiler warnings on FreeBSD and MacOS * refactored contexts to be unique; added health threads monitoring Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-05-05Accept a data query timeout parameter from the cloud (#12823)Stelios Fragkakis
* Add the ability to parse a "timeout" parameter in the incoming command * Cancel the incoming query if already in the queue for too long
2022-05-02Make atomics a hard-dep. (#12730)vkalintiris
They are used extensively throughout our code base, and not having support for them does not generate a thread-safe agent.
2022-04-28use aclk_parse_otp_error (#12767)Timotej S
2022-04-19Add the ability to perform a data query using an offline node id (#12650)Stelios Fragkakis
* Add the ability to build a host structure by node id to execute queries for archived hosts * Add the ability to execute queries from the cloud for archived hosts by node id * Add free_temporary_host function
2022-04-19Fix Valgrind errors (#12619)Vladimir Kobal
2022-04-01Fix memory leaks on Netdata exit (#12511)Vladimir Kobal
* Fix memory leaks in dimensions and charts * Initialize superblock memory regions * Clean up static threads * Fix memory leaks in compression * Fix memory leaks in rrdcaltemplate * Fix memory leaks in health config * Fix ACLK memory leaks
2022-03-22Extend aclk-state information (#12458)Timotej S
2022-03-21Implement fine-grained error replies to cloud queries (#12460)Timotej S
2022-03-18Add delay on missing priv_key (#12450)Timotej S
2022-03-09Improve agent to cloud synchronization performance (#12348)Stelios Fragkakis
* Switch to prepare statement when storing active charts / dimensions * Switch to prepare statement when storing chart labels * Switch to prepare statement when doing a node id lookup * Switch to prepare statement when loading the node id for a host * Improve performance by avoiding db query * Use prepare statement when counting pending chart messages to send to the cloud * Delay locking while preparing commands * No need to use buffer, avoid memory allocation overhead * Switch to prepare statement when loading pending chart updates to send to the cloud
2022-03-09Adds more info to aclk-state API call (#12231)Timotej S