summaryrefslogtreecommitdiffstats
path: root/exporting
AgeCommit message (Collapse)Author
2022-11-28replication fixes No 7 (#14053)Costa Tsaousis
* move global statistics workers to a separate thread; query statistics per query source; query statistics for ML, exporters, backfilling; reset replication point in time every 10 seconds, instead of every 1; fix compilation warnings; optimize the replication queries code; prevent long tail of replication requests (big sleeps); provide query statistics about replication ; optimize replication sender when most senders are full; optimize replication_request_get_first_available(); reset replication completion calculation; * remove workers utilization from global statistics thread
2022-11-22Do not force internal collectors to call rrdset_next. (#13926)vkalintiris
* Remove calls to rrdset_next(). * Rm checks plugin * Update documentantion * Call rrdset_next from within rrdset_done This wraps up the removal of rrdset_next from internal collectors, which removes a lot of unecessary code and the need for if/else clauses in every place. The pluginsd parser is the only component that calls rrdset_next*() functions because it's not strictly speaking a collector but more of a collector manager/proxy. With the current changes it's possible to simplify the API we expose from RRD significantly, but this will be follow-up work in the future. * Remove stale reference to checks.plugin * Fix RRD unit test rrdset_next is not meant to be called from these tests. * Fix db engine unit test. * Schedule rrdset_next when we have completed at least one collection. * Mark chart creation clauses as unlikely. * Add missing brace to fix FreeBSD plugin.
2022-11-18Change relative links to absolute for learn components (#14015)Tasos Katsoulas
Change relative links to absolute based on @site Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
2022-11-11Add _total suffix to raw increment metrics for remote write (#13977)Vladimir Kobal
Fixes https://github.com/netdata/netdata/issues/13963
2022-11-09Remove anomaly rates chart. (#13763)vkalintiris
2022-10-23QUERY_TARGET: new query engine for Netdata Agent (#13697)Costa Tsaousis
* initial implementation of QUERY_TARGET * rrd2rrdr() interface * rrddim_find_best_tier_for_timeframe() ported * added dimension filtering * added db object in query target * rrd2rrdr() ported * working on formatters * working on jsonwrapper * finally, it compiles... * 1st run without crashes * query planer working * cleanup old code * review changes * fix also changing data collection frequency * fix signess * fix rrdlabels and dimension ordering * fixes * remove unused variable * ml should accept NULL response from rrd2rrdr() * number formatting fixes * more number formatting fixes * more number formatting fixes * support mc parallel queries * formatting and cleanup * added rrd2rrdr_legacy() as a simplified interface to run a query * make sure rrdset_find_natural_update_every_for_timeframe() returns a value * make signed comparisons * weights endpoint using rrdcontexts * fix for legacy db modes and cleanup * fix for chart_ids and remove AR chart from weights endpoint * Ignore command if not initialized yet * remove unused members * properly initialize window * code cleanup - rrddim linked list is gone; rrdset rwlock is gone too * reviewed RRDR.internal members * eliminate unnecessary members of QUERY_TARGET * more complete query ids; more detailed information on aborted queries * properly terminate option strings * query id contains group_options which is controlled by users, so escaping is necessary * tense in query id * tense in query id - again * added the remaining query options to the query id * Expose hidden option to the dimension * use the hidden flag when loading context dimensions * Specify table alias for option * dont update chart last access time, unless at least a dimension of the chart will be queried Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-10-13Fix exporting unit tests (#13816)Vladimir Kobal
2022-10-13allow disabling netdata monitoring section of the dashboard (#13788)Costa Tsaousis
* allow disabling netdata monitoring section of the dashboard * disable-netdata-stats: Modify eBPF.plugin to disable statistic charts according user selection, by default it is enabled * Don't send internal statistics for exporting engine if it's disabled. * Fix global statistics flag initialization * Don't send internal statistics for checks plugin if it's disabled. Co-authored-by: Thiago Marques <thiagoftsm@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-10-13dbengine free from RRDSET and RRDDIM (#13772)Costa Tsaousis
* dbengine free from RRDSET and RRDDIM * fix for excess parameters to query ops * add comment about ML * update_every from int to uint32_t * rrddim_mem storage engine working * fixes for update_every_s * working dbengine * a lot of changes in dbengine regarding timestamps * better logging of not sequential points * rrdset_done() now gives aligned timestamps for higher tiers * dont change the end_time of descriptors, because they cant be loaded back * fixes for cmake * fixes for db mode ram * Global counters for dbengine loading errors. Ensure dbengine store metrics always has aligned metrics or breaks the page when storing new data. * update lgtm config * fixes for 32-bit systems * update unittests * Don't try to find and create a host on the fly if not already in memory * Remove unused functions * print backtrace in case of fatal * always set ctx to page_index * detect ctx and metric uuid discrepancies * use legacy uuid if multihost is not available * fix for last commit * prevent repeating log * Do not try to access archived charts when executing a data query * Remove unused function * log inconsistent collections once every 10 mins Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-10-09Remove extern from function declared in headers. (#13790)vkalintiris
By default functions are declared as extern in C/C++ headers. The goal of this PR is to reduce the wall of text that many headers have and, more importantly, to make the declaration of extern'd variables - of which we have many dispersed in various places - easily and quickly identifiable. Automatically generated with: $ git grep -l '^extern.*(' '**.h' | \ grep -v libjudy | \ grep -v 'sqlite3.h' | \ xargs sed -i -e 's/extern \(.*(.*$\)/\1/' This is a NFC.
2022-10-05Allow netdata plugins to expose functions for querying more information ↵Costa Tsaousis
about specific charts (#13720) * function renames and code cleanup in popen.c; no actual code changes * netdata popen() now opens both child process stdin and stdout and returns FILE * for both * pass both input and output to parser structures * updated rrdset to call custom functions * RRDSET FUNCTION leading calls for both sync and async operation * put RRDSET functions to a separate file * added format and timeout at function definition * support for synchronous (internal plugins) and asynchronous (external plugins and children) functions * /api/v1/function endpoint * functions are now attached to the host and there is a dictionary view per chart * functions implemented at plugins.d * remove the defer until keyword hook from plugins.d when it is done * stream sender implementation of functions * sanitization of all functions so that certain characters are only allowed * strictier sanitization * common max size * 1st working plugins.d example * always init inflight dictionary * properly destroy dictionaries to avoid parallel insertion of items * add more debugging on disconnection reasons * add more debugging on disconnection reasons again * streaming receiver respects newlines * dont use the same fp for both streaming receive and send * dont free dbengine memory with internal checks * make sender proceed in the buffer * added timing info and garbage collection at plugins.d * added info about routing nodes * added info about routing nodes with delay * added more info about delays * added more info about delays again * signal sending thread to wake up * streaming version labeling and commented code to support capabilities * added functions to /api/v1/data, /api/v1/charts, /api/v1/chart, /api/v1/info * redirect top output to stdout * address coverity findings * fix resource leaks of popen * log attempts to connect to individual destinations * better messages * properly parse destinations * try to find a function from the most matching to the least matching * log added streaming destinations * rotate destinations bypassing a node in the middle that does not accept our connection * break the loops properly * use typedef to define callbacks * capabilities negotiation during streaming * functions exposed upstream based on capabilities; compression disabled per node persisting reconnects; always try to connect with all capabilities * restore functionality to lookup functions * better logging of capabilities * remove old versions from capabilities when a newer version is there * fix formatting * optimization for plugins.d rrdlabels to avoid creating and destructing dictionaries all the time * delayed health initialization for rrddim and rrdset * cleanup health initialization * fix for popen() not returning the right value * add health worker jobs for initializing rrdset and rrddim * added content type support for functions; apps.plugin permanent function to display all the processes * fixes for functions parameters parsing in apps.plugin * fix for process matching in apps.plugiin * first working function for apps.plugin * Dashboard ACL is disabled for functions; Function errors are all in JSON format * apps.plugin function processes returns json table * use json_escape_string() to escape message * fix formatting * apps.plugin exposes all its metrics to function processes * fix json formatting when filtering out some rows * reopen the internal pipe of rrdpush in case of errors * misplaced statement * do not use buffer->len * support for GLOBAL functions (functions that are not linked to a chart * added /api/v1/functions endpoint; removed format from the FUNCTIONS api; * swagger documentation about the new api end points * added plugins.d documentation about functions * never re-close a file * remove uncessesary ifdef * fixed issues identified by codacy * fix for null label value * make edit-config copy-and-paste friendly * Revert "make edit-config copy-and-paste friendly" This reverts commit 54500c0e0a97f65a0c66c4d34e966f6a9056698e. * reworked sender handshake to fix coverity findings * timeout is zero, for both send_timeout() and recv_timeout() * properly detect that parent closed the socket * support caching of function responses; limit function response to 10MB; added protection from malformed function responses * disabled excessive logging * added units to apps.plugin function processes and normalized all values to be human readable * shorter field names * fixed issues reported * fixed apps.plugin error response; tested that pluginsd can properly handle faulty responses * use double linked list macros for double linked list management * faster apps.plugin function printing by minimizing file operations * added memory percentage * fix compatibility issues with older compilers and FreeBSD * rrdpush sender code cleanup; rrhost structure cleanup from sender flags and variables; * fix letftover variable in ifdef * apps.plugin: do not call detach from the thread; exit immediately when input is broken * exclude AR charts from health * flush cleaner; prefer sender output * clarity * do not fill the cbuffer if not connected * fix * dont enabled host->sender if streaming is not enabled; send host label updates to parent; * functions are only available through ACLK * Prepared statement reports only in dev mode * fix AR chart detection * fix for streaming not being enabling itself * more cleanup of sender and receiver structures * moved read-only flags and configuration options to rrdhost->options * fixed merge with master * fix for incomplete rename * prevent service thread from working on charts that are being collected Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-09-26Fix warnings during compilation time on ARM (32 bits) (#13681)thiagoftsm
2022-09-26Update exporting unit tests (#13706)Vladimir Kobal
2022-09-19RRD structures managed by dictionaries (#13646)Costa Tsaousis
* rrdset - in progress * rrdset optimal constructor; rrdset conflict * rrdset final touches * re-organization of rrdset object members * prevent use-after-free * dictionary dfe supports also counting of iterations * rrddim managed by dictionary * rrd.h cleanup * DICTIONARY_ITEM now is referencing actual dictionary items in the code * removed rrdset linked list * Revert "removed rrdset linked list" This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5. * removed rrdset linked list * added comments * Switch chart uuid to static allocation in rrdset Remove unused functions * rrdset_archive() and friends... * always create rrdfamily * enable ml_free_dimension * rrddim_foreach done with dfe * most custom rrddim loops replaced with rrddim_foreach * removed accesses to rrddim->dimensions * removed locks that are no longer needed * rrdsetvar is now managed by the dictionary * set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853 * conflict callback of rrdsetvar now properly checks if it has to reset the variable * dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM * dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe * dictionary walkthrough callbacks get dictionary acquired items * dictionary reference counters that can be dupped from zero * added advanced functions for get and del * rrdvar managed by dictionaries * thread safety for rrdsetvar * faster rrdvar initialization * rrdvar string lengths should match in all add, del, get functions * rrdvar internals hidden from the rest of the world * rrdvar is now acquired throughout netdata * hide the internal structures of rrdsetvar * rrdsetvar is now acquired through out netdata * rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata * better error handling * dont create variables if not initialized for health * dont create variables if not initialized for health again * rrdfamily is now managed by dictionaries; references of it are acquired dictionary items * type checking on acquired objects * rrdcalc renaming of functions * type checking for rrdfamily_acquired * rrdcalc managed by dictionaries * rrdcalc double free fix * host rrdvars is always needed * attempt to fix deadlock 1 * attempt to fix deadlock 2 * Remove unused variable * attempt to fix deadlock 3 * snprintfz * rrdcalc index in rrdset fix * Stop storing active charts and computing chart hashes * Remove store active chart function * Remove compute chart hash function * Remove sql_store_chart_hash function * Remove store_active_dimension function * dictionary delayed destruction * formatting and cleanup * zero dictionary base on rrdsetvar * added internal error to log delayed destructions of dictionaries * typo in rrddimvar * added debugging info to dictionary * debug info * fix for rrdcalc keys being empty * remove forgotten unlock * remove deadlock * Switch to metadata version 5 and drop chart_hash chart_hash_map chart_active dimension_active v_chart_hash * SQL cosmetic changes * do not busy wait while destroying a referenced dictionary * remove deadlock * code cleanup; re-organization; * fast cleanup and flushing of dictionaries * number formatting fixes * do not delete configured alerts when archiving a chart * rrddim obsolete linked list management outside dictionaries * removed duplicate contexts call * fix crash when rrdfamily is not initialized * dont keep rrddimvar referenced * properly cleanup rrdvar * removed some locks * Do not attempt to cleanup chart_hash / chart_hash_map * rrdcalctemplate managed by dictionary * register callbacks on the right dictionary * removed some more locks * rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread * when looking up for an alarm look using both chart id and chart name * host initialization a bit more modular * init rrdlabels on host update * preparation for dictionary views * improved comment * unused variables without internal checks * service threads isolation and worker info * more worker info in service thread * thread cancelability debugging with internal checks * strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647 * dictionary modularization * Remove unused SQL statement definition * unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated * remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops * rewritten dictionary to have 2 separate locks, one for indexing and another for traversal * Update collectors/cgroups.plugin/sys_fs_cgroup.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * Update collectors/cgroups.plugin/sys_fs_cgroup.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * Update collectors/proc.plugin/proc_net_dev.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * fix memory leak in rrdset cache_dir * minor dictionary changes * dont use index locks in single threaded * obsolete dict option * rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim; * fix jump on uninitialized value in dictionary; remove double free of cache_dir * addressed codacy findings * removed debugging code * use the private refcount on dictionaries * make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim; * more dictionary statistics * global statistics about dictionary operations, memory, items, callbacks * dictionary support for views - missing the public API * removed warning about unused parameter * chart and context name for cloud * chart and context name for cloud, again * dictionary statistics fixed; first implementation of dictionary views - not currently used * only the master can globally delete an item * context needs netdata prefix * fix context and chart it of spins * fix for host variables when health is not enabled * run garbage collector on item insert too * Fix info message; remove extra "using" * update dict unittest for new placement of garbage collector * we need RRDHOST->rrdvars for maintaining custom host variables * Health initialization needs the host->host_uuid * split STRING to its own files; no code changes other than that * initialize health unconditionally * unit tests do not pollute the global scope with their variables * Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-07Obsolete RRDSET state (#13635)Costa Tsaousis
* move chart_labels to rrdset * rename chart_labels to rrdlabels * renamed hash_id to uuid * turned is_ar_chart into an rrdset flag * removed rrdset state * removed unused senders_connected member of rrdhost * removed unused host flag RRDHOST_FLAG_MULTIHOST * renamed rrdhost host_labels to rrdlabels * Update exporting unit tests Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-05Deduplicate all netdata strings (#13570)Costa Tsaousis
* rrdfamily * rrddim * rrdset plugin and module names * rrdset units * rrdset type * rrdset family * rrdset title * rrdset title more * rrdset context * rrdcalctemplate context and removal of context hash from rrdset * strings statistics * rrdset name * rearranged members of rrdset * eliminate rrdset name hash; rrdcalc chart converted to STRING * rrdset id, eliminated rrdset hash * rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate * rrdcalctemplate * rrdvar * eval_variable * rrddimvar and rrdsetvar * rrdhost hostname, os and tags * fix master commits * added thread cache; implemented string_dup without locks * faster thread cache * rrdset and rrddim now use dictionaries for indexing * rrdhost now uses dictionary * rrdfamily now uses DICTIONARY * rrdvar using dictionary instead of AVL * allocate the right size to rrdvar flag members * rrdhost remaining char * members to STRING * * better error handling on indexing * strings now use a read/write lock to allow parallel searches to the index * removed AVL support from dictionaries; implemented STRING with native Judy calls * string releases should be negative * only 31 bits are allowed for enum flags * proper locking on strings * string threading unittest and fixes * fix lgtm finding * fixed naming * stream chart/dimension definitions at the beginning of a streaming session * thread stack variable is undefined on thread cancel * rrdcontext garbage collect per host on startup * worker control in garbage collection * relaxed deletion of rrdmetrics * type checking on dictfe * netdata chart to monitor rrdcontext triggers * Group chart label updates * rrdcontext better handling of collected rrdsets * rrdpush incremental transmition of definitions should use as much buffer as possible * require 1MB per chart * empty the sender buffer before enabling metrics streaming * fill up to 50% of buffer * reset signaling metrics sending * use the shared variable for status * use separate host flag for enabling streaming of metrics * make sure the flag is clear * add logging for streaming * add logging for streaming on buffer overflow * circular_buffer proper sizing * removed obsolete logs * do not execute worker jobs if not necessary * better messages about compression disabling * proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped * monitor stream sender used buffer ratio * Update exporting unit tests * no need to compare label value with strcmp * streaming send workers now monitor bandwidth * workers now use strings * streaming receiver monitors incoming bandwidth * parser shift of worker ids * minor fixes * Group chart label updates * Populate context with dimensions that have data * Fix chart id * better shift of parser worker ids * fix for streaming compression * properly count received bytes * ensure LZ4 compression ring buffer does not wrap prematurely * do not stream empty charts; do not process empty instances in rrdcontext * need_to_send_chart_definition() does not need an rrdset lock any more * rrdcontext objects are collected, after data have been written to the db * better logging of RRDCONTEXT transitions * always set all variables needed by the worker utilization charts * implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes * lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock * STRING code re-organization for clarity * thread_cache improvements; double numbers precision on worker threads * STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS * rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts * Add index to speed up database context searches * Removed last_updated optimization (was also buggy after latest merge with master) Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-08-16Support chart labels in alerts (#13290)Emmanuel Vasilakis
* chart labels for alerts * proper termination * use strchr * change if statement * change label variable. add docs * change doc * assign buf to temp * use new dictionary functions * reduce variable scope * reduce line length * make sure rrdcalc updates labels after inserted * reduce var scope * add rrdcalc.c for cmocka tests * Revert "add rrdcalc.c for cmocka tests" This reverts commit 5fe122adcf7abcbe6d67fa2ebd7c4ff8620cf9c8. * Fix cmocka unit tests * valgrind errors Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-07-26Dont duplicate buffered bytes (#13435)Vladimir Kobal
2022-07-11Move host tags to netdata_info (#13358)Vladimir Kobal
2022-07-11Exporting/send variables (#13221)boxjan
2022-07-11Get rid of extra comma in OpenTSDB exporting (#13355)Vladimir Kobal
2022-07-06Multi-Tier database backend for long term metrics storage (#13263)Stelios Fragkakis
* Tier part 1 * Tier part 2 * Tier part 3 * Tier part 4 * Tier part 5 * Fix some ML compilation errors * fix more conflicts * pass proper tier * move metric_uuid from state to RRDDIM * move aclk_live_status from state to RRDDIM * move ml_dimension from state to RRDDIM * abstracted the data collection interface * support flushing for mem db too * abstracted the query api * abstracted latest/oldest time per metric * cleanup * store_metric for tier1 * fix for store_metric * allow multiple tiers, more than 2 * state to tier * Change storage type in db. Query param to request min, max, sum or average * Store tier data correctly * Fix skipping tier page type * Add tier grouping in the tier * Fix to handle archived charts (part 1) * Temp fix for query granularity when requesting tier1 data * Fix parameters in the correct order and calculate the anomaly based on the anomaly count * Proper tiering grouping * Anomaly calculation based on anomaly count * force type checking on storage handles * update cmocka tests * fully dynamic number of storage tiers * fix static allocation * configure grouping for all tiers; disable tiers for unittest; disable statsd configuration for private charts mode * use default page dt using the tiering info * automatic selection of tier * fix for automatic selection of tier * working prototype of dynamic tier selection * automatic selection of tier done right (I hope) * ask for the proper tier value, based on the grouping function * fixes for unittests and load_metric_next() * fixes for lgtm findings * minor renames * add dbengine to page cache size setting * add dbengine to page cache with malloc * query engine optimized to loop as little are required based on the view_update_every * query engine grouping methods now do not assume a constant number of points per group and they allocate memory with OWA * report db points per tier in jsonwrap * query planer that switches database tiers on the fly to satisfy the query for the entire timeframe * dbegnine statistics and documentation (in progress) * calculate average point duration in db * handle single point pages the best we can * handle single point pages even better * Keep page type in the rrdeng_page_descr * updated doc * handle future backwards compatibility - improved statistics * support &tier=X in queries * enfore increasing iterations on tiers * tier 1 is always 1 iteration * backfilling higher tiers on first data collection * reversed anomaly bit * set up to 5 tiers * natural points should only be offered on tier 0, except a specific tier is selected * do not allow more than 65535 points of tier0 to be aggregated on any tier * Work only on actually activated tiers * fix query interpolation * fix query interpolation again * fix lgtm finding * Activate one tier for now * backfilling of higher tiers using raw metrics from lower tiers * fix for crash on start when storage tiers is increased from the default * more statistics on exit * fix bug that prevented higher tiers to get any values; added backfilling options * fixed the statistics log line * removed limit of 255 iterations per tier; moved the code of freezing rd->tiers[x]->db_metric_handle * fixed division by zero on zero points_wanted * removed dead code * Decide on the descr->type for the type of metric * dont store metrics on unknown page types * free db_metric_handle on sql based context queries * Disable STORAGE_POINT value check in the exporting engine unit tests * fix for db modes other than dbengine * fix for aclk archived chart queries destroying db_metric_handles of valid rrddims * fix left-over freez() instead of OWA freez on median queries Co-authored-by: Costa Tsaousis <costa@netdata.cloud> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-06-29Get rid of extra semicolon in Graphite exporting (#13261)Vladimir Kobal
2022-06-29fix: fix a base64_encode bug (#13074)kklionz
2022-06-28netdata doubles (#13217)Costa Tsaousis
* netdata doubles * fix cmocka test * fix cmocka test again * fix left-overs of long double to NETDATA_DOUBLE * RRDDIM detached from disk representation; db settings in [db] section of netdata.conf * update the memory before saving * rrdset is now detached from file structures too * on memory mode map, update the memory mapped structures on every iteration * allow RRD_ID_LENGTH_MAX to be changed * granularity secs, back to update every * fix formatting * more formatting
2022-06-22Query Engine multi-granularity support (and MC improvements) (#13155)Costa Tsaousis
* set grouping functions * storage engine should check the validity of timestamps, not the query engine * calculate and store in RRDR anomaly rates for every query * anomaly rate used by volume metric correlations * mc volume should use absolute data, to avoid cancelling effect * return anomaly-rates in jasonwrap with jw-anomaly-rates option to data queries * dont return null on anomaly rates * allow passing group query options from the URL * added countif to the query engine and used it in metric correlations * fix configure * fix countif and anomaly rate percentages * added group_options to metric correlations; updated swagger * added newline at the end of yaml file * always check the time the highlighted window was above/below the highlighted window * properly track time in memory queries * error for internal checks only * moved pack_storage_number() into the storage engines * moved unpack_storage_number() inside the storage engines * remove old comment * pass unit tests * properly detect zero or subnormal values in pack_storage_number() * fill nulls before the value, not after * make sure math.h is included * workaround for isfinite() * fix for isfinite() * faster isfinite() alternative * fix for faster isfinite() alternative * next_metric() now returns end_time too * variable step implemented in a generic way * remove left-over variables * ensure we always complete the wanted number of points * fixes * ensure no infinite loop * mc-volume-improvements: Add information about invalid condition * points should have a duration in the past * removed unneeded info() line * Fix unit tests for exporting engine * new_point should only be checked when it is fetched from the db; better comment about the premature breaking of the main query loop Co-authored-by: Thiago Marques <thiagoftsm@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-06-13Labels with dictionary (#13070)Costa Tsaousis
* squashed and rebased to master * fix overflow and single character bug in sanitize; include rrd.h instead of node_info.h * added unittest for UTF-8 multibyte sanitization * Fix unit test compilation * Fix CMake build * remove double sanitizer for opentsdb; cleanup sanitize_json_string() * rename error_description to error_message to avoid conflict with json-c * revert last and undef error_description from json-c * more unittests; attempt to fix protobuf map issue * get rid of rrdlabels_get() and replace it with a safe version that writes the value to a buffer * added dictionary sorting unittest; rrdlabels_to_buffer() now is sorted * better sorted dictionary checking * proper unittesting for sorted dictionaries * call dictionary deletion callback when destroying the dictionary * remove obsolete variable * Fix exporting unit tests * Fix k8s label parsing test * workaround for cmocka and strdupz() * Bypass cmocka memory allocation check * Revert "Bypass cmocka memory allocation check" This reverts commit 4c49923839d9229bea23ca914dd8a0be1ebe2bf4. * Revert "workaround for cmocka and strdupz()" This reverts commit 7bebee04801db1865c748a7896d5fa54bb7104a5. * Bypass cmocka memory allocation checks * respect json formatting for chart labels * cloud sends colons * print the value only once * allow parenthesis in values and spaces; make stream sender send quotes for values Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-05-06make 'send charts matching' behave the same as 'filter' (#12832)Ilya Mashchenko
2022-05-05Add chart filtering parameter to the allmetrics API query (#12820)Vladimir Kobal
* Add chart filtering in the allmetrics API call * Fix compilation warnings * Remove unnecessary function * Update the documentation * Apply suggestions from code review * Check for filter instead of filter_string * Do not check both - chart id and name for prometheus and shell formats * Fix unit tests Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
2022-05-03Remove per chart configuration. (#12728)vkalintiris
After https://github.com/netdata/netdata/pull/12209 per-chart configuration was used for (a) enabling/disabling a chart, and (b) renaming dimensions. Regarding the first use case: We already have component-specific configuration options|flags to finely control how a chart should behave. Eg. "send charts matching" in streaming, "charts to skip from training" in ML, etc. If we really need the concept of a disabled chart, we can add a host-level simple pattern to match these charts. Regarding the second use case: It's not obvious why we'd need to provide support for remapping dimension names through a chart-specific configuration from the core agent. If the need arises, we could add such support at the right place, ie. a exporter/streaming config section. This will allow each flag to act indepentendly from each other and avoid managing flag-state manually at various places, eg: ``` if(unlikely(!rrdset_flag_check(st, RRDSET_FLAG_ENABLED))) { rrdset_flag_clear(st, RRDSET_FLAG_UPSTREAM_SEND); rrdset_flag_set(st, RRDSET_FLAG_UPSTREAM_IGNORE); } ... ```
2022-03-15Remove backends subsystem (#12146)Vladimir Kobal
2022-02-24Track anomaly rates with DBEngine. (#12083)vkalintiris
* Track anomaly rates with DBEngine. This commit adds support for tracking anomaly rates with DBEngine. We do so by creating a single chart with id "anomaly_detection.anomaly_rates" for each trainable/predictable host, which is responsible for tracking the anomaly rate of each dimension that we train/predict for that host. The rrdset->state->is_ar_chart boolean flag is set to true only for anomaly rates charts. We use this flag to: - Disable exposing the anomaly rates charts through the functionality in backends/, exporting/ and streaming/. - Skip generation of configuration options for the name, algorithm, multiplier, divisor of each dimension in an anomaly rates chart. - Skip the creation of health variables for anomaly rates dimensions. - Skip the chart/dim queue of ACLK. - Post-process the RRDR result of an anomaly rates chart, so that we can return a sorted, trimmed number of anomalous dimensions. In a child/parent configuration where both the child and the parent run ML for the child, we want to be able to stream the rest of the ML-related charts to the parent. To be able to do this without any chart name collisions, the charts are now created on localhost and their IDs and titles have the node's machine_guid and hostname as a suffix, respectively. * Fix exporting_engine tests. * Restore default ML configuration. The reverted changes where meant for local testing only. This commit restores the default values that we want to have when someone runs anomaly detection on their node. * Set context for anomaly_detection.* charts. * Check for anomaly rates chart only with a valid pointer. * Remove duplicate code. * Use a more descriptive name for id/title pair variable
2022-02-17Docs: Removed Google Analytics tags (#12145)Tina Luedtke
2022-02-10Docs: Fix paths to install boxes (#12109)Tina Luedtke
* Updated doc to match the new component name * Updated filepaths to match learn repo
2022-02-08Added interactive kickstart scripts where possible (#12098)Tina Luedtke
2022-02-02Docs install cleanup (#12057)Tina Luedtke
* Extensively reworked MacOS installation page. * Removing outdated information * Updated more instances of the old kickstart script * Update kickstart command with tmp directories * amend command to avoid merge conflict * Removed reviewers note
2022-01-10Update dependencies for the pubsub exporting connector (#11872)Vladimir Kobal
2022-01-10Fix unit tests for Prometheus remote write exporting connector (#11883)Vladimir Kobal
2021-12-20Fix title of exporting reference doc (#11252)Joel Hans
* Fix title of exporting reference doc * Align titles
2021-12-20Fix slight errors (#11902)AR Dabbour
2021-12-20fix(docs): unresolved file references (#11903)Ilya Mashchenko
2021-11-19Cleanup compilation warnings (#11810)Stelios Fragkakis
* Fix compilation warnings (variables used when debugging is enabled using NETDATA_INTERNAL_CHECKS) * Fix compilation warning (casting)
2021-11-16Fix typos (#11782)Dimitris Apostolou
Co-authored-by: ilyam8 <ilya@netdata.cloud>
2021-10-22Reuse the SN_EXISTS bit to track anomaly status. (#11154)vkalintiris
* Replace all usages of SN_EXISTS with SN_DEFAULT_FLAGS. * Remove references to SN_NOT_EXISTS in comments. * Replace raw zero constant with SN_EMPTY_SLOT. * Use get_storage_number_flags only in storage_number.{c,h} * Compare against SN_EMPTY_SLOT to check if a storage_number exists. This is safe because: 1. rrdset_done_interpolate() is the only place where we call store_metric(), 2. All store_metric() calls, except for one, store an SN_EMPTY_SLOT value. 3. When we are not storing an SN_EMPTY_SLOT value, the flags that we pass to pack_storage_number() can be either SN_EXISTS *or* SN_EXISTS_RESET. * Compare only the SN_EXISTS_RESET bit to find reset values. * Remove get_storage_number_flags from storage_number.h * Do not set storage_number flags outside of rrdset_done_interpolate(). This is a NFC intended to limit the scope of storage_number flags processing to just one function. * Set reset bit without overwriting the rest of the flags. * Rename SN_EXISTS to SN_ANOMALY_BIT. * Use GOTOs in pack_storage_number to return from a single place. * Teach pack_storage_number how to handle anomalous zero values. Up until now, a storage_number had always either the SN_EXISTS or SN_EXISTS_RESET bit set. This meant that it was not possible for any packed storage_number to compare equal to the SN_EMPTY_SLOT. However, the SN_ANOMALY_BIT can be set to zero. This is fine for every value other than the anomalous 0 value, because it would compare equal to SN_EMPTY_SLOT. We address this issue by mapping the anomalous zero value to SN_EXISTS_100 (a number which was not possible to generate with the previous versions of the agent, ie. it won't exist in older dbengine files). This change was tested manually by intentionally flipping the anomaly bit for odd/even iterations in rrdset_done_interpolate. Prior to this change, charts whose dimensions had 0 values, where showing up in the dashboard as gaps (SN_EMPTY_SLOT), whereas with this commit the values are displayed correctly.
2021-09-04Clean netdata naming (#11484)Andrew Maguire
* replace "NetData" with "Netdata" * replace "NetData" with "Netdata"
2021-08-04Add HTTP basic authentication to some exporting connectors (#11394)Vladimir Kobal
2021-07-02[docs] fix prometheus node cpu alert rule (#11309)Ilya Mashchenko
2021-06-28Extra posthog attributes (#11237)Emmanuel Vasilakis
* add some more analytics items to posthog * add CI check * use empty string if install_type can not be read * better check of CI variable * reduce scope * get prebuilt distro * check for legacy/ng aclk implementation * use else * add list delimiter to exporting * Revert "check for legacy/ng aclk implementation" This reverts commit 4f0adf872176d75f75232ac95117beebfffdd50d. * formatting * use snprintfz * use a function for getting the value * fix buf size and formatting * fix crash when exporting is not enabled * remove netdata_is_in_ci
2021-04-27Provide more agent analytics to posthog (#11020)Emmanuel Vasilakis
* Move statistics related functions to analytics.c * error message change, space added after if * start an analytics thread * use heartbeat instead of sleep * add late enviroment (after rrdinit) pick of some attributes * change loop * re-enable info messages * remove possible new line * log and report hits on allmetrics pages. detect if exporting engines are enabled/in use, and report them * use lowercase for analytics variables * add collectors * add buildinfo * more attributes from late environment * add new attributes to v1/info * re-gather meta data before exit. update allmetrics counters to be available in v1/info * log hits to dashboard * add mirrored hosts * added notification methods * fix spaces, proper JSON naming * add alerts, charts and metrics count * more attributes * keep the thread up, and report a meta event every 2 hours * small formating changes. Disable analytics_log_prometheus when for unit testing. Add the new attributes to the anonymous-statistics.sh.in script * applied clang-format * dont gather data again on exit * safe buffer length in snprintfz * add rrdset lock * remove show_archived * remove setenv * calculate lengths during sets
2021-04-21Revert "Provide more agent analytics to posthog (#10887)" (#11011)Emmanuel Vasilakis