summaryrefslogtreecommitdiffstats
path: root/web
AgeCommit message (Collapse)Author
2022-08-22add docker dashboard info (#13547)Ilya Mashchenko
2022-08-19Cleanup of APIs (#13539)Timotej S
ACLK related API cleanup
2022-08-18Add summary dashboard for PostgreSQL (#13534)Shyam Sreevalsan
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
2022-08-11chore(python.d): remove python.d/* announced in v1.36.0 deprecation notice ↵Ilya Mashchenko
(#13503)
2022-08-08add PgBouncer charts description and icon to dashboard info (#13493)Ilya Mashchenko
Co-authored-by: thiagoftsm <thiagoftsm@gmail.com>
2022-08-05Trimmed-median, trimmed-mean and percentile (#13469)Costa Tsaousis
2022-08-04chore: add WireGuard description and icon to dashboard info (#13483)Ilya Mashchenko
2022-08-03update postgres dashboard info (#13474)Ilya Mashchenko
2022-08-01/api/v1/weights endpoint (#13449)Costa Tsaousis
* /api/v1/weights endpoints * high resolution anomaly rate in parallel with queries; points and options in /api/v1/weights reflect the truth * context printing * merged metric_correlations with weights API; added parameter tier to select the tier to run the query; weight api now returns points per tier; added swagger info about weights api * moved metric_correlations files to web/api/queries as weights * added contexts filtering; renamed correlated_dimensions; weights API is always enabled; code cleanup * allow returning zero results
2022-08-01rrdcontext support for hidden charts (#13466)Costa Tsaousis
* rrdcontext support for hidden charts * support unhidding charts
2022-07-28Get last_entry_t only when st changes (#13448)Emmanuel Vasilakis
get last_entry_t when st changes
2022-07-28additional stats (#13445)Costa Tsaousis
2022-07-27Fix typo in PostgreSQL section header (#13440)Shyam Sreevalsan
* Fix typo in PostgreSQL section header * Update dashboard_info.js
2022-07-26Tiering statistics API endpoint (#13420)Costa Tsaousis
* calculator statistics * added metrics and metrics_pages counters * implemented API * updates to match sheet * updates to match sheet No2 * fix update every calculation for single point pages * fix lgtm finding
2022-07-26Set value to SN_EMPTY_SLOT if flags is SN_EMPTY_SLOT (#13417)Emmanuel Vasilakis
* set value to SN_EMPTY_SLOT if flags is SN_EMPTY_SLOT * SN_EMPTY_SLOT should be SN_ANOMALOUS_ZERO * added the const attribute to pack_storage_number() * tier1 uses floats * zero should not be empty slot * add unlikely * rename all SN flags to be more meaningful * proper check for zero double value * properly check if pages are full with empty points Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
2022-07-24Rrdcontext (#13335)Costa Tsaousis
* type checking on dictionary return values * first STRING implementation, used by DICTIONARY and RRDLABEL * enable AVL compilation of STRING * Initial functions to store context info * Call simple test functions * Add host_id when getting charts * Allow host to be null and in this case it will process the localhost * Simplify init Do not use strdupz - link directly to sqlite result set * Init the database during startup * make it compile - no functionality yet * intermediate commit * intermidiate * first interface to sql * loading instances * check if we need to update cloud * comparison of rrdcontext on conflict * merge context titles * rrdcontext public interface; statistics on STRING; scratchpad on DICTIONARY * dictionaries maintain version numbers; rrdcontext api * cascading changes * first operational cleanup * string unittest * proper cleanup of referenced dictionaries * added rrdmetrics * rrdmetric starting retention * Add fields to context Adjuct context creation and delete * Memory cleanup * Fix get context list Fix memory double free in tests Store context with two hosts * calculated retention * rrdcontext retention with collection * Persist database and shutdown * loading all from sql * Get chart list and dimension list changes * fully working attempt 1 * fully working attempt 2 * missing archived flag from log * fixed archived / collected * operational * proper cleanup * cleanup - implemented all interface functions - dictionary react callback triggers after the dictionary is unlocked * track all reasons for changes * proper tracking of reasons of changes * fully working thread * better versioning of contexts * fix string indexing with AVL * running version per context vs hub version; ifdef dbengine * added option to disable rrdmetrics * release old context when a chart changes context * cleanup properly * renamed config * cleanup contexts; general cleanup; * deletion inline with dequeue; lots of cleanup; child connected/disconnected * ml should start after rrdcontext * added missing NULL to ri->rrdset; rrdcontext flags are now only changed under a mutex lock * fix buggy STRING under AVL * Rework database initialization Add migration logic to the context database * fix data race conditions during context deletion * added version hash algorithm * fix string over AVL * update aclk-schemas * compile new ctx related protos * add ctx stream message utils * add context messages * add dummy rx message handlers * add the new topics * add ctx capability * add helper functions to send the new messages * update cmake build to not fail * update topic names * handle rrdcontext_enabled * add more functions * fatal on OOM cases instead of return NULL * silence unknown query type error * fully working attempt 1 * fully working attempt 2 * allow compiling without ACLK * added family to the context * removed excess character in UUID * smarter merging of titles and families * Database migration code to add family Add family to SQL_CHART_DATA and VERSIONED_CONTEXT_DATA * add family to context message * enable ctx in communication * hardcoded enabled contexts * Add hard code for CTX * add update node collectors to json * add context message log * fix log about last_time_t * fix collected flags for queued items * prevent crash on charts cleanup * fix bug in AVL indexing of dictionaries; make sure react callback of dictionaries has a reference counter, which is acquired while the dictionary is locked * fixed dictionary unittest * strict policy to cleanup and garbage collector * fix db rotation and garbage collection timings * remove deadlock * proper garbage collection - a lot faster retention recalculation * Added not NULL in database columns Remove migration code for context -- we will ship with version 1 of the table schema Added define for query in tests to detect localhost * Use UUID_STR_LEN instead of GUID_LEN + 1 Use realistic timestamps when adding test data in the database * Add NULL checks for passed parameters * Log deleted context when compiled with NETDATA_INTERNAL_CHECKS * Error checking for null host id * add missing ContextsCheckpoint log convertor * Fix spelling in VACCUM * Hold additional information for host -- prepare to load archived hosts on startup * Make sure claim id is valid * is_get_claimed is actually get the current claim id * Simplify ctx get chart list query * remove env negotiation * fix string unittest when there are some strings already in the index * propagate live-retention flag upstream; cleanup all update reasons; updated instances logging; automated attaching started/stopped collecting flags; * first implementation of /api/v1/contexts * full contexts API; updated swagger * disabled debugging; rrdcontext enabled by default * final cleanup and renaming of global variables * return current time on currently collected contexts, charts and dimensions * added option "deepscan" to the API to have the server refresh the retention and recalculate the contexts on the fly * fixed identation of yaml * Add constrains to the host table * host->node_id may not be available * new capabilities * lock the context while rendering json * update aclk-schemas * added permanent labels to all charts about plugin, module and family; added labels to all proc plugin modules * always add the labels * allow merging of families down to [x] * dont show uuids by default, added option to enable them; response is now accepting after,before to show only data for a specific timeframe; deleted items are only shown when "deleted" is requested; hub version is now shown when "queue" is requested * Use the localhost claim id * Fix to handle host constrains better * cgroups: add "k8s." prefix to chart context in k8s * Improve sqlite metadata version migration check * empty values set to "[none]"; fix labels unit test to reflect that * Check if we reached the version we want first (address CODACY report re: Array index 'i' is used before limits check) * Rewrite condition to address CODACY report (Redundant condition: t->filter_callback. '!A || (A && B)' is equivalent to '!A || B') * Properly unlock context * fixed memory leak on rrdcontexts - it was not freeing all dictionaries in rrdhost; added wait of up to 100ms on dictionary_destroy() to give time to dictionaries to release their items before destroying them * fixed memory leak on rrdlabels not freed on rrdinstances * fixed leak when dimensions and charts are redefined * Mark entries for charts and dimensions as submitted to the cloud 3600 seconds after their creation Mark entries for charts and dimensions as updated (confirmed by the cloud) 1800 seconds after their submission * renamed struct string * update cgroups alarms * fixed codacy suggestions * update dashboard info * fix k8s_cgroup_10s_received_packets_storm alarm * added filtering options to /api/v1/contexts and /api/v1/context * fix eslint * fix eslint * Fix pointer binding for host / chart uuids * Fix cgroups unit tests * fixed non-retention updates not propagated upstream * removed non-fatal fatals * Remove context from 2 way string merge. * Move string_2way_merge to dictionary.c * Add 2-way string merge tests. * split long lines * fix indentation in netdata-swagger.yaml * update netdata-swagger.json * yamllint please * remove the deleted flag when a context is collected * fix yaml warning in swagger * removed non-fatal fatals * charts should now be able to switch contexts * allow deletion of unused metrics, instances and contexts * keep the queued flag * cleanup old rrdinstance labels * dont hide objects when there is no filter; mark objects as deleted when there are no sub-objects * delete old instances once they changed context * delete all instances and contexts that do not have sub-objects * more precise transitions * Load archived hosts on startup (part 1) * update the queued time every time * disable by default; dedup deleted dimensions after snapshot * Load archived hosts on startup (part 2) * delayed processing of events until charts are being collected * remove dont-trigger flag when object is collected * polish all triggers given the new dont_process flag * Remove always true condition Enums for readbility / create_host_callback only if ACLK is enabled (for now) * Skip retention message if context streaming is enabled Add messages in the access log if context streaming is enabled * Check for node id being a UUID that can be parsed Improve error check / reporting when loading archived hosts and creating ACLK sync threads * collected, archived, deleted are now mutually exclusive * Enable the "orphan" handling for now Remove dead code Fix memory leak on free host * Queue charts and dimensions will be no-op if host is set to stream contexts * removed unused parameter and made sure flags are set on rrdcontext insert * make the rrdcontext thread abort mid-work when exiting * Skip chart hash computation and storage if contexts streaming is enabled Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Timo <timotej@netdata.cloud> Co-authored-by: ilyam8 <ilya@netdata.cloud> Co-authored-by: Vladimir Kobal <vlad@prokk.net> Co-authored-by: Vasilis Kalintiris <vasilis@netdata.cloud>
2022-07-13chore(dashboard): update chrony dashboard info (#13371)Ilya Mashchenko
2022-07-12query engine: omit first point if not needed (#13345)Costa Tsaousis
* omit first point if not needed * poing end time should be bigger
2022-07-06Multi-Tier database backend for long term metrics storage (#13263)Stelios Fragkakis
* Tier part 1 * Tier part 2 * Tier part 3 * Tier part 4 * Tier part 5 * Fix some ML compilation errors * fix more conflicts * pass proper tier * move metric_uuid from state to RRDDIM * move aclk_live_status from state to RRDDIM * move ml_dimension from state to RRDDIM * abstracted the data collection interface * support flushing for mem db too * abstracted the query api * abstracted latest/oldest time per metric * cleanup * store_metric for tier1 * fix for store_metric * allow multiple tiers, more than 2 * state to tier * Change storage type in db. Query param to request min, max, sum or average * Store tier data correctly * Fix skipping tier page type * Add tier grouping in the tier * Fix to handle archived charts (part 1) * Temp fix for query granularity when requesting tier1 data * Fix parameters in the correct order and calculate the anomaly based on the anomaly count * Proper tiering grouping * Anomaly calculation based on anomaly count * force type checking on storage handles * update cmocka tests * fully dynamic number of storage tiers * fix static allocation * configure grouping for all tiers; disable tiers for unittest; disable statsd configuration for private charts mode * use default page dt using the tiering info * automatic selection of tier * fix for automatic selection of tier * working prototype of dynamic tier selection * automatic selection of tier done right (I hope) * ask for the proper tier value, based on the grouping function * fixes for unittests and load_metric_next() * fixes for lgtm findings * minor renames * add dbengine to page cache size setting * add dbengine to page cache with malloc * query engine optimized to loop as little are required based on the view_update_every * query engine grouping methods now do not assume a constant number of points per group and they allocate memory with OWA * report db points per tier in jsonwrap * query planer that switches database tiers on the fly to satisfy the query for the entire timeframe * dbegnine statistics and documentation (in progress) * calculate average point duration in db * handle single point pages the best we can * handle single point pages even better * Keep page type in the rrdeng_page_descr * updated doc * handle future backwards compatibility - improved statistics * support &tier=X in queries * enfore increasing iterations on tiers * tier 1 is always 1 iteration * backfilling higher tiers on first data collection * reversed anomaly bit * set up to 5 tiers * natural points should only be offered on tier 0, except a specific tier is selected * do not allow more than 65535 points of tier0 to be aggregated on any tier * Work only on actually activated tiers * fix query interpolation * fix query interpolation again * fix lgtm finding * Activate one tier for now * backfilling of higher tiers using raw metrics from lower tiers * fix for crash on start when storage tiers is increased from the default * more statistics on exit * fix bug that prevented higher tiers to get any values; added backfilling options * fixed the statistics log line * removed limit of 255 iterations per tier; moved the code of freezing rd->tiers[x]->db_metric_handle * fixed division by zero on zero points_wanted * removed dead code * Decide on the descr->type for the type of metric * dont store metrics on unknown page types * free db_metric_handle on sql based context queries * Disable STORAGE_POINT value check in the exporting engine unit tests * fix for db modes other than dbengine * fix for aclk archived chart queries destroying db_metric_handles of valid rrddims * fix left-over freez() instead of OWA freez on median queries Co-authored-by: Costa Tsaousis <costa@netdata.cloud> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-06-30query engine fixes for alarms and dashboards (#13282)Costa Tsaousis
* fix health alignment to future; fix logs * ensure the query always covers the entire duration requested * better comments
2022-06-30Rename the chart of real memory usage in FreeBSD (#13271)Vladimir Kobal
2022-06-30Fix alignment in charts endpoint (#13275)thiagoftsm
2022-06-30Update documentation about our REST API documentation. (#13269)Austin S. Hemmelgarn
2022-06-29Query engine with natural and virtual points (#13248)Costa Tsaousis
* new query engine * use Index * Revert change that changed in-memory page indexing to start time - update_every + 1 * use internal_error() to cleanup the code * interpolates values when generating points Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-06-28netdata doubles (#13217)Costa Tsaousis
* netdata doubles * fix cmocka test * fix cmocka test again * fix left-overs of long double to NETDATA_DOUBLE * RRDDIM detached from disk representation; db settings in [db] section of netdata.conf * update the memory before saving * rrdset is now detached from file structures too * on memory mode map, update the memory mapped structures on every iteration * allow RRD_ID_LENGTH_MAX to be changed * granularity secs, back to update every * fix formatting * more formatting
2022-06-27Removes Legacy JSON Cloud Protocol Support In Agent (#13111)Timotej S
* removes old protocol support (cloud removed support already)
2022-06-23Print INTERNAL BUG messages only when NETDATA_INTERNAL_CHECKS is enabled ↵Emmanuel Vasilakis
(#13207) put INTERNAL BUG messages inside NETDATA_INTERNAL_CHECKS
2022-06-22Query Engine multi-granularity support (and MC improvements) (#13155)Costa Tsaousis
* set grouping functions * storage engine should check the validity of timestamps, not the query engine * calculate and store in RRDR anomaly rates for every query * anomaly rate used by volume metric correlations * mc volume should use absolute data, to avoid cancelling effect * return anomaly-rates in jasonwrap with jw-anomaly-rates option to data queries * dont return null on anomaly rates * allow passing group query options from the URL * added countif to the query engine and used it in metric correlations * fix configure * fix countif and anomaly rate percentages * added group_options to metric correlations; updated swagger * added newline at the end of yaml file * always check the time the highlighted window was above/below the highlighted window * properly track time in memory queries * error for internal checks only * moved pack_storage_number() into the storage engines * moved unpack_storage_number() inside the storage engines * remove old comment * pass unit tests * properly detect zero or subnormal values in pack_storage_number() * fill nulls before the value, not after * make sure math.h is included * workaround for isfinite() * fix for isfinite() * faster isfinite() alternative * fix for faster isfinite() alternative * next_metric() now returns end_time too * variable step implemented in a generic way * remove left-over variables * ensure we always complete the wanted number of points * fixes * ensure no infinite loop * mc-volume-improvements: Add information about invalid condition * points should have a duration in the past * removed unneeded info() line * Fix unit tests for exporting engine * new_point should only be checked when it is fetched from the db; better comment about the premature breaking of the main query loop Co-authored-by: Thiago Marques <thiagoftsm@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-06-22Update dashboard to version v2.26.5. (#13192)Netdata bot
Co-authored-by: netdatabot <netdatabot@users.noreply.github.com>
2022-06-20add k8s_state dashboard_info (#13181)Ilya Mashchenko
2022-06-20Update dashboard to version v2.26.2. (#13177)Netdata bot
Co-authored-by: netdatabot <netdatabot@users.noreply.github.com>
2022-06-20feat(proc/proc_net_dev): add dim per phys link state to the "Interface ↵Ilya Mashchenko
Physical Link State" chart (#13176) * add dim per carrier state * fix down state
2022-06-17feat(proc/proc_net_dev): add dim per duplex state to the "Interface Duplex ↵Ilya Mashchenko
State" chart (#13165)
2022-06-17Revert "Configurable storage engine for Netdata agents: step 3 (#12892)" ↵vkalintiris
(#13171) This reverts commit 100a12c6cc01222b1518e5e50d2147f592d8a111. A couple parent/child startup/shutdown scenarios can lead to crashes.
2022-06-17feat(proc/proc_net_dev): add dim per operstate to the "Interface Operational ↵Ilya Mashchenko
State" chart (#13167)
2022-06-17Fix data query on stale chart (#13159)Stelios Fragkakis
* Fix data query on stale chart * Remove more checks vs the last timestamp of a point collected
2022-06-16Configurable storage engine for Netdata agents: step 3 (#12892)Adrien Béraud
* storage engine: add host context API Add a new API to allow storage engines to manage host contexts. * Replace single global context with per-engine global context * Context is full managed by storage engines: a storage engine can use no context, a global engine context, per host contexts, or a mix of these. * Currently, only dbengine uses contexts. Following the current logic, legacy hosts use their own context, while non-legacy hosts share the global context. * storage engine: use empty function instead of null for context ops * rrdhost: don't check return value for void call * rrdhost: create context with host * storage engine: move rrddim ops to rrddim_mem.{c,h} * storage engine: don't use NULL for end-of-list marker * storage engine: fallback to default engine
2022-06-16Add mem.available chart to FreeBSD (#13140)Emmanuel Vasilakis
2022-06-15use ks2 as MC default (#13131)Andrew Maguire
2022-06-14fixed coveriry 379136 379135 379134 379133 (#13123)Costa Tsaousis
2022-06-1373x times faster metrics correlations at the agent (#13107)Costa Tsaousis
* faster correlations * 4x times faster correlations * a little bit more help * 10x times faster metrics correlations * 6 digits precision; better comments * enabled metrics correlations by default * abstracted DIFFS_NUMBER to allow easily changing it * reworked the entire logic to have more accuracy and support a baseline that is power of two multiple of highlight * properly calculate shifts * even more improved version * added support for timeout; fixed another memory leak; skipped hidden dimensions * default timeout 1min * reduce memory even further * use dictionary for the list of charts and optimize locks * return 403 forbidden, when mc is not enabled * added query options * dont process zero dimensions * added volume method as an option to metric correlations ; now metric correlations can support multiple implementations * make sure we will never crash * spread results evenly for both kstwo and volume * fixed bug in query engine that was missing misaligned queries when a single point was requested from the db; improved comments; improved query flags * updated swagger and added sane defaults; query options are now supported, including anomaly-bit * added "raw" option to allow cross node correlations; added "group" option to allow different time aggregations; allowed calling metric correlations without any parameters; allowed calling metric correlations with relative timestamps; added timeout to volume method; properly handled timeout on ks2 method; json output now sends all parameters back - same for json_wrap; modified query engine to use present time for relative timestamps; modified "allow_past" to mean both past backwards and forwards * emulate the old behaviour about zero points * 100% accuracy against python ks_2samp(); now the default is volume and the default points are 500 * added config option to change default metric correlations method * removed work-arounds now that rrdlabels are merged
2022-06-13Labels with dictionary (#13070)Costa Tsaousis
* squashed and rebased to master * fix overflow and single character bug in sanitize; include rrd.h instead of node_info.h * added unittest for UTF-8 multibyte sanitization * Fix unit test compilation * Fix CMake build * remove double sanitizer for opentsdb; cleanup sanitize_json_string() * rename error_description to error_message to avoid conflict with json-c * revert last and undef error_description from json-c * more unittests; attempt to fix protobuf map issue * get rid of rrdlabels_get() and replace it with a safe version that writes the value to a buffer * added dictionary sorting unittest; rrdlabels_to_buffer() now is sorted * better sorted dictionary checking * proper unittesting for sorted dictionaries * call dictionary deletion callback when destroying the dictionary * remove obsolete variable * Fix exporting unit tests * Fix k8s label parsing test * workaround for cmocka and strdupz() * Bypass cmocka memory allocation check * Revert "Bypass cmocka memory allocation check" This reverts commit 4c49923839d9229bea23ca914dd8a0be1ebe2bf4. * Revert "workaround for cmocka and strdupz()" This reverts commit 7bebee04801db1865c748a7896d5fa54bb7104a5. * Bypass cmocka memory allocation checks * respect json formatting for chart labels * cloud sends colons * print the value only once * allow parenthesis in values and spaces; make stream sender send quotes for values Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-06-02Initialize chart label key parameter correctly (#13061)Stelios Fragkakis
Pass chart label key
2022-06-02Fix coverity 378625 (#13055)Emmanuel Vasilakis
free jsonb buffer
2022-06-01Dictionary with JudyHS and double linked list (#13032)Costa Tsaousis
* dictionary internals isolation * more dictionary cleanups * added unit test * we should use DICT internally * disable cups in cmake * implement DICTIONARY with Judy arrays * operational JUDY implementation * JUDY cleanup * JUDY summary added * JudyHS implementation with double linked list * test negative searches too * optimize destruction * optimize set to insert first without lookup * updated stats * code cleanup; better organization; updated info * more code cleanup and commenting * more cleanup, renames and comments * fix rename * more cleanups * use Judy.h from system paths * added foreach traversal; added flag to add item in front; isolated locks to their own functions; destruction returns the number of bytes freed * more comments; flags are now 16-bit * completed unittesting * addressed comments and added reference counters maintainance * added unittest in main; tested removal of items in front, back and middle * added read/write walkthrough and foreach; allowed walkthrough and foreach in write mode to delete the current element (used by cups.plugin); referenced counters removed from the API * DICTFE.name should be const too * added API calls for exposing all statistics * dictionary flags as enum and reference counters as atomic operations * more comments; improved error handling at unit tests * added functions to allow unsafe access while traversing the dictionary with locks in place * check for libcups in cmake * added delete callback; implemented statsd with this dictionary * added missing dfe_done() * added alternative implementation with AVL * added documentation * added comments and warning about AVL * dictionary walktrhough on new code * simplified foreach; updated docs * updated docs * AVL is much faster without hashes * AVL should follow DBENGINE
2022-05-31Add additional metadata to the data response (#13036)Stelios Fragkakis
* Consolidate query params * Add new option to show full dimensions in the json header (this will include dimensions, charts and chart labels) * Group and pass parameters with query_params
2022-05-31Fix coverity issue 378617,378615 (#13021)Stelios Fragkakis
* Fix CID 378617 * Fix CID 378615 * Make sure the ST rrdr lock indicator is set/reset while holding a lock * Switch to int
2022-05-28add hostname to mirrored hosts (#13030)Costa Tsaousis
2022-05-27Update dashboard to version v2.25.6. (#13028)Netdata bot
Co-authored-by: netdatabot <netdatabot@users.noreply.github.com>
2022-05-27prevent gap filling on dbengine gaps (#13027)Costa Tsaousis