summaryrefslogtreecommitdiffstats
path: root/web
AgeCommit message (Collapse)Author
2023-06-20Update dashboard to version v3.0.0. (#15219)Netdata bot
Co-authored-by: netdatabot <netdatabot@users.noreply.github.com>
2023-06-19Obvious memory reductions (#15204)Costa Tsaousis
* remove rd->update_every * reduce amount of memory for RRDDIM * reorgnize rrddim->db entries * optimize rrdset and statsd * optimize dictionaries * RW_SPINLOCK for dictionaries * fix codeql warning * rw_spinlock improvements * remove obsolete assertion * fix crash on health_alarm_log_process() * use RW_SPINLOCK for AVL trees * add RW_SPINLOCK read/write trylock * pgc and mrg now use rw_spinlocks; cache line optimizations for mrg * thread tag of dbegnine init * append created datafile, lockless * make DOUBLE_LINKED_LIST_APPEND_ITEM_UNSAFE friendly for lockless use * thread cancelability in spinlocks; optimize thread cancelability management * introduce a JudyL to index datafiles and use it during queries to quickly find the relevant files * use the last timestamp of each journal file for indexing * when the previous cannot be found, start from the beginning * add more stats to PDC to trace routing easier * rename spinlock functions * fix for spinlock renames * revert statsd socket statistics to size_t * turn fatal into internal_fatal() * show candidates always * show connected status and connection attempts
2023-06-19/api/v2/nodes and streaming function (#15168)Costa Tsaousis
* dummy streaming function * expose global functions upstream * separate function for pushing global functions * add missing conditions * allow streaming function to run async * started internal API for functions * cache host retention and expose it to /api/v2/nodes * internal API for function table fields; more progress on streaming status * abstracted and unified rrdhost status * port old coverity warning fix - although it is not needed * add ML information to rrdhost status * add ML capability to streaming to signal the transmission of ML information; added ML information to host status * protect host->receiver * count metrics and instances per host * exposed all inbound and outbound streaming * fix for ML status and dependency of DATA_WITH_ML to INTERPOLATED, not IEEE754 * update ML dummy * added all fields * added streaming group by and cleaned up accepted values by cloud * removed type * Revert "removed type" This reverts commit faae4177e603d4f85b7433f33f92ef3ccd23976e. * added context to db summary * new /api/v2/nodes schema * added ML type * change default function charts * log to trace new capa * add more debug * removed debugging code * retry on receive interrupted read; respect sender reconnect delay in all cases * set disconnected host flag and manipulate localhost child count atomically, inside set/clear receiver * fix infinite loop * send_to_plugin() now has a spinlock to ensure that only 1 thread is writing to the plugin/child at the same time * global cloud_status() call * cloud should be a section, since it will contain error information * put cloud capabilities into cloud * aclk status in /api/v2 agents sections * keep aclk_connection_counter * updates on /api/v2/nodes * final /api/v2/nodes and addition of /api/v2/nodes_instances * parametrize all /api/v2/xxx output to control which info is outputed per endpoint * always accept nodes selector * st needs to be per instance, not per node * fix merging of contexts; fix cups plugin priorities * add after and before parameters to /api/v2/contexts/nodes/nodes_instances/q * give each libuv worker a unique id * aclk http_api_v2 version 4
2023-06-14Redirect to index.html when a file is not found by web server (#15143)Emmanuel Vasilakis
* redirect to index.html when a file is not found * return index.html only when not found is likely a directory * recursively find the first path that works --------- Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
2023-06-08api v2 nodes for streaming statuses (#15162)Costa Tsaousis
* api v2 nodes for streaming statuses * remove test * move parts of the output * in api/v2/data return 5 values per point when aggregation=percentage and raw option is given; return final values when aggregation=percentage is not the final grouping
2023-06-07Re-write of SSL support in Netdata; restoration of SIGCHLD; detection of ↵Costa Tsaousis
stale plugins; streaming improvements (#15113) * add information about streaming connections to /api/v2/nodes; reset defer time when sender or receivers connect or disconnect * make each streaming destination respect its SSL settings * to not send SSL traffic over non-SSL connection * keep track of outgoing streaming connection attempts * retry SSL reads when SSL_read() returns SSL_ERROR_WANT_READ * Revert "retry SSL reads when SSL_read() returns SSL_ERROR_WANT_READ" This reverts commit 14c858677c6f2d3b08c94f298e2f45ecdb74c801. * cleanup SSL connections properly * initialize SSL in rpt before takeover * sender should free SSL when talking to a non-SSL destination * do not shutdown SSL when receiver exits * restore operation of SIGCHLD when the reaper is not enabled * create an fgets function that checks for data and times out * work on error handling of plugins exiting * remove newlines from logs * global call to waitid(), caching the result for netdata_pclose() to process * receiver tid * parser timeouts in 2 minutes instead of 10 * fix crash when UUID is NULL in SQLite * abstract sqlite3 parsing for uuid and text * write proper ssl errors on read and write * fix for SSL_ERROR_WANT_RETRY_VERIFY * SSL WANT per function * unified SSL error logging * fix compilation warning * additional logging about parser cleanup * streaming parser should call the pluginsd parser cleanup * SSL error handling work * SSL initialization unification * check for pending data when receiving SSL response with timeout * macro to check if an SSL connection has been established * remove SSL_pending() * check for SSL macros * use SSL_peek() to find if there is a response * SSL renames * more SSL renames & cleanup * rrdpush ssl connection function * abstract all SSL functions into security.c * keep track of SSL connections and always attempt to use SSL read/write when on SSL connection * signal openssl to skip certificate validation when configured to do so * better SSL error handling and logging * SSL code cleanup * SSL retry on SSL_connect and SSL_accept * SSL provide default return value for old compilers * SSL read/write functions emulate system read/write functions * fix receive/send timeout and switch from SSL_peek() to SSL_pending() * remove SSL_pending() * removed sender auto-retry and debug info for initial recevier response * ssl skip certificate verification config for web server * ssl errors log ip and port of the peer * keep ssl with web_client for its whole lifetime * thread safe socket peers to text * use error_limit() for common ssl errors * cleanup * more cleanup * coverity fixes * ssl error logs include both local and remote ip/port info * remove obsolete code
2023-05-31Percentage of group aggregatable at cloud - fixed for backwards ↵Costa Tsaousis
compatibility (#15126) * percentage of group is now aggregatable at cloud across multiple nodes * do not break backwards compatibility with percentage-of-instance * calculate the percentage when percentage-of-instance is requested * increase capability version
2023-05-31Revert "percentage of group is now aggregatable at cloud across multiple ↵Costa Tsaousis
nodes" (#15122) Revert "percentage of group is now aggregatable at cloud across multiple nodes (#15109)" This reverts commit 44b6c223b3e13774df45a96dd48588aa8a66ba42.
2023-05-30percentage of group is now aggregatable at cloud across multiple nodes (#15109)Costa Tsaousis
2023-05-27percentage-of-group: fix uninitialized array vh (#15106)Costa Tsaousis
fix uninitialized array vh
2023-05-25/api/v2/data percentage calculation on grouped queries (#15100)Costa Tsaousis
allow aggregation=percentage to calculate the percentage over any grouping
2023-05-23Better cleanup of health log table (#15045)Emmanuel Vasilakis
2023-05-15Debugfs collector (#15017)thiagoftsm
Co-authored-by: Fotis Voutsas <fotis@netdata.cloud> Co-authored-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> Co-authored-by: ilyam8 <ilya@netdata.cloud>
2023-05-10make zlib compulsory dep (#14928)Timotej S
zlib compulsory
2023-05-10initial minimal h2o webserver integration (#14585)Timotej S
Introduces h2o based web server as an alternative
2023-05-03Fix coverity issues (#15005)Stelios Fragkakis
Fix coverity issues 384022, 384021, 383825
2023-05-03weights endpoint: volume diff of anomaly rates (#15004)Costa Tsaousis
when querying the volume delta of anomaly rates, only consider higher highlighted area anomalies
2023-04-27interrupt callback on api/v1/data (#14978)Costa Tsaousis
* interrupt callback on api/v1/data * smaller tier_grouping member size
2023-04-26do not convert to percentage, when the raw option is given (#14969)Costa Tsaousis
2023-04-24Terminate JSX element in doc file (#14952)Fotis Voutsas
Terminate JSX element
2023-04-22Update README.md (#14948)Chris Akritidis
2023-04-20WEBRTC for communication between agents and browsers (#14874)Costa Tsaousis
* initial webrtc setup * missing files * rewrite of webrtc integration * initialization and cleanup of webrtc connections * make it compile without libdatachannel * add missing webrtc_initialize() function when webrtc is not enabled * make c++17 optional * add build/m4/ax_compiler_vendor.m4 * add ax_cxx_compile_stdcxx.m4 * added new m4 files to makefile.am * id all webrtc connections * show warning when webrtc is disabled * fixed message * moved all webrtc error checking inside webrtc.cpp * working webrtc connection establishment and cleanup * remove obsolete code * rewrote webrtc code in C to remove dependency for c++17 * fixed left-over reference * detect binary and text messages * minor fix * naming of webrtc threads * added webrtc configuration * fix for thread_get_name_np() * smaller web_client memory footprint * universal web clients cache * free web clients every 100 uses * webrtc is now enabled by default only when compiled with internal checks * webrtc responses to /api/ requests, including LZ4 compression * fix for binary and text messages * web_client_cache is now global * unification of the internal web server API, for web requests, aclk request, webrtc requests * more cleanup and unification of web client timings * fixed compiler warnings * update sent and received bytes * eliminated of almost all big buffers in web client * registry now uses the new json generation * cookies are now an array; fixed redirects * fix redirects, again * write cookies directly to the header buffer, eliminating the need for cookie structures in web client * reset the has_cookies flag * gathered all web client cleanup to one function * fixes redirects * added summary.globals in /api/v2/data response * ars to arc in /api/v2/data * properly handle host impersonation * set the context of mem.numa_nodes
2023-04-18Fix warnings and error when compiling with --disable-dbengine (#14919)Stelios Fragkakis
2023-04-13/api/v2 part 10 (#14904)Costa Tsaousis
/api/v2/weights nonzero output
2023-04-12Update README.md (#14899)Chris Akritidis
2023-04-12Update README.md (#14898)Chris Akritidis
Clarify that these queries work for the `lookup` line in alerts as well.
2023-04-12Collect additional BTRFS metrics (#14636)Dimitris P
* Add commit_stats metrics to BTRFS section * Add error_stats metrics (per device) to BTRFS section * Simplify commit stats variables and chart ids/names * Add basic BTRFS error alarms. Configured to trip whenever any of the error dimensions is non-zero. * Add chart descriptions for new charts. * Remove duplicate code * Comment out some debugging code * Always create error stats dimensions, even if zero * Show rate of commits and commit duration instead of totals * Change current commit metrics to absolute from incremental * Change commits dimension to absolute and add separate commits time share chart * Rename 'device_' rrdlabels to 'filesystem_' * Replace all snprintf() calls with snprintfz() * Fix codacy warning * Provide separate error charts for each filesystem device * Accept code review suggestions for more descriptive context and labels Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud> * Add 'device' prefix to id, name, title of errors chart * Add 'device_id' label to device_errors * Update health.d/btrfs.conf to match new errors charts * Remove commented out code * Do not disable all BTRFS metrics collection if only commit_stats is missing * Do not disable all BTRFS metrics collection if only error_stats is missing * Fix bug of BTRFS device add/remove not being detected properly * Fix double free() error when deleting a device * Update dashboard info with bold tags Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud> --------- Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud> Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
2023-04-11/api/v2 part 9 (#14888)Costa Tsaousis
* /api/v2/weights now supports group-by * /api/v2/weights now follows the same principles for describing responses as /api/v2/data
2023-04-10/api/v2 part 8 (#14885)Costa Tsaousis
* configure extent cache size * /api/v2/data responsed with the id of dimensions in result.labels * provide the original units and sts to db.dimensions * updated swagger for the changes
2023-04-07Boost dbengine (#14832)Costa Tsaousis
* configure extent cache size * workers can now execute up to 10 jobs in a run, boosting query prep and extent reads * fix dispatched and executing counters * boost to the max * increase libuv worker threads * query prep always get more prio than extent reads; stop processing in batch when dbengine is queue is critical * fix accounting of query prep * inlining of time-grouping functions, to speed up queries with billions of points * make switching based on a local const variable * print one pending contexts loading message per iteration * inlined store engine query API * inlined storage engine data collection api * inlined all storage engine query ops * eliminate and inline data collection ops * simplified query group-by * more error handling * optimized partial trimming of group-by queries * preparative work to support multiple passes of group-by * more preparative work to support multiple passes of group-by (accepts multiple group-by params) * unified query timings * unified query timings - weights endpoint * query target is no longer a static thread variable - there is a list of cached query targets, each of which of freed every 1000 queries * fix query memory accounting * added summary.dimension[].pri and sorted summary.dimensions based on priority and then name * limit max ACLK WEB response size to 30MB * the response type should be text/plain * more preparative work for multiple group-by passes * create functions for generating group by keys, ids and names * multiple group-by passes are now supported * parse group-by options array also with an index * implemented percentage-of-instance group by function * family is now merged in multi-node contexts * prevent uninitialized use
2023-03-28/api/v2/X part 7 (#14797)Costa Tsaousis
* /api/v2/weights, points key renamed to result * /api/v2/weights, add node ids in response * /api/v2/data remove NONZERO flag when all dimensions are zero and fix MIN/MAX grouping and statistics * /api/v2/data expose view.dimensions.sts{} * /api/v2 endpoints expose agents and additional info per node, that is needed to unify cloud responses * /api/v2 nodes output now includes the duration of time spent per node * jsonwrap view object renames and cleanup * rework of the statistics returned by the query engine * swagger work * swagger work * more swagger work * updated swagger json * added the remaining of the /api/v2 endpoints to swagger * point.ar has been renamed point.arp * updated weights endpoint * fix compilation warnings
2023-03-22/api/v2/X part 6 (#14785)Costa Tsaousis
* additional check for consistency of rrdr * do not query the host if the contexts are not initialized for it
2023-03-21/api/v2/X part 5 (#14718)Costa Tsaousis
* query timestamps are now pre-determined and alignment on timestamps is guarranteed * turn internal_fatal() to internal_error() to investigate the issue * handle query when no data exist in the db * check for non NULL dict when running dictionary garbage collect * support API v2 requests via ACLK * add nodes detailed information to /api/v2/nodes * fixed keys and added dummy nodes for completeness * added nodes_hard_hash, alerts_hard_hash, alerts_soft_hash; started building a nodes status object to reflect the current status of a node * make sure replication does not double count charts that are already being replicated * expose min and max in sts structures * added view_minimum_value and view_maximum_value; percentage calculation is now an additional pass on the data, removed from formatters; absolute value calculation is now done at the query level, removed from formatters * respect trimming in percentage calculation; updated swagger * api/v2/weights preparative work to support multi-node queries - still single node though * multi-node /api/v2/weights endpoint, supporting all the filtering parameters of /api/v2/data * when passing the raw option, the query exposes the hidden dimensions * fix compilation issues on older systems * the query engine now calculates per dimension min, max, sum, count, anomaly count * use the macro to calculate storage point anomaly rate * weights endpoint exposing version hashes * weights method=value shows min, max, average, sum, count, anomaly count, anomaly rate * query: expose RESET flag; do not add the same point multiple times to the aggregated point * weights: more compact output * weights requests can be interrupted * all /api/v2 requests can be interrupted and timeout * allow relative timestamps in weights * fix macos compilation warnings * Revert "fix macos compilation warnings" This reverts commit 8a1d24e41e9b58de566ac59f0c4b1c465bcc0592. * /api/v2/data group-by now works on dimension names, not ids * /api/v2/weights does not query metrics without retention and new output format * /api/v2/weights value and anomaly queries do context queries when contexts are filtered; query timeout is now always in ms
2023-03-21Update dashboard to version v2.30.1. (#14784)Netdata bot
Co-authored-by: netdatabot <netdatabot@users.noreply.github.com>
2023-03-21Update API (#14772)Chris Akritidis
* Update README.md * Add info on the cloud API * Update web/api/README.md Co-authored-by: Tasos Katsoulas <12612986+tkatsoulas@users.noreply.github.com> --------- Co-authored-by: Tasos Katsoulas <12612986+tkatsoulas@users.noreply.github.com>
2023-03-20Accept all=true for alarms api v1 call (#14762)Emmanuel Vasilakis
accept all=true for alarms api v1
2023-03-15Update dashboard to version v2.30.0. (#14732)Netdata bot
Co-authored-by: netdatabot <netdatabot@users.noreply.github.com>
2023-03-13/api/v2 part 4 (#14706)Costa Tsaousis
* expose the order of group by * key renames in json wrapper v2 * added group by context and group by units * added view_average_values * fix for view_average_values when percentage is specified * option group-by-labels is enabling the exposure of all the labels that are used for each of the final grouped dimensions * when executing group by queries, allocate one dimension data at a time - not all of them * respect hidden dimensions * cancel running data query on socket error * use poll to detect socket errors * use POLLRDHUP to detect half closed connections * make sure POLLRDHUP is available * do not destroy aral-by-size arals * completed documentation of /api/v2/data. * moved min, max back to view; updated swagger yaml and json * default format for /api/v2/data is json2
2023-03-10/api/v2/X improvements part 3 (#14665)Costa Tsaousis
* max web request size to 64KB * fix the request too big message * increase max request reading tries to 100 * support for bigger web requests * add "avg" as a shortcut for "average" to both group by aggregation and time aggregation; discard the last partial points of a query in play mode, up to max update every; group by hidden dimensions too * better implementation for partial data trimming * added group_by=selected to return only one dimension for all selected metrics * fix acceptance of group_by=selected * passing option "raw" disables partial data trimming * remove obsolete option "plan"; use "debug" * fix view.min and view.max calculation - there were 2 bugs: a) min and max were reset for every row and b) min and max were corrupted by GBC and AR printing * per row annotations * added time column to point annotations * disable caching for /api/v2/contexts responses * added api format json2 that returns an array for each points, having all the point values and annotations in them * work on swagger about /api/v2 * prevent infinite loop * cleanup and swagger work * allow negative simple pattern expressions to work as expected * do not lookup in the dictionary empty names * garbage collect dictionaries * make query_target allocate less aggressively; queries fill the remaining points with nulls * reusable query ops to save memory on huge queries * move parts of query plans into query ops to save query target memory * remove storage engine from query metric tiers, to save memory, and recalculate it when it is needed
2023-03-10Refactor ML code. (#14659)vkalintiris
* Refactor ML code. This commit introduces only non-functional changes. Originally, the C++ code exposed C functions to be called from the rest of the agent. When we migrated from C++ to C, we did not eliminate these wrapper functions to make the PR easier to understand and keep the total LOC low. This commit removes the wrapper functions and "reclaims" the `ml_` prefix that we used for the public API of the old implementation. Also, the nlohmann Json library has been removed and its functionality was replaced with the equivalent Json functionality that we added in libnetdata's BUFFERs. * Remove missing headers from build systems. * Fix CMake build. * rrddim_free is outside of rrd "internals" now.
2023-03-08eBPF new charts (user ring) (#14623)thiagoftsm
2023-03-02/api/v2/contexts (#14592)Costa Tsaousis
* preparation for /api/v2/contexts * working /api/v2/contexts * add anomaly rate information in all statistics; when sum-count is requested, return sums and counts instead of averages * minor fix * query targegt now accurately counts hosts, contexts, instances, dimensions, metrics * cleanup /api/v2/contexts * full text search with /api/v2/contexts * simple patterns now support the option to search ignoring case * full text search API with /api/v2/q * simple pattern execution optimization * do not show q when not given * full text search accounting * separated /api/v2/nodes from /api/v2/contexts * fix ssv queries for group_by * count query instances queried and failed per context and host * split rrdcontext.c to multiple files * add query totals * fix anomaly rate calculation; provide "ni" for indexing hosts * do not generate zero valued members * faster calculation of anomaly rate; by just summing integers for each db points and doing math once for every generated point * fix typo when printing dimensions totals * added option minify to remove spaces and newlines fron JSON output * send instance ids and names when they differ * do not add in query target dimensions, instances, contexts and hosts for which there is no retention in the current timeframe * fix for the previous + renames and code cleanup * when a dimension is filtered, include in the response all the other dimensions that are selectable * do not add nodes that do not have retention in the current window * move selection of dimensions to query_dimension_add(), instead of query_metric_add() * increase the pre-processing capacity of queries * generate instance fqdn ids and names only when they are needed * provide detailed statistics about tiers retention, queries, points, update_every * late allocation of query dimensions * cleanup * more cleanup * support for annotations per displayed point, RESET and PARTIAL * new type annotations * if a chart is not linked to contexts and it is collected, link it when it is collected * make ML run reentrant * make ML rrdr query synchronous * optimize replication memory allocation of replication_sort_entry * change units to percentage, when requesting a coefficinet of variation, or a percentage query * initialize replication before starting main threads * properly decrement no room requests counter * propagate the non-zero flag to group-by * the same by avoiding the extra loop * respect non-zero in all dimension arrays * remove dictionary garbage collection from dictionary_entries() and dictionary_version() * be more verbose when jv2 indexing is postponed * prevent infinite loop * use hidden dimensions even when dimensions pattern is unset * traverse hosts using dictionaries * fix dictionary unittests
2023-02-28Misc SSL improvements 3 (#14602)Emmanuel Vasilakis
* go to normal only if its in stream mode * use ssl specific flags for wait receive and send * remove brackets
2023-02-28Make the title metadata H1 in all markdown files (#14625)Fotis Voutsas
* make the title metadta the H1 * Update collectors/python.d.plugin/zscores/README.md * Update libnetdata/ebpf/README.md * Update ml/README.md * Update libnetdata/string/README.md --------- Co-authored-by: Chris Akritidis <43294513+cakrit@users.noreply.github.com>
2023-02-27Reorg learn 0227 (#14621)Chris Akritidis
* reorg batch 1 * remove duplicate cloud custom dashboard and agent dashboard * Simplify the root web/README * Merge streaming references * Make enable streaming the overall intro and the README the reference * Remove reference-streaming document * Update overview pages
2023-02-25Fix links to chart interactions (#14609)Chris Akritidis
2023-02-23Replace web server readme with its improved replica (#14598)Chris Akritidis
2023-02-22/api/v2/data - multi-host/context/instance/dimension/label queries (#14564)Costa Tsaousis
* fundamentals for having /api/v2/ working * use an atomic to prevent writing to internal pipe too much * first attempt of multi-node, multi-context, multi-chart, multi-dimension queries * v2 jsonwrap * first attempt for group by * cleaned up RRDR and fixed group by * improvements to /api/v2/api * query instance may be realloced, so pointers to it get invalid; solved memory leaks * count of quried metrics in summary information * provide detailed information about selected, excluded, queried and failed metrics for each entity * select instances by fqdn too * add timing information to json output * link charts to rrdcontexts, if a query comes in and it is found unlinked * calculate min, max, sum, average, volume, count per metric * api v2 parameters naming * renders alerts and units * render machine_guid and node_id in all sections it is relevant * unified keys * group by now takes into account units and when there are multiple units involved, it creates a dimension per unit * request and detailed are hidden behind an option * summary includes only a flattened list of alerts * alert counts per host and instance * count of grouped metrics per dimension * added contexts to summary * added chart title * added dimension priorities and chart type * support for multiple group by at the same time * minor fixes * labels are now a tree * keys uniformity * filtering by alerts, both having a specific alert and having a specific alert in a specific status * added scope of hosts and contexts * count of instances on contexts and hosts * make the api return valid responses even when the response contains no data * calculate average and contribution % for every item in the summary * fix compilation warnings * fix compilation warnings - again
2023-02-22Misc SSL improvements 2 (#14334)Emmanuel Vasilakis
* set to wait receive/send when ssl returns wait read/write * compare the bytes * set to normal to prevent going into stream mode with incomplete request * disable wait send
2023-02-21Fix coverity issues (#14543)Stelios Fragkakis
* Fix coverity 383236: Resource leak * Fix coverity 382915 : Logically dead code * Fix coverity 379133 : Division or modulo by float zero * Fix coverity 382783 : Copy into fixed size buffer * Fix coverity 381151 : Missing unlock * Fix coverity 381903 : Dereference after null check