summaryrefslogtreecommitdiffstats
path: root/database
AgeCommit message (Collapse)Author
2023-07-28Fix health query (#15589)Stelios Fragkakis
Fix query
2023-07-27Drop duplicate / unused index (#15568)Stelios Fragkakis
2023-07-26fix expiration dates for API responses (#15546)Costa Tsaousis
2023-07-26Avoid an extra uuid_copy when creating new MRG entries (#15502)Stelios Fragkakis
Avoid an extra uuid_copy when creating new mrg entries
2023-07-26Refactor RRD code. (#15423)vkalintiris
* Storage engine. * Host indexes to rrdb * Move globals to rrdb * Move storage_tiers_backfill to rrdb * default_rrd_update_every to rrdb * default_rrd_history_entries to rrdb * gap_when_lost_iterations_above to rrdb * rrdset_free_obsolete_time_s to rrdb * libuv_worker_threads to rrdb * ieee754_doubles to rrdb * rrdhost_free_orphan_time_s to rrdb * rrd_rwlock to rrdb * localhost to rrdb * rm extern from func decls * mv rrd macro under rrd.h * default_rrdeng_page_cache_mb to rrdb * default_rrdeng_extent_cache_mb to rrdb * db_engine_journal_check to rrdb * default_rrdeng_disk_quota_mb to rrdb * default_multidb_disk_quota_mb to rrdb * multidb_ctx to rrdb * page_type_size to rrdb * tier_page_size to rrdb * No storage_engine_id in rrdim functions * storage_engine_id is provided by st * Update to fix merge conflict. * Update field name * Remove unnecessary macros from rrd.h * Rm unused type decls * Rm duplicate func decls * make internal function static * Make the rest of public dbengine funcs accept a storage_instance. * No more rrdengine_instance :) * rm rrdset_debug from rrd.h * Use rrdb to access globals in ML and ACLK Missed due to not having the submodules in the worktree. * rm total_number * rm RRDVAR_TYPE_TOTAL * rm unused inline * Rm names from typedef'd enums * rm unused header include * Move include * Rm unused header include * s/rrdhost_find_or_create/rrdhost_get_or_create/g * s/find_host_by_node_id/rrdhost_find_by_node_id/ Also, remove duplicate definition in rrdcontext.c * rm macro used only once * rm macro used only once * Reduce rrd.h api by moving funcs into a collector specific utils header * Remove unused func * Move parser specific function out of rrd.h * return storage_number instead of void pointer * move code related to rrd initialization out of rrdhost.c * Remove tier_grouping from rrdim_tier Saves 8 * storage_tiers bytes per dimension. * Fix rebase * s/rrd_update_every/update_every/ * Mark functions as static and constify args * Add license notes and file to build systems. * Remove remaining non-log/config mentions of memory mode * Move rrdlabels api to separate file. Also, move localhost functions that loads labels outside of database/ and into daemon/ * Remove function decl in rrd.h * merge rrdhost_cache_dir_for_rrdset_alloc into rrdset_cache_dir * Do not expose internal function from rrd.h * Rm NETDATA_RRD_INTERNALS Only one function decl is covered. We have more database internal functions that we currently expose for no good reason. These will be placed in a separate internal header in follow up PRs. * Add license note * Include libnetdata.h instead of aral.h * Use rrdb to access localhost * Fix builds without dbengine * Add header to build system files * Add rrdlabels.h to build systems * Move func def from rrd.h to rrdhost.c * Fix macos build * Rm non-existing function * Rebase master * Define buffer length macro in ad_charts. * Fix FreeBSD builds. * Mark functions static * Rm func decls without definitions * Rebase master * Rebase master * Properly initialize value of storage tiers. * Fix build after rebase.
2023-07-25Allow to create alert hashes with --disable-cloud (#15519)Emmanuel Vasilakis
* check for alarm ids with zero hashes * use zeroblob(16)
2023-07-25wait for node_id while claiming (#15526)Costa Tsaousis
2023-07-22Improve the update of the alert chart name in the database (#15490)Stelios Fragkakis
Disable check during health init Store chart_name when storing a new transition
2023-07-20Store and transmit chart_name to cloud in alert events (#15441)Emmanuel Vasilakis
2023-07-18fix alerts transitions search when something specific is asked for (#15447)Costa Tsaousis
2023-07-18added missing fields to alerts instances (#15442)Costa Tsaousis
2023-07-18add chart id and name to alert instances and transitions (#15430)Costa Tsaousis
2023-07-14Pre release fixes (#15405)Costa Tsaousis
2023-07-13Fix CodeQL alert (#15384)Stelios Fragkakis
Fix CodeQL alert -- Multiplication result converted to larger type
2023-07-13Rename log_access and log_health (#15368)Emmanuel Vasilakis
2023-07-12Keep health log history in seconds (#15314)Emmanuel Vasilakis
* rebase * changes queries to delete based on when * readme changes * no need to do migration * wip, protect un-updated events from cleanup * remove index on when_key * fix query for claimed cleanup * if set less than minimum, set minimum * fix query * correct config assign
2023-07-11Rename log Macros (debug) (#15322)thiagoftsm
2023-07-11bearer improvements (#15342)Costa Tsaousis
2023-07-10Use spinlock in host and chart (#15328)Stelios Fragkakis
* Switch alarm log lock to spinlock * Switch the alerts lock in the chart structure to spinlock * Proper lock usage
2023-07-09alerts_transitions outputs hostnames and items statistics (#15329)Costa Tsaousis
* alerts_transitions outputs hostnames and items statistics * return details about the items in the database * added comments to items list and made the whole of statsd available under debug
2023-07-06Rename generic `error` function (#15296)thiagoftsm
2023-07-06avoid memory allocations for alert transitions facets processing (#15318)Costa Tsaousis
2023-07-06add add summary linking to alert instances (ati) when options=summary,values ↵Costa Tsaousis
is requested (#15317)
2023-07-06fix alerts transitions sorting (#15315)Costa Tsaousis
2023-07-06stale vitual hosts (#15313)Costa Tsaousis
wrong parenthesis fixed
2023-07-06Code reorg and cleanup - enrichment of /api/v2 (#15294)Costa Tsaousis
* claim script now accepts the same params as the kickstart * rewrote buildinfo to unify all methods * added cloud unavailable in cloud status * added all exporters * renamed httpd to h2o * rename ENABLE_COMPRESSION to ENABLE_LZ4 * rename global variable * rename ENABLE_HTTPS to ENABLE_OPENSSL * fix coverity-scan for openssl * add lz4 to coverity-scan * added all plugins and most of the features * added all plugins and most of the features * generalize bitmap code so that we can have any size of bitmaps * cleanup * fix compilation without protobuf * fix compilation with others allocators * fix bitmap * comprehensive bitmaps unit test * bitmap as macros * added developer mode * added system info to build info * cloud available/unavailable * added /api/v2/info * added units and ni to transitions * when showing instances and transitions, show only the instances that have transitions * cleanup * add missing quotes * add anchor to transitions * added more to build info * calculate retention per tier and expose it to /api/v2/info * added currently collected metrics * do not show space and retention when no numbers are available * fix impossible overflow * Add function for transitions and execute callback * In case of error, reset and try next dictionary entry * Fix error message * simpler logic to maintain retention per tier * /api/v2/alert_transitions * Handle case of recipient null Convert after and before to usec * Add classification, type and component * working /api/v2/alert_transitions * Fix query to properly handle context and alert name * cleanup * Add search with transition * accept transition in /api/v2/alert_transitions * totaly dynamic facets * fixed debug info * restructured facets * cleanup; removal of options=transitions * updated alert entries flags * method to exec * Return also exec run timestamp Temp table cleanup only when we don't execute with a transition * cleanup obsolete anchor parameter * Add sql_get_alert_configuration function * added options=config to alert_transitions * added /api/v2/alert_config * preliminary work for /api/v2/claim * initialize variables; do not expose expected retention if no disk space info is available; do not report aclk as initializing when not claimed * fix claim session key filename * put a newline into the session key file * more progress on claiming * final /api/v2/claim endpoint * after claiming, refresh our state at the output * Fix query to fetch config * Remove debug log * add configuration objects * add configuration objects - fixed * respect the NETDATA_DISABLE_CLOUD env variable * NETDATA_DISABLE_CLOUD env variable sets the default, but the config sets the final value * use a new claimed_id on every claiming * regenerate random key on claiming and wait for online status * ignore write() return value when writing a newline * dont show cloud status disabled when claimed_id is missing * added ctx to alert instances * cleanup config and transitions from /api/v2/alerts * fix unused variable * in /api/v2/alert_config show 1 config without an array * show alert values conditionally, by appending options=values * When storing host info if the key value is empty, store unknown * added options=summary to control when the alerts summary is shown * increased http_api_v2 to version 5 * claming random key file is now not world readable * added local-listeners binary that detects all the listening ports, their IPs and their command lines --------- Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-07-04Change query to store host system info values (#15300)Emmanuel Vasilakis
* change query to store host info * change define name * change rc check
2023-07-04Check for source field when requesting /api/v1/alarm_log (#15306)Emmanuel Vasilakis
check for source field
2023-07-03Change info to netdata_log_info in sqlite_db_migration.c (#15303)Emmanuel Vasilakis
change info to netdata_log_info
2023-07-03Send alert chart labels config key to cloud (#15283)Emmanuel Vasilakis
* add chart_labels to alert_hash * store chart_labels in alert_hash * transmit to cloud
2023-07-01Optimizations part 3 (#15293)Costa Tsaousis
* use madvise to speed up indexing * collect all rrddim members into a collector structure * use tier 0 virtual point for storing last stored value * reorganize key fields in rrddim * remove fgets from pluginsd and replace it with read() * properly uncork the web server sockets * Revert "reorganize key fields in rrddim" This reverts commit 2d45fa3959087e05462d387ff115a260f3a04b60. * Revert "use tier 0 virtual point for storing last stored value" This reverts commit a576cdd377ad4778a3b8608cabbb7ea7bb19a3a8. * fix cork names * fix compilation warnings
2023-06-30Replace `info` macro with a less generic name (#15266)Carlo Cabrera
2023-06-29use stat() instead of lstat() (#15287)Costa Tsaousis
2023-06-29Misc alert fixes (#15274)Emmanuel Vasilakis
* rebase * proper pointer
2023-06-29Optimizations part 2 (#15280)Costa Tsaousis
* make all pluginsd functions inline, instead of function pointers * dynamic MRG partitions based on the number of CPUs * report the right size of the MRG * prevent invalid read on pluginsd exit * faster service_running() check; fix compiler warnings; shutdown replication after streaming to prevent crash on shutdown * sender is now using a spinlock * rrdcontext uses spinlock * replace select() with poll() * signed calculation of threads * disable read-ahead on jnfv2 files during scan
2023-06-29Revert "Optimizations Part 2" (#15279)Costa Tsaousis
Revert "Optimizations Part 2 (#15267)" This reverts commit b52a989497f68cddeeb0282f5fd650c4e373e477.
2023-06-28Optimizations Part 2 (#15267)Costa Tsaousis
* make all pluginsd functions inline, instead of function pointers * dynamic MRG partitions based on the number of CPUs * report the right size of the MRG
2023-06-28rewrite /api/v2/alerts (#15257)Costa Tsaousis
* rewrite /api/v2/alerts * implement searching for transition * Find transition id and issue callback * Fix parameters * call and transition filter * Search with transition as well * renames and cleanup * render flags * what if scenario for moving transitions at the top level * If transition is given, limit the query appropriately * Add alert transitions * Optimize find transition to use prepared query Drop temp table properly * enabled alert instances again * Order by when key * Order by global_id * Return last X transitions * updated field names * add ati to configurations and show all keys in debug mode * Code cleanup and optimizations * Drop temp table in case of error * Finalize temp table population statement to prevent memory leak * final changes --------- Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-06-26use gperf for the pluginsd/streaming parser hashtable (#15251)Costa Tsaousis
* use gperf for the pluginsd parser * simplify pluginsd_parser by removing void pointers to user * pluginsd_split_words() with inlined pluginsd_space() * quoted_string_splitter() now uses a map instead of a function for determining spaces * add stress test for pluginsd parser * optimized BITMAP256 * optimized rrdpush receiver reception * optimized rrdpush sender compression * renames and cleanup * remove wrong negation * unify handshake and disconnection reasons * use parser_find_keyword * register job names only for the current repertoire
2023-06-26Relax jnfv2 caching (#15224)Costa Tsaousis
* readers should be able to recursively acquire the lock, even when there is a writer waiting * dont madvise dontneed and random * dont validate extents and metrics on jnfv2 * dont validate crc * Delay journal metric check * added MRG stress test --------- Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-06-23Fix coverity 393183 & 393182 (#15234)Emmanuel Vasilakis
fix coverity 393183 393182
2023-06-22New alerts endpoint (#15232)Stelios Fragkakis
* alerts / alerts_log v2 * Add global_id to ae Populate entries with global id * Remove transition id from template Change history to instances * Link ae to rc in all cases Code cleanup
2023-06-22Create index for health log migration (#15233)Stelios Fragkakis
Create health_log_id index
2023-06-21/api/v2 improvements (#15227)Costa Tsaousis
* readers should be able to recursively acquire the lock, even when there is a writer waiting * added health section into nodes * uniformity of nodes * nodes instances should not return node info; http_api_v2 capability should be version 4 everywhere * added /api/v2/versions * added /api/v2/functions * /api/v2/version should be neat
2023-06-21Use a single health log table (#15157)Emmanuel Vasilakis
* move old health log tables to one * change table in sqlite_health * remove check for off period of agent * changes in aclk_alert * fixes * add new field insert_mark_timestamp * cleanup * remove hostname, create the health log table during sqlite init * create the health_log during migration * move source from health_log to alert_hash. Remove class, component and type field from health_log * Register now_usec sqlite function * use global_id instead of insert_mark_timestamp. Use function now_usec to populate it * create functions earlier to have them during migration * small unit test fix * create additional health_log_detail table. Do the insert of an alert event on both * do the update on health_log_detail * change more queries * more indexes, fix inject removed * change last executed and select health log queries * random uuid for sqlite * do migration from old tables * queries to send alerts to cloud * cleanup queries * get an alarm id from db if not found in memory * small fix on query * add info when migration completes * dont pick health_log_detail during migration * check proper old health_log table * safer migration * proper log sent alerts. small fix in claimed cleanup * cleanups * extra check for cleanup * also get an alarm_event_id from sql * check for empty source * remove cleanup of main health log table --------- Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-06-20Fix /api/v2/contexts,nodes,nodes_instances,q before match (#15223)Costa Tsaousis
* readers should be able to recursively acquire the lock, even when there is a writer waiting * in /api/v2/contexts/nodes/nodes_instances/q calls, when the context is collected, before should be matched against now, not the latest cached retention
2023-06-19Obvious memory reductions (#15204)Costa Tsaousis
* remove rd->update_every * reduce amount of memory for RRDDIM * reorgnize rrddim->db entries * optimize rrdset and statsd * optimize dictionaries * RW_SPINLOCK for dictionaries * fix codeql warning * rw_spinlock improvements * remove obsolete assertion * fix crash on health_alarm_log_process() * use RW_SPINLOCK for AVL trees * add RW_SPINLOCK read/write trylock * pgc and mrg now use rw_spinlocks; cache line optimizations for mrg * thread tag of dbegnine init * append created datafile, lockless * make DOUBLE_LINKED_LIST_APPEND_ITEM_UNSAFE friendly for lockless use * thread cancelability in spinlocks; optimize thread cancelability management * introduce a JudyL to index datafiles and use it during queries to quickly find the relevant files * use the last timestamp of each journal file for indexing * when the previous cannot be found, start from the beginning * add more stats to PDC to trace routing easier * rename spinlock functions * fix for spinlock renames * revert statsd socket statistics to size_t * turn fatal into internal_fatal() * show candidates always * show connected status and connection attempts
2023-06-19/api/v2/nodes and streaming function (#15168)Costa Tsaousis
* dummy streaming function * expose global functions upstream * separate function for pushing global functions * add missing conditions * allow streaming function to run async * started internal API for functions * cache host retention and expose it to /api/v2/nodes * internal API for function table fields; more progress on streaming status * abstracted and unified rrdhost status * port old coverity warning fix - although it is not needed * add ML information to rrdhost status * add ML capability to streaming to signal the transmission of ML information; added ML information to host status * protect host->receiver * count metrics and instances per host * exposed all inbound and outbound streaming * fix for ML status and dependency of DATA_WITH_ML to INTERPOLATED, not IEEE754 * update ML dummy * added all fields * added streaming group by and cleaned up accepted values by cloud * removed type * Revert "removed type" This reverts commit faae4177e603d4f85b7433f33f92ef3ccd23976e. * added context to db summary * new /api/v2/nodes schema * added ML type * change default function charts * log to trace new capa * add more debug * removed debugging code * retry on receive interrupted read; respect sender reconnect delay in all cases * set disconnected host flag and manipulate localhost child count atomically, inside set/clear receiver * fix infinite loop * send_to_plugin() now has a spinlock to ensure that only 1 thread is writing to the plugin/child at the same time * global cloud_status() call * cloud should be a section, since it will contain error information * put cloud capabilities into cloud * aclk status in /api/v2 agents sections * keep aclk_connection_counter * updates on /api/v2/nodes * final /api/v2/nodes and addition of /api/v2/nodes_instances * parametrize all /api/v2/xxx output to control which info is outputed per endpoint * always accept nodes selector * st needs to be per instance, not per node * fix merging of contexts; fix cups plugin priorities * add after and before parameters to /api/v2/contexts/nodes/nodes_instances/q * give each libuv worker a unique id * aclk http_api_v2 version 4
2023-06-19Add two functions that allow someone to start/stop ML. (#15185)vkalintiris
* Add two functions that allow someone to start/stop ML. * Shutdown ML after stopping collector services * Remove unnecessary mutex from ml charts. There's already a spinlock that protects the chart when a someone calls rrdset_done(). * Use a lightweight spinlock instead of a mutext for ML dimensions.
2023-06-15sqlite_health.c: remove `uuid.h` include (#15195)Nanda H Krishna