summaryrefslogtreecommitdiffstats
path: root/database
AgeCommit message (Collapse)Author
2024-02-01Create a top-level directory to contain source code. (#16896)vkalintiris
* Move ML under src * Move spwan under src * Move cli/ under src/ * move registry/ under src/ * move streaming/ under src/ * Move claim under src. Update docs * Move database/ under src/ * Move libnetdata/ under src/ * Update references to libnetdata * Fix logsmanagement includes * Update generated script path.
2024-01-31add the CLOEXEC flag to all sockets and files (#16881)Costa Tsaousis
* add the CLOEXEC flag to all sockets and files * add network-viewer to apps.plugin; min update frequency 5 seconds
2024-01-29Update statistics to address slow queries (#16838)Stelios Fragkakis
* Run analyze on aclk_alert tables Add analyze option -W sqlite-analyze * Remove empty line * Remove analyze during runtime * Remove health_log_entries_written * Replace index * Remove forced index skip * Change version and run database analyze * Adjust analyze to run on specific tables Fix previous migration v14 -> v15 typo * Fix v15 -> v16 migration message * Fix v15 -> v16 migration message (typo) * Increase analysis limit
2024-01-29Remove old mention of save db mode (#16864)Fotis Voutsas
2024-01-29New Permissions System (#16837)Costa Tsaousis
* wip of migrating to bitmap permissions * replace role with bitmapped permissions * formatting permissions using macros * accept view and edit permissions for all dynamic configuration * work on older compilers * parse the header in hex * agreed permissions updates * map permissions to old roles * new permissions management * fix function rename * build libdatachannel when enabled - currently for code maintainance * dyncfg now keeps 2 sets of statuses, to keep track of what happens to dyncfg and what actually happens with the plugin * complete the additions of jobs and solve unittests * fix renumbering of ACL bits * processes function shows the cmdline based on permissions and the presence of the sensitive data permission * now the agent returns 412 when authorization is missing, 403 when authorization exists but permissions are not enough, 451 when access control list prevents the user from accessing the dashboard * enable cmdline on processes with thhe HTTP_ACCESS_VIEW_AGENT_CONFIG permission * by default functions require anonymous-data access * fix compilation on debian * fix left-over renamed define * updated schema for alerts * updated permissions * require a name when loading json payloads, if the name is not provided by dyncfg
2024-01-24Fix coverity issue (#16831)Stelios Fragkakis
CID 414122: Resource leaks (RESOURCE_LEAK)
2024-01-23Change query label matching logic (#16827)Stelios Fragkakis
* Match multi labels * Rework, add support for weights * Fix function return value * Cleanup function
2024-01-23DYNCFG: dynamically configured alerts (#16779)Costa Tsaousis
* cleanup alerts * fix references * fix references * fix references * load alerts once and apply them to each node * simplify health_create_alarm_entry() * Compile without warnings with compiler flags: -Wall -Wextra -Wformat=2 -Wshadow -Wno-format-nonliteral -Winit-self * code re-organization and cleanup * generate patterns when applying prototypes; give unique dyncfg names to all alerts * eval expressions keep the source and the parsed_as as STRING pointers * renamed host to node in dyncfg ids * renamed host to node in dyncfg ids * add all cloud roles to the list of parsed X-Netdata-Role header and also default to member access level * working functionality * code re-organization: moved health event-loop to a new file, moved health globals to health.c * rrdcalctemplate is removed; alert_cfg is removed; foreach dimension is removed; RRDCALCs are now instanciated only when they are linked to RRDSETs * dyncfg alert prototypes initialization for alerts * health dyncfg split to separate file * cleanup not-needed code * normalize matches between parsing and json * also detect !* for disabled alerts * dyncfg capability disabled * Store alert config part1 * Add rrdlabels_common_count * wip health variables lookup without indexes * Improve rrdlabels_common_count by reusing rrdlabels_find_label_with_key_unsafe with an additional parameter * working variables with runtime lookup * working variables with runtime lookup * delete rrddimvar and rrdfamily index * remove rrdsetvar; now all variables are in RRDVARs inside hosts and charts * added /api/v1/variable that resolves a variable the same way alerts do * remove rrdcalc from eval * remove debug code * remove duplicate assignment * Fix memory leak * all alert variables are now handled by alert_variable_lookup() and EVAL is now independent of alerts * hide all internal structures of EVAL * Enable -Wformat flag Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> * Adjust binding for calculation, warning, critical * Remove unused macro * Update config hash id * use the right info and summary in alerts log * use synchronous queries for alerts * Handle cases when config_hash_id is missing from health_log * remove deadlock from health worker * parsing to json payload for health alert prototypes * cleaner parsing and avoiding memory leaks in case of duplicate members in json * fix left-over rename of function * Keep original lookup field to send to the cloud Cleanup / rename function to store config Remove unused DEFINEs, functions * Use ac->lookup * link jobs to the host when the template is registered; do not accept running a function without a host * full dyncfg support for health alerts, except action TEST * working dyncfg additions, updates, removals * fixed missing source, wrong status updates * add alerts by type, component, classification, recipient and module at the /api/v2/alerts endpoint * fix dyncfg unittest * rename functions * generalize the json-c parser macros and move them to libnetdata * report progress when enabling and disabling dyncfg templates * moved rrdcalc and rrdvar to health * update alarms * added schema for alerts; separated alert_action_options from rrdr_options; restructured the json payload for alerts * enable parsed json alerts; allow sending back accepted but disabled * added format_version for alerts payload; enables/disables status now is also inheritted by the status of the rules; fixed variable names in json output * remove the RRDHOST pointer from DYNCFG * Fix command field submitted to the cloud * do not send updates to creation requests, for DYNCFG jobs --------- Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Tasos Katsoulas <tasos@netdata.cloud> Co-authored-by: ilyam8 <ilya@netdata.cloud>
2024-01-22Preserve label source during migration (#16821)Stelios Fragkakis
2024-01-15Add additional fail reason and source during database initialization (#16794)Stelios Fragkakis
2024-01-15Use original summary for alert transition (#16793)Stelios Fragkakis
Use original summary for alert Fetch transaction and global id for transitions safely
2024-01-15Improve context load (#16659)Stelios Fragkakis
* Improve single thread load. Handle thread creation failure as well Remove RRDHOST_FLAG_CONTEXT_LOAD_IN_PROGRESS Improve chart label cleanup * Init thread index
2024-01-11Fix sanitizer errors (#16759)Costa Tsaousis
* fix sanitizer errors in logs.c * fix sanitizer errors in rrdlabels.c * cleanup sanitizer exceptions
2024-01-11Delete memory mode "map" and "save". (#16604)vkalintiris
* Delete memory modes "map" and "save". * Remove unmaintained exporting tests * Remove references of map/save modes in docs. * Remove more references to map/save from docs.
2024-01-11dyncfg v2 (#16702)Costa Tsaousis
* split rrdfunctions streaming and progress * simplified internal inline functions API * split rrdfunctions inflight management * split rrd functions exporters * renames * base dyncfg structure * config pluginsd * intercept dyncfg function calls * loading and saving of dyncfg metadata and data * save metadata and payload to a single file; added code to update the plugins with jobs and saved configs * basic working unit test * added payload to functions execution * removed old dyncfg code that is not needed any more * more cleanup * cleanup sender for functions with payload * dyncfg functions are not exposed as functions * remaining work to avoid indexing the \0 terminating character in dictionary keys * added back old dyncfg plugins.d commands as noop, to allow plugins continue working * working api; working streaming; * updated plugins.d documentation * aclk and http api requests share the same header parsing logic * added source type internal * fixed crashes * added god mode for tests * fixes * fixed messages * save host machine guids to configs * cleaner manipulation of supported commands * the functions event loop for external plugins can now process dyncfg requests * unified internal and external plugins dyncfg API * Netdata serves schema requests from /etc/netdata/schema.d and /var/lib/netdata/conf.d/schema.d * cleanup and various fixes; fixed bug in previous dyncfg implementation on streaming that was sending the paylod in a way that allowed other streaming commands to be multiplexed * internals go to a separate header file * fix duplicate ACLK requests sent by aclk queue mechanism * use fstat instead of stat * working api * plugin actions renamed to create and delete; dyncfg files are removed only from user actions * prevent deadlock by using the react callback * fix for string_strndupz() * better dyncfg unittests * more tests at the unittests * properly detect dyncfg functions * hide config functions from the UI * tree response improvements * send the initial update with payload * determine tty using stdout, not stderr * changes to statuses, cleanup and the code to bring all business logic into interception * do not crash when the status is empty * functions now propagate the source of the requests to plugins * avoid warning about unused functions * in the count at items for attention, do not count the orphan entries * save source into dyncfg * make the list null terminated * fixed invalid comparison * prevent memory leak on duplicated headers; log x-forwarded-for * more unit tests * added dyncfg unittests into the default unittests * more unit tests and fixes * more unit tests and fixes * fix dictionary unittests * config functions require admin access
2024-01-11Name storage engine variables consistently. (#16753)vkalintiris
* Consistent naming of STORAGE_INSTANCE instances. Replace usages of `db_instance` and `instance` with `si`. * Rename array `storage_metrics_groups[tier]` to `smg[tier]` * Rename db_metric_handle to smh * Rename instances of `storage_engine_query_handle` to `seqh`. * Rename instances of STORAGE_ENGINE_BACKEND to `seb`. * Rename instances of STORAGE_COLLECT_HANDLE to `sch`.
2024-01-10Address sanitizer through CMake and use it for unit tests. (#16748)vkalintiris
* Disable address sanitizer for some functions. These functions report some issues when running the address sanitizer with `-W unittest`. We want to run the sanitized binary on Github PRs to catch newly-introduced issues. FIXMEs were added so that we know which ones already existed prior to this change. * Add cmake option to use address sanitizer * Run unit tests with address sanitizer. * Specify attribute before the function declaration. * Disable hardening flags.
2024-01-10Remove unused file (#16747)Stelios Fragkakis
* Remove unused file Fix compilation warning * Use PRIu32
2024-01-09Fatal relaxation of unknown page types. (#16682)vkalintiris
Mostly to make the agent downgradable when dealing with unknown page types.
2023-12-29fix quota calculation when the the db is empty (#16699)Costa Tsaousis
* fix quota calculation when the the db is empty * do not compute workers utilization if extented statistics is not enabled
2023-12-29improve the error message when accessing functions (#16692)Costa Tsaousis
2023-12-29atomically load the metric reference count (#16687)Costa Tsaousis
2023-12-28cmake missing defines (#16680)Costa Tsaousis
* added HAVE_ACCEPT4 * added HAVE_FINITE, HAVE_ISFINITE * added SIZEOF_VOID_P * added HAVE_NICE, HAVE_RECVMMSG, HAVE_GETPRIORITY * added HAVE_C__GENERIC * added HAVE_C_MALLOPT * added HAVE_BACKTRACE, HAVE_CLOSE_RANGE, HAVE_SCHED_GETSCHEDULER, HAVE_SCHED_SETSCHEDULER, HAVE_SCHED_GET_PRIORITY_MIN, HAVE_SCHED_GET_PRIORITY_MAX * added HAVE_DLSYM * added function attributes checks * fix SIZEOF_VOID_P * added HAVE_PTHREAD_GETNAME_NP * fixed compiler warnings
2023-12-27set log level of too-old-data message to debug (#16663)Ilya Mashchenko
2023-12-27Shutdown dbengine event loop properly (#16658)Stelios Fragkakis
* Shutdown dbengine event loop properly * Adjust messages
2023-12-22Fix overrun in crc32set (#16654)Stelios Fragkakis
2023-12-19Fix UB of unaligned loads/stores and signed shifts. (#16628)vkalintiris
* Ignore build/ dir. This directory is the default dir for many LSPs and for IDES using cmake. "Reserve" it by ignoring it in .gitignore. * Fix format specifier. * Use unsigned literals when shifting. * Do not sanitize shifts in libjudy. * Fix unaligned loads/stores of dbengine's CRCs * Fix unaligned load when partitioning metrics. * Use unsigned literals when shifting.
2023-12-15Queries Progress (#16574)Costa Tsaousis
* track the progress of queries * add query_progress in libnetdata Makefile.am * add acl, response size and response code to the tracking * define the required functions * fix the last commit * added /api/v2/progress?transaction=ID to report the progress of queries * added function to report netdata-queries * track hashtable additions * when resusing a transaction, maintain the counter * keep track of linked and indexing * added X-Forwarded-Host and X-Forwarded-For to logs. X-Forwarded-For is also added in progress tracking * report compact uuids to match logs; register the actual duration of the transaction * added rowOptions to function; now web_client keeps track if it tracks progress or not * add http request method to progress * add tags per function; /api/vX/functions is now not protected * compact the sanitization array * split pluginsd_parser into multiple files * cleanup keyword definitions * code cleanup * extracted rrd_collector to separate files * added http access level to functions * renamed access "all" to "any" * implemented optional protection on functions * add priority to functions, to allow the UI select the best function (lower priority) when the user has not selected a function * added progress report from the plugins to netdata and from children to parents - untested * added progress reporting in systemd-journal * query timeout is now handled by evloop for external plugins * propagate progress reports to children and plugins * fix codeql warning * adapt to cmake * minor changes * extend function timeout when progress is received; added streaming capability to propagate progress reports to parents and send progress requests to children * revert change in dictionary.h * add log when access level is invalid * update access level of functions * added logs when processing progress updates * log when the deferred response is too big * comment out sender progress to find the issue * added missing newline in streaming progress reports * propogate progress reports to functions * fix logs
2023-12-14Remove assert (#16611)Stelios Fragkakis
2023-12-14Fix coverity issues (#16596)Stelios Fragkakis
* Fix coverity issues * Prevent potential overflow
2023-12-13CMake build system. (#15996)vkalintiris
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud> Co-authored-by: Tasos Katsoulas <12612986+tkatsoulas@users.noreply.github.com> Co-authored-by: Emmanuel Vasilakis <mrzammler@mm.st> Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: netdatabot <bot@netdata.cloud> Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
2023-12-13Fix coverity issues (#16589)Stelios Fragkakis
* Fix coverity issues * More issues fixed
2023-12-12code cleanup (#16542)Costa Tsaousis
fixed minor code cleanup warnings
2023-12-12Handle coverity issues related to Y2K38_SAFETY (#16583)Stelios Fragkakis
* Switch update_every_s to uint32_t Fix coverity issues related to Y2K38_SAFETY * Fix CI
2023-12-08Fix memory leak during host chart label cleanup (#16568)Stelios Fragkakis
Fix memory leak
2023-12-07Resolve issue on startup in servers with 1 core (#16565)Stelios Fragkakis
* Use at least one thread to do context load Check for uninitialized last connected value * Simplify / fix compilation warning
2023-12-06Improve page validity check during database extent load (#16552)Stelios Fragkakis
Pages with type 2 (gorilla compression) can be > 4096 bytes in multiples of GORILLA PAGE SIZE
2023-12-05change level to debug "took too long to be updated" (#16540)Ilya Mashchenko
2023-12-01change log level to debug for dbengine routine operations on start (#16518)Ilya Mashchenko
2023-12-01Code cleanup (#16448)Stelios Fragkakis
* Code cleanup * More cleanup * More cleanup * Use FILENAME_MAX * query fix
2023-11-30convert some error messages to info (#16508)Ilya Mashchenko
2023-11-30Resolve coverity issue 410232 (#16507)Stelios Fragkakis
Resolve CID 410232: Error handling issues (CHECKED_RETURN)
2023-11-29When unregistering an ephemeral host, delete its chart labels (#16486)Stelios Fragkakis
* When unregistering an ephemeral host, delete its chart labels * Fix memory leak in case of query preparation or bind failure * Add check to handle CID 410125 Dereference null return value
2023-11-28Fix occasional shutdown deadlock (#16495)Stelios Fragkakis
* Wait for RRDENG_OPCODE_CTX_QUIESCE to complete before attempting rrd_finalize_collection_for_all_hosts * Submit RRDENG_OPCODE_CTX_QUIESCE for all tiers and then wait for completion
2023-11-28Check context post processing queue before sending status to cloud (#16472)Stelios Fragkakis
Waiting until the context post processing queue is empty before sending node info and collectors
2023-11-23Handle ephemeral hosts (#16381)Stelios Fragkakis
* Handle ephemeral hosts * Node empheral removal timeout 86400 seconds (1 day) * Move config from health to global section * Set a node to queryable false when it is ephemeral and is removed * Log queryable. Send queryable=0 only when forcing host deletion (the node is ephemeral) * Switch to "is ephemeral node" Document stream.conf * Unregister node id
2023-11-22New logging layer (#16357)Costa Tsaousis
* cleanup of logging - wip * first working iteration * add errno annotator * replace old logging functions with netdata_logger() * cleanup * update error_limit * fix remanining error_limit references * work on fatal() * started working on structured logs * full cleanup * default logging to files; fix all plugins initialization * fix formatting of numbers * cleanup and reorg * fix coverity issues * cleanup obsolete code * fix formatting of numbers * fix log rotation * fix for older systems * add detection of systemd journal via stderr * finished on access.log * remove left-over transport * do not add empty fields to the logs * journal get compact uuids; X-Transaction-ID header is added in web responses * allow compiling on systems without memfd sealing * added libnetdata/uuid directory * move datetime formatters to libnetdata * add missing files * link the makefiles in libnetdata * added uuid_parse_flexi() to parse UUIDs with and without hyphens; the web server now read X-Transaction-ID and uses it for functions and web responses * added stream receiver, sender, proc plugin and pluginsd log stack * iso8601 advanced usage; line_splitter module in libnetdata; code cleanup * add message ids to streaming inbound and outbound connections * cleanup line_splitter between lines to avoid logging garbage; when killing children, kill them with SIGABRT if internal checks is enabled * send SIGABRT to external plugins only if we are not shutting down * fix cross cleanup in pluginsd parser * fatal when there is a stack error in logs * compile netdata with -fexceptions * do not kill external plugins with SIGABRT * metasync info logs to debug level * added severity to logs * added json output; added options per log output; added documentation; fixed issues mentioned * allow memfd only on linux * moved journal low level functions to journal.c/h * move health logs to daemon.log with proper priorities * fixed a couple of bugs; health log in journal * updated docs * systemd-cat-native command to push structured logs to journal from the command line * fix makefiles * restored NETDATA_LOG_SEVERITY_LEVEL * fix makefiles * systemd-cat-native can also work as the logger of Netdata scripts * do not require a socket to systemd-journal to log-as-netdata * alarm notify logs in native format * properly compare log ids * fatals log alerts; alarm-notify.sh working * fix overflow warning * alarm-notify.sh now logs the request (command line) * anotate external plugins logs with the function cmd they run * added context, component and type to alarm-notify.sh; shell sanitization removes control character and characters that may be expanded by bash * reformatted alarm-notify logs * unify cgroup-network-helper.sh * added quotes around params * charts.d.plugin switched logging to journal native * quotes for logfmt * unify the status codes of streaming receivers and senders * alarm-notify: dont log anything, if there is nothing to do * all external plugins log to stderr when running outside netdata; alarm-notify now shows an error when notifications menthod are needed but are not available * migrate cgroup-name.sh to new logging * systemd-cat-native now supports messages with newlines * socket.c logs use priority * cleanup log field types * inherit the systemd set INVOCATION_ID if found * allow systemd-cat-native to send messages to a systemd-journal-remote URL * log2journal command that can convert structured logs to journal export format * various fixes and documentation of log2journal * updated log2journal docs * updated log2journal docs * updated documentation of fields * allow compiling without libcurl * do not use socket as format string * added version information to newly added tools * updated documentation and help messages * fix the namespace socket path * print errno with error * do not timeout * updated docs * updated docs * updated docs * log2journal updated docs and params * when talking to a remote journal, systemd-cat-native batches the messages * enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote * Revert "enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote" This reverts commit b079d53c11f6687cd64d804fdd7b24c0492bf245. * note about uncompressed traffic * log2journal: code reorg and cleanup to make modular * finished rewriting log2journal * more comments * rewriting rules support * increased limits * updated docs * updated docs * fix old log call * use journal only when stderr is connected to journal * update netdata.spec for libcurl, libpcre2 and log2journal * pcre2-devel * do not require pcre2 in centos < 8, amazonlinux < 2023, open suse * log2journal only on systems pcre2 is available * ignore log2journal in .gitignore * avoid log2journal on centos 7, amazonlinux 2 and opensuse * add pcre2-8 to static build * undo last commit * Bundle to static Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> * Add build deps for deb packages Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> * Add dependencies; build from source Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> * Test build for amazon linux and centos expect to fail for suse Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> * fix minor oversight Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> * Reorg code * Add the install from source (deps) as a TODO * Not enable the build on suse ecosystem Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> --------- Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> Co-authored-by: Tasos Katsoulas <tasos@netdata.cloud>
2023-11-21Add support for gorilla pages for tier 0. (#15969)vkalintiris
--------- Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
2023-11-21Remove queue limit from ACLK sync event loop (#16411)Stelios Fragkakis
Code cleanup
2023-11-13Switch alarm_log to use the buffer json functions (#16360)Stelios Fragkakis
* Switch alarm_log to use the buffer json functions * Remove commented out code * Fix finalize when an object is not explicitly closed * Use buffer_json_member_add_boolean