summaryrefslogtreecommitdiffstats
path: root/libnetdata/Makefile.am
AgeCommit message (Collapse)Author
2023-01-30DBENGINE v2 - improvements part 11 (#14337)Costa Tsaousis
* acquiring / releasing interface for metrics * metrics registry statistics * cleanup metrics registry by deleting metrics when they dont have retention anymore; do not double copy the data of pages to be flushed * print the tier in retention summary * Open files with buffered instead of direct I/O (test) * added more metrics stats and fixed unittest * rename writer functions to avoid confusion with refcounting * do not release a metric that is not acquired * Revert to use direct I/O on write -- use direct I/O on read as well * keep track of ARAL overhead and add it to the memory chart * aral full check via api * Cleanup * give names to ARALs and PGCs * aral improvements * restore query expansion to the future * prefer higher resolution tier when switching plans * added extent read statistics * smoother joining of tiers at query engine * fine tune aral max allocation size * aral restructuring to hide its internals from the rest of netdata * aral restructuring; addtion of defrag option to aral to keep the linked list sorted - enabled by default to test it * fully async aral * some statistics and cleanup * fix infinite loop while calculating retention * aral docs and defragmenting disabled by default * fix bug and add optimization when defragmenter is not enabled * aral stress test * aral speed report and documentation * added internal checks that all pages are full * improve internal log about metrics deletion * metrics registry uses one aral per partition * metrics registry aral max size to 512 elements per page * remove data_structures/README.md dependency --------- Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-01-10DBENGINE v2 (#14125)Costa Tsaousis
* count open cache pages refering to datafile * eliminate waste flush attempts * remove eliminated variable * journal v2 scanning split functions * avoid locking open cache for a long time while migrating to journal v2 * dont acquire datafile for the loop; disable thread cancelability while a query is running * work on datafile acquiring * work on datafile deletion * work on datafile deletion again * logs of dbengine should start with DBENGINE * thread specific key for queries to check if a query finishes without a finalize * page_uuid is not used anymore * Cleanup judy traversal when building new v2 Remove not needed calls to metric registry * metric is 8 bytes smaller; timestamps are protected with a spinlock; timestamps in metric are now always coherent * disable checks for invalid time-ranges * Remove type from page details * report scanning time * remove infinite loop from datafile acquire for deletion * remove infinite loop from datafile acquire for deletion again * trace query handles * properly allocate array of dimensions in replication * metrics cleanup * metrics registry uses arrayalloc * arrayalloc free should be protected by lock * use array alloc in page cache * journal v2 scanning fix * datafile reference leaking hunding * do not load metrics of future timestamps * initialize reasons * fix datafile reference leak * do not load pages that are entirely overlapped by others * expand metric retention atomically * split replication logic in initialization and execution * replication prepare ahead queries * replication prepare ahead queries fixed * fix replication workers accounting * add router active queries chart * restore accounting of pages metadata sources; cleanup replication * dont count skipped pages as unroutable * notes on services shutdown * do not migrate to journal v2 too early, while it has pending dirty pages in the main cache for the specific journal file * do not add pages we dont need to pdc * time in range re-work to provide info about past and future matches * finner control on the pages selected for processing; accounting of page related issues * fix invalid reference to handle->page * eliminate data collection handle of pg_lookup_next * accounting for queries with gaps * query preprocessing the same way the processing is done; cache now supports all operations on Judy * dynamic libuv workers based on number of processors; minimum libuv workers 8; replication query init ahead uses libuv workers - reserved ones (3) * get into pdc all matching pages from main cache and open cache; do not do v2 scan if main cache and open cache can satisfy the query * finner gaps calculation; accounting of overlapping pages in queries * fix gaps accounting * move datafile deletion to worker thread * tune libuv workers and thread stack size * stop netdata threads gradually * run indexing together with cache flush/evict * more work on clean shutdown * limit the number of pages to evict per run * do not lock the clean queue for accesses if it is not possible at that time - the page will be moved to the back of the list during eviction * economies on flags for smaller page footprint; cleanup and renames * eviction moves referenced pages to the end of the queue * use murmur hash for indexing partition * murmur should be static * use more indexing partitions * revert number of partitions to number of cpus * cancel threads first, then stop services * revert default thread stack size * dont execute replication requests of disconnected senders * wait more time for services that are exiting gradually * fixed last commit * finer control on page selection algorithm * default stacksize of 1MB * fix formatting * fix worker utilization going crazy when the number is rotating * avoid buffer full due to replication preprocessing of requests * support query priorities * add count of spins in spinlock when compiled with netdata internal checks * remove prioritization from dbengine queries; cache now uses mutexes for the queues * hot pages are now in sections judy arrays, like dirty * align replication queries to optimal page size * during flushing add to clean and evict in batches * Revert "during flushing add to clean and evict in batches" This reverts commit 8fb2b69d068499eacea6de8291c336e5e9f197c7. * dont lock clean while evicting pages during flushing * Revert "dont lock clean while evicting pages during flushing" This reverts commit d6c82b5f40aeba86fc7aead062fab1b819ba58b3. * Revert "Revert "during flushing add to clean and evict in batches"" This reverts commit ca7a187537fb8f743992700427e13042561211ec. * dont cross locks during flushing, for the fastest flushes possible * low-priority queries load pages synchronously * Revert "low-priority queries load pages synchronously" This reverts commit 1ef2662ddcd20fe5842b856c716df134c42d1dc7. * cache uses spinlock again * during flushing, dont lock the clean queue at all; each item is added atomically * do smaller eviction runs * evict one page at a time to minimize lock contention on the clean queue * fix eviction statistics * fix last commit * plain should be main cache * event loop cleanup; evictions and flushes can now happen concurrently * run flush and evictions from tier0 only * remove not needed variables * flushing open cache is not needed; flushing protection is irrelevant since flushing is global for all tiers; added protection to datafiles so that only one flusher can run per datafile at any given time * added worker jobs in timer to find the slow part of it * support fast eviction of pages when all_of_them is set * revert default thread stack size * bypass event loop for dispatching read extent commands to workers - send them directly * Revert "bypass event loop for dispatching read extent commands to workers - send them directly" This reverts commit 2c08bc5bab12881ae33bc73ce5dea03dfc4e1fce. * cache work requests * minimize memory operations during flushing; caching of extent_io_descriptors and page_descriptors * publish flushed pages to open cache in the thread pool * prevent eventloop requests from getting stacked in the event loop * single threaded dbengine controller; support priorities for all queries; major cleanup and restructuring of rrdengine.c * more rrdengine.c cleanup * enable db rotation * do not log when there is a filter * do not run multiple migration to journal v2 * load all extents async * fix wrong paste * report opcodes waiting, works dispatched, works executing * cleanup event loop memory every 10 minutes * dont dispatch more work requests than the number of threads available * use the dispatched counter instead of the executing counter to check if the worker thread pool is full * remove UV_RUN_NOWAIT * replication to fill the queues * caching of extent buffers; code cleanup * caching of pdc and pd; rework on journal v2 indexing, datafile creation, database rotation * single transaction wal * synchronous flushing * first cancel the threads, then signal them to exit * caching of rrdeng query handles; added priority to query target; health is now low prio * add priority to the missing points; do not allow critical priority in queries * offload query preparation and routing to libuv thread pool * updated timing charts for the offloaded query preparation * caching of WALs * accounting for struct caches (buffers); do not load extents with invalid sizes * protection against memory booming during replication due to the optimal alignment of pages; sender thread buffer is now also reset when the circular buffer is reset * also check if the expanded before is not the chart later updated time * also check if the expanded before is not after the wall clock time of when the query started * Remove unused variable * replication to queue less queries; cleanup of internal fatals * Mark dimension to be updated async * caching of extent_page_details_list (epdl) and datafile_extent_offset_list (deol) * disable pgc stress test, under an ifdef * disable mrg stress test under an ifdef * Mark chart and host labels, host info for async check and store in the database * dictionary items use arrayalloc * cache section pages structure is allocated with arrayalloc * Add function to wakeup the aclk query threads and check for exit Register function to be called during shutdown after signaling the service to exit * parallel preparation of all dimensions of queries * be more sensitive to enable streaming after replication * atomically finish chart replication * fix last commit * fix last commit again * fix last commit again again * fix last commit again again again * unify the normalization of retention calculation for collected charts; do not enable streaming if more than 60 points are to be transferred; eliminate an allocation during replication * do not cancel start streaming; use high priority queries when we have locked chart data collection * prevent starvation on opcodes execution, by allowing 2% of the requests to be re-ordered * opcode now uses 2 spinlocks one for the caching of allocations and one for the waiting queue * Remove check locks and NETDATA_VERIFY_LOCKS as it is not needed anymore * Fix bad memory allocation / cleanup * Cleanup ACLK sync initialization (part 1) * Don't update metric registry during shutdown (part 1) * Prevent crash when dashboard is refreshed and host goes away * Mark ctx that is shutting down. Test not adding flushed pages to open cache as hot if we are shutting down * make ML work * Fix compile without NETDATA_INTERNAL_CHECKS * shutdown each ctx independently * fix completion of quiesce * do not update shared ML charts * Create ML charts on child hosts. When a parent runs a ML for a child, the relevant-ML charts should be created on the child host. These charts should use the parent's hostname to differentiate multiple parents that might run ML for a child. The only exception to this rule is the training/prediction resource usage charts. These are created on the localhost of the parent host, because they provide information specific to said host. * check new ml code * first save the database, then free all memory * dbengine prep exit before freeing all memory; fixed deadlock in cache hot to dirty; added missing check to query engine about metrics without any data in the db * Cleanup metadata thread (part 2) * increase refcount before dispatching prep command * Do not try to stop anomaly detection threads twice. A separate function call has been added to stop anomaly detection threads. This commit removes the left over function calls that were made internally when a host was being created/destroyed. * Remove allocations when smoothing samples buffer The number of dims per sample is always 1, ie. we are training and predicting only individual dimensions. * set the orphan flag when loading archived hosts * track worker dispatch callbacks and threadpool worker init * make ML threads joinable; mark ctx having flushing in progress as early as possible * fix allocation counter * Cleanup metadata thread (part 3) * Cleanup metadata thread (part 4) * Skip metadata host scan when running unittest * unittest support during init * dont use all the libuv threads for queries * break an infinite loop when sleep_usec() is interrupted * ml prediction is a collector for several charts * sleep_usec() now makes sure it will never loop if it passes the time expected; sleep_usec() now uses nanosleep() because clock_nanosleep() misses signals on netdata exit * worker_unregister() in netdata threads cleanup * moved pdc/epdl/deol/extent_buffer related code to pdc.c and pdc.h * fixed ML issues * removed engine2 directory * added dbengine2 files in CMakeLists.txt * move query plan data to query target, so that they can be exposed by in jsonwrap * uniform definition of query plan according to the other query target members * event_loop should be in daemon, not libnetdata * metric_retention_by_uuid() is now part of the storage engine abstraction * unify time_t variables to have the suffix _s (meaning: seconds) * old dbengine statistics become "dbengine io" * do not enable ML resource usage charts by default * unify ml chart families, plugins and modules * cleanup query plans from query target * cleanup all extent buffers * added debug info for rrddim slot to time * rrddim now does proper gap management * full rewrite of the mem modes * use library functions for madvise * use CHECKSUM_SZ for the checksum size * fix coverity warning about the impossible case of returning a page that is entirely in the past of the query * fix dbengine shutdown * keep the old datafile lock until a new datafile has been created, to avoid creating multiple datafiles concurrently * fine tune cache evictions * dont initialize health if the health service is not running - prevent crash on shutdown while children get connected * rename AS threads to ACLK[hostname] * prevent re-use of uninitialized memory in queries * use JulyL instead of JudyL for PDC operations - to test it first * add also JulyL files * fix July memory accounting * disable July for PDC (use Judy) * use the function to remove datafiles from linked list * fix july and event_loop * add july to libnetdata subdirs * rename time_t variables that end in _t to end in _s * replicate when there is a gap at the beginning of the replication period * reset postponing of sender connections when a receiver is connected * Adjust update every properly * fix replication infinite loop due to last change * packed enums in rrd.h and cleanup of obsolete rrd structure members * prevent deadlock in replication: replication_recalculate_buffer_used_ratio_unsafe() deadlocking with replication_sender_delete_pending_requests() * void unused variable * void unused variables * fix indentation * entries_by_time calculation in VD was wrong; restored internal checks for checking future timestamps * macros to caclulate page entries by time and size * prevent statsd cleanup crash on exit * cleanup health thread related variables Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: vkalintiris <vasilis@netdata.cloud>
2022-09-19RRD structures managed by dictionaries (#13646)Costa Tsaousis
* rrdset - in progress * rrdset optimal constructor; rrdset conflict * rrdset final touches * re-organization of rrdset object members * prevent use-after-free * dictionary dfe supports also counting of iterations * rrddim managed by dictionary * rrd.h cleanup * DICTIONARY_ITEM now is referencing actual dictionary items in the code * removed rrdset linked list * Revert "removed rrdset linked list" This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5. * removed rrdset linked list * added comments * Switch chart uuid to static allocation in rrdset Remove unused functions * rrdset_archive() and friends... * always create rrdfamily * enable ml_free_dimension * rrddim_foreach done with dfe * most custom rrddim loops replaced with rrddim_foreach * removed accesses to rrddim->dimensions * removed locks that are no longer needed * rrdsetvar is now managed by the dictionary * set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853 * conflict callback of rrdsetvar now properly checks if it has to reset the variable * dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM * dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe * dictionary walkthrough callbacks get dictionary acquired items * dictionary reference counters that can be dupped from zero * added advanced functions for get and del * rrdvar managed by dictionaries * thread safety for rrdsetvar * faster rrdvar initialization * rrdvar string lengths should match in all add, del, get functions * rrdvar internals hidden from the rest of the world * rrdvar is now acquired throughout netdata * hide the internal structures of rrdsetvar * rrdsetvar is now acquired through out netdata * rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata * better error handling * dont create variables if not initialized for health * dont create variables if not initialized for health again * rrdfamily is now managed by dictionaries; references of it are acquired dictionary items * type checking on acquired objects * rrdcalc renaming of functions * type checking for rrdfamily_acquired * rrdcalc managed by dictionaries * rrdcalc double free fix * host rrdvars is always needed * attempt to fix deadlock 1 * attempt to fix deadlock 2 * Remove unused variable * attempt to fix deadlock 3 * snprintfz * rrdcalc index in rrdset fix * Stop storing active charts and computing chart hashes * Remove store active chart function * Remove compute chart hash function * Remove sql_store_chart_hash function * Remove store_active_dimension function * dictionary delayed destruction * formatting and cleanup * zero dictionary base on rrdsetvar * added internal error to log delayed destructions of dictionaries * typo in rrddimvar * added debugging info to dictionary * debug info * fix for rrdcalc keys being empty * remove forgotten unlock * remove deadlock * Switch to metadata version 5 and drop chart_hash chart_hash_map chart_active dimension_active v_chart_hash * SQL cosmetic changes * do not busy wait while destroying a referenced dictionary * remove deadlock * code cleanup; re-organization; * fast cleanup and flushing of dictionaries * number formatting fixes * do not delete configured alerts when archiving a chart * rrddim obsolete linked list management outside dictionaries * removed duplicate contexts call * fix crash when rrdfamily is not initialized * dont keep rrddimvar referenced * properly cleanup rrdvar * removed some locks * Do not attempt to cleanup chart_hash / chart_hash_map * rrdcalctemplate managed by dictionary * register callbacks on the right dictionary * removed some more locks * rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread * when looking up for an alarm look using both chart id and chart name * host initialization a bit more modular * init rrdlabels on host update * preparation for dictionary views * improved comment * unused variables without internal checks * service threads isolation and worker info * more worker info in service thread * thread cancelability debugging with internal checks * strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647 * dictionary modularization * Remove unused SQL statement definition * unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated * remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops * rewritten dictionary to have 2 separate locks, one for indexing and another for traversal * Update collectors/cgroups.plugin/sys_fs_cgroup.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * Update collectors/cgroups.plugin/sys_fs_cgroup.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * Update collectors/proc.plugin/proc_net_dev.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * fix memory leak in rrdset cache_dir * minor dictionary changes * dont use index locks in single threaded * obsolete dict option * rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim; * fix jump on uninitialized value in dictionary; remove double free of cache_dir * addressed codacy findings * removed debugging code * use the private refcount on dictionaries * make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim; * more dictionary statistics * global statistics about dictionary operations, memory, items, callbacks * dictionary support for views - missing the public API * removed warning about unused parameter * chart and context name for cloud * chart and context name for cloud, again * dictionary statistics fixed; first implementation of dictionary views - not currently used * only the master can globally delete an item * context needs netdata prefix * fix context and chart it of spins * fix for host variables when health is not enabled * run garbage collector on item insert too * Fix info message; remove extra "using" * update dict unittest for new placement of garbage collector * we need RRDHOST->rrdvars for maintaining custom host variables * Health initialization needs the host->host_uuid * split STRING to its own files; no code changes other than that * initialize health unconditionally * unit tests do not pollute the global scope with their variables * Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-07-08array allocator for dbengine page descriptors (#13312)Costa Tsaousis
* array allocator for dbengine page descriptors * full implementation of array allocator with cleanup * faster deallocations * eliminate entierely the need for loops during free * addressed comments * lower the min number of elements to 10
2022-05-09Workers utilization charts (#12807)Costa Tsaousis
* initial version of worker utilization * working example * without mutexes * monitoring DBENGINE, ACLKSYNC, WEB workers * added charts to monitor worker usage * fixed charts units * updated contexts * updated priorities * added documentation * converted threads to stacked chart * One query per query thread * Revert "One query per query thread" This reverts commit 6aeb391f5987c3c6ba2864b559fd7f0cd64b14d3. * fixed priority for web charts * read worker cpu utilization from proc * read workers cpu utilization via /proc/self/task/PID/stat, so that we have cpu utilization even when the jobs are too long to finish within our update_every frequency * disabled web server cpu utilization monitoring - it is now monitored by worker utilization * tight integration of worker utilization to web server * monitoring statsd worker threads * code cleanup and renaming of variables * contrained worker and statistics conflict to just one variable * support for rendering jobs per type * better priorities and removed the total jobs chart * added busy time in ms per job type * added proc.plugin monitoring, switch clock to MONOTONIC_RAW if available, global statistics now cleans up old worker threads * isolated worker thread families * added cgroups.plugin workers * remove unneeded dimensions when then expected worker is just one * plugins.d and streaming monitoring * rebased; support worker_is_busy() to be called one after another * added diskspace plugin monitoring * added tc.plugin monitoring * added ML threads monitoring * dont create dimensions and charts that are not needed * fix crash when job types are added on the fly * added timex and idlejitter plugins; collected heartbeat statistics; reworked heartbeat according to the POSIX * the right name is heartbeat for this chart * monitor streaming senders * added streaming senders to global stats * prevent division by zero * added clock_init() to external C plugins * added freebsd and macos plugins * added freebsd and macos to global statistics * dont use new as a variable; address compiler warnings on FreeBSD and MacOS * refactored contexts to be unique; added health threads monitoring Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-05-03One way allocator to double the speed of parallel context queries (#12787)Costa Tsaousis
* one way allocator to speed up context queries * fixed a bug while expanding memory pages * reworked for clarity and finally fixed the bug of allocating memory beyond the page size * further optimize allocation step to minimize the number of allocations made * implement strdup with memcpy instead of strcpy * added documentation * prevent an uninitialized use of owa * added callocz() interface * integrate onewayalloc everywhere - apart sql queries * one way allocator is now used in context queries using archived charts in sql * align on the size of pointers * forgotten freez() * removed not needed memcpys * give unique names to global variables to avoid conflicts with system definitions
2022-01-18Do not use dbengine headers when dbengine is disabled. (#11967)vkalintiris
Prior to this commit both daemon/commands.c and spawn/spawn.c used to include database/engine/rrdenginelib.h, ie. a header file that is available only when enabling the dbengine feature.
2020-02-17eBPF process plugin (#7979)thiagoftsm
* syscall_plugin: Compilation This commit brings the necessaries changes to the compilation files * syscall_plugin: Collector body This commit brings the collector body to files. * syscall_plugin: .gitignore This commit adds syscall.plugin to .gitignore * syscall_plugin: Plugin adjust Fix reference and remove message * syscall_plugin: Remove limit Remove call to setrlimit * syscall: Fix start This commit fixes problems related with start of the plugin * syscall_plugin: Bring heartbeat This commit removes the sleep and changes to heartbeat to avoid plugin receive a SIGTERM * syscall_plugin: Missing semicolon * syscall_plugin: Fix dimension Brings the initial value of chart for the normal dimension of the other values * syscall_plugin: Fix dimension 2 The previous change did not give the expected results, so I am bringing more a fix * syscall_plugin: adjust values Rename function and adjust pid size * syscall_plugin: Remove Chart and fix var this commit removes a chart that will not be created and fix an error when the bytes were calculated * syscall_plugin: Brings error This commit brings a new variable that will be used to identify errors * syscall_plugin: Rename charts This commit starts to rename the charts properly * syscall_plugin: Rename plugin * syscall_plugin: missing changes for rename * syscall_plugin: fix compilation * syscall_plugin: bring new charts * syscall_plugin: Warnings Remove warnings from compilation time * vfs_plugin: Fix Error chart plot There was an error when the chart was being displayed * vfs_plugin: Change family This commit changes the family of the VFS plugin * vfs_plugin: Fix order This PR fixes the wrong order when creating a chart * vfs_plugin: Remove path Remove path from structure * vfs_plugin: From Perf to HASH This commit converts the main source a hash table and also split the data collection per chart * vfs_plugin: Adjusts and exit This commit brings adjusts to the collect and the complete monitor to exit events * vfs_plugin: Start process This commit brings the monitoring of a process start and thread creation to Netdata * vfs_plugin: Visualization and collection Adjust variables to show and to collect data * vfs_plugin: Connection with apps plugin This commit starts to bring the connection with apps. * vfs_plugin: Various This commit brings new label for charts, fix to error chart and adjusts for new charts, I am sorry * vfs_plugin: basis new chart This commit brings the basis of the new charts for the plugin * vfs_plugin: Apps plugin This commit brings the integration with apps.plugin * vfs_plugin:fix counter This commit fixer the difference between apps plugin and counter * ebpf_plugin: rename charts This commit renames the charts * ebpf_plugin: New charts adjusts and log start * ebpf_plugin: Log thread Creates the log thread that will be used to store error message * ebpf_plugin: Rename Web Group This commit reorganize the charts on dashboard * ebpf_plugin: Restore This commit restore the previous status of the collector where we only have a global vision of the problems * ebpf_plugin: kretprobe This commit brings the initial changes for the collector works with both eBPF program * ebpf_plugin: New syscalls This commit brings the new syscalls that we are monitoring * ebpf_plugin: New charts This commit brings new charts to the collector * ebpf_plugin: Parse config This commit starts the parser of the file * ebpf_plugin: collector debug * ebpf_plugin: Global variables from config This commit brings the global variable update from the config file * ebpf_plugin: Clean kprobe_events This commit brings the clean of kprobe_events and also starts the common library for all eBPF collectors * ebpf_plugin: Check kernel version This function brings a check for the kernel version * ebpf_plugin: Start documentation This commit brings the initial documentation for the users * ebpf_plugin: Documentation This commit brings adjust to code and updates for the documentation * ebpf_plugin: this commit brings the developer mode to the collector * ebpf_plugin: Documentation This commit brings more information to the documentation * ebpf_plugin: Documentation This commit brings more information to the documentation * ebpf_plugin: errno to logs Brings errno number to logs * ebpf_plugin: Documentation This commit brings fixes to the collector documentation * ebpf_plugin: Move description This commit move the chart description from the C code to dashboard_info.js * ebpf_plugin: Rename files This commit rename files to the final version * ebpf_plugin: COntinue renaming This commit continue renaming the files to the final version * ebpf_plugin: Renaming process This commit renames the final plugin * ebpf_plugin: Finish rename This commit finishes the rename processing * ebpf_plugin: fix entry charts This commit removes one chart from mode * ebpf_plugin: Fix remove This commit brings a new function to fix the unload of collector when the collector is running in entry mode * ebpf_plugin: Rename on old kernels This commit brings fixes for syscall names * ebpf_plugin: Timestamp to log This commit brings the timestamp to the logs * ebpf_plugin: Remove syscall With the changes on the backend, we are not monitoring more sys_clone * ebpf_plugin: The syscall is important for 5.3 or newer, so I am returning * ebpf_plugin: Remove concurrency This commit adds variables necessary to interact with the new structor of the eBPF program * ebpf_plugin: Ids to dimension This commit fews the functions name as ids for the dimensions * ebpf_plugin: Missing chart This commit brings the missing chart for Netdata * ebpf_plugin: Remove unecessary message Remove unecessary error message from the collector * ebpf_plugin: Rename dimension This commit renames the dimension for something more meaninful * ebpf_plugin: Optional log This commit converts the developer.log in an optional feature * redirect to stdoou This commit starts to bring the capability to redirect everything to stdout * ebpf_plugin: Disable dev mode This commit removes the possibility to load the dev mode file for while * ebpf_plugin: Disable eBPF process By default this plugin won't be enabled * ebpf_plugin: Update debug message * ebpf_plugin: this commit adjusts documentation to next release. * ebpf_plugin: documentation fix. * ebpf_plugin: Percpu hash This commit moves from an unique hash table for various to speed up the collector * ebpf_plugin: Compatibility This commit set compatibility version between kernels
2019-11-11Makefile.am files indentation (#7252)Konstantinos Natsakis
* Use 4 spaces for indentation of non-recipe lines in Makefile.am files * Be more consistent in the use of space before = in Makefile.am files
2019-10-15Add CMocka unit tests (#6985)Vladimir Kobal
* Add str2ld test * Build test with Autotools * Add storage_number test * Configure tests in CMake
2019-07-01Easily disable alarms, by persisting the silencers configuration (#6360)thiagoftsm
This PR was created to fix #3414, here I am completing the job initiated by Christopher, among the newest features that we are bring we have JSON inside the core - We are bringing to the core the capacity to work with JSON files, this is available either using the JSON-C library case it is present in the system or using JSMN library that was incorporated to our core. The preference is to have JSON-C, because it is a more complete library, but case the user does not have the library installed we are keeping the JSMN for we do not lose the feature. Health LIST - We are bringing more one command to the Health API, now with the LIST it is possible to get in JSON format the alarms active with Netdata. Health reorganized - Previously we had duplicated code in different files, this PR is fixing this (Thanks @cakrit !), the Health is now better organized. Removing memory leak - The first implementation of the json.c was creating SILENCERS without to link it in anywhere. Now it has been linked properly. Script updated - We are bringing some changes to the script that tests the Health. This PR also fixes the race condition created by the previous new position of the SILENCERS creation, I had to move it to daemon/main.c, because after various tests, it was confirmed that the error could happen in different parts of the code, case it was not initialized before the threads starts. Component Name health directory health-cmd Additional Information Fixes #6356 and #3414
2019-06-28Revert "Easily disable alarms, by persisting the silencers configuration ↵Pavlos Emm. Katsoulakis
(#6274)" This reverts commit 60a73e90de2aa1c2eaae2ebbc45dd1fb96034df2. Emergency rollback of potential culprit as per issue #6356 Will be re-merging the change after investigation
2019-06-27Easily disable alarms, by persisting the silencers configuration (#6274)thiagoftsm
* Alarms begin! * Alarms web interface comments! * Alarms web interface comments 2! * Alarms bringing Christopher work! * Alarms bringing Christopher work! * Alarms commenting code that will be rewritten! * Alarms json-c begin! * Alarms json-c end! * Alarms missed script! * Alarms fix json-c parser and change script to test LIST! * Alarms fix test script! * Alarms documentation! * Alarms script step 1! * Alarms fix script! * Alarms fix testing script and code! * Alarms missing arguments to pkg_check_modules * SSL_backend indentation! * Alarms, description in Makefile * Alarms missing extern! * Alarms compilation! * Alarms libnetdata/health! * Alarms fill library! * Alarms fill CMakeList! * Alarm fix version! * Alarm remove readme! * Alarm fix readme version!
2018-10-15modularized all source code (#4391)Costa Tsaousis
* modularized all external plugins * added README.md in plugins * fixed title * fixed typo * relative link to external plugins * external plugins configuration README * added plugins link * remove plugins link * plugin names are links * added links to external plugins * removed unecessary spacing * list to table * added language * fixed typo * list to table on internal plugins * added more documentation to internal plugins * moved python, node, and bash code and configs into the external plugins * added statsd README * fix bug with corrupting config.h every 2nd compilation * moved all config files together with their code * more documentation * diskspace info * fixed broken links in apps.plugin * added backends docs * updated plugins readme * move nc-backend.sh to backends * created daemon directory * moved all code outside src/ * fixed readme identation * renamed plugins.d.plugin to plugins.d * updated readme * removed linux- from linux plugins * updated readme * updated readme * updated readme * updated readme * updated readme * updated readme * fixed README.md links * fixed netdata tree links * updated codacy, codeclimate and lgtm excluded paths * update CMakeLists.txt * updated automake options at top directory * libnetdata slit into directories * updated READMEs * updated READMEs * updated ARL docs * updated ARL docs * moved /plugins to /collectors * moved all external plugins outside plugins.d * updated codacy, codeclimate, lgtm * updated README * updated url * updated readme * updated readme * updated readme * updated readme * moved api and web into webserver * web/api web/gui web/server * modularized webserver * removed web/gui/version.txt