summaryrefslogtreecommitdiffstats
path: root/CMakeLists.txt
AgeCommit message (Collapse)Author
2020-05-13Rename eBPF collector (#8822)thiagoftsm
We renamed eBPF collector for a more meaningful name.
2020-04-13Revert "Revert changes since v1.21 in pereparation for hotfix release."Austin S. Hemmelgarn
This reverts commit e2874320fc027f7ab51ab3e115d5b1889b8fd747.
2020-04-13Revert changes since v1.21 in pereparation for hotfix release.Austin S. Hemmelgarn
2020-04-10Show internal stats for the exporting engine (#8635)Vladimir Kobal
* Add a print function for internal exporting statistics * Send statistics for simple connectors * Flush sending buffers on failures * Send statistics for the Kinesis connector * Send statistics for the MongoDB connector * Add unit tests
2020-03-30Add a MongoDB connector to the exporting engine (#8416)Vladimir Kobal
* Copy files from the MongoDB backend * Update the documentation * Rename functions in the MongoDB backend * Add the connector to the Netdata build * Add an initializer and a worker * Add specific configuration options * Initialize the connector * Add a ring buffer for inserting data to a MongoDB database * Add unit tests
2020-03-16Add mising files to CMake (#8412)Timo
2020-03-16Fix Prometheus Remote Write build (#8411)Vladimir Kobal
2020-03-15Support SOCKS5 in ACLK Challenge/Response and rewrite with LWS (#8404)Timo
* wip * add alpn * beggining of cleanup * move common code * add SOCKS5 support * check HTTP response code * add timeout * separate https_client into own files * fix some mem leaks from master + avoid string copying and alloc/free * fix some PR unrelated warnings
2020-03-12Add a Prometheus Remote Write connector to the exporting engine (#8292)Vladimir Kobal
* Copy files from the Prometheus remote write backend * Update the documentation * Rename backend -> exporting * Add the connector to the Netdata build * Separate files for the remote write connector * Add an initializer and formatters * Read a connector specific configuration option * Add a separate function for header sending * Use labels instead of tags * Separate write request for every instance * Add unit tests
2020-03-05ACLK cmake fixes (#8280)Timo
* ACLK cmake fixes
2020-02-25Add an AWS Kinesis connector to the exporting engine (#8145)Vladimir Kobal
* Prepare files for the AWS Kinesis exporting connector * Update the documentation * Rename functions in backends * Include the connector to the Netdata buid * Add initializers and a worker * Add Kinesis specific configuration options * Add a compile time configuration check * Remove the connector data structure * Restore unit tests * Fix the compile-time configuration check * Initialize AWS SDK only once * Don't create an instance for an unknown exporting connector * Separate client and request outcome data for every instance * Fix memory cleanup, document functions * Add unit tests * Update the documentation
2020-02-17eBPF process plugin (#7979)thiagoftsm
* syscall_plugin: Compilation This commit brings the necessaries changes to the compilation files * syscall_plugin: Collector body This commit brings the collector body to files. * syscall_plugin: .gitignore This commit adds syscall.plugin to .gitignore * syscall_plugin: Plugin adjust Fix reference and remove message * syscall_plugin: Remove limit Remove call to setrlimit * syscall: Fix start This commit fixes problems related with start of the plugin * syscall_plugin: Bring heartbeat This commit removes the sleep and changes to heartbeat to avoid plugin receive a SIGTERM * syscall_plugin: Missing semicolon * syscall_plugin: Fix dimension Brings the initial value of chart for the normal dimension of the other values * syscall_plugin: Fix dimension 2 The previous change did not give the expected results, so I am bringing more a fix * syscall_plugin: adjust values Rename function and adjust pid size * syscall_plugin: Remove Chart and fix var this commit removes a chart that will not be created and fix an error when the bytes were calculated * syscall_plugin: Brings error This commit brings a new variable that will be used to identify errors * syscall_plugin: Rename charts This commit starts to rename the charts properly * syscall_plugin: Rename plugin * syscall_plugin: missing changes for rename * syscall_plugin: fix compilation * syscall_plugin: bring new charts * syscall_plugin: Warnings Remove warnings from compilation time * vfs_plugin: Fix Error chart plot There was an error when the chart was being displayed * vfs_plugin: Change family This commit changes the family of the VFS plugin * vfs_plugin: Fix order This PR fixes the wrong order when creating a chart * vfs_plugin: Remove path Remove path from structure * vfs_plugin: From Perf to HASH This commit converts the main source a hash table and also split the data collection per chart * vfs_plugin: Adjusts and exit This commit brings adjusts to the collect and the complete monitor to exit events * vfs_plugin: Start process This commit brings the monitoring of a process start and thread creation to Netdata * vfs_plugin: Visualization and collection Adjust variables to show and to collect data * vfs_plugin: Connection with apps plugin This commit starts to bring the connection with apps. * vfs_plugin: Various This commit brings new label for charts, fix to error chart and adjusts for new charts, I am sorry * vfs_plugin: basis new chart This commit brings the basis of the new charts for the plugin * vfs_plugin: Apps plugin This commit brings the integration with apps.plugin * vfs_plugin:fix counter This commit fixer the difference between apps plugin and counter * ebpf_plugin: rename charts This commit renames the charts * ebpf_plugin: New charts adjusts and log start * ebpf_plugin: Log thread Creates the log thread that will be used to store error message * ebpf_plugin: Rename Web Group This commit reorganize the charts on dashboard * ebpf_plugin: Restore This commit restore the previous status of the collector where we only have a global vision of the problems * ebpf_plugin: kretprobe This commit brings the initial changes for the collector works with both eBPF program * ebpf_plugin: New syscalls This commit brings the new syscalls that we are monitoring * ebpf_plugin: New charts This commit brings new charts to the collector * ebpf_plugin: Parse config This commit starts the parser of the file * ebpf_plugin: collector debug * ebpf_plugin: Global variables from config This commit brings the global variable update from the config file * ebpf_plugin: Clean kprobe_events This commit brings the clean of kprobe_events and also starts the common library for all eBPF collectors * ebpf_plugin: Check kernel version This function brings a check for the kernel version * ebpf_plugin: Start documentation This commit brings the initial documentation for the users * ebpf_plugin: Documentation This commit brings adjust to code and updates for the documentation * ebpf_plugin: this commit brings the developer mode to the collector * ebpf_plugin: Documentation This commit brings more information to the documentation * ebpf_plugin: Documentation This commit brings more information to the documentation * ebpf_plugin: errno to logs Brings errno number to logs * ebpf_plugin: Documentation This commit brings fixes to the collector documentation * ebpf_plugin: Move description This commit move the chart description from the C code to dashboard_info.js * ebpf_plugin: Rename files This commit rename files to the final version * ebpf_plugin: COntinue renaming This commit continue renaming the files to the final version * ebpf_plugin: Renaming process This commit renames the final plugin * ebpf_plugin: Finish rename This commit finishes the rename processing * ebpf_plugin: fix entry charts This commit removes one chart from mode * ebpf_plugin: Fix remove This commit brings a new function to fix the unload of collector when the collector is running in entry mode * ebpf_plugin: Rename on old kernels This commit brings fixes for syscall names * ebpf_plugin: Timestamp to log This commit brings the timestamp to the logs * ebpf_plugin: Remove syscall With the changes on the backend, we are not monitoring more sys_clone * ebpf_plugin: The syscall is important for 5.3 or newer, so I am returning * ebpf_plugin: Remove concurrency This commit adds variables necessary to interact with the new structor of the eBPF program * ebpf_plugin: Ids to dimension This commit fews the functions name as ids for the dimensions * ebpf_plugin: Missing chart This commit brings the missing chart for Netdata * ebpf_plugin: Remove unecessary message Remove unecessary error message from the collector * ebpf_plugin: Rename dimension This commit renames the dimension for something more meaninful * ebpf_plugin: Optional log This commit converts the developer.log in an optional feature * redirect to stdoou This commit starts to bring the capability to redirect everything to stdout * ebpf_plugin: Disable dev mode This commit removes the possibility to load the dev mode file for while * ebpf_plugin: Disable eBPF process By default this plugin won't be enabled * ebpf_plugin: Update debug message * ebpf_plugin: this commit adjusts documentation to next release. * ebpf_plugin: documentation fix. * ebpf_plugin: Percpu hash This commit moves from an unique hash table for various to speed up the collector * ebpf_plugin: Compatibility This commit set compatibility version between kernels
2020-02-06ACLK agent 1 (#7894)Stelios Fragkakis
* - Add initial mqtt support * [WIP] Agent cloud link - Setup main mqtt thread to connect to a broker using V5 of the MQTT protocol (TBD) - Send alarms to "netdata/alarm" - Add error checks to handle connection failures - Add params for Broker, port Maximum concurrent sent / recev messages - Dummy function to check claiming status - Generic mqtt_send command to publish message to a base topic , sub topic It will end up in the form base_topic/sub_topic - Add host/port in the connection failure error message * Test libmosquitto libs * connect to broker locally (assume localhost:1883) * subscribe to channel netdata/command * Test try a reload command to trigger health reload * publish alerts to netdata/alarm * - Fix compile issues * - Use sleep_usec instead of usleep * - Delay reconnection on failure due to misconfiguration (high cpu usage) * - Remove the TLS connection config * - Fix NETDATA_MQTT_INITIALIZATION_SLEEP_WAIT to use seconds * - Gather ACLK related code under aclk folder - Add aclk_ functions for abstract layer - Moved low level libs intergration in mqtt.c * - Add README.md file with initial comment * - Clean MQTT v5 * - Code cleanup * - Remove alarm log for now - Remove the heart beat * - Remove message properties for V5 * - Remove message properties for V5 (header) * Fixed the netdata target to use a local static version of libmosquitto. The installer does not yet have steps to pull and build the local library. cd project_root git clone ssh://git@github.com/netdata/mosquitto mosquitto/ (cd mosquitto/lib && make) # Ignore the cpp error This will leave mosquitto/lib/libmosquitto.a for the build process to use. * - Fix compile issues with older < 1.6 libmosquitto lib * - Enable alarm events to check it works - Re arrange includes - Rework topic to be agent/guid/. Actual id will be returned by the is_agent_claimed * - Add initial metadata info - Added helper function in web_api - Added a debug command (info) * Update the claiming state to retrieve the claimed id. * - Use define for constants like command and metadata topics - Function to wait for initialization of the ACLK link - New aclk_subscribe command with QOS parameter for the mqtt subscription - Use the is_agent_claimed function to get the real claim id and use it to build the topics that will be used for the cloud communication - Change in netdata-claim.sh.in to write the claim id without a trailing \n * - Use define for constants like command and metadata topics - Function to wait for initialization of the ACLK link - New aclk_subscribe command with QOS parameter for the mqtt subscription - Use the is_agent_claimed function to get the real claim id and use it to build the topics that will be used for the cloud communication - Change in netdata-claim.sh.in to write the claim id without a trailing \n * - Remove the alarm log for now - Add code (but disabled) to send charts * - Use dummy anon, anon as username and password for testing purposes * - Use client id anon as well * Testing without TLS * Switching TLS back on to fix docker environment. * - Added query processing An incoming URL now calls web_client_api_request_v1_data to handle a request and push the results back to the "data" topic - Move the above processing from the message callback to the query handle loop - Added helper "pause" , "resume" commands to stop and resume query processing to stress test loading the queue with queries before executing them - Changed the endpoint topics to "meta", and "cmd" (previously metadata and command) * make info message follow protocol * move metadata msg generation into new func * move metadata msg generation into new func * - Add metadata to the responses - Add hook to queue chart changes on creation and dimensions - Changed the queue mechanism to include delay for X seconds - Add delayed submittion of charts to the cloud so that all DIMs are defined to avoid resubmission * - Add additional data info for aclk_queue command * - Use web_clinet_api_request_v1 to handle the incoming request This will handle all requests coming from the cloud * - Cleanup and aclk_query structure - Add msg_id parameter - Enable the incoming JSON request - Enable the outgoing JSON response * - Added new thread to handle query processing - Add lock and cond wait to wakeup thread when queries are submitted - Cleanup on the main init function * - Add wait time on agent init, to allow for chart, alarms and other definitions to be completed. - During the wait time, no queries will be queued * - Send metadata on query thread init - New generic create header function for the JSON response - Pack info and charts into one message - Modified chart to remove entries (test) - Modified charts mod to remove entries e.g alarms and volatile info - Change input to aclk_update_chart (RRDHOST / instead of hostname) * - When a request fails, add to the payload - We may need to handle in a different key - Error check in json parsing * - Add dummy aclk_update_alarm command * - Move incoming request JSON parsing code away from mqtt.c - Added #ifdef ACLK_ENABLE so that we can have code merged but disabled by default - Added version in incoming and outgoing JSON dict * - Disable code if ACLK_ENABLE is not defined - Remove references to the mqtt (mosquitto) lib - Add dummy stubs in mqtt.c for completeness if ACLK_ENABLE is not defined * - Disable challenge sample code for now * - Remove libmosquitto from makefile * - Fix spaces in Makefile.am - Remove ifdef to avoid warning from LGTM * - Remove for now the code that builds an along log test message to send to the cloud * - Add check for ACLK_ENABLE definition and avoid calling the chart update functions * - Remove commented code * - Move source files to the correct place (ACLK_PLUGIN_FILES) * - Remove include file thats not needed * - Remove include file thats not needed - Add improved checks for load_claiming_state() * - Fix error message. Used error() that also logs errno and message * - Fix some codacy issues * - Fix more codacy issues, code cleanup * - Revert code to address codacy warnings * - Revert spaces added in a previous commit by mistake * clean up if/else nest * print error if fopen fails * minor - error already logs errno * - Fix version formatting * - Cleanup all ACLK related compiler warnings - Re-arrange include files - Removed unused defines * - More compilation warnings fixed - Bug with thread creation fixed * - Add condition to skip compilation of the ACLK code entirely. Add env variable ACLK="yes" to enable * - Add condition to skip the libmosquitto * - Change feature flag from ACLK_ENABLE to ENABLE_ACLK in accordance with the rest of ENABLE_xx flags - Typo in info message fix Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com> Co-authored-by: Timo <6674623+underhood@users.noreply.github.com>
2020-01-23Fix unit tests for the exporting engine (#7784)Vladimir Kobal
2020-01-07Restore support for protobuf 3.0 (#7683)Vladimir Kobal
2019-12-24Fix a warning in prometheus remote write backend (#7609)Vladimir Kobal
* Change deprecated method to a new one * Change the minimum required version of protobuf
2019-12-19Agent claiming (#7525)Markos Fountoulakis
Initial infrastructure support for agent claiming. This feature is not currently enabled as we are still finalizing the details of the cloud infrastructure w.r.t. agent claiming. The feature will be enabled when we are ready to release it.
2019-12-12Implement the main flow for the Exporting Engine (#7149)Vladimir Kobal
* Add top level tests * Add a skeleton for preparing buffers * Initialize graphite instance * Prepare buffers for all instances * Add Grafite collected value formatter * Add support for exporting.conf read and parsing * - Use new exporting_config instead of netdata_config * Implement Grafite worker * Disable exporting engine compilation if libuv is not available * Add mutex locks - Configure connectors as connector_<type> in sections of exporting.conf - Change exporting_select_type to check for connector_ fields * - Override exporting_config structure if there no exporting.conf so that look ups don't fail and we maintain backwards compatibility * Separate fixtures in unit tests * Test exporting_discard_responce * Test response receiving * Test buffer sending * Test simple connector worker - Instance section has the format connector:instance_name e.g graphite:my_graphite_instance - Connectors with : in their name e.g graphite:plaintext are reserved So graphite:plaintext is not accepted because it would activate an instance with name "plaintext" It should be graphite:plaintext:instance_name * - Enable the add_connector_instance to cleanup the internal structure by passing NULL,not NULL arguments * Implement configurable update interval - Add additional check to verify instance uniqueness across connectors * Add host and chart filters * Add the value calculation over a database series * Add the calculated over stored data graphite connector * Add tests for graphite connector * Add JSON connector * Add tests for JSON formatting functions * Add OpenTSDB connector * Add tests for the OpenTSDB connector * Add temporaty notes to the documentation
2019-12-04Implement netdata command server and cli tool (#7325)Markos Fountoulakis
* Checkpoint commit (POC) * Implemented command server in the daemon * Add netdatacli implementation * Added prints in command server setup functions * Make libuv version 1 a hard dependency for the agent * Additional documentation * Improved accuracy of names and documentation * Fixed documentation * Fixed buffer overflow * Added support for exit status in cli. Added prefixes for exit code, stdout and stderr. Fixed parsers. * Fix compilation errors * Fix compile errors * Fix compile errors * Fix compile error * Fix linker error for muslc
2019-12-02proc.plugin: add pressure stall information (#7209)Haochen Tong
* proc.plugin: add pressure stall information * dashboard_info: add "Pressure" section * proc.plugin: mention PSI collector in doc * dashboard_info: fix grammar in PSI section * proc_pressure: fix wrong line name for "full" metrics * proc_pressure: fix copypasta * proc_pressure: refactor to prepare for cgroup changes * cgroups.plugin: add pressure monitoring * add proc_pressure.h to targets * Makefile.am: fix indentation * cgroups.plugin: remove a useless comment * cgroups.plugin: fix pressure config name * proc.plugin: arrange pressure charts under corresponding sections * dashboard_info: rearrange pressure chart descriptions * dashboard_info: reword PSI descriptions
2019-11-21CMocka tests for Issue 7274 (#7308)Andrew Moss
* Start of testing partial requests. Need to stash this to checkout a PR to test. * Disambiguated error messages during header validation. The mocking has blown up in the linker, need to wipe out repo local changes and restart from a known good state. * Test failures. CMocka is really not designed for parametric tests which is making it difficult to test the http validation properly. We have some problems in the web_client.c code that are causing early failures in the testing sequence, and it is causing CMocka to abort the sequence. Need to try a different approach to building the tests... * Pedantic style pass. * Test generation. There must be another value hidden in the system that CMocka uses. This sets up 3278 tests but the results from cmocka_run_group_tests_name show 0 tests were run. * The problem was the "helper"-macro. Calling CMocka directly, moved the setup/teardown into explicit fixtures. Successfully runs the family of tests over the same (empty) state. * Parameterised family of tests runs. The api_next() acts as a counter, the least significant digit is the prefix_len using the web_buffer in the test_family struct as a template to walk throufh. The most significant digit is the number of headers to use in the request. Checked that this walk executes correctly and all the tests run before putting the test payloads back in. We trigger a failure about 3-4 tests in that takes down the process. Currently investigating which parts are not mocked correctly. * Pedantic style pass * Adding a mocking for fatal. That weird thing with the linker has happened again, need to clean repo and rebuild fresh. * Full test sequence executes. The test parameter counter jammed after a failure - we cannot rely on anything in the main test body being executed after we call the functionailty under test. A failure will skip the rest of the execution. Moved the counter stepping to the top of the function (i.e. it is now a ++i instead of a i++). Adjusted the initial state to compensate. This now steps through all of the test-sequence, but it raises an ugly issue - the post-test cleanup will not be executed on a failure. TODO: * Move the test-state into the test_family. * Do the clean-up of the previous test (if necesarry) in the step function. * Fix the assertion on the web_client state. * Pedantic style pass * Test state is now in the test_family. This addresses the issue with leaking on failure and not performing clean-up - we don't really care about memory leaks during unit-testing, but we do care about reseting the system-under-test back to a known state to guarantee independence across the tests. The clean-up is now triggered in api_next(). * Flip the wait flag assertions. Partial requests should leave the web_client waiting to receive more data. * Fixing ACL flags in test-driver. This makes some tests pass - but far too many. Probably need a proper debugging function to show the request / response in a readable format. * Result from the api mocking. Setting a successful return code in the api mocking makes the non-partial tests pass. Zero'ing out the web_client before use has not fixed the initialization errors, there is still some history on the parse_tries that needs to be tracked down. Some of the other errors are spurious - they result from stream multiplexing in the testdriver - be careful with less. * Fix warnings. Switched the build configuration to CFLAGS="-O1 -ggdb -Wall -Wextra -Wformat-signedness -fstack-protector-all -DNETDATA_INTERNAL_CHECKS=1 -D_FORTIFY_SOURCE=2 -DNETDATA_VERIFY_LOCKS=1". The memset introduced last night to zero out the initial web_client state had transposed parameters. Now that the state is initially zero before hitting the http request processing most issues have disappeared. There are 3000+ passing tests and 48 boundary cases to track down. * Pushing log entries from each test into a buffer. This will allow suppression of logs from tests that pass. * Switched to a unique test definition structure per test. This cleans up the code as it means that a list of tests can be constructed during the first walk through the parameter space. There is no need to walk the space twice and keep both walks aligned. Removed the cmocka_unit_test macro and build the CMUnitTest structures directly -> this allows a real name per test instead of the procedure name. The walking/step function api_info_next has been folded back into the test procedure as it is simpler to walk the list in the shared test state. Current TODO: * There is a bug, the check on the wait state in the buffer is not being handled properly, investigate why everything fails. * The results don't match the old code, are we handing the correct web_buffer to each tested piece of code? * Capture the test success state -> dump the log buffer on failures. * State is properly passed through the tests. Spent a long time chasing a horrible bug that seems to be inside CMocka? The state parameter being passed to each unit test is different on each call, i.e. it looks like a unique void** where the void * (*state) has been overwritten with the original value on each iteration through the testing loop. This behaviour does not match the CMocka source code, which does thread the given valud through the unit test calls. It could be a side-effect of the memory check-pointing, but the net-effect is that we cannot change the shared state between tests. It can be set in the setup-fixture and used in each test, but not altered for the subsequent test. This took a long time to diagnose - the fix is simple, we just share the state in a global pointer. This shared state is used to walk through the list of test_def structures so that each unit-test knows where it is in the parameter- space. * With the correct state the bug in triggering the correct assertions is gone. * Dump out the buffered logs on test failure. * The only failing case (relative to these assertions) are the ulta-short partial-requests. * Check the web_client->mode is set properly. * Style pass * Checking values passed to the API despatch point. * Disabled the parametric tests to do some low-level testings. Later on both sets of tests will be active. While the low-level url encoding tests are being developed the dynamically generated set is disabled to make the output easier to read. Working through the W3C URL spec, against RFC3986 and comparing the cases in available url-parsing test-suites to build our test-suite. * Start of the URL test-suite. The percent-decoding in the current implementation is in the wrong place - it happens too early and causes non-delimitor characters in the URL to be treated as delimitors. Current unit-tests seem to cover the range of checks that we need CMocka to make. The handling of output is a little awkward - need something like the dynamic cases that can output the log on a failure or skip it on a pass. * Raw material for low-level testing. * Adding more families in here is getting too messy. About to switch over to multiple testdrivers. * Need to clean repo to work around wrapping failures in CMocka. * CMocka is not compatible with LTO. The weird wrapping issues that come and go are as a result of LTO. My typical netdata-installer command-line that I use to reboot the project state disables LTO, while my normal autoreconf / configure command-line does not causing this problem to reappear seemingly-randomly. To build a single test-driver target this works: autoreconf -ivf && CFLAGS='-O1 -ggdb -Wall -Wextra -Wformat-signedness -fstack-protector-all -DNETDATA_INTERNAL_CHECKS=1 -D_FORTIFY_SOURCE=2 -DNETDATA_VERIFY_LOCKS=1' ./configure --disable-lto && make web/api/tests/web_api_valid_urls The actual change in this commit is just a bug-fix. * Ripping out the parameterized test generator. Each of the URL cases is slightly and subtly different. This can't be done using the parameterization and will need a healthy dose of cut and paste. CMocka does not recognise the mocking for mysendfile, which is necessary to capture the exit route from the URL parsing. * Weird bug in CMocka? For some reason CMocka will not mock out the mysendfile() procedure. We need to mock this to capture the behaviour of the URL parsing as it is one of the exit paths. The wrapping is setup the same way as for the procedures so I cannot see any reason that the library would not overwrite the calls. The only difference that I can find is that mysendfile is in the unit being tested and the other mocked procedures are in different translation units. This should not make a difference, but we have to disable LTO to get CMocka to work and the symbol patchs is some kind of linker hack so there could be an issue if LTO is not running and the patch target is inside the same translation unit. Hiding it for now with a #ifndef UNIT_TESTING, which then compiles find and control flow hits the mock... * Converting the ascii comments into unit_tests. * More nasty cases for unit testing. The commented out case will trigger a buffer overflow in the netdata agent and crash it. * Last of the individual unit tests planned before the demo. * Removing warnings. * Switching on the rest of the parametric set - the other case with CRs. * Fix Travis build failure under docker. * Change the name of a define so it does not collide with existing testing in Travis. * Add CMocka unit tests to CMake * Linting pass * Adding RFC comment to test. * Buffer overflow checks on the captured logs. This fixes the seg-fault seen by @vlvkobal and @thiagoftsm during testing. * Chasing down other valgrind reports. This gets rid of all of the uninitialised variable warnings. We stil have a memory leak, the headers that are set during the unit testing switch on compression. This causes the web_client code to call deflatInit2 and allocate structures for the compressor. We do not have a matching call to deflateEnd anywhere in the code so the memory leaks. * Cleaning up a comment. * Fixing review comments from @vlvkobal. Also noticed that the buffer overflow fix this morning was killing the logfile output, fixed this as well. * Addressing @thiagoftsm's concerns about the changing number of failures. Switched the log dump for failing cases to repr(). Found a bug in the test case generator (not storing the flag for `\r`. Verified that the 58 failing cases are the correct set of failures for the tested code.
2019-10-21[collector/proc.plugin] Add /proc/pagetypeinfo parser (#6843)Adrien Mahieux
* [proc.plugin/proc_pagetypeinfo] Initial commit * [Fix] Generate graphs for pagetypeinfo * [Fix] Create node/zone/type graphs * [Fix] Use directly size and order * [Add] Configuration handling * [Imp] Changed SetId to identify NodeNumber * [Fix] Standard name for chart priority and value * [Fix] use dynamic pagesize * [Enh] allow prefix for containerized netdata * [Fix] global system graph always on, but for explicit no * [Fix] Add more checks for pageorders_cnt and really use it * [Enh] Special config value of netdata_zero_metrics_enabled * [Fix] Check we parsed at least a valid line
2019-10-18Fix build when CMocka isn't installed (#7129)Vladimir Kobal
2019-10-15Add CMocka unit tests (#6985)Vladimir Kobal
* Add str2ld test * Build test with Autotools * Add storage_number test * Configure tests in CMake
2019-09-18Collector slabinfo (#6800)Adrien Mahieux
### Summary Provide new collector parsing `/proc/slabinfo` to provide details on kernel slab structures. Asked by issue #13 (very happy for the oldest issue in backlog) ##### Component Name collectors/slabinfo.plugin ##### Additional Information This slabinfo details allows to have clues on actions done on your system. In the following screenshot, you can clearly see a `find` done on a ext4 filesystem (the number of `ext4_inode_cache` & `dentry` are rising fast), and a few seconds later, an admin issued a `echo 3 > /proc/sys/vm/drop_cached` as their count dropped.
2019-08-14Add MongoDB backend (#6524)Vladimir Kobal
* Add mongodb backend skeleton * Send data to the backend * Send metrics as separate JSON documents * Add a configuration file * Send all metrics in a batch * Update the documentation * Free configuration strings on exit * Make socket timeout configurable
2019-08-12(re-open) ZRAM info collector module (proc.plugin) (#6424)Vilkov Adel
* ZRAM collector module ZRAM: Implemented zram device id detection ZRAM: Implemented zram device enumeration WIP ZRAM: Memory usage graph (needs other graphs) ZRAM: Added ratio and efficiency graph ZRAM: Added chart description and context names, code formatting * ZRAM: Proper handling of zram device removal * ZRAM: Added additional checks, removed redundant logging
2019-07-24Utf8 Badge Fix And URL Parser International Support (initial) (#6426)Timo
#### Summary Fixes #3117 Additionally it adds support for UTF-8 in URL parser (as it should). Label sizes now are updated by browser with JavaScript (although guess is still calculated by verdana11_widths with minor improvements) #### Component Name API/Badges, LibNetData/URL #### Additional Information It was found that not only verdana11_widths need to be updated but the url parser replaces international characters with spaces (one space per each byte of multibyte character). Therefore I update both to support international chars.
2019-07-09Revert "Add ZRAM collector module to the proc plugin"Pavlos Emm. Katsoulakis
This reverts commit c7ab028f787f1c3f1325f6195ea0cb2afc95ab95. **Removed as it was seen to cause crashes. Change will be revised and re-published at a later stage**
2019-07-09Add ZRAM collector module to the proc pluginVilkov Adel
The module gets the ZRAM device list by reading /proc/devices, obtaining an device ID from it, then enumerating the devices in /dev filtering them by corresponding major device number it got from previous step. It takes the data from /sys/block/{name}/mm_stat.
2019-07-01Easily disable alarms, by persisting the silencers configuration (#6360)thiagoftsm
This PR was created to fix #3414, here I am completing the job initiated by Christopher, among the newest features that we are bring we have JSON inside the core - We are bringing to the core the capacity to work with JSON files, this is available either using the JSON-C library case it is present in the system or using JSMN library that was incorporated to our core. The preference is to have JSON-C, because it is a more complete library, but case the user does not have the library installed we are keeping the JSMN for we do not lose the feature. Health LIST - We are bringing more one command to the Health API, now with the LIST it is possible to get in JSON format the alarms active with Netdata. Health reorganized - Previously we had duplicated code in different files, this PR is fixing this (Thanks @cakrit !), the Health is now better organized. Removing memory leak - The first implementation of the json.c was creating SILENCERS without to link it in anywhere. Now it has been linked properly. Script updated - We are bringing some changes to the script that tests the Health. This PR also fixes the race condition created by the previous new position of the SILENCERS creation, I had to move it to daemon/main.c, because after various tests, it was confirmed that the error could happen in different parts of the code, case it was not initialized before the threads starts. Component Name health directory health-cmd Additional Information Fixes #6356 and #3414
2019-06-28Revert "Easily disable alarms, by persisting the silencers configuration ↵Pavlos Emm. Katsoulakis
(#6274)" This reverts commit 60a73e90de2aa1c2eaae2ebbc45dd1fb96034df2. Emergency rollback of potential culprit as per issue #6356 Will be re-merging the change after investigation
2019-06-27Easily disable alarms, by persisting the silencers configuration (#6274)thiagoftsm
* Alarms begin! * Alarms web interface comments! * Alarms web interface comments 2! * Alarms bringing Christopher work! * Alarms bringing Christopher work! * Alarms commenting code that will be rewritten! * Alarms json-c begin! * Alarms json-c end! * Alarms missed script! * Alarms fix json-c parser and change script to test LIST! * Alarms fix test script! * Alarms documentation! * Alarms script step 1! * Alarms fix script! * Alarms fix testing script and code! * Alarms missing arguments to pkg_check_modules * SSL_backend indentation! * Alarms, description in Makefile * Alarms missing extern! * Alarms compilation! * Alarms libnetdata/health! * Alarms fill library! * Alarms fill CMakeList! * Alarm fix version! * Alarm remove readme! * Alarm fix readme version!
2019-06-20Perf plugin (#6225)Vladimir Kobal
* Add perf plugin skeleton * Initialize events * Collect data * Configure default counters * Add charts for hardware and software counters * Add charts for cache counters * Don't show zeroes for non-existent metrics * Reinit events when stalled * Do not reinit disabled events * Update the documentation * Scale values when multiplexing is happening
2019-06-07Prometheus remote write backend (#6062)Vladimir Kobal
* Add Prometheus remote write backend prototype * Fix autotools issues * Send HTTP POST request * Add parameters to HTTP header * Discard HTTP responce 200 * Update CMake build configuration * Fix Codacy issue * Check for C++ binary * Fix compilation without remote write backend * Add options to the installer script * Fix configure script warning * Fix make dist * Downgrade to ByteSize for better compatibility * Integrate remote write more tightly into the existing backends code * Cleanup * Fix build error * Parse host tags * Fix Codacy issue * Fix counters for buffered data * Rename preprocessor symbol * Better error handling * Cleanup * Update the documentation
2019-05-31SSL implementation for Netdata (#5956)thiagoftsm
* SSL implementation for Netdata * Upload of fixes asked by @paulkatsoulakis and @cakrit * Fix local computer * Adding openssl to webserver * fixing.. * HTTPS almost there * Codacity * HTTPS day 3 * HTTPS without Bio step 1 * HTTPS without Bio step 2 * HTTPS without Bio step 3 * HTTPS without Bio step 4 * HTTPS without Bio step 5 * HTTPS without Bio step 6 * HTTPS without Bio step 7 * HTTPS without Bio step 8 * HTTPS without Bio step 9 * HTTPS without Bio step 10 * SSL on streaming 1 * Daily pull * HTTPS without Bio step 11 * HTTPS without Bio step 12 * HTTPS without Bio step 13 * HTTPS without Bio step 14 * SSL_Interception change documentation * HTTPS without Bio step 15 * HTTPS without Bio step 16 * SSL_Interception fix codacity * SSL_Interception fix doc * SSL_Interception comments * SSL_Interception fixing problems! * SSL_Interception killing bugs * SSL_Interception changing parameter * SSL_Implementation documentation and script * SSL_Implementation multiple fixes * SSL_Implementation installer and cipher * SSL_Implementation Redirect 301 * SSL_Implementation webserver doc and install-or-update.sh * SSL_Implementation error 00000001:lib(0):func(0):reason(1) * SSL_Implementation web server doc * SSL_Implementation SEGFAULT on Fedora * SSL_Implementation fix ^SSL=force|optional * SSL_Implementation Redirect and Ciphers * SSL_Implementation race condition 1 * SSL_Implementation Fix Location * SSL_Implementation Fix Location 2 * SSL_Implementation Fix stream * SSL_Implementation Fix stream 2 * SSL_Implementation Fix stream 3 * SSL_Implementation last problems! * SSL_Implementation adjusts to commit! * SSL_Implementation documentation permission! * SSL_Implementation documentation permission 2! * SSL_Implementation documentation permission 3!
2019-05-30DB engine optimize RAM usage (#6134)Markos Fountoulakis
* Optimize memory footprint of DB engine * Update documentation with the new memory requirements of dbengine * Fixed code style * Fix code style * Fix compile error
2019-05-15Database engine (#5282)Markos Fountoulakis
* Database engine prototype version 0 * Database engine initial integration with netdata POC * Scalable database engine with file and memory management. * Database engine integration with netdata * Added MIN MAX definitions to fix alpine build of travis CI * Bugfix for backends and new DB engine, remove useless rrdset_time2slot() calls and erroneous checks * DB engine disk protocol correction * Moved DB engine storage file location to /var/cache/netdata/{host}/dbengine * Fix configure to require openSSL for DB engine * Fix netdata daemon health not holding read lock when iterating chart dimensions * Optimized query API for new DB engine and old netdata DB fallback code-path * netdata database internal query API improvements and cleanup * Bugfix for DB engine queries returning empty values * Added netdata internal check for data queries for old and new DB * Added statistics to DB engine and fixed memory corruption bug * Added preliminary charts for DB engine statistics * Changed DB engine ratio statistics to incremental * Added netdata statistics charts for DB engine internal statistics * Fix for netdata not compiling successfully when missing dbengine dependencies * Added DB engine functional test to netdata unittest command parameter * Implemented DB engine dataset generator based on example.random chart * Fix build error in CI * Support older versions of libuv1 * Fixes segmentation fault when using multiple DB engine instances concurrently * Fix memory corruption bug * Fixed createdataset advanced option not exiting * Fix for DB engine not working on FreeBSD * Support FreeBSD library paths of new dependencies * Workaround for unsupported O_DIRECT in OS X * Fix unittest crashing during cleanup * Disable DB engine FS caching in Apple OS X since O_DIRECT is not available * Fix segfault when unittest and DB engine dataset generator don't have permissions to create temporary host * Modified DB engine dataset generator to create multiple files * Toned down overzealous page cache prefetcher * Reduce internal memory fragmentation for page-cache data pages * Added documentation describing the DB engine * Documentation bugfixes * Fixed unit tests compilation errors since last rebase * Added note to back-up the DB engine files in documentation * Added codacy fix. * Support old gcc versions for atomic counters in DB engine
2019-05-13Add AWS Kinesis backend (#5914)Vladimir Kobal
* Add Kinesis backend * Separate config file * Send data in chunks * Fix minor issues * Add error handling * Use existing JSON functions * Do not retry on send failure * Implement building with autotools * Implement building with CMake * Fix CMake variables * Fix build when C++ compiler is not available * Add checks for C++11 * Don't reinitialize API * Don't reinitialize client * Minor cleanup * Fix Codacy warning * Separate sending records and receiving results * Add documentation * Make connection timeout configurable * Fix operation metrics * Fix typo * Change parameter names for credentials * Allow using the default SDK credentials configuration
2019-03-27Add xenstat plugin (#5660)Vladimir Kobal
* Add xenstat plugin * Add basic domain charts * Initialize xl context * Use domain UUID instead of name * Make charts obsolete * Add tmem charts * Change algorithm for tmem puts and gets * Add VCPU charts * Minor formatting for sending charts functions * Add VBD charts * Add network charts * Assemble VCPU metrics in one chart * Fix chart names * Make write/sent dimensions negative * Minor formatting * Change id and context for domain charts * Add dashboard info * Get rid of global variables * Free libxenstat and libxl resourses * Free domain_metrics on VM shutdown * Add domain state chart * Add debug messages * Add branch prediction hints * Minor fix * Fix chart obsoleting * Make names more general * Fix CMake build of nfacct.plugin
2019-03-08Correct PLUGINS_DIR directory in CMakeLists.txt (#5555)Chris Akritidis
2019-02-13Split nfacct plugin into separate process (#5361)Vladimir Kobal
* Prepare build configuration * Prepare plugin for separating * Add command line options * Add debug messages * Use text API * Minor fixes * Update the documentation * Minor documentation formatting * Fix LGTM alerts * Fix building with CMake * Add nfacct and cups plugins to apps.plugin groups