diff options
author | Costa Tsaousis <costa@netdata.cloud> | 2022-10-05 14:13:46 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-10-05 14:13:46 +0300 |
commit | 8fc3b351a2e7fc96eced8f924de2e9cec9842128 (patch) | |
tree | bde41c66573ccaf8876c280e00742cc6096b587c | |
parent | 6850878e697d66dc90b9af1e750b22238c63c292 (diff) |
Allow netdata plugins to expose functions for querying more information about specific charts (#13720)
* function renames and code cleanup in popen.c; no actual code changes
* netdata popen() now opens both child process stdin and stdout and returns FILE * for both
* pass both input and output to parser structures
* updated rrdset to call custom functions
* RRDSET FUNCTION leading calls for both sync and async operation
* put RRDSET functions to a separate file
* added format and timeout at function definition
* support for synchronous (internal plugins) and asynchronous (external plugins and children) functions
* /api/v1/function endpoint
* functions are now attached to the host and there is a dictionary view per chart
* functions implemented at plugins.d
* remove the defer until keyword hook from plugins.d when it is done
* stream sender implementation of functions
* sanitization of all functions so that certain characters are only allowed
* strictier sanitization
* common max size
* 1st working plugins.d example
* always init inflight dictionary
* properly destroy dictionaries to avoid parallel insertion of items
* add more debugging on disconnection reasons
* add more debugging on disconnection reasons again
* streaming receiver respects newlines
* dont use the same fp for both streaming receive and send
* dont free dbengine memory with internal checks
* make sender proceed in the buffer
* added timing info and garbage collection at plugins.d
* added info about routing nodes
* added info about routing nodes with delay
* added more info about delays
* added more info about delays again
* signal sending thread to wake up
* streaming version labeling and commented code to support capabilities
* added functions to /api/v1/data, /api/v1/charts, /api/v1/chart, /api/v1/info
* redirect top output to stdout
* address coverity findings
* fix resource leaks of popen
* log attempts to connect to individual destinations
* better messages
* properly parse destinations
* try to find a function from the most matching to the least matching
* log added streaming destinations
* rotate destinations bypassing a node in the middle that does not accept our connection
* break the loops properly
* use typedef to define callbacks
* capabilities negotiation during streaming
* functions exposed upstream based on capabilities; compression disabled per node persisting reconnects; always try to connect with all capabilities
* restore functionality to lookup functions
* better logging of capabilities
* remove old versions from capabilities when a newer version is there
* fix formatting
* optimization for plugins.d rrdlabels to avoid creating and destructing dictionaries all the time
* delayed health initialization for rrddim and rrdset
* cleanup health initialization
* fix for popen() not returning the right value
* add health worker jobs for initializing rrdset and rrddim
* added content type support for functions; apps.plugin permanent function to display all the processes
* fixes for functions parameters parsing in apps.plugin
* fix for process matching in apps.plugiin
* first working function for apps.plugin
* Dashboard ACL is disabled for functions; Function errors are all in JSON format
* apps.plugin function processes returns json table
* use json_escape_string() to escape message
* fix formatting
* apps.plugin exposes all its metrics to function processes
* fix json formatting when filtering out some rows
* reopen the internal pipe of rrdpush in case of errors
* misplaced statement
* do not use buffer->len
* support for GLOBAL functions (functions that are not linked to a chart
* added /api/v1/functions endpoint; removed format from the FUNCTIONS api;
* swagger documentation about the new api end points
* added plugins.d documentation about functions
* never re-close a file
* remove uncessesary ifdef
* fixed issues identified by codacy
* fix for null label value
* make edit-config copy-and-paste friendly
* Revert "make edit-config copy-and-paste friendly"
This reverts commit 54500c0e0a97f65a0c66c4d34e966f6a9056698e.
* reworked sender handshake to fix coverity findings
* timeout is zero, for both send_timeout() and recv_timeout()
* properly detect that parent closed the socket
* support caching of function responses; limit function response to 10MB; added protection from malformed function responses
* disabled excessive logging
* added units to apps.plugin function processes and normalized all values to be human readable
* shorter field names
* fixed issues reported
* fixed apps.plugin error response; tested that pluginsd can properly handle faulty responses
* use double linked list macros for double linked list management
* faster apps.plugin function printing by minimizing file operations
* added memory percentage
* fix compatibility issues with older compilers and FreeBSD
* rrdpush sender code cleanup; rrhost structure cleanup from sender flags and variables;
* fix letftover variable in ifdef
* apps.plugin: do not call detach from the thread; exit immediately when input is broken
* exclude AR charts from health
* flush cleaner; prefer sender output
* clarity
* do not fill the cbuffer if not connected
* fix
* dont enabled host->sender if streaming is not enabled; send host label updates to parent;
* functions are only available through ACLK
* Prepared statement reports only in dev mode
* fix AR chart detection
* fix for streaming not being enabling itself
* more cleanup of sender and receiver structures
* moved read-only flags and configuration options to rrdhost->options
* fixed merge with master
* fix for incomplete rename
* prevent service thread from working on charts that are being collected
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
67 files changed, 4373 insertions, 1604 deletions
diff --git a/CMakeLists.txt b/CMakeLists.txt index e0c114ab4b..a23f80e35d 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -731,6 +731,8 @@ set(RRD_PLUGIN_FILES database/rrddimvar.c database/rrddimvar.h database/rrdfamily.c + database/rrdfunctions.c + database/rrdfunctions.h database/rrdhost.c database/rrdlabels.c database/rrd.c diff --git a/Makefile.am b/Makefile.am index 34026e9888..1a18150314 100644 --- a/Makefile.am +++ b/Makefile.am @@ -441,6 +441,8 @@ RRD_PLUGIN_FILES = \ database/rrd.c \ database/rrd.h \ database/rrdset.c \ + database/rrdfunctions.c \ + database/rrdfunctions.h \ database/rrdsetvar.c \ database/rrdsetvar.h \ database/rrdvar.c \ diff --git a/aclk/aclk_query.c b/aclk/aclk_query.c index 132d5fe18f..b30da60e1f 100644 --- a/aclk/aclk_query.c +++ b/aclk/aclk_query.c @@ -80,7 +80,7 @@ static int http_api_v2(struct aclk_query_thread *query_thr, aclk_query_t query) strcpy(w->origin, "*"); // Simulate web_client_create_on_fd() w->cookie1[0] = 0; // Simulate web_client_create_on_fd() w->cookie2[0] = 0; // Simulate web_client_create_on_fd() - w->acl = 0x1f; + w->acl = WEB_CLIENT_ACL_ACLK; buffer_strcat(log_buffer, query->data.http_api_v2.query); size_t size = 0; diff --git a/claim/claim.c b/claim/claim.c index b0d40ecf86..c9b1712f86 100644 --- a/claim/claim.c +++ b/claim/claim.c @@ -58,7 +58,7 @@ void claim_agent(char *claiming_arguments) int exit_code; pid_t command_pid; char command_buffer[CLAIMING_COMMAND_LENGTH + 1]; - FILE *fp; + FILE *fp_child_output, *fp_child_input; // This is guaranteed to be set early in main via post_conf_load() char *cloud_base_url = appconfig_get(&cloud_config, CONFIG_SECTION_GLOBAL, "cloud base url", NULL); @@ -84,14 +84,14 @@ void claim_agent(char *claiming_arguments) claiming_arguments); info("Executing agent claiming command 'netdata-claim.sh'"); - fp = mypopen(command_buffer, &command_pid); - if(!fp) { + fp_child_output = netdata_popen(command_buffer, &command_pid, &fp_child_input); + if(!fp_child_output) { error("Cannot popen(\"%s\").", command_buffer); return; } info("Waiting for claiming command to finish."); - while (fgets(command_buffer, CLAIMING_COMMAND_LENGTH, fp) != NULL) {;} - exit_code = mypclose(fp, command_pid); + while (fgets(command_buffer, CLAIMING_COMMAND_LENGTH, fp_child_output) != NULL) {;} + exit_code = netdata_pclose(fp_child_input, fp_child_output, command_pid); info("Agent claiming command returned with code %d", exit_code); if (0 == exit_code) { load_claiming_state(); diff --git a/collectors/apps.plugin/apps_plugin.c b/collectors/apps.plugin/apps_plugin.c index 8521e078e8..212374e828 100644 --- a/collectors/apps.plugin/apps_plugin.c +++ b/collectors/apps.plugin/apps_plugin.c @@ -10,6 +10,11 @@ #include "libnetdata/libnetdata.h" #include "libnetdata/required_dummies.h" +#define APPS_PLUGIN_FUNCTIONS() do { \ + fprintf(stdout, PLUGINSD_KEYWORD_FUNCTION " \"processes\" 10 \"Detailed information on the currently running processes on this node\"\n"); \ + } while(0) + + // ---------------------------------------------------------------------------- // debugging @@ -191,6 +196,18 @@ struct pid_on_target { struct pid_on_target *next; }; +struct openfds { + kernel_uint_t files; + kernel_uint_t pipes; + kernel_uint_t sockets; + kernel_uint_t inotifies; + kernel_uint_t eventfds; + kernel_uint_t timerfds; + kernel_uint_t signalfds; + kernel_uint_t eventpolls; + kernel_uint_t other; +}; + // ---------------------------------------------------------------------------- // target // @@ -235,24 +252,16 @@ struct target { kernel_uint_t io_logical_bytes_read; kernel_uint_t io_logical_bytes_written; - // kernel_uint_t io_read_calls; - // kernel_uint_t io_write_calls; + kernel_uint_t io_read_calls; + kernel_uint_t io_write_calls; kernel_uint_t io_storage_bytes_read; kernel_uint_t io_storage_bytes_written; - // kernel_uint_t io_cancelled_write_bytes; + kernel_uint_t io_cancelled_write_bytes; int *target_fds; int target_fds_size; - kernel_uint_t openfiles; - kernel_uint_t openpipes; - kernel_uint_t opensockets; - kernel_uint_t openinotifies; - kernel_uint_t openeventfds; - kernel_uint_t opentimerfds; - kernel_uint_t opensignalfds; - kernel_uint_t openeventpolls; - kernel_uint_t openother; + struct openfds openfds; kernel_uint_t starttime; kernel_uint_t collected_starttime; @@ -382,22 +391,24 @@ struct pid_stat { kernel_uint_t io_logical_bytes_read_raw; kernel_uint_t io_logical_bytes_written_raw; - // kernel_uint_t io_read_calls_raw; - // kernel_uint_t io_write_calls_raw; + kernel_uint_t io_read_calls_raw; + kernel_uint_t io_write_calls_raw; kernel_uint_t io_storage_bytes_read_raw; kernel_uint_t io_storage_bytes_written_raw; - // kernel_uint_t io_cancelled_write_bytes_raw; + kernel_uint_t io_cancelled_write_bytes_raw; kernel_uint_t io_logical_bytes_read; kernel_uint_t io_logical_bytes_written; - // kernel_uint_t io_read_calls; - // kernel_uint_t io_write_calls; + kernel_uint_t io_read_calls; + kernel_uint_t io_write_calls; kernel_uint_t io_storage_bytes_read; kernel_uint_t io_storage_bytes_written; - // kernel_uint_t io_cancelled_write_bytes; + kernel_uint_t io_cancelled_write_bytes; struct pid_fd *fds; // array of fds it uses - size_t fds_size; // the size of the fds array + size_t fds_size; // the size of the fds array + + struct openfds openfds; int children_count; // number of processes directly referencing this unsigned char keep:1; // 1 when we need to keep this process in memory even after it exited @@ -447,8 +458,8 @@ kernel_uint_t global_uptime; static struct pid_stat *root_of_pids = NULL, // global list of all processes running - **all_pids = NULL; // to avoid allocations, we pre-allocate the - // the entire pid space. + **all_pids = NULL; // to avoid allocations, we pre-allocate + // a pointer for each pid in the entire pid space. static size_t all_pids_count = 0; // the number of processes running @@ -961,15 +972,10 @@ static inline struct pid_stat *get_pid_entry(pid_t pid) { p->fds = mallocz(sizeof(struct pid_fd) * MAX_SPARE_FDS); p->fds_size = MAX_SPARE_FDS; init_pid_fds(p, 0, p->fds_size); - - if(likely(root_of_pids)) - root_of_pids->prev = p; - - p->next = root_of_pids; - root_of_pids = p; - p->pid = pid; + DOUBLE_LINKED_LIST_APPEND_UNSAFE(root_of_pids, p, prev, next); + all_pids[pid] = p; all_pids_count++; @@ -986,11 +992,7 @@ static inline void del_pid_entry(pid_t pid) { debug_log("process %d %s exited, deleting it.", pid, p->comm); - if(root_of_pids == p) - root_of_pids = p->next; - - if(p->next) p->next->prev = p->prev; - if(p->prev) p->prev->next = p->next; + DOUBLE_LINKED_LIST_REMOVE_UNSAFE(root_of_pids, p, prev, next); // free the filename #ifndef __FreeBSD__ @@ -1554,21 +1556,21 @@ static inline int read_proc_pid_io(struct pid_stat *p, void *ptr) { #else pid_incremental_rate(io, p->io_logical_bytes_read, str2kernel_uint_t(procfile_lineword(ff, 0, 1))); pid_incremental_rate(io, p->io_logical_bytes_written, str2kernel_uint_t(procfile_lineword(ff, 1, 1))); - // pid_incremental_rate(io, p->io_read_calls, str2kernel_uint_t(procfile_lineword(ff, 2, 1))); - // pid_incremental_rate(io, p->io_write_calls, str2kernel_uint_t(procfile_lineword(ff, 3, 1))); + pid_incremental_rate(io, p->io_read_calls, str2kernel_uint_t(procfile_lineword(ff, 2, 1))); + pid_incremental_rate(io, p->io_write_calls, str2kernel_uint_t(procfile_lineword(ff, 3, 1))); pid_incremental_rate(io, p->io_storage_bytes_read, str2kernel_uint_t(procfile_lineword(ff, 4, 1))); pid_incremental_rate(io, p->io_storage_bytes_written, str2kernel_uint_t(procfile_lineword(ff, 5, 1))); - // pid_incremental_rate(io, p->io_cancelled_write_bytes, str2kernel_uint_t(procfile_lineword(ff, 6, 1))); + pid_incremental_rate(io, p->io_cancelled_write_bytes, str2kernel_uint_t(procfile_lineword(ff, 6, 1))); #endif if(unlikely(global_iterations_counter == 1)) { p->io_logical_bytes_read = 0; p->io_logical_bytes_written = 0; - // p->io_read_calls = 0; - // p->io_write_calls = 0; + p->io_read_calls = 0; + p->io_write_calls = 0; p->io_storage_bytes_read = 0; p->io_storage_bytes_written = 0; - // p->io_cancelled_write_bytes = 0; + p->io_cancelled_write_bytes = 0; } return 1; @@ -1577,11 +1579,11 @@ static inline int read_proc_pid_io(struct pid_stat *p, void *ptr) { cleanup: p->io_logical_bytes_read = 0; p->io_logical_bytes_written = 0; - // p->io_read_calls = 0; - // p->io_write_calls = 0; + p->io_read_calls = 0; + p->io_write_calls = 0; p->io_storage_bytes_read = 0; p->io_storage_bytes_written = 0; - // p->io_cancelled_write_bytes = 0; + p->io_cancelled_write_bytes = 0; return 0; #endif } @@ -1888,7 +1890,7 @@ static inline int file_descriptor_find_or_add(const char *name, uint32_t hash) { else if(likely(strncmp(name, "anon_inode:", 11) == 0)) { const char *t = &name[11]; - if(strcmp(t, "inotify") == 0) type = FILETYPE_INOTIFY; + if(strcmp(t, "inotify") == 0) type = FILETYPE_INOTIFY; else if(strcmp(t, "[eventfd]") == 0) type = FILETYPE_EVENTFD; else if(strcmp(t, "[eventpoll]") == 0) type = FILETYPE_EVENTPOLL; else if(strcmp(t, "[timerfd]") == 0) type = FILETYPE_TIMERFD; @@ -1943,7 +1945,6 @@ static inline void cleanup_negative_pid_fds(struct pid_stat *p) { static inline void init_pid_fds(struct pid_stat *p, size_t first, size_t size) { struct pid_fd *pfd = &p->fds[first], *pfdend = &p->fds[first + size]; - size_t i = first; while(pfd < pfdend) { #ifndef __FreeBSD__ @@ -1951,7 +1952,6 @@ static inline void init_pid_fds(struct pid_stat *p, size_t first, size_t size) { #endif clear_pid_fd(pfd); pfd++; - i++; } } @@ -2904,24 +2904,24 @@ static size_t zero_all_targets(struct target *root) { w->io_logical_bytes_read = 0; w->io_logical_bytes_written = 0; - // w->io_read_calls = 0; - // w->io_write_calls = 0; + w->io_read_calls = 0; + w->io_write_calls = 0; w->io_storage_bytes_read = 0; w->io_storage_bytes_written = 0; - // w->io_cancelled_write_bytes = 0; + w->io_cancelled_write_bytes = 0; // zero file counters if(w->target_fds) { memset(w->target_fds, 0, sizeof(int) * w->target_fds_size); - w->openfiles = 0; - w->openpipes = 0; - w->opensockets = 0; - w->openinotifies = 0; - w->openeventfds = 0; - w->opentimerfds = 0; - w->opensignalfds = 0; - w->openeventpolls = 0; - w->openother = 0; + w->openfds.files = 0; + w->openfds.pipes = 0; + w->openfds.sockets = 0; + w->openfds.inotifies = 0; + w->openfds.eventfds = 0; + w->openfds.timerfds = 0; + w->openfds.signalfds = 0; + w->openfds.eventpolls = 0; + w->openfds.other = 0; } w->collected_starttime = 0; @@ -2956,60 +2956,64 @@ static inline void reallocate_target_fds(struct target *w) { } } -static inline void aggregate_fd_on_target(int fd, struct target *w) { - if(unlikely(!w)) - return; - - if(unlikely(w->target_fds[fd])) { - // it is already aggregated - // just increase its usage counter - w->target_fds[fd]++; - return; - } - - // increase its usage counter - // so that we will not add it again - w->target_fds[fd]++; - - switch(all_files[fd].type) { +static void aggregage_fd_type_on_openfds(FD_FILETYPE type, struct openfds *openfds) { + switch(type) { case FILETYPE_FILE: - w->openfiles++; + openfds->files++; break; case FILETYPE_PIPE: - w->openpipes++; + openfds->pipes++; break; case FILETYPE_SOCKET: - w->opensockets++; + openfds->sockets++; break; case FILETYPE_INOTIFY: - w->openinotifies++; + openfds->inotifies++; break; case FILETYPE_EVENTFD: - w->openeventfds++; + openfds->eventfds++; break; case FILETYPE_TIMERFD: - w->opentimerfds++; + openfds->timerfds++; break; case FILETYPE_SIGNALFD: - w->opensignalfds++; + openfds->signalfds++; break; case FILETYPE_EVENTPOLL: - w->openeventpolls++; + openfds->eventpolls++; break; case FILETYPE_OTHER: - w->openother++; + openfds->other++; break; } } +static inline void aggregate_fd_on_target(int fd, struct target *w) { + if(unlikely(!w)) + return; + + if(unlikely(w->target_fds[fd])) { + // it is al |