netdata - Mirror of https://github.com/netdata/netdata

Age	Commit message (Collapse)	Author
2022-06-03	Fix locking access to chart labels (#13064)	Stelios Fragkakis
	No write lock required
2022-06-02	Check for host labels when linking alerts for children (#13053)	Emmanuel Vasilakis
	check for host labels when linking alerts
2022-06-01	Schedule retention message calculation to a worker thread (#13039)	Stelios Fragkakis
	* Move aclk_update_retention to the proper header file * Do a scan but avoid going through all the dimensions if we have too much to delete -- do not generate a retention message in that case * Schedule the retention calculation to a worker * Adjust messages in the access log * Fix compilation errors with --disable-cloud
2022-05-31	Fix the retry count and netdata_exit check when running an sqlite3_step ↵	Stelios Fragkakis
	command (#13040) * Move retry count to the header file * Add SQL_MAX_RETRY count and fix the netdata_exit check
2022-05-31	When sending a dimension for the first time, make sure there is a non zero ↵	Stelios Fragkakis
	created_at timestamp (#13035)
2022-05-30	Check return value and log an error on failure (#13037)	Stelios Fragkakis

2022-05-30	Trigger queue removed alerts on health log exchange with cloud (#12954)	Emmanuel Vasilakis
	trigger queue removed on health log exchange with cloud
2022-05-25	Delay children chart obsoletion check (#12992)	Emmanuel Vasilakis
	* wait untill after 2 minutes of last chart received to run obsoletion check * turn write to read locks
2022-05-24	Don't expose the chart definition to streaming if there is no metadata ↵	Stelios Fragkakis
	change (#12990) * Only clear the RRDSET_FLAG_UPSTREAM_EXPOSED chart flag if metadata has changed * Handle modification of units as well * Initialize old_units in the chart state
2022-05-24	Stream and advertise metric correlations to the cloud (#12940)	Emmanuel Vasilakis
	* stream and advertise mc to the cloud * better reporting * remove log * remove aclk debug
2022-05-24	Faster queries (#12988)	Costa Tsaousis
	* faster rrdeng_load_metric_next() * no need to check validity for number - already done at the query side * solve discrepancy between query create and free * inline unpack_storage_number
2022-05-23	modify code to resolve compile warning issue (#12969)	kklionz

2022-05-20	cleanup and optimize rrdeng_load_metric_next() (#12966)	Costa Tsaousis
	* cleanup and optimize rrdeng_load_metric_next() * fixed typo
2022-05-20	Apply some logic to possible streaming destinations (#12866)	Emmanuel Vasilakis
	* replace connect_to_one_of with connect_to_one_of_destinations * move functions from socket.c * use sizeof * move current destination pointer to host * formatting * use snprintfz * get entries in same order * handle single destination as before (or when it is the last of the list), instead of skiping it every other loop * try other destinations on ssl problem
2022-05-19	Cleanup chart hash and map tables on startup (#12956)	Stelios Fragkakis

2022-05-18	Defer the dimension payload check to the ACLK sync thread (#12951)	Stelios Fragkakis
	Defer payload check to the aclk sync thread
2022-05-18	Optimize the dimensions option store to the metadata database (#12952)	Stelios Fragkakis
	* Add a flag to "cache" the latest hidden status written in the database * rrddim hide and unhide will check "cached" state, update the database if needed and set the cache flag accordingly * Check the dimension option and only do the database update if the cached state is different
2022-05-17	Adjust the dimension liveness status check (#12933)	Stelios Fragkakis
	* Mark a chart to be exposed only if dimension is created or metadata changes * Add a calculate liveness for the dimension for collected to non collected (live -> stale) and vice versa * queue_dimension_to_aclk will have the rrdset and either 0 or last collected time If 0 then it will be marked as live else it will be marked as stale and last collected time will be sent to the cloud * Add an extra parameter to indicate if the payload check should be done in the database or it has been done already * Queue dimension sets dimension liveness and queues the exact payload to store in the database * Fix compilation error when --disable-cloud is specified
2022-05-16	chore: add links to SQLite init options in the src code (#12920)	Ilya Mashchenko

2022-05-16	user configurable sqlite PRAGMAs (#12917)	Costa Tsaousis
	* user configurable sqlite PRAGMAs * added cache size
2022-05-16	Fix the log entry for incoming cloud start streaming commands (#12908)	Stelios Fragkakis
	Add the correct requested chart sequence id from the cloud and also record the local one we have
2022-05-14	Fix release channel in the node info message (#12905)	Stelios Fragkakis
	Fix release channel in the node info message (was hardcoded)
2022-05-13	Implements new capability fields in aclk_schemas (#12602)	Timotej S
	use new capability fields
2022-05-12	Pause alert pushes to the cloud (#12852)	Emmanuel Vasilakis
	* pause and unpause alert pushes to the cloud * move the check to when creating opcode * check for worker * remove previous checks for dbsync_workers. queue and clean aclk_alert tables even if no workers are up. Get wc then check before setting pause * remove sync_syncronize * remove sync_synchronize_2
2022-05-12	Fix compilation warnings (#12886)	Vladimir Kobal

2022-05-11	Configurable storage engine for Netdata agents: step 2 (#12808)	Adrien Béraud

2022-05-10	Initialize the metadata database when performing dbengine stress test (#12861)	Stelios Fragkakis
	* Remove error (no real value) * Add a parameter to create an in-memory database for stress testing * Add a new parameter to the stresstest command to set the number of deisred libuv worker threads
2022-05-09	Add a database checkpoint command (#12859)	Stelios Fragkakis

2022-05-09	Workers utilization charts (#12807)	Costa Tsaousis
	* initial version of worker utilization * working example * without mutexes * monitoring DBENGINE, ACLKSYNC, WEB workers * added charts to monitor worker usage * fixed charts units * updated contexts * updated priorities * added documentation * converted threads to stacked chart * One query per query thread * Revert "One query per query thread" This reverts commit 6aeb391f5987c3c6ba2864b559fd7f0cd64b14d3. * fixed priority for web charts * read worker cpu utilization from proc * read workers cpu utilization via /proc/self/task/PID/stat, so that we have cpu utilization even when the jobs are too long to finish within our update_every frequency * disabled web server cpu utilization monitoring - it is now monitored by worker utilization * tight integration of worker utilization to web server * monitoring statsd worker threads * code cleanup and renaming of variables * contrained worker and statistics conflict to just one variable * support for rendering jobs per type * better priorities and removed the total jobs chart * added busy time in ms per job type * added proc.plugin monitoring, switch clock to MONOTONIC_RAW if available, global statistics now cleans up old worker threads * isolated worker thread families * added cgroups.plugin workers * remove unneeded dimensions when then expected worker is just one * plugins.d and streaming monitoring * rebased; support worker_is_busy() to be called one after another * added diskspace plugin monitoring * added tc.plugin monitoring * added ML threads monitoring * dont create dimensions and charts that are not needed * fix crash when job types are added on the fly * added timex and idlejitter plugins; collected heartbeat statistics; reworked heartbeat according to the POSIX * the right name is heartbeat for this chart * monitor streaming senders * added streaming senders to global stats * prevent division by zero * added clock_init() to external C plugins * added freebsd and macos plugins * added freebsd and macos to global statistics * dont use new as a variable; address compiler warnings on FreeBSD and MacOS * refactored contexts to be unique; added health threads monitoring Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-05-09	Resolve coverity issues (#12846)	Stelios Fragkakis
	- Variable "hostname" going out of scope leaks the storage it points to. - Null-checking "rd->name" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
2022-05-07	fix memory leaks and mismatches of the use of the z functions for ↵	Costa Tsaousis
	allocations (#12841) * fix mismatches of the use of the z functions for allocations * when there was no memory; the original name of the dimensions was freed, and with mismatching deallocator.. * fixed memory leak at rrdeng_load_metric_() functions fixed memory leak on exit of plugins.d parser * fixed memory leak on plugins and streaming receiver threads exit * fixed compiler warnings
2022-05-07	speedup queries by providing optimization in the main loop (#12811)	Costa Tsaousis

2022-05-06	Set a page wait timeout to 1 second (#12836)	Stelios Fragkakis
	Retry 3 times, to queue the page request before giving up
2022-05-05	Reduce the number of messages logged to one that sums up the numbr of ↵	Stelios Fragkakis
	metrics ignored (#12829)
2022-05-05	Add chart filtering parameter to the allmetrics API query (#12820)	Vladimir Kobal
	* Add chart filtering in the allmetrics API call * Fix compilation warnings * Remove unnecessary function * Update the documentation * Apply suggestions from code review * Check for filter instead of filter_string * Do not check both - chart id and name for prometheus and shell formats * Fix unit tests Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
2022-05-05	Cleanup node instance (#12825)	Stelios Fragkakis

2022-05-05	Fill missing removed events after a crash (#12803)	Emmanuel Vasilakis
	* inject removed events when missing from sqlite * pass flag * remove log message
2022-05-04	Optimize linking of foreach alarms to dimensions. (#12813)	vkalintiris
	* Optimize linking of foreach alarms to dimensions. Keep the write-lock on host but use read-lock for charts because it's easy to verify that they aren't modified by the linking of foreach alarms to dimensions. * Protect alarm log modifications with write-lock.
2022-05-04	* Add a parameter for the libuv worker threads to pre-initialize (#12814)	Stelios Fragkakis
	* Set the thread name for libuv threads to LIBUV_WORKER * Make sure the dbengine thread has the correct name
2022-05-04	Metric correlations (#12582)	Emmanuel Vasilakis
	* initial attempt at metric correlations * fix loop * simplify struct * change json * get points from query * comment * dont lock the host as much * add a configuration option to enable/disable metric correlations * remove KSfbar from header file * lock charts * add timeout * cast multiplication * add licencing info * better licencing * use onewayalloc * destroy owa
2022-05-04	fix!: do not replace a hyphen in the chart name with an underscore (#12812)	Ilya Mashchenko

2022-05-03	Improve agent cloud chart synchronization (#12655)	Stelios Fragkakis
	* Try to queue dimension always when: Trying to clean obsolete charts If chart has been sent and liveness apparently changed * delay rotation and skip chart check if not send to cloud * No need to CLEAR flag during database rotation Do not clear chart ACLK status for dimension requests * Change payload_sent to return timestamp of submitted message * Clear the dimension ACLK flag if we are processing all the charts again * Check if dimension is already queued to ACLK and ignore it If queue fails then reset it to retry Already try to queue the dimension * Improve dimension cleanup during the retention message calculation * Change queue_dimension_to_aclk to return void * If no time range for this dimension then assume it is deleted * Start streaming for inactive nodes * Remove dead code * Correctly report hostname in the access log * Schedule a dimension deletion without trying to submit a message immediately * Enable dimension cleanup -- also delete dimension if not found in the dbengine files Free hostname
2022-05-03	Remove per chart configuration. (#12728)	vkalintiris
	After https://github.com/netdata/netdata/pull/12209 per-chart configuration was used for (a) enabling/disabling a chart, and (b) renaming dimensions. Regarding the first use case: We already have component-specific configuration options\|flags to finely control how a chart should behave. Eg. "send charts matching" in streaming, "charts to skip from training" in ML, etc. If we really need the concept of a disabled chart, we can add a host-level simple pattern to match these charts. Regarding the second use case: It's not obvious why we'd need to provide support for remapping dimension names through a chart-specific configuration from the core agent. If the need arises, we could add such support at the right place, ie. a exporter/streaming config section. This will allow each flag to act indepentendly from each other and avoid managing flag-state manually at various places, eg: ``` if(unlikely(!rrdset_flag_check(st, RRDSET_FLAG_ENABLED))) { rrdset_flag_clear(st, RRDSET_FLAG_UPSTREAM_SEND); rrdset_flag_set(st, RRDSET_FLAG_UPSTREAM_IGNORE); } ... ```
2022-05-03	Skip dimension deletion on free (temp fix) (#12777)	Stelios Fragkakis

2022-05-03	Configurable storage engine for Netdata agents: step 1 (#12776)	Adrien Béraud
	* rrd: move API structures out of rrddim_volatile In C, unlike C++, it's not possible to reference a nested structure from outside this structure. Since we later want to use rrddim_query_ops and rrddim_collect_ops separately from rrddim_volatile, move these nested structures out. * rrd: use opaque handle types for different memory modes
2022-05-03	Check for chart obsoletion on children re-connections (#12707)	Emmanuel Vasilakis
	* check for chart obsoletion on children connections * use rrdset_is_obsolete
2022-05-03	One way allocator to double the speed of parallel context queries (#12787)	Costa Tsaousis
	* one way allocator to speed up context queries * fixed a bug while expanding memory pages * reworked for clarity and finally fixed the bug of allocating memory beyond the page size * further optimize allocation step to minimize the number of allocations made * implement strdup with memcpy instead of strcpy * added documentation * prevent an uninitialized use of owa * added callocz() interface * integrate onewayalloc everywhere - apart sql queries * one way allocator is now used in context queries using archived charts in sql * align on the size of pointers * forgotten freez() * removed not needed memcpys * give unique names to global variables to avoid conflicts with system definitions
2022-05-02	Reduce alert events sent to the cloud. (#12544)	Emmanuel Vasilakis
	* filter * update filter * queue removed directly * more * logging * cleanup * cleanup 2 * cleanup 3 * finalize instead of reset
2022-05-02	Avoid clearing already unset flags. (#12727)	vkalintiris
	If memory mode is save, map or ram the set's flags are initialized to 0. Otherwise, the set is calloc'd which will make the set have 0 flags.
2022-05-02	Make atomics a hard-dep. (#12730)	vkalintiris
	They are used extensively throughout our code base, and not having support for them does not generate a thread-safe agent.