summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2023-01-25Introduce the new Structure of the documentation (#13915)Fotis Voutsas
* Moving the cloud docs under /docs/cloud (previous location: netdata/learn/*) * Added metadata on almost every document of the old learn site for the new ingest process of learn. * Map old learn document to their best fit as topic related docs. Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud> Co-authored-by: DShreve2 <david@netdata.cloud> Co-authored-by: hugovalente-pm <hugo@netdata.cloud>
2023-01-25Misc SSL improvements (#14317)Emmanuel Vasilakis
* set web client to poll when ssl error want read or write * turn to function
2023-01-25minor - kaitai for netdata datafiles (#14312)Timotej S
2023-01-25fix(proc.plugin): add "cpu" label to per core util% charts (#14322)Ilya Mashchenko
* fix(proc.plugin): add cpu label to per core util% charts * fix codeql warning
2023-01-25Fix up codeowners based on recent staffing changes. (#14320)Austin S. Hemmelgarn
2023-01-25[ci skip] Update changelog and version for nightly build: v1.37.0-182-nightly.netdatabot
2023-01-25DBENGINE v2 - improvements part 8 (#14319)Costa Tsaousis
* cache 100 pages for each size our tiers need * smarter page caching * account the caching structures * dynamic max number of cached pages * make variables const to ensure they are not changed * make sure replication timestamps do not go to the future * replication now sends chart and dimension states atomically; replication receivers ignores chart and dimension states when rbegin is also ignored * make sure all pages are flushed on shutdown * take into account empty points too * when recalculating retention update first_time_s on metrics only when they are bigger * Report the datafile number we use to recalculate retention * Report the datafile number we use to recalculate retention * rotate db at startup * make query plans overlap * Calculate properly first time s * updated event labels * negative page caching fix * Atempt to create missing tables on query failure * Atempt to create missing tables on query failure (part 2) * negative page caching for all gaps, to eliminate jv2 scans * Fix unittest Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-01-24[ci skip] Update changelog and version for nightly build: v1.37.0-180-nightly.netdatabot
2023-01-23DBENGINE v2 - improvements part 7 (#14307)Costa Tsaousis
* run cleanup in workers * when there is a discrepancy between update every, fix it * fix the other occurences of metric update every mismatch * allow resetting the same timestamp * validate flushed pages before committing them to disk * initialize collection with the latest time in mrg * these should be static functions * acquire metrics for writing to detect multiple data collections of the same metric * print the uuid of the metric that is collected twice * log the discrepancies of completed pages * 1 second tolerance * unify validation of pages and related logging across dbengine * make do_flush_pages() thread safe * flush pages runs on libuv workers * added uv events to tp workers * dont cross datafile spinlock and rwlock * should be unlock * prevent the creation of multiple datafiles * break an infinite replication loop * do not log the epxansion of the replication window due to start streaming * log all invalid pages with internal checks * do not shutdown event loop threads * add information about collected page events, to find the root cause of invalid collected pages * rewrite of the gap filling to fix the invalid collected pages problem * handle multiple collections of the same metric gracefully * added log about main cache page conflicts; fix gap filling once again... * keep track of the first metric writer * it should be an internal fatal - it does not harm users * do not check of future timestamps on collected pages, since we inherit the clock of the children; do not check collected pages validity without internal checks * prevent negative replication completion percentage * internal error for the discrepancy of mrg * better logging of dbengine new metrics collection * without internal checks it is unused * prevent pluginsd crash on exit due to calling pthread_cancel() on an exited thread * renames and atomics everywhere * if a datafile cannot be acquired for deletion during shutdown, continue - this can happen when there are hot pages in open cache referencing it * Debug for context load * rrdcontext uuid debug * rrddim uuid debug * rrdeng uuid debug * Revert "rrdeng uuid debug" This reverts commit 393da190826a582e7e6cc90771bf91b175826d8b. * Revert "rrddim uuid debug" This reverts commit 72150b30408294f141b19afcfb35abd7c34777d8. * Revert "rrdcontext uuid debug" This reverts commit 2c3b940dc23f460226e9b2a6861c214e840044d0. * Revert "Debug for context load" This reverts commit 0d880fc1589f128524e0b47abd9ff0714283ce3b. * do not use legacy uuids on multihost dbs * thread safety for journafile size * handle other cases of inconsistent collected pages * make health thread check if it should be running in key loops * do not log uuids Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-01-23remove mqtt-c from websockets (#14181)Timotej S
* remove MQTT-C (MQTT 3 implementation) from buildsystem
2023-01-23Update kickstart script to use new DEB infrastructure. (#14301)Austin S. Hemmelgarn
* Update kickstart script to use new DEB infrastructure. * Fix package filename suffix handling for DEB packages. * Fix the DEB package availability check to use new repo URL.
2023-01-21[ci skip] Update changelog and version for nightly build: v1.37.0-176-nightly.netdatabot
2023-01-20DBENGINE v2 - improvements part 6 (#14299)Costa Tsaousis
* query preparation runs before extent reads * populate mrg in parallel * fix formatting warning * first search for a metric then add it if it does not exist * Revert "first search for a metric then add it if it does not exist" This reverts commit 4afa6461fcce859d03f1c9cf56dd3b5933ee5ebc. * Revert "fix formatting warning" This reverts commit 49473493f7f1c3399b5635a573d3c6ed2b6e46f3. * Revert "populate mrg in parallel" This reverts commit a40166708d4222f6329904f109114c47c44ca666. * merge journalfiles metrics before committing them to MRG * Revert "merge journalfiles metrics before committing them to MRG" This reverts commit 50c8934e23a0a09ea4da80e3f88290e46496ad92. * Revert "Revert "populate mrg in parallel"" This reverts commit f4c149d2ab7a8c9af24a10f95438a0d662a5cf8a. * Revert "Revert "fix formatting warning"" This reverts commit 78298ff9efc49806ded029f5f1e868cc42e8f6eb. * Revert "Revert "first search for a metric then add it if it does not exist"" This reverts commit 997b9c813b290882ba18a8c44bf73f9ee5480adf. * preload first and last journal files v2 * fix formatting warning * parallel loading of tiers; cleanup of ctx structures * use half the cores * add partitions to metrics registry * revert accidental change * parallel processing according to MRG partitions; dont recalculate retention on exit
2023-01-20Fix Exporiting compilaton error (#14306)thiagoftsm
2023-01-20Fixes required to make the agent work without crashes on MacOS (#14304)vkalintiris
* Bump the soft limit on open FDs to the max. On systems with a low soft-limit for open file descriptors, the agent would fail to initialize all dbengine tiers. * Iterate the right number of dbengine tiers. For whatever reason, this was causing a crash on MacOS but it was running "correctly" on Linux systems.
2023-01-20bump go.d.plugin to v0.49.2 (#14305)Ilya Mashchenko
2023-01-20add consul license expiration time alarm (#14298)Ilya Mashchenko
* add consul license alarm * minor
2023-01-20Fix build CI jobs. (#14302)Austin S. Hemmelgarn
Switch from OCI to Docker images for transferring containers between CI steps.
2023-01-20Switch to self-hosted infrastructure for DEB package distribution. (#14290)Austin S. Hemmelgarn
* Update DEB repository configuration to new infrastructure. * Fix typo.
2023-01-20Update to SQLITE version 3.40.1 (#14282)Stelios Fragkakis
Update to sqlite3 version 3.40.1
2023-01-20[ci skip] Update changelog and version for nightly build: v1.37.0-167-nightly.netdatabot
2023-01-20track memory footprint of Netdata (#14294)Costa Tsaousis
* track memory footprint of Netdata * track db modes alloc/ram/save/map * track system info; track sender and receiver * fixes * more fixes * track workers memory, onewayalloc memory; unify judyhs size estimation * track replication structures and buffers * Properly clear host RRDHOST_FLAG_METADATA_UPDATE flag * flush the replication buffer every 1000 times the circular buffer is found empty * dont take timestamp too frequently in sender loop * sender buffers are not used by the same thread as the sender, so they were never recreated - fixed it * free sender thread buffer on replication threads when replication is idle * use the last sender flag as a timestamp of the last buffer recreation * free cbuffer before reconnecting * recreate cbuffer on every flush * timings for journal v2 loading * inlining of metric and cache functions * aral likely/unlikely * free left-over thread buffers * fix NULL pointer dereference in replication * free sender thread buffer on sender thread too * mark ctx as used before flushing * better logging on ctx datafiles closing Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-01-19Fix macos struct definition. (#14297)vkalintiris
2023-01-19Remove archivedcharts endpoint, optimize indices (#14296)Stelios Fragkakis
Remove undocumented archivedcharts endpoint. Use context endpoint instead Remove unused functions to lookup chart and dimension UUIDs Drop/Add new index for dimension and chart tables
2023-01-19Improve file descriptor closing loops (#14213)Dim-P
* Add for_each_open_fd() and fix second instance of _SC_OPEN_MAX * Add argument to allow exclusion of file descriptors from closing * Fix clang error * Address review comments * Use close_range() if possible and replace macros with enums
2023-01-18DBENGINE v2 - improvements part 5 (#14289)Costa Tsaousis
* cleanup journal v2 mounts periodically * fix for last commit * re-enable loading page from disk when the arrangement of pages requires it * Remove unused statistics * Estimate diskspace when the current datafile is full and queue a rotate command (Currently it will not attempt to estimate end size for journals) Queue a command to check quota on startup per tier * apps.plugin now exposes RSS chart * shorter thread names to make debugging easier, since thread names can only be 15 characters * more thread names fixes * allow an apps_groups.conf target to be pid 0 or 1 Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-01-18allow multiple local-build/static-install options in kickstart (#14287)Ilya Mashchenko
2023-01-18fix(alarms): treat 0 processors as unknown in load_cpu_number (#14286)Ilya Mashchenko
2023-01-18Revert health to run in a single thread (#14244)Emmanuel Vasilakis
* revert health to single thread * remove getting now * use a health struct * remove commented code * cleanup health log from metdata * dont check for METADATA_UPDATE
2023-01-18[ci skip] Update changelog and version for nightly build: v1.37.0-158-nightly.netdatabot
2023-01-17DBENGINE v2 - improvements part 4 (#14285)Costa Tsaousis
do not lock the entire datafile list while a datafile is being deleted
2023-01-17fix for dbengine2 improvements part 3 (#14284)Costa Tsaousis
return true when the file is already unmounted
2023-01-17DBENGINE v2 - improvements part 3 (#14269)Costa Tsaousis
* reduce journal v2 shared memory using madvise() - not integrated yet * working attempt to minimize dbengine shared memory * never call willneed - let the kernel decide which parts of each file are really needed * journal files get MADV_RANDOM * dont call MADV_DONTNEED too frequently * madvise() is always called with the journal unlocked but referenced * call madvise() even less frequently * added chart for monitoring database events * turn batch mode on under critical conditions * max size to evict is 1/4 of the max * fix max size to evict calculation * use dbengine_page/extent_alloc/free to pages and extents allocations, tracking also the size of these allocations at free time * fix calculation for batch evictions * allow main and open cache to have as many evictors as needed * control inline evictors for each cache; report different levels of cache pressure on every cache evaluation * more inline evictors for extent cache * bypass max inline evictors above critical level * current cache usage has to be taken * re-arrange items in journafile * updated docs - work in progress * more docs work * more docs work * Map / unmap journal file * draw.io diagram for dbengine operations * updated dbengine diagram * updated docs * journal files v2 now get mapped and unmapped as needed * unmap journal v2 immediately when getting retention * mmap and munmap do not block queries evaluating journal files v2 * have only one unmap function Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-01-17Store host and claim info in sqlite as soon as possible (#14263)Emmanuel Vasilakis
* store host and claim info as soon as possible * no need to set the flag * check for metasync_worker.loop
2023-01-17Make sure variables are streamed after SENDER_CONNECTED flag is set (#14283)Emmanuel Vasilakis
make sure vars are sent after SENDER_CONNECTED flag is set
2023-01-17readme updates (#14224)Andrew Maguire
* Clarify the cloud option in the Readme * Add Netdata Cloud image * reviewed some typos and did small tweaks * small typo * Update README.md Co-authored-by: Chris Akritidis <43294513+cakrit@users.noreply.github.com> * Update README.md Co-authored-by: Chris Akritidis <43294513+cakrit@users.noreply.github.com> * Update README.md * typo * grammer * small add * clean up Co-authored-by: Alex Malkov <alex.a.malkov@gmail.com> Co-authored-by: hugovalente-pm <hugo@netdata.cloud> Co-authored-by: Chris Akritidis <43294513+cakrit@users.noreply.github.com>
2023-01-17Check session variable before resuming it (#14279)Emmanuel Vasilakis
check session variable before resuming it
2023-01-17Update infographic image on main README (#14276)Chris Akritidis
2023-01-17minor - add kaitaistruct for journal v2 files (#14267)Timotej S
add kaitaistruct for journal v2 files
2023-01-17[ci skip] Update changelog and version for nightly build: v1.37.0-148-nightly.netdatabot
2023-01-16Fix conditional in matrix generation for packaging jobs. (#14274)Austin S. Hemmelgarn
This will bring us back to running only native packaging jobs on most PRs instead of running all packaging jobs on all PRs.
2023-01-16Set an explicit timeout in updater checks. (#14273)Austin S. Hemmelgarn
If it takes more than an hour to run the updater, something has gone horribly wrong, so just kill it instead of letting it keep running.
2023-01-16Replace individual collector images/links on infographic (#14262)Chris Akritidis
Replace individual collector images/links with one Link to www.netdata.cloud/integrations instead
2023-01-16bump go.d.plugin to v0.49.1 (#14275)Ilya Mashchenko
2023-01-16Switch nightlies to GitHub releases. (#14020)Austin S. Hemmelgarn
* Switch nightlies to GitHub releases. Instead of using GCS. * Fix CI handling. * Fix handling of download URLs for nightly builds. * Fix handling of redirects for consolidated artifact checks. * Avoid redirect issues with the test environment. * Add more info to logs for updater checks. * Ignore redirect issues for updater checks. * Fix base URL handling in updater. * Dump post-update info in CI before checking if the update worked. * Special-case a version of `latest` in updater. This is to allow CI to work correctly. * Update nightly release badge in README.md. * Fix updater check variable name. * Add a comment documenting the magic number in parse_version.
2023-01-16fix(pacakging): fix cpu/memory metrics when running inside LXC container as ↵Ilya Mashchenko
systemd service (#14255) Fixes https://github.com/netdata/netdata/issues/14238
2023-01-16Adds some introspection into the MQTT_WSS (#14039)Timotej S
2023-01-14[ci skip] Update changelog and version for nightly build: v1.37.0-140-nightly.netdatabot
2023-01-14Skip cross-platform validation when preparing releases. (#14268)Austin S. Hemmelgarn
2023-01-14[ci skip] Update changelog and version for nightly build: v1.37.0-138-nightly.netdatabot