Age | Commit message (Collapse) | Author |
|
Fix CID 385073 Uninitialized scalar variable
|
|
|
|
* generate and store an event_hash_id
* transmit to cloud
* transmit to the cloud
|
|
|
|
only queue an alert to cloud when its inserted
|
|
|
|
|
|
* use chart labels to filter alerts
* add entry to readme
* support chart_label=val val2 val3
* docs updates
* more docs
* use rc not rt
|
|
|
|
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
|
|
|
|
* pull aclk schemas
* resolve capas
* handle checkpoints and removed from health
* build with disable-cloud
* codacy 1
* misc changes
* one more char in hash
* free buffer
* change topic
* misc fixes
* skip removed alert variables
* change hash functions
* use create and destroy for compatibility with older openssl
|
|
* initial webrtc setup
* missing files
* rewrite of webrtc integration
* initialization and cleanup of webrtc connections
* make it compile without libdatachannel
* add missing webrtc_initialize() function when webrtc is not enabled
* make c++17 optional
* add build/m4/ax_compiler_vendor.m4
* add ax_cxx_compile_stdcxx.m4
* added new m4 files to makefile.am
* id all webrtc connections
* show warning when webrtc is disabled
* fixed message
* moved all webrtc error checking inside webrtc.cpp
* working webrtc connection establishment and cleanup
* remove obsolete code
* rewrote webrtc code in C to remove dependency for c++17
* fixed left-over reference
* detect binary and text messages
* minor fix
* naming of webrtc threads
* added webrtc configuration
* fix for thread_get_name_np()
* smaller web_client memory footprint
* universal web clients cache
* free web clients every 100 uses
* webrtc is now enabled by default only when compiled with internal checks
* webrtc responses to /api/ requests, including LZ4 compression
* fix for binary and text messages
* web_client_cache is now global
* unification of the internal web server API, for web requests, aclk request, webrtc requests
* more cleanup and unification of web client timings
* fixed compiler warnings
* update sent and received bytes
* eliminated of almost all big buffers in web client
* registry now uses the new json generation
* cookies are now an array; fixed redirects
* fix redirects, again
* write cookies directly to the header buffer, eliminating the need for cookie structures in web client
* reset the has_cookies flag
* gathered all web client cleanup to one function
* fixes redirects
* added summary.globals in /api/v2/data response
* ars to arc in /api/v2/data
* properly handle host impersonation
* set the context of mem.numa_nodes
|
|
|
|
|
|
* Add commit_stats metrics to BTRFS section
* Add error_stats metrics (per device) to BTRFS section
* Simplify commit stats variables and chart ids/names
* Add basic BTRFS error alarms.
Configured to trip whenever any of the error dimensions is non-zero.
* Add chart descriptions for new charts.
* Remove duplicate code
* Comment out some debugging code
* Always create error stats dimensions, even if zero
* Show rate of commits and commit duration instead of totals
* Change current commit metrics to absolute from incremental
* Change commits dimension to absolute and add separate commits time share chart
* Rename 'device_' rrdlabels to 'filesystem_'
* Replace all snprintf() calls with snprintfz()
* Fix codacy warning
* Provide separate error charts for each filesystem device
* Accept code review suggestions for more descriptive context and labels
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
* Add 'device' prefix to id, name, title of errors chart
* Add 'device_id' label to device_errors
* Update health.d/btrfs.conf to match new errors charts
* Remove commented out code
* Do not disable all BTRFS metrics collection if only commit_stats is missing
* Do not disable all BTRFS metrics collection if only error_stats is missing
* Fix bug of BTRFS device add/remove not being detected properly
* Fix double free() error when deleting a device
* Update dashboard info with bold tags
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
---------
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud>
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
|
|
|
|
fix js tag
|
|
* update health/notifications/README.md
* Alerta notification method documentation update
* Amazon SNS and some alerta changes
* notification methods imporvements
* alerta refinements
* awssns refinements
* custom alert refinements
* discord refinements
* email notifications documentation update
* flock notifications documentation update
* alerta edits
* awssns edits
* custom notification method edits
* discord edits
* email notification method edits
* flock edits
* IRC notifications update
* Kavenegar notifications documentation update
* matrix notifications documentation update
* messagebird notifications documentation update
* msteams notifications documentation update
* wording change
* twilio notifications documentation update
* telegram notifications documentation update
* syslog notifications update
* smstools3 notifications documentation update
* rocket.chat notifications documentation update
* pushover notifications documentation update
* pushbullet notifications documentation update
* prowl notifications documentation update
* pagerduty notifications documentation update
* remove comments from example configuration
* slight wording changes
* more notification methods documentation updates
* slack notification documentation update
* add config options to the notifications Introduction page
* crop image twilio
* crop image slack
* crop image pushover
* crop images pushbullet
* crop image messagebird
* crop image kavenegar
|
|
|
|
* query timestamps are now pre-determined and alignment on timestamps is guarranteed
* turn internal_fatal() to internal_error() to investigate the issue
* handle query when no data exist in the db
* check for non NULL dict when running dictionary garbage collect
* support API v2 requests via ACLK
* add nodes detailed information to /api/v2/nodes
* fixed keys and added dummy nodes for completeness
* added nodes_hard_hash, alerts_hard_hash, alerts_soft_hash; started building a nodes status object to reflect the current status of a node
* make sure replication does not double count charts that are already being replicated
* expose min and max in sts structures
* added view_minimum_value and view_maximum_value; percentage calculation is now an additional pass on the data, removed from formatters; absolute value calculation is now done at the query level, removed from formatters
* respect trimming in percentage calculation; updated swagger
* api/v2/weights preparative work to support multi-node queries - still single node though
* multi-node /api/v2/weights endpoint, supporting all the filtering parameters of /api/v2/data
* when passing the raw option, the query exposes the hidden dimensions
* fix compilation issues on older systems
* the query engine now calculates per dimension min, max, sum, count, anomaly count
* use the macro to calculate storage point anomaly rate
* weights endpoint exposing version hashes
* weights method=value shows min, max, average, sum, count, anomaly count, anomaly rate
* query: expose RESET flag; do not add the same point multiple times to the aggregated point
* weights: more compact output
* weights requests can be interrupted
* all /api/v2 requests can be interrupted and timeout
* allow relative timestamps in weights
* fix macos compilation warnings
* Revert "fix macos compilation warnings"
This reverts commit 8a1d24e41e9b58de566ac59f0c4b1c465bcc0592.
* /api/v2/data group-by now works on dimension names, not ids
* /api/v2/weights does not query metrics without retention and new output format
* /api/v2/weights value and anomaly queries do context queries when contexts are filtered; query timeout is now always in ms
|
|
absolute links (#14779)
* Update REFERENCE.md
* replace redirected links
* format the files
* fix redirected link
* format the file
* replace hardcoded links
|
|
* Update freebsd.md
* Update REFERENCE.md
* Update README.md
* Update COLLECTORS.md
|
|
* preparation for /api/v2/contexts
* working /api/v2/contexts
* add anomaly rate information in all statistics; when sum-count is requested, return sums and counts instead of averages
* minor fix
* query targegt now accurately counts hosts, contexts, instances, dimensions, metrics
* cleanup /api/v2/contexts
* full text search with /api/v2/contexts
* simple patterns now support the option to search ignoring case
* full text search API with /api/v2/q
* simple pattern execution optimization
* do not show q when not given
* full text search accounting
* separated /api/v2/nodes from /api/v2/contexts
* fix ssv queries for group_by
* count query instances queried and failed per context and host
* split rrdcontext.c to multiple files
* add query totals
* fix anomaly rate calculation; provide "ni" for indexing hosts
* do not generate zero valued members
* faster calculation of anomaly rate; by just summing integers for each db points and doing math once for every generated point
* fix typo when printing dimensions totals
* added option minify to remove spaces and newlines fron JSON output
* send instance ids and names when they differ
* do not add in query target dimensions, instances, contexts and hosts for which there is no retention in the current timeframe
* fix for the previous + renames and code cleanup
* when a dimension is filtered, include in the response all the other dimensions that are selectable
* do not add nodes that do not have retention in the current window
* move selection of dimensions to query_dimension_add(), instead of query_metric_add()
* increase the pre-processing capacity of queries
* generate instance fqdn ids and names only when they are needed
* provide detailed statistics about tiers retention, queries, points, update_every
* late allocation of query dimensions
* cleanup
* more cleanup
* support for annotations per displayed point, RESET and PARTIAL
* new type annotations
* if a chart is not linked to contexts and it is collected, link it when it is collected
* make ML run reentrant
* make ML rrdr query synchronous
* optimize replication memory allocation of replication_sort_entry
* change units to percentage, when requesting a coefficinet of variation, or a percentage query
* initialize replication before starting main threads
* properly decrement no room requests counter
* propagate the non-zero flag to group-by
* the same by avoiding the extra loop
* respect non-zero in all dimension arrays
* remove dictionary garbage collection from dictionary_entries() and dictionary_version()
* be more verbose when jv2 indexing is postponed
* prevent infinite loop
* use hidden dimensions even when dimensions pattern is unset
* traverse hosts using dictionaries
* fix dictionary unittests
|
|
* make the title metadta the H1
* Update collectors/python.d.plugin/zscores/README.md
* Update libnetdata/ebpf/README.md
* Update ml/README.md
* Update libnetdata/string/README.md
---------
Co-authored-by: Chris Akritidis <43294513+cakrit@users.noreply.github.com>
|
|
|
|
* reorg batch 1
* remove duplicate cloud custom dashboard and agent dashboard
* Simplify the root web/README
* Merge streaming references
* Make enable streaming the overall intro and the README the reference
* Remove reference-streaming document
* Update overview pages
|
|
* Reorg getting started
* Streaming
* Remove blanks
* Fix up to cloud alerts
|
|
* Remove varlib_dir from host structure
* Remove unused parameter
|
|
|
|
* fix broken link in ml/README.md
* fix broken link across all files
* fix broken link across all files
* fix broken links and remove what's next sections
* fix broken links and remove what's next section
* Remove related links sections with broken links that link to removed files
* fix broken links
|
|
|
|
* Change titles of agent alert notifications
* Reintroduce netdata for iot
* Eliminate guides category, merge health config docs
* Rename setup to configuration
* Codacy fixes and move health config reference
|
|
optimization (#14493)
* first work on standardizing json formatting
* renamed old grouping to time_grouping and added group_by
* add dummy functions to enable compilation
* buffer json api work
* jsonwrap opening with buffer_json_X() functions
* cleanup
* storage for quotes
* optimize buffer printing for both numbers and strings
* removed ; from define
* contexts json generation using the new json functions
* fix buffer overflow at unit test
* weights endpoint using new json api
* fixes to weights endpoint
* check buffer overflow on all buffer functions
* do synchronous queries for weights
* buffer_flush() now resets json state too
* content type typedef
* print double values that are above the max 64-bit value
* str2ndd() can now parse values above UINT64_MAX
* faster number parsing by avoiding double calculations as much as possible
* faster number parsing
* faster hex parsing
* accurate printing and parsing of double values, even for very large numbers that cannot fit in 64bit integers
* full printing and parsing without using library functions - and related unit tests
* added IEEE754 streaming capability to enable streaming of double values in hex
* streaming and replication to transfer all values in hex
* use our own str2ndd for set2
* remove subnormal check from ieee
* base64 encoding for numbers, instead of hex
* when increasing double precision, also make sure the fractional number printed is aligned to the wanted precision
* str2ndd_encoded() parses all encoding formats, including integers
* prevent uninitialized use
* /api/v1/info using the new json API
* Fix error when compiling with --disable-ml
* Remove redundant 'buffer_unittest' declaration
* Fix formatting
* Fix formatting
* Fix formatting
* fix buffer unit test
* apps.plugin using the new JSON API
* make sure the metrics registry does not accept negative timestamps
* do not allow pages with negative timestamps to be loaded from db files; do not accept pages with negative timestamps in the cache
* Fix more formatting
---------
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
|
|
* Reorge exporter integrations
* Reorg alert notifications.
change mdx to md for alert related files
* Move all .mdx to .md, including links.
|
|
* fix broken links in claim/README.md
* delete broken link in docs/guidelines.md
* fix broken links
* fix broken link
* fix broken links
* fix broken links
* fix broken links
* fix broken links
* fix broken links
* remove broken link
* fix broken link
* fix broken links
* fix broken links
* fix broken links
* fix broken link
* fix linking phrasing
* fix broken links batch
* fix broken links second batch
* fix broken links
* fix broken links
* fix broken links
* Update COLLECTORS.md
* fix broken links
* fix broken links
|
|
* Just formatting
* Remove single threaded
* Only destroy if we are localhost (ie. shutdown)
|
|
* store only rrdvars health needs
* make it simpler
* only set
* fix codacy
|
|
See https://github.com/netdata/netdata/issues/3495#issuecomment-1408452259
|
|
|
|
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
|
|
* Add the docs for the newly added notification/integrations methods of the cloud.
Notifications: Discord/PagerDuty/Slack/Generic WebHook
* Update docs related to; Managing notification with the new methods.
Co-authored-by: Shyam Sreevalsan <shyam@netdata.cloud>
|
|
|
|
The info is already in the main README
|
|
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
|
|
* replication cancels pending queries on exit
* log when waiting for inflight queries
* when there are collected and not-collected metrics, use the context priority from the collected only
* Write metadata with a faster pace
* Remove journal file size limit and sync mode to 0 / Drop wal checkpoint for now
* Wrap in a big transaction remaining metadata writes (test 1)
* fix higher tiers when tiering iterations = 2
* dbengine always returns db-aligned points; query engine expands the queries by 2 points in every direction to have enough data for interpolation
* Wrap in a big transaction metadata writes (test 2)
* replication cancelling fix
* do not first and last entry in replication when the db has no retention
* fix internal check condition
* Increase metadata write batch size
* always apply error limit to dbengine logs
* Remove code that processes the obsolete health.db files
* cleanup in query.c
* do not allow queries to go beyond db boundaries
* prevent internal log for +1 delta in timestamp
* detect gap pages in conflicts
* double protection for gap injection in main cache
* Add checkpoint to prevent large WAL while running
Remove unused and duplicate functions
* do not allocate chart cache dir if not needed
* add more info to unittests
* revert query expansion to satisfy unittests
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
|
|
|
|
* Moving the cloud docs under /docs/cloud (previous location: netdata/learn/*)
* Added metadata on almost every document of the old learn site for the new ingest process of learn.
* Map old learn document to their best fit as topic related docs.
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
Co-authored-by: DShreve2 <david@netdata.cloud>
Co-authored-by: hugovalente-pm <hugo@netdata.cloud>
|
|
* run cleanup in workers
* when there is a discrepancy between update every, fix it
* fix the other occurences of metric update every mismatch
* allow resetting the same timestamp
* validate flushed pages before committing them to disk
* initialize collection with the latest time in mrg
* these should be static functions
* acquire metrics for writing to detect multiple data collections of the same metric
* print the uuid of the metric that is collected twice
* log the discrepancies of completed pages
* 1 second tolerance
* unify validation of pages and related logging across dbengine
* make do_flush_pages() thread safe
* flush pages runs on libuv workers
* added uv events to tp workers
* dont cross datafile spinlock and rwlock
* should be unlock
* prevent the creation of multiple datafiles
* break an infinite replication loop
* do not log the epxansion of the replication window due to start streaming
* log all invalid pages with internal checks
* do not shutdown event loop threads
* add information about collected page events, to find the root cause of invalid collected pages
* rewrite of the gap filling to fix the invalid collected pages problem
* handle multiple collections of the same metric gracefully
* added log about main cache page conflicts; fix gap filling once again...
* keep track of the first metric writer
* it should be an internal fatal - it does not harm users
* do not check of future timestamps on collected pages, since we inherit the clock of the children; do not check collected pages validity without internal checks
* prevent negative replication completion percentage
* internal error for the discrepancy of mrg
* better logging of dbengine new metrics collection
* without internal checks it is unused
* prevent pluginsd crash on exit due to calling pthread_cancel() on an exited thread
* renames and atomics everywhere
* if a datafile cannot be acquired for deletion during shutdown, continue - this can happen when there are hot pages in open cache referencing it
* Debug for context load
* rrdcontext uuid debug
* rrddim uuid debug
* rrdeng uuid debug
* Revert "rrdeng uuid debug"
This reverts commit 393da190826a582e7e6cc90771bf91b175826d8b.
* Revert "rrddim uuid debug"
This reverts commit 72150b30408294f141b19afcfb35abd7c34777d8.
* Revert "rrdcontext uuid debug"
This reverts commit 2c3b940dc23f460226e9b2a6861c214e840044d0.
* Revert "Debug for context load"
This reverts commit 0d880fc1589f128524e0b47abd9ff0714283ce3b.
* do not use legacy uuids on multihost dbs
* thread safety for journafile size
* handle other cases of inconsistent collected pages
* make health thread check if it should be running in key loops
* do not log uuids
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
|
|
* add consul license alarm
* minor
|