summaryrefslogtreecommitdiffstats
path: root/health
AgeCommit message (Collapse)Author
2020-12-14Kubernetes labels (#10107)Ilya Mashchenko
Co-authored-by: Markos Fountoulakis <markos.fountoulakis.senior@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2020-12-07health: disable 'used_file_descriptors' alarm (#10328)Ilya Mashchenko
2020-12-03File descr alarm v01 (#10192)Fotis Voutsas
2020-12-02Anomalies collector (#10060)Andrew Maguire
ML based anomaly detection python collector built on top of PyOD.
2020-11-28Fix race condition in rrdset_first_entry_t() and rrdset_last_entry_t() (#10276)Markos Fountoulakis
2020-11-26health/web_log: remove `crit` from unmatched alarms (#10280)Ilya Mashchenko
2020-11-25Fix hostname when syslog is used (#10275)thiagoftsm
2020-11-19Docs: Point users to proper configure doc (#10254)Joel Hans
* Point users to proper configure doc * Remove extra text
2020-11-10health: convert `elasticsearch_last_collected` alarm to template (#10226)Ilya Mashchenko
2020-11-06Page duty V2 (#10189)thiagoftsm
Add Pagerduty V2 to Netdata.
2020-11-04Add supported notification platforms to docs (#10170)Joel Hans
* Add supported notification platforms * Fix for Thiago
2020-10-30Hangout thread (#10160)thiagoftsm
Add threads to Hangouts notification.
2020-10-28Opsgenie integration (#9879)thiagoftsm
Bring full integration with Opsgenie.
2020-10-20New alarm entities (#10041)thiagoftsm
Co-authored-by: Joel Hans <joel.g.hans@gmail.com> Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
2020-10-09health/portcheck: add `failed` dim to the `connection_fails` alarm (#10048)Ilya Mashchenko
2020-09-21Remove dupplication (#9968)thiagoftsm
2020-09-21Stackpulse integration (#9965)thiagoftsm
Add integration with Stackpulse.
2020-09-10Change instruction to reload HEALTH (#9869)thiagoftsm
Add netdatacli as instruction to reload health.
2020-09-08improve_http_message: Change error message to be nearest possible RFC2616 ↵thiagoftsm
(#9887)
2020-09-01collect active processes limit feature v2 (#9843)Fotis Voutsas
* Add VARIABLE of pid_max to active_processes chart to use on alarms * use function for pid_max and use a single alarm * fix in alarm
2020-08-26Fix link and clean up frontmatter in health (#9813)Joel Hans
2020-08-19Docs: Add daemon config to health section and standardize IP references (#8837)Joel Hans
* Fix Chris' bug and cleanup * Fixes for Thiago * Fixes for Thiago * Rephrase the health files bullets * Fix in quickstart for Thiago * Fix path * Fix broken link
2020-08-14Send follow up alarms when the initial status matches the notification (#9698)Chris Akritidis
2020-08-13Update health_alarm_notify.conf (#9740)Chris Akritidis
Update comments inside `healt_alarm_notify.conf` that were creating confusion.
2020-08-10Replace alarm redirection link for cloud (#9688)Chris Akritidis
As per #9487, the redirection links only work now with private registries. Temporarily replacing the goto_url with a simple link to the cloud.
2020-07-26Add alarms for FreeBSD interface errors (#8340)Lasse Bang Mikkelsen
Based on net.drops alarms.
2020-07-16Change all instances of alarm to template (#9553)Toby Hammond
Fix megacli.conf alarm.
2020-07-11Remove health from archived metrics (#9520)Markos Fountoulakis
* Disassociate health variables and alarms from archived charts and dimensions. * Ignore archived charts during health reload.
2020-07-06Fix broken link in Kavenegar notification doc (#9492)Joel Hans
* Fix broken link * Retrigger CI
2020-06-29Fixed duplicate alarm ids in health-log.db (#9428)Stelios Fragkakis
Fixed duplicate alarm ids in health-log.db
2020-06-12Change streaming terminology to parent-child in the code (#9323)Andrew Moss
2020-06-12Add support for persistent metadata (#9324)Stelios Fragkakis
* Implemented collector metadata logging * Added persistent GUIDs for charts and dimensions * Added metadata log replay and automatic compaction * Added detection of charts with no active collector (archived) * Added new endpoint to report archived charts via `/api/v1/archivedcharts` * Added support for collector metadata update Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com>
2020-06-08Add revisions to Matrix doc (#9295)Joel Hans
2020-06-08Support for matrix notifications (#9196)David Heidelberg
2020-06-04Move/refactor docs to accomodate new Guides section on Learn (#9266)Joel Hans
* Move directories and change verbiage to guide * Move health guides * Quick fix to collectors quickstart * Fix broken links * Remove health/tutorials dir * Fix links in collectors quickstart * Fix links to go.d pages
2020-06-03Fixes documentation ambiguity leading into issue #8239 (#9255)Timotej S
* docu update * ilyam8 & joelhans comments on PR resolved
2020-05-26New alarms (exporting and Backend) (#9075)thiagoftsm
New alarms for exporting and backend.
2020-05-14Improve the impact of health code on netdata scalability (#8407)Markos Fountoulakis
* Add support for spawning processes without pipes. * Port health_alarm_execute() from mypopen() to netdata_spawn() * Make alarm notifications asynchronous within a single health thread iteration * Initial version of spawn server. * preliminary integration of spawn client with health
2020-05-14Account for zfs.arc_size.min, and correct calc (#8913)araemo
2020-05-12Remove check for old alarm status (#8978)Stelios Fragkakis
Fixed coverity issue (CID 358436)
2020-05-11Docs: Fix internal links and remove obsolete admonitions (#8946)Joel Hans
* Fixed a few more links * Remove old syntax * Abs-relative links to files in docs folder * Trying to fix nother doc learn link * Fix a few more links * Add testing doc * Tracking down mysteries * Cleanup * Update broken external links * Remove index.html that appeared from testing * Fix remainder of links
2020-05-11Enable support for Netdata Cloud.Andrew Moss
This PR merges the feature-branch to make the cloud live. It contains the following work: Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com(opens in new tab)> Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)> Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud(opens in new tab)> Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)> Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com(opens in new tab)> Co-authored-by: Timotej S <6674623+underhood@users.noreply.github.com(opens in new tab)> Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com(opens in new tab)> * dashboard with new navbars, v1.0-alpha.9: PR #8478 * dashboard v1.0.11: netdata/dashboard#76 Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)> * Added installer code to bundle JSON-c if it's not present. PR #8836 Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)> * Fix claiming config PR #8843 * Adds JSON-c as hard dep. for ACLK PR #8838 * Fix SSL renegotiation errors in old versions of openssl. PR #8840. Also - we have a transient problem with opensuse CI so this PR disables them with a commit from @prologic. Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)> * Fix claiming error handling PR #8850 * Added CI to verify JSON-C bundling code in installer PR #8853 * Make cloud-enabled flag in web/api/v1/info be independent of ACLK build success PR #8866 * Reduce ACLK_STABLE_TIMEOUT from 10 to 3 seconds PR #8871 * remove old-cloud related UI from old dashboard (accessible now via /old suffix) PR #8858 * dashboard v1.0.13 PR #8870 * dashboard v1.0.14 PR #8904 * Provide feedback on proxy setting changes PR #8895 * Change the name of the connect message to update during an ongoing session PR #8927 * Fetch active alarms from alarm_log PR #8944
2020-04-22health: fix mdstat `failed devices` alarm (#8794)Ilya Mashchenko
2020-04-22health/portcheck: remove no-clear-notification (#8748)Ilya Mashchenko
2020-04-20added whoisquery health templates (#8700)Yashar Nesabian
Update Makefile.am to add whoisquery.conf
2020-04-15added certificate revocation alert (#8684)Yashar Nesabian
* added certificate revocation alert
2020-04-14Docs: Standardize links between documentation (#8638)Joel Hans
* Trying out some absolute-ish links * Try one out on installer * Testing logic * Trying out some more links * Fixing links * Fix links in python collectors * Changed a bunch more links * Fix build errors * Another push of links * Fix build error and add more links * Complete first pass * Fix final broken links * Fix links to files * Fix for Netlify * Two more fixes
2020-04-13Revert "Revert changes since v1.21 in pereparation for hotfix release."Austin S. Hemmelgarn
This reverts commit e2874320fc027f7ab51ab3e115d5b1889b8fd747.
2020-04-13Revert changes since v1.21 in pereparation for hotfix release.Austin S. Hemmelgarn
2020-04-08health/alarm_notify: add dynatrace enabled check (#8654)Ilya Mashchenko