summaryrefslogtreecommitdiffstats
path: root/aclk
AgeCommit message (Collapse)Author
2020-07-21Replaces mempcpy with memcpy (#9575)Timotej S
both functions do the same, they differ in return value only (which we don't use) some systems do not have mempcpy
2020-07-16Suppress warning -Wformat-truncation in ACLK (#9547)Timotej S
* suppress some truncation warning in places we want truncation
2020-07-10adds support for multiple ACLK query processing threads (#9355)Timotej S
2020-07-08fix ACLK protocol version always parsed as 0 (#9502)Timotej S
2020-06-18Fixed ACLK shutdown sequence (#9367)Timotej S
* fix aclk shutdown sequence
2020-06-16Add missing slash (#9257)oneoneonepig
2020-06-12Add support for persistent metadata (#9324)Stelios Fragkakis
* Implemented collector metadata logging * Added persistent GUIDs for charts and dimensions * Added metadata log replay and automatic compaction * Added detection of charts with no active collector (archived) * Added new endpoint to report archived charts via `/api/v1/archivedcharts` * Added support for collector metadata update Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com>
2020-06-11Adds metrics for ACLK performance and status (#9269)Timotej S
Adds ACLK charts
2020-05-30fix compilation for older systems (#9198)Costa Tsaousis
inherit libs for clock_gettime() when building libmosquitto; Check that X509_VERIFY_PARAM_set1_host is available on the target system
2020-05-20Regenerate topic base on connect (#9044)Andrew Moss
Allow agents to be reclaimed while they are running. Fix a race hazard between claiming and the ACLK. Changes the private key, base topic, username and contents of the LWT. Co-authored-by: <hilari@hilarimoragrega.com>
2020-05-20Add ACLK Connection Details (#9047)Zack Shoylev
* Add ACLK Connection Details * Update aclk/README.md Co-authored-by: Joel Hans <joel@netdata.cloud> * Update README.md Remove heading, decided to consolidate under `Agent-cloud link (ACLK)` heading * Update README.md Remove uncertain future plans. Update docs if we update the product at a later time. * review fix * review feedback Co-authored-by: Joel Hans <joel@netdata.cloud>
2020-05-13Add text to ACLK doc mentioning WebSockets and port (#8968)Joel Hans
* Add text about websockets/port * Update aclk/README.md Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com> * Tweak for Chris Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com>
2020-05-12Fix the latency issue on the ACLK and suppress the diagnostics (#8992)Andrew Moss
The on-connect payloads were large enough to trigger a massive increase in latency on the link and prevent chart updates due to head-of-line blocking. The default window detection in libwebsockets was under-reporting the size of the available window in the network. Overwritten with some sensible values. The large volume of ACLK per-message info-logging is not produced unless the agent is compiled with NETDATA_INTERNAL_CHECKS. The logging now includes latency measurements on the link. Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2020-05-11Docs: Fix internal links and remove obsolete admonitions (#8946)Joel Hans
* Fixed a few more links * Remove old syntax * Abs-relative links to files in docs folder * Trying to fix nother doc learn link * Fix a few more links * Add testing doc * Tracking down mysteries * Cleanup * Update broken external links * Remove index.html that appeared from testing * Fix remainder of links
2020-05-12Docs: Update with go-live claiming and ACLK information (#8859) (#8960)James Mills
* Restore docs from naughty PR * Address Andrew's comments * Ini to conf * Changes based on meeting with Andrew * Tweak text around claiming * Some grammar/typo fixes * Add /var/lib/netdata to Docker instructions on README * Added a few more ACLK links per Chris Co-authored-by: Joel Hans <joel@netdata.cloud>
2020-05-11Enable support for Netdata Cloud.Andrew Moss
This PR merges the feature-branch to make the cloud live. It contains the following work: Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com(opens in new tab)> Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)> Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud(opens in new tab)> Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)> Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com(opens in new tab)> Co-authored-by: Timotej S <6674623+underhood@users.noreply.github.com(opens in new tab)> Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com(opens in new tab)> * dashboard with new navbars, v1.0-alpha.9: PR #8478 * dashboard v1.0.11: netdata/dashboard#76 Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)> * Added installer code to bundle JSON-c if it's not present. PR #8836 Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)> * Fix claiming config PR #8843 * Adds JSON-c as hard dep. for ACLK PR #8838 * Fix SSL renegotiation errors in old versions of openssl. PR #8840. Also - we have a transient problem with opensuse CI so this PR disables them with a commit from @prologic. Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)> * Fix claiming error handling PR #8850 * Added CI to verify JSON-C bundling code in installer PR #8853 * Make cloud-enabled flag in web/api/v1/info be independent of ACLK build success PR #8866 * Reduce ACLK_STABLE_TIMEOUT from 10 to 3 seconds PR #8871 * remove old-cloud related UI from old dashboard (accessible now via /old suffix) PR #8858 * dashboard v1.0.13 PR #8870 * dashboard v1.0.14 PR #8904 * Provide feedback on proxy setting changes PR #8895 * Change the name of the connect message to update during an ongoing session PR #8927 * Fetch active alarms from alarm_log PR #8944
2020-04-22Add http headers to responses (#8760)Andrew Moss
The MQTT payloads for responses to API requests from the cloud now include a headers field with the raw http headers encoded into unicode. This exposes the `Date` and `Expired` fields to the cloud backend.
2020-04-21Docs: Add Docker instructions to claiming (#8755)Joel Hans
* Trying some options * Add Docker command to claiming * Fix linter error * Fix broken links * Add docker run command * Added sections for running/ephemeral containers * Fixes for James
2020-04-17Additional cases for the thread exit fix (#8750)Andrew Moss
2020-04-17Fix crash when shutdown with ACLK disabled (#8725)Lasse Bang Mikkelsen
2020-04-16Docs: Combined claiming+ACLK documentation (#8724)Joel Hans
* Init new documents * Finalize draft of combined claiming doc * Add notice to anonymous stats * Remove .. from links * Update none proxy setting * Changes for Andrew and Manos * Remove E2EE from ACLK * Add details about netdata user
2020-04-16Improved ACLK reconnection sequence (#8729)Stelios Fragkakis
Improved ACLK reconnection sequence
2020-04-15Revert "Improved ACLK reconnection sequence (#8708)" (#8728)cosmix
This reverts commit 853b23745e7df2f163df1c0213c9de52394de36b.
2020-04-15Improved ACLK reconnection sequence (#8708)Stelios Fragkakis
* Change aclk_connecting to a counter and attempt to reconnect in case the LWS layer is not responding (no callbacks received) Do not attempt subscription if we are not connected * Set aclk_connecting before calling the MQTT / LWS layer
2020-04-13Revert "Revert changes since v1.21 in pereparation for hotfix release."Austin S. Hemmelgarn
This reverts commit e2874320fc027f7ab51ab3e115d5b1889b8fd747.
2020-04-13Revert changes since v1.21 in pereparation for hotfix release.Austin S. Hemmelgarn
2020-04-09Improved ACLK memory management and shutdown sequence (#8611)Stelios Fragkakis
Improved ACLK memory management and shutdown sequence
2020-04-08Add session-id using connect timestamp (#8633)Andrew Moss
Added a session-id to the ACLK messages to overcome a problem with the LWT timestamp being out of sequence with the rest of the message flow.
2020-04-03Fix Coverity defects (#8579)Andrew Moss
Fix Coverity CID355287 and CID355289: technically it is a false-positive but it is easier to put a pattern in the code that they can recognise as a sanitizer. The compiler will remove it during optimization. Fix CID353973: the security condition is unlikely to occur but we can avoid it completely. Fix resource leak from CID 355286 and CID 355288. Fixing new resource leak introduced by a previous commit (CID355449)
2020-03-31Switching over to soft feature flag (#8545)Andrew Moss
Preparing for the cloud release. This changes how we handle the feature flag so that it no longer requires installer switches and can be set from the config file. This still requires internal access to use and is not ready for public access yet.
2020-03-31Improve the behavior of claiming (#8516)Andrew Moss
The default cloud url has been updated to app.netdata.cloud ready for the release. The claiming process now checks the current user executing claiming and refuses to perform the claim for the wrong user. If the current UID is 0 then claiming proceeds but the file ownership is adjusted to be the correct netdata user. The default expected user is `netdata` unless the script can identify the user from the current configuration. After the claiming script is executed the CLI is used to reload the claiming state.
2020-03-30Updating the info endpoint for cloud notifications (#8519)Andrew Moss
2020-03-30Write the failure reason during ACLK challenge / response (#8538)Stelios Fragkakis
Improved the error logging in case of ACLK challenge / response failure
2020-03-30Cleans up cloud config files [agent_cloud_link] -> [cloud] (#8501)Timo
* config move from [agent_cloud_link] to [cloud]
2020-03-30Enhanced ACLK header payload to include timestamp-offset-usec (#8499)Stelios Fragkakis
Enhanced the ACLK header payload to include timestamp-offset-usec
2020-03-26Improved ACLK (#8498)Stelios Fragkakis
Improved the stability of the ACLK
2020-03-26Allow insecure SSL in the testing environment (#8489)Timo
2020-03-25Fake collector to provoke ACLK messages (#8427)Andrew Moss
For internal testing use only.
2020-03-25Report ACLK Connection Failure (#8456)Timo
* report callback chain on conn failure
2020-03-25HTTP proxy support + some cleanup (#8418)Timo
* HTTP proxy support + some cleanup * fix unrelated compiler warnings with -Wextra * minor - log proxy setting * run changed code trough .clang-format * fix case when url ends by / * update README
2020-03-19Fixed response payload to match the new specification (#8420)Stelios Fragkakis
* * Modify the payload object to include a code and body object * * Encoding body as a string * * Send the response as a string as specified in the doc
2020-03-18ACLK: Implemented Last Will and Testament (#8410)Stelios Fragkakis
* Added support for Last Will and Testament to the ACLK * On normal agent shutdown an alternate "graceful shutdown" message is published
2020-03-18Fixed JSON parsing (#8426)Stelios Fragkakis
* Fixed an issue with the internal JSON parser which made it fail to parse ACLK challenge/response related payloads
2020-03-17Fix outstanding problems in claiming and add SOCKS5 support. (#8406)Andrew Moss
This commit fixes the known problems in claiming: incorrect reports of success, better treatment of error code and improved visibility of what the script is doing. There has been extensive testing against both environments to check that it works. The socks5 proxy support has been integrated and works for both methods of calling the claiming script. Co-authored-by: Timotej Šiškovič <timotej@netdata.cloud>
2020-03-15Support SOCKS5 in ACLK Challenge/Response and rewrite with LWS (#8404)Timo
* wip * add alpn * beggining of cleanup * move common code * add SOCKS5 support * check HTTP response code * add timeout * separate https_client into own files * fix some mem leaks from master + avoid string copying and alloc/free * fix some PR unrelated warnings
2020-03-15Update the type message for the alarm updates (#8403)Stelios Fragkakis
2020-03-14Improved the performance of the ACLK. (#8391) (#8401)Andrew Moss
* Switched back to lws internal scheduler, small fragment size. * Fixing comment from review
2020-03-14Improving the ACLK performance - initial changes (#8399)Andrew Moss
Add an inspection point for VerneMQ in the local dev env. Remove the bottleneck in sending websocket messages, at the expense of increased CPU-load. Fixed the message encoding. Added support for stress testing - it is still enabled in the main loop so will fire stress-testing payloads when the ACLK is established. Next patch will integrate the socket polling properly to reduce the CPU overhead and remove the stress testing payloads.
2020-03-13ACLK: Improved the agent "pop-corning" phase (#8398)Stelios Fragkakis
* Ignore the cloud commands when the agent is initializing * Tune the agent popcorning * Reorder waiting msg, stable timeout back to 10 seconds * Moved checks for popcorning to the calling functions for code clarity
2020-03-11Change topics for ACLK (#8374)Andrew Moss
Update to topic structure.