Age | Commit message (Collapse) | Author |
|
both functions do the same, they differ in return value only (which we don't use)
some systems do not have mempcpy
|
|
* suppress some truncation warning in places we want truncation
|
|
|
|
|
|
* fix aclk shutdown sequence
|
|
|
|
* Implemented collector metadata logging
* Added persistent GUIDs for charts and dimensions
* Added metadata log replay and automatic compaction
* Added detection of charts with no active collector (archived)
* Added new endpoint to report archived charts via `/api/v1/archivedcharts`
* Added support for collector metadata update
Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com>
|
|
Adds ACLK charts
|
|
inherit libs for clock_gettime() when building libmosquitto; Check that X509_VERIFY_PARAM_set1_host is available on the target system
|
|
Allow agents to be reclaimed while they are running. Fix a race hazard between claiming and the ACLK. Changes the private key, base topic, username and contents of the LWT.
Co-authored-by: <hilari@hilarimoragrega.com>
|
|
* Add ACLK Connection Details
* Update aclk/README.md
Co-authored-by: Joel Hans <joel@netdata.cloud>
* Update README.md
Remove heading, decided to consolidate under `Agent-cloud link (ACLK)` heading
* Update README.md
Remove uncertain future plans. Update docs if we update the product at a later time.
* review fix
* review feedback
Co-authored-by: Joel Hans <joel@netdata.cloud>
|
|
* Add text about websockets/port
* Update aclk/README.md
Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com>
* Tweak for Chris
Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com>
|
|
The on-connect payloads were large enough to trigger a massive increase in latency on the link and prevent chart updates due to head-of-line blocking. The default window detection in libwebsockets was under-reporting the size of the available window in the network. Overwritten with some sensible values.
The large volume of ACLK per-message info-logging is not produced unless the agent is compiled with NETDATA_INTERNAL_CHECKS. The logging now includes latency measurements on the link.
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
|
|
* Fixed a few more links
* Remove old syntax
* Abs-relative links to files in docs folder
* Trying to fix nother doc learn link
* Fix a few more links
* Add testing doc
* Tracking down mysteries
* Cleanup
* Update broken external links
* Remove index.html that appeared from testing
* Fix remainder of links
|
|
* Restore docs from naughty PR
* Address Andrew's comments
* Ini to conf
* Changes based on meeting with Andrew
* Tweak text around claiming
* Some grammar/typo fixes
* Add /var/lib/netdata to Docker instructions on README
* Added a few more ACLK links per Chris
Co-authored-by: Joel Hans <joel@netdata.cloud>
|
|
This PR merges the feature-branch to make the cloud live. It contains the following work:
Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com(opens in new tab)>
Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)>
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud(opens in new tab)>
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com(opens in new tab)>
Co-authored-by: Timotej S <6674623+underhood@users.noreply.github.com(opens in new tab)>
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com(opens in new tab)>
* dashboard with new navbars, v1.0-alpha.9: PR #8478
* dashboard v1.0.11: netdata/dashboard#76
Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)>
* Added installer code to bundle JSON-c if it's not present. PR #8836
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
* Fix claiming config PR #8843
* Adds JSON-c as hard dep. for ACLK PR #8838
* Fix SSL renegotiation errors in old versions of openssl. PR #8840. Also - we have a transient problem with opensuse CI so this PR disables them with a commit from @prologic.
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
* Fix claiming error handling PR #8850
* Added CI to verify JSON-C bundling code in installer PR #8853
* Make cloud-enabled flag in web/api/v1/info be independent of ACLK build success PR #8866
* Reduce ACLK_STABLE_TIMEOUT from 10 to 3 seconds PR #8871
* remove old-cloud related UI from old dashboard (accessible now via /old suffix) PR #8858
* dashboard v1.0.13 PR #8870
* dashboard v1.0.14 PR #8904
* Provide feedback on proxy setting changes PR #8895
* Change the name of the connect message to update during an ongoing session PR #8927
* Fetch active alarms from alarm_log PR #8944
|
|
The MQTT payloads for responses to API requests from the cloud now include a headers field with the raw http headers encoded into unicode. This exposes the `Date` and `Expired` fields to the cloud backend.
|
|
* Trying some options
* Add Docker command to claiming
* Fix linter error
* Fix broken links
* Add docker run command
* Added sections for running/ephemeral containers
* Fixes for James
|
|
|
|
|
|
* Init new documents
* Finalize draft of combined claiming doc
* Add notice to anonymous stats
* Remove .. from links
* Update none proxy setting
* Changes for Andrew and Manos
* Remove E2EE from ACLK
* Add details about netdata user
|
|
Improved ACLK reconnection sequence
|
|
This reverts commit 853b23745e7df2f163df1c0213c9de52394de36b.
|
|
* Change aclk_connecting to a counter and attempt to reconnect in case the LWS layer
is not responding (no callbacks received)
Do not attempt subscription if we are not connected
* Set aclk_connecting before calling the MQTT / LWS layer
|
|
This reverts commit e2874320fc027f7ab51ab3e115d5b1889b8fd747.
|
|
|
|
Improved ACLK memory management and shutdown sequence
|
|
Added a session-id to the ACLK messages to overcome a problem with the LWT timestamp being out of sequence with the rest of the message flow.
|
|
Fix Coverity CID355287 and CID355289: technically it is a false-positive but it is easier to put a pattern in the code that they can recognise as a sanitizer. The compiler will remove it during optimization. Fix CID353973: the security condition is unlikely to occur but we can avoid it completely. Fix resource leak from CID 355286 and CID 355288. Fixing new resource leak introduced by a previous commit (CID355449)
|
|
Preparing for the cloud release. This changes how we handle the feature flag so that it no longer requires installer switches and can be set from the config file. This still requires internal access to use and is not ready for public access yet.
|
|
The default cloud url has been updated to app.netdata.cloud ready for the release. The claiming process now checks the current user executing claiming and refuses to perform the claim for the wrong user. If the current UID is 0 then claiming proceeds but the file ownership is adjusted to be the correct netdata user. The default expected user is `netdata` unless the script can identify the user from the current configuration. After the claiming script is executed the CLI is used to reload the claiming state.
|
|
|
|
Improved the error logging in case of ACLK challenge / response failure
|
|
* config move from [agent_cloud_link] to [cloud]
|
|
Enhanced the ACLK header payload to include timestamp-offset-usec
|
|
Improved the stability of the ACLK
|
|
|
|
For internal testing use only.
|
|
* report callback chain on conn failure
|
|
* HTTP proxy support + some cleanup
* fix unrelated compiler warnings with -Wextra
* minor - log proxy setting
* run changed code trough .clang-format
* fix case when url ends by /
* update README
|
|
* * Modify the payload object to include a code and body object
* * Encoding body as a string
* * Send the response as a string as specified in the doc
|
|
* Added support for Last Will and Testament to the ACLK
* On normal agent shutdown an alternate "graceful shutdown" message is published
|
|
* Fixed an issue with the internal JSON parser which made it fail to parse ACLK challenge/response related payloads
|
|
This commit fixes the known problems in claiming: incorrect reports of success, better treatment of error code and improved visibility of what the script is doing. There has been extensive testing against both environments to check that it works. The socks5 proxy support has been integrated and works for both methods of calling the claiming script.
Co-authored-by: Timotej Šiškovič <timotej@netdata.cloud>
|
|
* wip
* add alpn
* beggining of cleanup
* move common code
* add SOCKS5 support
* check HTTP response code
* add timeout
* separate https_client into own files
* fix some mem leaks from master + avoid string copying and alloc/free
* fix some PR unrelated warnings
|
|
|
|
* Switched back to lws internal scheduler, small fragment size.
* Fixing comment from review
|
|
Add an inspection point for VerneMQ in the local dev env. Remove the bottleneck in sending websocket messages, at the expense of increased CPU-load. Fixed the message encoding. Added support for stress testing - it is still enabled in the main loop so will fire stress-testing payloads when the ACLK is established.
Next patch will integrate the socket polling properly to reduce the CPU overhead and remove the stress testing payloads.
|
|
* Ignore the cloud commands when the agent is initializing
* Tune the agent popcorning
* Reorder waiting msg, stable timeout back to 10 seconds
* Moved checks for popcorning to the calling functions for code clarity
|
|
Update to topic structure.
|