adding info about hardware requirements

author: Mateusz Bularz <60339703+M4itee@users.noreply.github.com> 2023-11-27 10:45:18 +0100
committer: Mateusz Bularz <60339703+M4itee@users.noreply.github.com> 2023-11-27 10:45:18 +0100
commit: 54d59a9a3a5127c5eee909ee311fc9bfba054486 (patch)
tree: 009b2bb0c8259b6973f0d99da61763f95913d51f
parent: 29693397a528e6f9355bbc59c5e3898fbf06c633 (diff)
1 files changed, 29 insertions, 0 deletions
diff --git a/docs/netdata-cloud-onprem/getting-started.md b/docs/netdata-cloud-onprem/getting-started.md
index 482214e24d..dd0c6d5bd2 100644
--- a/docs/netdata-cloud-onprem/getting-started.md
+++ b/docs/netdata-cloud-onprem/getting-started.md
@@ -22,6 +22,35 @@ Helm charts are designed for kubernetes to run as the local equivalent of the ne
 - Default storage class configured and working (Persistent volumes based on SSDs are preferred)
 `*` - available in dependencies helm chart for PoC applications.
 
+#### Hardware requirements:
+##### How we tested it:
+- A number of VMs on the AWS EC2, size of the instance was c6a.32xlarge (128CPUs / 256GiB memory).
+- Host system - Ubuntu 22.04.
+- Each VM hosts 200 Agent nodes as docker containers.
+- Agents are connected DIRECTLY to cloud (no Parent-Child relationships). This is the worst option for the cloud.
+- Cloud hosted on 1 kubernetes node c6a.8xlarge (32CPUs / 64GiB memory).
+- Dependencies were also installed on the same node.
+- Maximum connected nodes was ~2000.
+
+##### Results
+There was no point in trying to connect more nodes as we are covering the PoC purposes.
+- In a peak connection phase - All nodes startup were triggered in ~15 minues:
+  - Up to 60% (20 cores) CPU usage of the kubernetes node. Top usage came from:
+    - Ingress controller (we used haproxy ingress controller)
+    - Postgres
+    - Pulsar
+    - EMQX
+  Combined they were responsible for ~30-35% of CPU usage of the node.
+- When all nodes connected and synchronized their state CPU usage floated between 30% and 40% - depending on what we did on the cloud (browsing different). Here top offenders were:
+  - Pulsar
+  - Postgres
+  Combined they were responsible for ~15-20% of CPU usage of the node.
+- Memory usage - 45GiB in a peak. Most of it (~20GiB) was consumed by:
+  - Postgres
+  - Elastic
+  - Pulsar
+
+For a comparison - Netdata Cloud On-prem installation with just 100 nodes connected, without dependencies is going to consume ~2CPUs and ~2GiB of memory (REAL usage, not requests on a Kubernetes).
 
 ## Pulling the helm chart
 Helm chart for the Netdata Cloud On-Prem installation on Kubernetes is available at ECR registry.
author	Mateusz Bularz <60339703+M4itee@users.noreply.github.com>	2023-11-27 10:45:18 +0100
committer	Mateusz Bularz <60339703+M4itee@users.noreply.github.com>	2023-11-27 10:45:18 +0100
commit	54d59a9a3a5127c5eee909ee311fc9bfba054486 (patch)
tree	009b2bb0c8259b6973f0d99da61763f95913d51f
parent	29693397a528e6f9355bbc59c5e3898fbf06c633 (diff)