summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Makefile.am6
-rw-r--r--README.md18
-rw-r--r--docs/Why-Netdata.md172
-rwxr-xr-xdocs/generator/buildyaml.sh9
-rw-r--r--docs/why-netdata/1s-granularity.md53
-rw-r--r--docs/why-netdata/README.md30
-rw-r--r--docs/why-netdata/immediate-results.md41
-rw-r--r--docs/why-netdata/meaningful-presentation.md63
-rw-r--r--docs/why-netdata/unlimited-metrics.md44
-rw-r--r--web/gui/demosites.html299
10 files changed, 487 insertions, 248 deletions
diff --git a/Makefile.am b/Makefile.am
index 4d7605e32d..aa4d6291c5 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -72,7 +72,11 @@ dist_noinst_DATA= \
docs/high-performance-netdata.md \
docs/netdata-for-IoT.md \
docs/netdata-security.md \
- docs/Why-Netdata.md \
+ docs/why-netdata/README.md \
+ docs/why-netdata/1s-granularity.md \
+ docs/why-netdata/unlimited-metrics.md \
+ docs/why-netdata/meaningful-presentation.md \
+ docs/why-netdata/immediate-results.md \
docs/GettingStarted.md \
docs/Charts.md \
docs/configuration-guide.md \
diff --git a/README.md b/README.md
index ac0d144191..42faf90d5e 100644
--- a/README.md
+++ b/README.md
@@ -42,7 +42,8 @@ Once you use it on your systems, **there is no going back**! *You have been warn
11. [Documentation](#documentation) - read the docs
12. [Community](#community) - discuss with others and get support
13. [License](#license) - check the license of netdata
-
+14. [Is it any good?](#is-it-any-good) - Yes
+15. [Is it awesome?](#is-it-awesome) - Yes
## How it looks
@@ -487,9 +488,22 @@ You can also find netdata on:
- [Repology](https://repology.org/metapackage/netdata/versions)
- [StackShare](https://stackshare.io/netdata)
-## License
+## License
netdata is [GPLv3+](LICENSE).
Netdata re-distributes other open-source tools and libraries. Please check the [third party licenses](REDISTRIBUTED.md).
+## Is it any good?
+
+Yes.
+
+*When people first hear about a new product, they frequently ask if it is any good. A Hacker News user [remarked](https://news.ycombinator.com/item?id=3067434):*
+
+> Note to self: Starting immediately, all raganwald projects will have a “Is it any good?” section in the readme, and the answer shall be “yes.".
+
+So, we follow the tradition...
+
+## Is it awesome?
+
+[These people](https://github.com/netdata/netdata/stargazers) seem to like it.
diff --git a/docs/Why-Netdata.md b/docs/Why-Netdata.md
deleted file mode 100644
index 5d290e01a4..0000000000
--- a/docs/Why-Netdata.md
+++ /dev/null
@@ -1,172 +0,0 @@
-# Why Netdata
-
-![image8](https://cloud.githubusercontent.com/assets/2662304/14253735/536f4580-fa95-11e5-9f7b-99112b31a5d7.gif)
-
-## Netdata is unique!
-
-The following is an animated GIF showing **netdata**'s ability to monitor QoS. The timings of this animation have not been altered, this is the real thing:
-
-![animation5](https://cloud.githubusercontent.com/assets/2662304/12373715/0da509d8-bc8b-11e5-85cf-39d5234bf976.gif)
-
-Check the details on this animation:
-
-1. At the beginning the charts auto-refresh, in real-time
-2. Charts can be dragged and zoomed (either mouse or touch)
-3. You pan or zoom one, the others follow
-4. Mouse over on one, selects the same timestamp on all
-5. Dimensions can be enabled or disabled
-6. All refreshes are instant (an 8 year old core-2 duo computer was used to record this)
-
-There are a lot of excellent open source tools for collecting and visualizing performance metrics. Check for example [collectd](https://collectd.org/), [OpenTSDB](http://opentsdb.net/), [influxdb](https://influxdata.com/), [Grafana](http://grafana.org/), etc.
-
-So, why **netdata**?
-
-Well, **netdata** has a quite different approach.
-
-## Simplicity
-
-> Most monitoring solutions require endless configuration of whatever imaginable. Well, this is a linux box. Why do we need to configure every single metric we need to monitor. Of course it has a CPU and RAM and a few disks, and ethernet ports, it might run a firewall, a web server, or a database server and so on. Why do we need to configure all these metrics?
-
-**Netdata** has been designed to auto-detect everything. Of course you can enable, tweak or disable things. But by default, if **netdata** can retrieve `/server-status` from an web server you run on your linux box, it will automatically collect all performance metrics. This happens for apache, squid, nginx, mysql, opensips, etc. It will also automatically collect all available system values for CPU, memory, disks, network interfaces, QoS (with labels if you also use [FireQOS](http://firehol.org)), etc. Even for applications that do not offer performance metrics, it will automatically group the whole process tree and provide metrics like CPU usage, memory allocated, opened files, sockets, disk activity, swap activity, etc per application group.
-
-Netdata supports plenty of [configuration](../daemon/config/). However, we have done everything we can to allow netdata to auto-detect as much as possible.
-
-Even netdata plugins are designed to support configuration-less operation. So, you just install and run netdata. You will need to configure something only if it cannot be auto-detected.
-
-> Take any performance monitoring solution and try to troubleshoot a performance problem. At the end of the day you will have to ssh to the server to understand what exactly is happening. You will have to use `iostat`, `iotop`, `vmstat`, `top`, `iperf`, `ethtool` and probably a few dozen more console tools to figure it out.
-
-With **netdata**, this need is eliminated significantly. Of course you will ssh. Just not for monitoring performance.
-
-If you install **netdata** you will prefer it over the console tools. **Netdata** visualizes the data, while the console tools just show their values. The detail is the same - I have spent pretty much time reading the source code of the console tools, to figure out what needs to do done in netdata, so that the data, the values, will be the same. Actually, **netdata** is more precise than most console tools, it will interpolate all collected values to second boundary, so that even if something took a few microseconds more to be collected, netdata will correctly estimate the per second value.
-
-**Netdata** visualizes data in ways you cannot even imagine on a console. It allows you to see the present in real-time, much like the console tools, but also the recent past, compare different metrics with each other, zoom in to see the recent past in detail, or zoom out to have a helicopter view of what is happening in longer durations, build custom dashboards with just the charts you need for a specific purpose.
-
-Most engineers that install netdata, ssh to the server to tweak system or application settings and at the same time they monitor the result of the new settings in **netdata** on their browser.
-
-## Per second data collection and visualization
-
-**Per second data collection and visualization** is usually only available in dedicated console tools, like `top`, `vmstat`, `iostat`, etc. Netdata brings per second data collection and visualization to all applications, accessible through the web.
-
-*You are not convinced per second data collection is important?*
-**Click** this image for a demo:
-
-[![image](https://cloud.githubusercontent.com/assets/2662304/12373555/abd56f04-bc85-11e5-9fa1-10aa3a4b648b.png)](http://netdata.firehol.org/demo2.html)
-
-## Realtime monitoring
-
-> Any performance monitoring solution that does not go down to per second collection and visualization of the data, is useless. It will make you happy to have it, but it will not help you more than that.
-
-Visualizing the present in **real-time and in great detail**, is the most important value a performance monitoring solution should provide. The next most important is the last hour, again per second. The next is the last 8 hours and so on, up to a week, or at most a month. In my 20+ years in IT, I needed just once or twice to look a year back. And this was mainly out of curiosity.
-
-Of course real-time monitoring requires resources. **netdata** is designed to be very efficient:
-
-1. collecting performance data is a repeating process - you do the same thing again and again. **Netdata** has been designed to learn from each iteration, so that the next one will be faster. It learns the sizes of files (it even keeps them open when it can), the number of lines and words per line they contain, the sizes of the buffers it needs to process them, etc. It adapts, so that everything will be as ready as possible for the next iteration.
-2. internally, it uses hashes and indexes (b-trees), to speed up lookups of metrics, charts, dimensions, settings.
-3. it has an in-memory round robin database based on a custom floating point number that allows it to pack values and flags together, in 32 bits, to lower its memory footprint.
-4. its internal web server is capable of generating JSON responses from live performance data with speeds comparable to static content delivery (it does not use `printf`, it is actually 11 times faster than in generating JSON compared to `printf`).
-
-**Netdata** will use some CPU and memory, but it **will not produce any disk I/O at all**, apart its logs (which you can disable if you like).
-
-Most servers should have plenty of CPU resources (I consider a hardware upgrade or application split when a server averages around 40% CPU utilization at the peak hour). Even if a server has limited CPU resources available, you can just lower the data collection frequency of **netdata**. Going from per second to every 2 seconds data collection, will cut the **netdata** CPU requirements in half and you will still get charts that are just 2 seconds behind.
-
-The same goes for memory. If you just keep an hour of data (which is perfect for performance troubleshooting), you will most probably need 15-20MB. You can also enable the kernel de-duper (Kernel Same-Page Merging) and **netdata** will offer to it all its round robin database. KSM can free 20-60% of the memory used by **netdata** (guess why: there are a lot of metrics that are always zero or just constant).
-
-When netdata runs on modern computers (even on CELERON processors), most chart queries are replied in less than 3 milliseconds! **Not seconds, MILLISECONDS!** Less than 3 milliseconds for calculating the chart, generating JSON text, compressing it and sending it to your web browser. Timings are logged in netdata's `access.log` for you to examine.
-
-Netdata is written in plain `C` and the key system plugins are written in `C` too. Its speed can only be compared to the native console system administration tools.
-
-You can also stress test your netdata installation by running the script `tests/stress.sh` found in the distribution. Most modern server hardware can serve more than 300 chart refreshes per second per core. A raspberry pi 2, can serve 300+ chart refreshes per second utilizing all of its 4 cores.
-
-
-## No disk I/O at all
-
-Netdata does not use any disk I/O, apart from its logs and even these can be disabled.
-
-Netdata will use some memory (you size it, check [[Memory Requirements]] and CPU (below 2% of a single core for the daemon, plugins may require more, check [[Performance]]), but normally your systems should have plenty of these resources available and spare.
-
-The design goal of **NO DISK I/O AT ALL** effectively means netdata will not disrupt your applications.
-
-## No root access
-
-You don't need to run netdata as root. If started as root, netdata will switch to the `netdata` user (or any other user given in its configuration or command line argument).
-
-There are a few plugins that in order to collect values need root access. These (and only these) are setuid to root.
-
-## Visualizes QoS
-
-Netdata visualizes `tc` QoS classes automatically. If you also use FireQOS, it will also collect interface and class names.
-
-Check this animated GIF (generated with [ScreenToGif](https://github.com/NickeManarin/ScreenToGif)):
-
-![animation5](https://cloud.githubusercontent.com/assets/2662304/12373715/0da509d8-bc8b-11e5-85cf-39d5234bf976.gif)
-
-## Embedded web server
-
-> Most solutions require dedicated servers to actually use the monitoring console. To my perspective, this is totally unneeded for performance monitoring. All of us have a spectacular tool on our desktops, that allows us to connect in real time to any server in the world: **the web browser**. It shouldn't be so hard to use the same tool to connect in real-time to all our servers.
-
-With **netdata**, there is no need to centralize anything for performance monitoring. You view everything directly from their source. No need to run something else to access netdata. Of course you can use a firewall, or a reverse proxy, to limit access to it. But for most systems, inside your DMZ, just running it will be enough.
-
-Still, with **netdata** you can build dashboards with charts from any number of servers. And these charts will be connected to each other much like the ones that come from the same server. You will hover on one and all of them will show the relative value for the selected timestamp. You will zoom or pan one and all of them will follow. **Netdata** achieves that because the logic that connects the charts together is at the browser, not the server, so that all charts presented on the same page are connected, no matter where they come from.
-
-## Performance monitoring, scaled properly
-
-"Properly"? What is "properly"?
-
-We know software solutions can **scale up** (i.e. you replace its resources with bigger ones), or **scale out** (i.e. you add more smaller resources to it). In both cases, to get more of it, you need to supply **more resources**.
-
-So, what is "scaled properly"?
-
-Traditionally, monitoring solutions centralize all metric data to provide unified dashboards across all servers. So, you install agents on all your servers to collect system and application metrics which are then sent to a central place for storage and processing. Depending on the solution you use, the central place can either **scale up** or **scale out** (or a mix of the two).
-
-"Scaled properly" is something completely different. "Scaled properly" minimizes the need for a "central place", so that **there is nothing to be scaled**!
-
-Wait a moment! You cannot take out the "central place" of a monitoring solution!
-
-Yes, we can! well... most of it, but before explaining how, let's see what happens today:
-
-Monitoring solutions are a key component for any online service. These solutions usually consume considerable amount of resources. This is true for both "scale-up" and "scale-out" solutions. These resources require maintenance and administration too. To balance the resources required, these monitoring solutions follow a few simple rules:
-
-1. The number of metrics collected per server is limited. They collect CPU, RAM, DISK, NETWORK metrics and a few application metrics.
-
-2. The data collection frequency of each metric is also very low, at best it is once every 10 or 15 seconds, at worst every 5 or 10 mins.
-
-Due to all the above, most centralized monitoring solutions are usually good for alarms and **statistics of past performance**. The alarms usually trigger every 1 to 5 minutes and you get a few low-resolution charts about the past performance of your servers.
-
-Well... there is something wrong in this approach! Can you see it?
-
-Let's see the netdata approach:
-
-1. Data collection happens **per second**. This allows true real-time performance monitoring.
-
-2. **Thousands of metrics** per server and application are collected, **every single second**. The number of metrics collected is not a problem.
-
-3. Data do not leave the server they are collected. Data are not centralized, so the need for a huge central place that will process and store gazillions of data is not needed.
-
- > Ok, I hear a few of you complaining already - you will find out... patience...
-
-4. netdata does not use any DISK I/O while running (apart its log files - and even these can be disabled) and netdata runs with the lowest possible process priority, so that **your applications will never be affected by it**.
-
-5. Each netdata is standalone. Your web browser connects directly to each server to present real-time dashboards. The charts are so snappy, so real-time, so fast that we can call netdata, **a console killer for performance monitoring**.
-
-The charting libraries **netdata** uses, are the fastest possible ([Dygraphs](http://dygraphs.com/) do make the difference!) and **netdata** respects browser resources. Data are just rendered on a canvas. No processing in javascript at all.
-
-6. netdata is very efficient: just 2% of a single core is required and some RAM, and you can actually control how much of both you want to allocate to it.
-
-
-Server side, chart data generation scales pretty well. You can expect 400+ chart refreshes per second per core on modern hardware. For a page with 10 charts visible (the page may have hundreds, but only the visible are refreshed), just a tiny fraction of a single CPU core will be used for servicing them. Even these refreshes stop when you switch tabs on your browser, you focus on another window, scroll to a part of the page without charts, zoom or pan a chart. And of course the **netdata** server runs with the lowest possible process priority, so that your production environment, your applications, will not be slowed down by the netdata server.
-
-7. netdata dashboards can be multi-server (check: [http://my-netdata.io](http://my-netdata.io)) - your browser connects to each netdata server directly.
-
-So, using netdata, your monitoring infrastructure is embedded on each server, limiting significantly the need of additional resources. netdata is very resource efficient and utilizes server resources that already exist and are spare (on each server).
-
-Of course, there are a few issues that need to be addressed with this approach:
-
-1. We need an index of all netdata installations we have
-2. We need a place to handle notifications and alarms
-3. We need a place to save statistics of past performance
-
-Our approach uses the netdata [registry](../registry/). The registry solves the problem of maintaining a list of all the netdata installations we have. It does this transparently, without any configuration. It tracks the netdata servers your web browser has visited and bookmarks them at the `my-netdata` menu.
-
-Every netdata can be a registry. You can use the global one we provided for free, or pick one of your netdata servers and turn it to a registry for your network.
-
-[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2FWhy-Netdata&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/docs/generator/buildyaml.sh b/docs/generator/buildyaml.sh
index 480476a92f..10811b17e4 100755
--- a/docs/generator/buildyaml.sh
+++ b/docs/generator/buildyaml.sh
@@ -116,14 +116,19 @@ nav:'
navpart 1 . README "About"
-echo -ne " - 'docs/Why-Netdata.md'
- - 'docs/Demo-Sites.md'
+echo -ne " - 'docs/Demo-Sites.md'
- 'docs/netdata-security.md'
- 'docs/Donations-netdata-has-received.md'
- 'docs/a-github-star-is-important.md'
- REDISTRIBUTED.md
- CHANGELOG.md
- CONTRIBUTING.md
+- Why Netdata:
+ - 'docs/why-netdata/README.md'
+ - 'docs/why-netdata/1s-granularity.md'
+ - 'docs/why-netdata/unlimited-metrics.md'
+ - 'docs/why-netdata/meaningful-presentation.md'
+ - 'docs/why-netdata/immediate-results.md'
- Installation:
- 'packaging/installer/README.md'
- 'packaging/docker/README.md'
diff --git a/docs/why-netdata/1s-granularity.md b/docs/why-netdata/1s-granularity.md
new file mode 100644
index 0000000000..0898545436
--- /dev/null
+++ b/docs/why-netdata/1s-granularity.md
@@ -0,0 +1,53 @@
+# 1s granularity
+
+High resolution metrics are required to effectively monitor and troubleshoot systems and applications.
+
+## Why?
+
+- The world is going real-time. Today, customer experience is significantly affected by response time, so SLAs are tighter than ever before. It is just not practical to monitor a 2-second SLA with 10-second metrics.
+
+- IT goes virtual. Unlike real hardware, virtual environments are not linear, nor predictable. You cannot expect resources to be available when your applications need them. They will eventually be, but not exactly at the time they are needed. The latency of virtual environments is affected by many factors, most of which are outside our control, like: the maintenance policy of the hosting provider, the work load of third party virtual machines running on the same physical servers combined with the resource allocation and throttling policy among virtual machines, the provisioning system of the hosting provider, etc.
+
+## What do others do?
+
+So, why don't most monitoring platforms and monitoring SaaS providers offer high resolution metrics?
+
+They want to, but they can't, at least not massively.
+
+The reasons lie in their design decisions:
+
+1. Time-series databases (prometheus, graphite, opentsdb, influxdb, etc) centralize all the metrics. At scale, these databases can easily become the bottleneck of the whole infrastructure.
+
+2. SaaS providers base their business models on centralizing all the metrics. On top of the time-series database bottleneck they also have increased bandwidth costs. So, massively supporting high resolution metrics, destroys their business model.
+
+Of course, since a couple of decades the world has fixed this kind of scaling problems: instead of scaling up, scale out, horizontally. That is, instead of investing on bigger and bigger central components, decentralize the application so that it can scale by adding more smaller nodes to it.
+
+There have been many attempts to fix this problem for monitoring. But so far, all solutions required centralization of metrics, which can only scale up. So, although the problem is somehow managed, it is still the key problem of all monitoring platforms and one of the key reasons for increased monitoring costs.
+
+Another important factor is how resource efficient data collection can be when running per second. Most solutions fail to do it properly. The data collection agent is consuming significant system resources when running "per second", influencing the monitored systems and applications to a great degree.
+
+Finally, per second data collection is a lot harder. Busy virtual environments have [a constant latency of about 100ms, spread randomly to all data sources](https://docs.google.com/presentation/d/18C8bCTbtgKDWqPa57GXIjB2PbjjpjsUNkLtZEz6YK8s/edit#slide=id.g422e696d87_0_57). If data collection is not implemented properly, this latency introduces a random error of +/- 10%, which is quite significant for a monitoring system.
+
+So, the monitoring industry fails to massively provide high resolution metrics, mainly for 3 reasons:
+
+1. Centralization of metrics makes monitoring cost inefficient at that rate.
+2. Data collection needs optimization, otherwise it will significantly affect the monitored systems.
+3. Data collection is a lot harder, especially on busy virtual environments.
+
+## What does netdata do differently?
+
+Netdata decentralizes monitoring completely. Each Netdata node is autonomous. It collects metrics locally, it stores them locally, it runs checks against them to trigger alarms locally, and provides an API for the dashboards to visualize them. This allows Netdata to scale to infinity.
+
+Of course, Netdata can centralize metrics when needed. For example, it is not practical to keep metrics locally on ephemeral nodes. For these cases, Netdata streams the metrics in real-time, from the ephemeral nodes to one or more non-ephemeral nodes nearby. This centralization is again distributed. On a large infrastructure, there may be many centralization points.
+
+To eliminate the error introduced by data collection latencies on busy virtual environments, Netdata interpolates collected metrics. It does this using microsecond timings, per data source, offering measurements with an error rate of 0.0001%. When running [in debug mode, netdata calculates this error rate](https://github.com/netdata/netdata/blob/36199f449852f8077ea915a3a14a33fa2aff6d85/database/rrdset.c#L1070-L1099) for every point collected, ensuring that the database works with acceptable accuracy.
+
+Finally, Netdata is really fast. Optimization is a core product feature. On modern hardware, Netdata can collect metrics with a rate of above 1M metrics per second per core (this includes everything, parsing data sources, interpolating data, storing data in the time series database, etc). So, for a few thousands metrics per second per node, Netdata needs negligible CPU resources (just 1-2% of a single core).
+
+Netdata has been designed to:
+- Solve the centralization problem of monitoring
+- Replace the console for performance troubleshooting.
+
+So, for Netdata 1s granularity is easy, the natural outcome...
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fwhy-netdata%2F1s-granularity&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/docs/why-netdata/README.md b/docs/why-netdata/README.md
new file mode 100644
index 0000000000..df8c0d02b5
--- /dev/null
+++ b/docs/why-netdata/README.md
@@ -0,0 +1,30 @@
+# Why Netdata
+
+> Any performance monitoring solution that does not go down to per second
+> collection and visualization of the data, is useless.
+> It will make you happy to have it, but it will not help you more than that.
+
+Netdata is built around 4 principles:
+
+1. **[Per second data collection for all metrics.](1s-granularity.md)**
+
+ *It is impossible to monitor a 2 second SLA, with 10 second metrics.*
+
+2. **[Collect and visualize all the metrics from all possible sources.](unlimited-metrics.md)**
+
+ *To troubleshoot slowdowns, we need all the available metrics. The console should not provide more metrics.*
+
+3. **[Meaningful presentation, optimized for visual anomaly detection.](meaningful-presentation.md)**
+
+ *Metrics are a lot more than name-value pairs over time. The monitoring tool should know all the metrics. Users should not!*
+
+4. **[Immediate results, just install and use.](immediate-results.md)**
+
+ *Most of our infrastructure is standardized. There is no point to configure everything metric by metric.*
+
+Unlike other monitoring solutions that focus on metrics visualization,
+Netdata's helps us troubleshoot slowdowns without touching the console.
+
+So, everything is a bit different.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2FWhy-Netdata&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/docs/why-netdata/immediate-results.md b/docs/why-netdata/immediate-results.md
new file mode 100644
index 0000000000..9afe4afdcf
--- /dev/null
+++ b/docs/why-netdata/immediate-results.md
@@ -0,0 +1,41 @@
+# Immediate results
+
+Most of our infrastructure is based on standardized systems and applications.
+
+It is a tremendous waste of time and effort, in a global scale, to require from all users to configure their infrastructure dashboards and alarms metric by metric.
+
+## Why?
+
+Most of the existing monitoring solutions, focus on providing a platform "for building your monitoring". So, they provide the tools to collect metrics, store them, visualize them, check them and query them. And we are all expected to go through this process.
+
+However, most of our infrastructure is standardized. We run well known Linux distributions, the same kernel, the same database, the same web server, etc.
+
+So, why can't we have a monitoring system that can be installed and instantly provide feature rich dashboards and alarms about everything we use? Is there any reason you would like to monitor your web server differently than me?
+
+What a waste of time and money! Hundreds of thousands of people doing the same thing over and over again, trying to understand what the metrics are, how to visualize them, how to configure alarms for them and how to query them when issues arise.
+
+## What do others do?
+
+Open-source solutions rely almost entirely on configuration. So, you have to go through endless metric-by-metric configuration yourself. The result will reflect your skills, your experience, your understanding.
+
+Monitoring SaaS providers offer a very basic set of pre-configured metrics, dashboards and alarms. They assume that you will configure the rest you may need. So, once more, the result will reflect your skills, your experience, your understanding.
+
+## What does netdata do?
+
+1. Metrics are auto-detected, so for 99% of the cases data collection works out of the box.
+2. Metrics are converted to human readable units, right after data collection, before storing them into the database.
+3. Metrics are structured, organized in charts, families and applications, so that they can be browsed.
+4. Dashboards are automatically generated, so all metrics are available for exploration immediately after installation.
+5. Dashboards are not just visualizing metrics; they are a tool, optimized for visual anomaly detection.
+6. Hundreds of pre-configured alarm templates are automatically attached to collected metrics.
+
+The result is that Netdata can be used immediately after installation!
+
+Netdata:
+
+- Helps engineers understand and learn what the metrics are.
+- Does not require any configuration. Of course there are thousands of options to tweak, but the defaults are pretty good for most systems.
+- Does not introduce any query languages or any other technology to be learned. Of course some familiarity with the tool is required, but nothing too complicated.
+- Includes all the community expertise and experience for monitoring systems and applications.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fwhy-netdata%2Fimmediate-results&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/docs/why-netdata/meaningful-presentation.md b/docs/why-netdata/meaningful-presentation.md
new file mode 100644
index 0000000000..6414d023f3
--- /dev/null
+++ b/docs/why-netdata/meaningful-presentation.md
@@ -0,0 +1,63 @@
+# Meaningful presentation
+
+Metrics are a lot more than name-value pairs over time. It is just not practical to require from all users to have a deep understanding of all metrics for monitoring their systems and applications.
+
+## Why?
+
+There is a plethora of metrics. And each of them has a context, a meaning, a way to be interpreted.
+
+Traditionally, monitoring solutions instruct engineers to collect only the metrics they understand. This is a good strategy as long as you have a clear understanding of what you need and you have the skills, the expertise and the experience to select them.
+
+For most people, this is an impossible task. It is just not practical to assume that any engineer will have a deep understanding of how the kernel works, how the networking stack works, how the system manages its memory, how it schedules processes, how web servers work, how databases work, etc.
+
+The result is that for most of the world, monitoring sucks. It is incomplete, inefficient, and in most of the cases only useful for providing an illusion that the infrastructure is being monitored. It is not! According to the [State of Monitoring 2017](http://start.bigpanda.io/state-of-monitoring-report-2017), only 11% of the companies are satisfied with their existing monitoring infrastructure, and on the average they use 6-7 monitoring tools.
+
+But even if all the metrics are collected, an even bigger challenge is revealed: What to do with them? How to use them?
+
+The existing monitoring solutions, assume the engineers will:
+
+- Design dashboards
+- Configure alarms
+- Use a query language to investigate issues
+
+However, all these have to be configured metric by metric.
+
+The monitoring industry believes there is this "IT Operations Hero", a person combining these abilities:
+
+1. Has a deep understanding of IT architectures and is a skillful SysAdmin.
+2. Is a superb Network Administrator (can even read and understand the Linux kernel networking stack).
+3. Is a exceptional database administrator.
+4. Is fluent in software engineering, capable of understanding the internal workings of applications.
+5. Masters Data Science, statistical algorithms and is fluent in writing advanced mathematical queries to reveal the meaning of metrics.
+
+Of course this person does not exist!
+
+## What do others do?
+
+Most solutions are based on a time-series database. A database that tracks name-value pairs, over time.
+
+Data collection blindly collects metrics and stores them into the database, dashboard editors query the database to visualize the metrics. They may also provide a query editor, that users can use to query the database by hand.
+
+Of course, it is just not practical to work that way when the database has 10,000 unique metrics. Most of them will be just noise, not because they are not useful, but because no one understands them!
+
+So, they collect very limited metrics. Basic dashboards can be created with these metrics, but for any issue that needs to be troubleshooted, the monitoring system is just not adequate. It cannot help. So, engineers are using the console to access the rest of the metrics and find the root cause.
+
+## What does netdata do?
+
+In netdata, the meaning of metrics is incorporated into the database:
+
+1. all metrics are converted and stored to human-friendly units. This is a data-collection process, not a visualization process. For example, cpu utilization in Netdata is stored as percentage, not as kernel ticks.
+
+2. all metrics are organized into human-friendly charts, sharing the same context and units (similar to what other monitoring solutions call `cardinality`). So, when Netdata developer collect metrics, they configure the correlation of the metrics right in data collection, which is stored in the database too.
+
+3. all charts are then organized in families, and chart families are organized in applications. These structures are responsible for providing the menu at the right side of Netdata dashboards for exploring the whole database.
+
+The result is a system that can be browsed by humans, even if the database has 100,000 unique metrics. It is pretty natural for everyone to browse them, understand their meaning and their scope.
+
+Of course, this process makes data collection significantly more time consuming. Netdata developers need to normalize and correlate and categorize every single metric Netdata collects.
+
+But it simplifies everything else. Data collection, metrics database and visualization are de-coupled, thus the query engine is simpler, and the visualization is straight forward.
+
+Netdata goes a step further, by enriching the dashboard with information that is useful for most people. So, to improve clarity and help users be more effective, Netdata includes right in the dashboard the community knowledge and expertise about the metrics. So, that Netdata users can focus on solving their infrastructure problem, not on the technicalities of data collection and visualization.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fwhy-netdata%2Fmeaningful-presentation&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/docs/why-netdata/unlimited-metrics.md b/docs/why-netdata/unlimited-metrics.md
new file mode 100644
index 0000000000..e35034a2b8
--- /dev/null
+++ b/docs/why-netdata/unlimited-metrics.md
@@ -0,0 +1,44 @@
+# Unlimited metrics
+
+All metrics are important and all metrics should be available when you need them.
+
+## Why?
+
+Collecting all the metrics breaks the first rule of every monitoring text book: "collect only the metrics you need", "collect only the metrics you understand".
+
+Unfortunately, this does not work! Filtering out most metrics is like reading a book by skipping most of its pages...
+
+For many people, monitoring is about:
+
+- Detecting outages
+- Capacity planning
+
+However, **slowdowns are 10 times more common** compared to outages (check slide 14 of [Online Performance is Business Performance ](https://www.slideshare.net/KenGodskind/alertsitetrac) reported by Trac Research/AlertSite). Designing a monitoring system targeting only outages and capacity planning solves just a tiny part of the operational problems we face. Check also [Downtime vs. Slowtime: Which Hurts More?](https://dzone.com/articles/downtime-vs-slowtime-which-hurts-more).
+
+To troubleshoot a slowdown, a lot more metrics are needed. Actually all the metrics are needed, since the real cause of a slowdown is most probably quite complex. If we knew the possible reasons, chances are we would have fixed them before they become a problem.
+
+## What do others do?
+
+Most monitoring solutions, when they are able to detect something, provide just a hint (e.g. "hey, there is a 20% drop in requests per second over the last minute") and they expect us to use the console for determining the root cause.
+
+Of course this introduces a lot more problems: how to troubleshoot a slowdown using the console, if the slowdown lifetime is just a few seconds, randomly spread throughout the day?
+
+You can't! You will spend your entire day on the console, waiting for the problem to happen again while you are logged in. A blame war starts: developers blame the systems, sysadmins blame the hosting provider, someone says it is a DNS problem, another one believes it is network related, etc. We have all experienced this, multiple times...
+
+So, why do monitoring solutions and SaaS providers filter out metrics?
+
+They can't do otherwise!
+
+1. Centralization of metrics depends on metrics filtering, to control monitoring costs. Time-series databases limit the number of metrics collected, because the number of metrics influences their performance significantly. They get congested at scale.
+2. It is a lot easier to provide an illusion of monitoring by using a few basic metrics.
+3. Troubleshooting slowdowns is the hardest IT problem to solve, so most solutions just avoid it.
+
+## What does netdata do?
+
+Netdata collects, stores and visualizes everything, every single metric exposed by systems and applications.
+
+Due to Netdata's distributed nature, the number of metrics collected does not have any noticeable effect on the performance or the cost of the monitoring infrastructure.
+
+Of course, since netdata is also about [meaningful presentation](meaningful-presentation.md), the number of metrics makes Netdata development slower. We, the Netdata developers, need to have a good understanding of the metrics before adding them into Netdata. We need to organize the metrics, add information related to them, configure alarms for them, so that you, the Netdata users, will have the best out-of-the-box experience and all the information required to kill the console for troubleshooting slowdowns.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fwhy-netdata%2Funlimited-metrics&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/web/gui/demosites.html b/web/gui/demosites.html
index df1d9795ae..33a771db49 100644
--- a/web/gui/demosites.html
+++ b/web/gui/demosites.html
@@ -1,6 +1,6 @@
<!doctype html>
<!-- SPDX-License-Identifier: GPL-3.0-or-later -->
-<html lang=en-us>
+<html lang=en-us xmlns="http://www.w3.org/1999/html">
<head>
<meta charset=utf-8>
<title>NetData: Get control of your Linux Servers. Simple. Effective. Awesome.</title>
@@ -60,7 +60,7 @@ strong {
}
h1 {
- font-size: 3em;
+ font-size: 2.9em;
line-height: 1.2em;
margin: 0 .5em .75em
}
@@ -80,37 +80,43 @@ img {
}
a:active, a:focus, a:hover {
- text-decoration: underline
+ text-decoration: underline;
}
::-moz-selection {
background-color: #b3d4fc;
- text-shadow: none
+ text-shadow: none;
}
::selection {
background-color: #b3d4fc;